Abstract:
This thesis introduces a novel approach for automatic timeline generation for news data, an essential task in Information Retrieval. Our work addresses the challenge of generating timelines and multilingual scalability issues, mainly focusing on Arabic Timeline Generation, an under-resourced field in literature. Our approach utilizes the improved capabilities of Large Language Models(LLMs), particularly their accuracy and efficiency in text generation and summarization. We showcase the architecture of our system, highlighting the novel components that distinguish it from other approaches in the literature, particularly the integration of LLMs for event and date extraction of news sources, through utilizing the state-of-the-art GPT4 Turbo model. In this work, we demonstrate that our system outperforms existing methods in critical metrics regarding accuracy. Our system's versatility allows us to generate timelines independent of the input language, highlighting its scalability and adaptation to various applications. This is the first work that introduces the use of LLMs in the Timeline Generation task, as well as building a Timeline Generation system independent of language-specific features, highlighting the novelty of our approach. The experimental results demonstrate that our approach fills critical gaps in Timeline Generation. Thus, this work represents a substantial advancement in Timeline Generation, offering new grounds for research and application in various contexts.