Ве молиме користете го овој идентификатор да го цитирате или поврзете овој запис: http://hdl.handle.net/20.500.12188/33563
Наслов: Combining Semantic Matching, Word Embeddings, Transformers, and LLMs for Enhanced Document Ranking: Application in Systematic Reviews
Authors: Mitrov, Goran
Stanoev, Boris
Gievska, Sonja 
Mirceva, Georgina
Zdravevski, Eftim 
Issue Date: 4-сеп-2024
Publisher: MDPI AG
Journal: Big Data and Cognitive Computing
Abstract: The rapid increase in scientific publications has made it challenging to keep up with the latest advancements. Conducting systematic reviews using traditional methods is both time-consuming and difficult. To address this, new review formats like rapid and scoping reviews have been introduced, reflecting an urgent need for efficient information retrieval. This challenge extends beyond academia to many organizations where numerous documents must be reviewed in relation to specific user queries. This paper focuses on improving document ranking to enhance the retrieval of relevant articles, thereby reducing the time and effort required by researchers. By applying a range of natural language processing (NLP) techniques, including rule-based matching, statistical text analysis, word embeddings, and transformer- and LLM-based approaches like Mistral LLM, we assess the article’s similarities to user-specific inputs and prioritize them according to relevance. We propose a novel methodology, Weighted Semantic Matching (WSM) + MiniLM, combining the strengths of the different methodologies. For validation, we employ global metrics such as precision at K, recall at K, average rank, median rank, and pairwise comparison metrics, including higher rank count, average rank difference, and median rank difference. Our proposed algorithm achieves optimal performance, with an average recall at 1000 of 95% and an average median rank of 185 for selected articles across the five datasets evaluated. These findings give promising results in pinpointing the relevant articles and reducing the manual work.
URI: http://hdl.handle.net/20.500.12188/33563
DOI: 10.3390/bdcc8090110
Appears in Collections:Faculty of Computer Science and Engineering: Journal Articles

Files in This Item:
File SizeFormat 
BDCC-08-00110-v2.pdf713.99 kBAdobe PDFView/Open
Прикажи целосна запис

Google ScholarTM

Проверете

Altmetric


Записите во DSpace се заштитени со авторски права, со сите права задржани, освен ако не е поинаку наведено.