Faculty of Computer Science and Engineering

Permanent URI for this communityhttps://repository.ukim.mk/handle/20.500.12188/5

The Faculty of Computer Science and Engineering (FCSE) within UKIM is the largest and most prestigious faculty in the field of computer science and technologies in Macedonia, and among the largest faculties in that field in the region. The FCSE teaching staff consists of 50 professors and 30 associates. These include many “best in field” personnel, such as the most referenced scientists in Macedonia and the most influential professors in the ICT industry in the Republic of Macedonia.

Browse

Search Results

Now showing 1 - 6 of 6
  • Some of the metrics are blocked by your 
    Item type:Publication,
    RAGCare-QA: A Benchmark Dataset for Evaluating Retrieval-Augmented Generation Pipelines in Theoretical Medical Knowledge
    (Cold Spring Harbor Laboratory Press, 2025)
    Dobreva, Jovana
    ;
    Karasmanakis, Ivana
    ;
    Ivanisevic, Filip
    ;
    Horvat, Tadej
    ;
    Gams, Matjaz
    The paper introduces RAGCare-QA, an extensive dataset of 420 theoretical medical knowledge questions for assessing Retrieval-Augmented Generation (RAG) pipelines in medical education and evaluation settings. The dataset includes one-choice-only questions from six medical specialties (Cardiology, Endocrinology, Gastroenterology, Family Medicine, Oncology, and Neurology) with three levels of complexity (Basic, Intermediate, and Advanced). Each question is accompanied by the best fit of RAG implementation complexity level, such as Basic RAG (315 questions, 75.0%), Multi-vector RAG (82 questions, 19.5%), and Graph-enhanced RAG (23 questions, 5.5%). The questions emphasize theoretical medical knowledge on fundamental concepts, pathophysiology, diagnostic criteria, and treatment principles important in medical education. The dataset is a useful tool for the assessment of RAG- based medical education systems, allowing researchers to fine-tune retrieval methods for various categories of theoretical medical knowledge questions.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Benchmarking OpenAI’s APIs and other Large Language Models for Repeatable and Efficient Question Answering Across Multiple Documents
    (IEEE, 2024-09-08)
    Filipovska, Elena
    ;
    Mladenovska, Ana
    ;
    Bajrami, Merxhan
    ;
    Dobreva, Jovana
    ;
    Hillman, Velislava
    The rapid growth of document volumes and com plexity in various domains necessitates advanced automated methods to enhance the efficiency and accuracy of information extraction and analysis. This paper aims to evaluate the efficiency and repeatability of OpenAI’s APIs and other Large Language Models (LLMs) in automating question-answering tasks across multiple documents, specifically focusing on analyzing Data Pri vacy Policy (DPP) documents of selected EdTech providers. We test how well these models perform on large-scale text processing tasks using the OpenAI’s LLM models (GPT 3.5 Turbo, GPT 4, GPT 4o) and APIs in several frameworks: direct API calls (i.e., one-shot learning), LangChain, and Retrieval Augmented Generation (RAG) systems. We also evaluate a local deployment of quantized versions (with FAISS) of LLM models (Llama-2- 13B-chat-GPTQ). Through systematic evaluation against pre defined use cases and a range of metrics, including response format, execution time, and cost, our study aims to provide insights into the optimal practices for document analysis. Our findings demonstrate that using OpenAI’s LLMs via API calls is a workable workaround for accelerating document analysis when using a local GPU-powered infrastructure is not a viable solution, particularly for long texts. On the other hand, the local deployment is quite valuable for maintaining the data within the private infrastructure. Our findings show that the quantized models retain substantial relevance even with fewer parameters than ChatGPT and do not impose processing restrictions on the number of tokens. This study offers insights on maximizing the use of LLMs for better efficiency and data governance in addition to confirming their usefulness in improving document analysis procedures.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Post COVID depression prediction using Twitter data
    (Ss Cyril and Methodius University in Skopje, Faculty of Computer Science and Engineering, Republic of North Macedonia, 2023-07)
    Spirovska, Eva
    ;
    Dobreva, Jovana
    ;
    Lucas, Mary
    ;
    Vodenska, Irena
    ;
    Chitkushev, Lou
    This study aims to investigate the prevalence of Post COVID-19 depression by collecting, preprocessing, and analyzing English-language tweets using several natural language processing (NLP) models. Our primary objective is to identify depression-related tweets and develop a machine learning (ML) model for depression prediction. Two datasets are employed for this research: the first is a publicly available depression dataset from Kaggle, and the second is a long covid dataset obtained from Twitter between April 2020 and April 2022. By leveraging NLP techniques and ML algorithms, we analyze these datasets to gain insights into the pandemic’s impact on mental health and identify key features associated with depression. Although the chosen classification model had promising results, it still misclassified certain data, prompting the incorporation of Twitter Account classification. Consequently, this integration resulted in tweets being classified more accurately.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Survey of nlp in pharmacology: Methodology, tasks, resources, knowledge, and tools
    (2022-08-22)
    ;
    Trajkovski, Vangel
    ;
    Dimitrieva, Makedonka
    ;
    Dobreva, Jovana
    ;
    Jovanovik, Milos
    Natural language processing (NLP) is an area of artificial intelligence that applies information technologies to process the human language, understand it to a certain degree, and use it in various applications. This area has rapidly developed in the last few years and now employs modern variants of deep neural networks to extract relevant patterns from large text corpora. The main objective of this work is to survey the recent use of NLP in the field of pharmacology. As our work shows, NLP is a highly relevant information extraction and processing approach for pharmacology. It has been used extensively, from intelligent searches through thousands of medical documents to finding traces of adversarial drug interactions in social media. We split our coverage into five categories to survey modern NLP methodology, commonly addressed tasks, relevant textual data, knowledge bases, and useful programming libraries. We split each of the five categories into appropriate subcategories, describe their main properties and ideas, and summarize them in a tabular form. The resulting survey presents a comprehensive overview of the area, useful to practitioners and interested observers.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Forensics investigation comparison of privacy-oriented cryptocurrencies
    (Scientific Technical Union of Mechanical Engineering" Industry 4.0", 2022)
    Taneska, Marija
    ;
    Dobreva, Jovana
    ;
    Digital cryptocurrencies especially privacy-oriented cryptocurrencies over the past years have experienced significant growth in terms of usage. The increased usage of privacy-oriented cryptocurrencies due to the offered privacy and anonymity, allows a cybercriminal to commit illegal transactions that are harder to trace back than Bitcoin. In this paper, we provide a forensic overview of the privacyoriented cryptocurrencies Monero, Verge, Dash, and Zcash. We analyse forensics experiments with these cryptocurrencies and make some assumptions and conclusions related to the analysed experiments.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Application of Machine Learning in DES Cryptanalysis
    (2020-09-24)
    Andonov, Stefan
    ;
    Dobreva, Jovana
    ;
    Lumburovska, Lina
    ;
    Pavlov, Stefan
    ;
    The usage of machine learning is expanding over all scientific fields and this branch is becoming more and more popular in the last years. In this paper we consider application of machine learning in the cryptanalysis, precisely in cryptanalysis of DES algorithm. This algorithm works in 16 rounds and we make two analyses: one for only one round and one for all rounds. We use different datasets and specific neural network for each analysis. We present results from several experiments for different datasets and different keys. Furthermore, we analyze and compare the obtained results, where we provide visual and textual presentation and we derive some conclusions.