Faculty of Computer Science and Engineering

Permanent URI for this communityhttps://repository.ukim.mk/handle/20.500.12188/5

The Faculty of Computer Science and Engineering (FCSE) within UKIM is the largest and most prestigious faculty in the field of computer science and technologies in Macedonia, and among the largest faculties in that field in the region. The FCSE teaching staff consists of 50 professors and 30 associates. These include many “best in field” personnel, such as the most referenced scientists in Macedonia and the most influential professors in the ICT industry in the Republic of Macedonia.

Browse

Search Results

Now showing 1 - 10 of 13
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Benchmarking Sentence Encoders in Associating Indicators With Sustainable Development Goals and Targets
    (Institute of Electrical and Electronics Engineers (IEEE), 2025)
    Gjorgjevikj, Ana
    ;
    ;
    ;
    The United Nations’ 2030 Agenda for Sustainable Development balances the economic, environmental, and social dimension of sustainable development in 17 Sustainable Development Goals (SDGs), monitored through a well-defined set of targets and global indicators. Although essential for humanity’s future well-being, this monitoring is still challenging due to the variable quality of the statistical data of global indicators compiled at the national level and the diversity of indicators used to monitor sustainable development at the subnational level. Associating indicators other than the global ones with the SDGs/targets may help not only to expand the statistical data, but to better align the efforts toward sustainable development taken at (sub)national level. This article presents a model-agnostic framework for associating such indicators with the SDGs and targets by comparing their textual descriptions in a common representation space. While removing the dependence on the quantity and quality of the statistical data of the indicators, it provides human experts with data-driven suggestions on the complex and not always obvious associations between the indicators and the SDGs/targets. A comprehensive domain-specific benchmarking of a diverse sentence encoder portfolio was performed first, followed by fine-tuning of the best ones on a newly created dataset. Five sets of indicators used at the (sub)national level of governance (around 800 indicators in total) were used for the evaluation. Finally, the influence of 40 factors on the results was analyzed using explainable artificial intelligence (xAI) methods. The results show that 1) certain sentence encoders are better suited to solving the task than others (potentially due to their diverse pre-training datasets), 2) the fine-tuning not only improves the predictive performance over the baselines but also reduces the sensitivity to changes in indicator description length (performance drops even by up to 17% for baseline m...
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Evaluating Trustworthiness in AI: Risks, Metrics, and Applications Across Industries
    (MDPI AG, 2025-07-04)
    Nastoska, Aleksandra
    ;
    Jancheska, Bojana
    ;
    Rizinski, Maryan
    ;
    Ensuring the trustworthiness of artificial intelligence (AI) systems is critical as they become increasingly integrated into domains like healthcare, finance, and public administration. This paper explores frameworks and metrics for evaluating AI trustworthiness, focusing on key principles such as fairness, transparency, privacy, and security. This study is guided by two central questions: how can trust in AI systems be systematically measured across the AI lifecycle, and what are the trade-offs involved when optimizing for different trustworthiness dimensions? By examining frameworks such as the NIST AI Risk Management Framework (AI RMF), the AI Trust Framework and Maturity Model (AI-TMM), and ISO/IEC standards, this study bridges theoretical insights with practical applications. We identify major risks across the AI lifecycle stages and outline various metrics to address challenges in system reliability, bias mitigation, and model explainability. This study includes a comparative analysis of existing standards and their application across industries to illustrate their effectiveness. Real-world case studies, including applications in healthcare, financial services, and autonomous systems, demonstrate approaches to applying trust metrics. The findings reveal that achieving trustworthiness involves navigating trade-offs between competing metrics, such as fairness versus efficiency or privacy versus transparency, and emphasizes the importance of interdisciplinary collaboration for robust AI governance. Emerging trends suggest the need for adaptive frameworks for AI trustworthiness that evolve alongside advancements in AI technologies. This paper contributes to the field by proposing a comprehensive review of existing frameworks with guidelines for building resilient, ethical, and transparent AI systems, ensuring their alignment with regulatory requirements and societal expectations.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Multiword Discourse Markers Across Languages: A Linguistic and Computational Perspective
    (Wiley, 2025-04-22)
    Apostol, Elena‐Simona
    ;
    Truică, Ciprian‐Octavian
    ;
    Damova, Mariana
    ;
    Silvano, Purificação
    ;
    Oleškeviciene, Giedre Valunaite
    Discourse markers (DMs) are linguistic expressions that convey different semantic and pragmatic values, managing and organizing the structure of spoken and written discourses. They can be either single-word or multiword expressions (MWE), made up of conjunctions, adverbs, and prepositional phrases. Although DMs are the focus of many studies, some questions regarding the interoperability of taxonomies and automatic identification and classification require further research. We aim to tackle these issues by offering a critical analysis and discussing the constitution of a multilingual corpus in 10 languages, i.e., English, Lithuanian, Bulgarian, German, Macedonian, Romanian, Hebrew, Polish, European Portuguese, and Italian. The novel two-level annotation approach is based on (i) signaling the existence or non-existence of DMs in a given text, and (ii) applying the ISO-24617 standard to annotate the DMs’ discourse relation and communicative function in the corpora. Additionally, we introduce prediction models for detecting the presence of DMs within a text.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    AI Agents in Finance and Fintech: A Scientific Review of Agent-Based Systems, Applications, and Future Horizons
    (Tech Science Press, 2026)
    Rizinski, Maryan
    ;
    Artificial intelligence (AI) is reshaping financial systems and services, as intelligent AI agents increasingly form the foundation of autonomous, goal-driven systems capable of reasoning, learning, and action. This review synthesizes recent research and developments in the application of AI agents across core financial domains. Specifically, it covers the deployment of agent-based AI in algorithmic trading, fraud detection, credit risk assessment, roboadvisory, and regulatory compliance (RegTech).The review focuses on advanced agent-based methodologies, including reinforcement learning, multi-agent systems, and autonomous decision-making frameworks, particularly those leveraging large language models (LLMs), contrasting these with traditional AI or purely statistical models. Our primary goals are to consolidate current knowledge, identify significant trends and architectural approaches, review the practical efficiency and impact of current applications, and delineate key challenges and promising future research directions. The increasing sophistication of AI agents offers unprecedented opportunities for innovation in finance, yet presents complex technical, ethical, and regulatory challenges that demand careful consideration and proactive strategies. This review aims to provide a comprehensive understanding of this rapidly evolving landscape, highlighting the role of agent-based AI in the ongoing transformation of the financial industry, and is intended to serve financial institutions, regulators, investors, analysts, researchers, and other key stakeholders in the financial ecosystem.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Large language models in food and nutrition science: Opportunities, challenges, and the case of FoodyLLM
    (Elsevier BV, 2026)
    Gjorgjevikj, Ana
    ;
    Martinc, Matej
    ;
    Cenikj, Gjorgjina
    ;
    ;
    Drole, Jan
    Background Reliable nutrient profiling and semantic interoperability are essential for scalable dietary assessment, food labeling (e.g., traffic-light schemes), and FAIR integration of food composition and consumption data. However, general-purpose large language models (LLMs) are not systematically exposed to structured recipe–nutrition mappings and food ontologies, limiting their accuracy and trustworthiness in food and nutrition tasks. Scope and approach We review recent LLM advances in life sciences and healthcare and analyze the gap in food and nutrition applications. To address this gap, we introduce FoodyLLM, a domain-specialized LLM fine-tuned on 225k task-aligned QA pairs for (i) recipe nutrient estimation, (ii) traffic-light classification, and (iii) ontology-based entity linking to support FAIR food data interoperability. We benchmark FoodyLLM against strong general-purpose baselines (e.g., Llama 3 8B, Gemini 2.0) under zero-/few-shot prompting across five evaluation folds. Key findings Across all tasks, FoodyLLM substantially outperforms general-purpose LLMs for nutrient estimation across all macronutrients (fat, protein, salt, saturates, sugar), accuracy increases from 0.43 to 0.63 to 0.91–0.97; for traffic-light classification across all nutrients and color categories, macro F1 improves from 0.46 to 0.80 to 0.86–0.97; and for ontology-based food entity linking across FoodOn, SNOMED-CT, and Hansard, macro F1 increases from 0.33 to 0.44 (best general-purpose baseline) to 0.93–0.98 on artificial NEL data, and from 0.24 to 0.51 to 0.67–0.84 on real corpora (CafeteriaSA and CafeteriaFCD). Overall, our results demonstrate the practical value of domain-specialized LLMs in food and nutrition research. They enable automated dietary assessment, large-scale nutritional monitoring, and FAIR data integration, while opening new pathways toward sustainable and personalized nutrition.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Preserving Macedonian Culinary Heritage: Fine-Tuning a Large Language Model for Recipe Generation in a Low-Resource Language
    (IEEE, 2025-12-08)
    Peshevski, Dimitar
    ;
    Sasanski, Darko
    ;
    ;
    We introduce the first fine-tuned large language model for recipe instruction generation in Macedonian. Building on VezilkaLLM-Instruct, a 4-billion parameter model, we fine-tune it using a curated dataset of 36,000 recipes with detailed cooking instructions. Our key contributions include: (1) the development of a domain-adapted language model for a low-resource language; (2) the demonstration that relatively small LLMs can be effectively adapted to specialized culinary tasks; and (3) the proposal of a dual evaluation framework that combines semantic similarity and verb overlap analyses to assess both content generalization and procedural accuracy. Fine-tuning results in a mean cosine similarity of 0.90 and significantly increases the overlap of domain-specific cooking verbs, indicating improved generation quality. These results highlight the potential of targeted fine-tuning approaches for domain-specific applications in underrepresented languages and provide a foundation for further research in computational culinary heritage.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Advancing AI in Higher Education: A Comparative Study of Large Language Model-Based Agents for Exam Question Generation, Improvement, and Evaluation
    (MDPI AG, 2025-03-04)
    Nikolovski, Vlatko
    ;
    ;
    The transformative capabilities of large language models (LLMs) are reshaping educational assessment and question design in higher education. This study proposes a systematic framework for leveraging LLMs to enhance question-centric tasks: aligning exam questions with course objectives, improving clarity and difficulty, and generating new items guided by learning goals. The research spans four university courses—two theory-focused and two application-focused—covering diverse cognitive levels according to Bloom’s taxonomy. A balanced dataset ensures representation of question categories and structures. Three LLM-based agents—VectorRAG, VectorGraphRAG, and a fine-tuned LLM—are developed and evaluated against a meta-evaluator, supervised by human experts, to assess alignment accuracy and explanation quality. Robust analytical methods, including mixed-effects modeling, yield actionable insights for integrating generative AI into university assessment processes. Beyond exam-specific applications, this methodology provides a foundational approach for the broader adoption of AI in post-secondary education, emphasizing fairness, contextual relevance, and collaboration. The findings offer a comprehensive framework for aligning AI-generated content with learning objectives, detailing effective integration strategies, and addressing challenges such as bias and contextual limitations. Overall, this work underscores the potential of generative AI to enhance educational assessment while identifying pathways for responsible implementation.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Few-Shot Semantic Segmentation in Remote Sensing: A Review on Definitions, Methods, Datasets, Advances and Future Trends
    (MDPI AG, 2026-02-18)
    Petrov, Marko
    ;
    Pandilova, Ema
    ;
    ;
    ;
    Semantic segmentation in remote sensing images, which is the task of classifying each pixel of the image in a specific category, is widely used in areas such as disaster management, environmental monitoring, precision agriculture, and many others. However, traditional semantic segmentation methods face a major challenge: they require large amounts of annotated data to train effectively. To tackle this challenge, few-shot semantic segmentation has been introduced, where the models can learn and adapt quickly to new classes from just a few annotated samples. This paper presents a comprehensive review of recent advances in few-shot semantic segmentation (FSSS) for remote sensing, covering datasets, methods, and emerging research directions. We first outline the fundamental principles of few-shot learning and summarize commonly used remote-sensing benchmarks, emphasizing their scale, geographic diversity, and relevance to episodic evaluation. Next, we categorize FSSS methods into major families (meta-learning, conditioning-based, and foundation-assisted approaches) and analyze how architectural choices, pretraining strategies, and inference protocols influence performance. The discussion highlights empirical trends across datasets, the behavior of different conditioning mechanisms, the impact of self-supervised and multimodal pretraining, and the role of reproducibility and evaluation design. Finally, we identify key challenges and future trends, including benchmark standardization, integration with foundation and multimodal models, efficiency at scale, and uncertainty-aware adaptation. Collectively, they signal a shift toward unified, adaptive models capable of segmenting novel classes across sensors, regions, and temporal domains with minimal supervision.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    A Unified Framework for Alzheimer’s Disease Knowledge Graphs: Architectures, Principles, and Clinical Translation
    (MDPI, 2025-05-19)
    Dobreva, Jovana
    ;
    ;
    ;
    ;
    This review paper synthesizes the application of knowledge graphs (KGs) in Alzheimer’s disease (AD) research, based on two basic questions, as follows: what types of input data are available to construct these knowledge graphs, and what purpose the knowledge graph is intended to fulfill. We synthesize results from existing works to illustrate how diverse knowledge graph structures behave in different data availability settings with distinct application targets in AD research. By comparative analysis, we define the best methodology practices by data type (literature, structured databases, neuroimaging, and clinical records) and application of interest (drug repurposing, disease classification, mechanism discovery, and clinical decision support). From this analysis, we recommend AD-KG 2.0, which is a new framework that coalesces best practices into a unifying architecture with well-defined decision pathways for implementation. Our key contributions are as follows: (1) a dynamic adaptation mechanism that adapts methodological elements automatically according to both data availability and application objectives, (2) a specialized semantic alignment layer that harmonizes terminologies across biological scales, and (3) a multi-constraint optimization approach for knowledge graph building. The framework accommodates a variety of applications, including drug repurposing, patient stratification for precision medicine, disease progression modeling, and clinical decision support. Our system, with a decision tree structured and pipeline layered architecture, offers research precise directions on how to use knowledge graphs in AD research by aligning methodological choice decisions with respective data availability and application goals. We provide precise component designs and adaptation processes that deliver optimal performance across varying research and clinical settings. We conclude by addressing implementation challenges and future directions for translating knowledge graph technologies from research tool to clinical use, with a specific focus on interpretability, workflow integration, and regulatory matters.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    From linguistic linked data to big data
    (2024-05-22)
    ;
    Apostol, Elena-Simona
    ;
    Garabík, Radovan
    ;
    Gkirtzou, Katerina
    ;
    Gromann, Dagmar
    With advances in the field of Linked (Open) Data (LOD), language data on the LOD cloud has grown in number, size, and variety. With an increased volume and variety of language data, optimizations of methods for distributing, storing, and querying these data become more central. To this end, this position paper investigates use cases at the intersection of LLOD and Big Data, existing approaches to utilizing Big Data techniques within the context of linked data, and discusses the challenges and benefits of this union.