Faculty of Computer Science and Engineering

Permanent URI for this communityhttps://repository.ukim.mk/handle/20.500.12188/5

The Faculty of Computer Science and Engineering (FCSE) within UKIM is the largest and most prestigious faculty in the field of computer science and technologies in Macedonia, and among the largest faculties in that field in the region. The FCSE teaching staff consists of 50 professors and 30 associates. These include many “best in field” personnel, such as the most referenced scientists in Macedonia and the most influential professors in the ICT industry in the Republic of Macedonia.

Browse

Search Results

Now showing 1 - 2 of 2
  • Some of the metrics are blocked by your 
    Item type:Publication,
    PharmKE: Knowledge Extraction Platform for Pharmaceutical Texts Using Transfer Learning
    (MDPI, 2023-01-09)
    Jofche, Nasi
    ;
    ;
    ;
    ;
    Even though named entity recognition (NER) has seen tremendous development in recent years, some domain-specific use-cases still require tagging of unique entities, which is not well handled by pre-trained models. Solutions based on enhancing pre-trained models or creating new ones are efficient, but creating reliable labeled training for them to learn on is still challenging. In this paper, we introduce PharmKE, a text analysis platform tailored to the pharmaceutical industry that uses deep learning at several stages to perform an in-depth semantic analysis of relevant publications. The proposed methodology is used to produce reliably labeled datasets leveraging cutting-edge transfer learning, which are later used to train models for specific entity labeling tasks. By building models for the well-known text-processing libraries spaCy and AllenNLP, this technique is used to find Pharmaceutical Organizations and Drugs in texts from the pharmaceutical domain. The PharmKE platform also incorporates the NER findings to resolve co-references of entities and examine the semantic linkages in each phrase, creating a foundation for further text analysis tasks, such as fact extraction and question answering. Additionally, the knowledge graph created by DBpedia Spotlight for a specific pharmaceutical text is expanded using the identified entities. The obtained results with the proposed methodology result in about a 96% F1-score on the NER tasks, which is up to 2% better than those of the fine-tuned BERT and BioBERT models developed using the same dataset. The ultimate benefits of the platform are that pharmaceutical domain specialists may more easily identify the knowledge extracted from the input texts thanks to the platform’s visualization of the model findings. Likewise, the proposed techniques can be integrated into mobile and pervasive systems to give patients more relevant and comprehensive information from scanned medication guides. Similarly, it can provide preliminary insights to patients and even medical personnel on whether a drug from a different vendor is compatible with the patient’s prescription medication.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Improving NER Performance by Applying Text Summarization on Pharmaceutical Articles
    (Springer International Publishing, 2020-10-30)
    Dobreva, Jovana
    ;
    Jofche, Nasi
    ;
    ;
    Analyzing long text articles in the pharmaceutical domain, for the purpose of knowledge extraction and recognizing entities of interest, is a tedious task. In our previous research efforts, we were able to develop a platform which successfully extracts entities and facts from pharmaceutical texts and populates a knowledge graph with the extracted knowledge. However, one drawback of our approach was the processing time; the analysis of a single text source was not interactive enough, and the batch processing of entire article datasets took too long. In this paper, we propose a modified pipeline where the texts are summarized before the analysis begins. With this, the source articles is reduced significantly, to a compact version which contains only the most commonly encountered entities. We show that by reducing the text size, we get knowledge extraction results comparable to the full text analysis approach and, at the same time, we significantly reduce the processing time, which is essential for getting both real-time results on single text sources, and faster results when analyzing entire batches of collected articles from the domain.