Now showing 1 - 10 of 34
  • Some of the metrics are blocked by your 
    Item type:Publication,
    RDFGraphGen: An RDF Graph Generator Based on SHACL Shapes
    (Springer Nature (Singapore), 2026-04-01)
    ;
    Vecovska, Marija
    ;
    Jakubowski, Maxime
    ;
    Hose, Katja
    Developing and testing modern RDF-based applications often requires access to RDF datasets with certain characteristics. Unfortunately, it is very difficult to publicly find domain-specific knowledge graphs that conform to a particular set of characteristics. Hence, in this paper we propose RDFGraphGen, an open-source RDF graph generator that uses characteristics provided in the form of SHACL (Shapes Constraint Language) shapes to generate synthetic RDF graphs. RDFGraphGen is domain-agnostic, with configurable graph structure, value constraints, and distributions. It also comes with a number of predefined values for popular schema.org classes and properties, for more realistic graphs. Our results show that RDFGraphGen is scalable and can generate small, medium, and large RDF graphs in any domain.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Inferring Cuisine - Drug Interactions Using the Linked Data Approach
    (Springer Nature, 2015-03-20)
    ;
    ;
    ;
    Food - drug interactions are well studied, however much less is known about cuisine - drug interactions. Non-native cuisines are becoming increasingly more popular as they are available in (almost) all regions in the world. Here we address the problem of how known negative food - drug interactions are spread in different cuisines. We show that different drug categories have different distribution of the negative effects in different parts of the world. The effects certain ingredients have on different drug categories and in different cuisines are also analyzed. This analysis is aimed towards stressing out the importance of cuisine - drug interactions for patients which are being administered drugs with known negative food interactions. A patient being under a treatment with one such drug should be advised not only about the possible negative food - drug interactions, but also about the cuisines that could be avoided from the patient's diet.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Software for the GeoSPARQL Compliance Benchmark
    (Elsevier, 2021-05)
    ;
    Homburg, Timo
    ;
    Spasić, Mirko
    Checking the compliance of geospatial triplestores with the GeoSPARQL standard represents a crucial step for many users when selecting the appropriate storage solution. This publication presents the software which comprises the GeoSPARQL compliance benchmark — a benchmark which checks RDF triplestores for compliance with the requirements of the GeoSPARQL standard. Users can execute this benchmark within the HOBBIT benchmarking platform to quantify the extent to which the GeoSPARQL standard is implemented within the triplestore of interest. This enables users to make an informed decision when choosing an RDF storage solution and helps assess the general state of adoption of geospatial technologies on the Semantic Web.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Temporal Authorization Graphs: Pros, Cons and Limits
    (Springer International Publishing, 2022-01)
    ;
    Popovski, Ognen
    ;
    ;
    ;
    As more private data is entering the web, defining authorization about its access is crucial for privacy protection. This paper proposes a policy language that leverages SPARQL expressiveness and popularity for flexible access control management and enforces the protection using temporal graphs. The temporal graphs are created during the authentication phase and are cached for further usage. They enable design-time policy testing and debugging, which is necessary for correctness guarantee. The security never comes with convenience, and this paper examines the environments in which the temporal graphs are suitable. Based on the evaluation results, an approximated function is defined for suitability determination based on the expected temporal graph size.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    A GeoSPARQL Compliance Benchmark
    (2021-02-11)
    ;
    Timo Homburg
    ;
    Mirko Spasić
    We propose a series of tests that check for the compliance of RDF triplestores with the GeoSPARQL standard. The purpose of the benchmark is to test how many of the requirements outlined in the standard a tested system supports and to push triplestores forward in achieving a full GeoSPARQL compliance. This topic is of concern because the support of GeoSPARQL varies greatly between different triplestore implementations, and such support is of great importance for the domain of geospatial RDF data. Additionally, we present a comprehensive comparison of triplestores, providing an insight into their current GeoSPARQL support.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Learning Robust Food Ontology Alignment
    (IEEE, 2022-12-17)
    Mijalcheva, Viktorija
    ;
    Davcheva, Ana
    ;
    ;
    ;
    In today’s knowledge society, large number of information systems use many different individual schemes to represent data. Ontologies are a promising approach for formal knowledge representation and their number is growing rapidly. The semantic linking of these ontologies is a necessary prerequisite for establishing interoperability between the large number of services that structure the data with these ontologies. Consequently, the alignment of ontologies becomes a central issue when building a worldwide Semantic Web. There is a need to develop automatic or at least semi-automatic techniques to reduce the burden of manually creating and maintaining alignments. Ontologies are seen as a solution to data heterogeneity on the Web. However, the available ontologies are themselves a source of heterogeneity. On the Web, there are multiple ontologies that refer to the same domain, and with that comes the challenge of a given graph-based system using multiple ontologies whose taxonomy is different, but the semantics are the same. This can be overcome by aligning the ontologies or by finding the correspondence between their components.In this paper, we propose a method for indexing ontologies as a support to a solution for ontology alignment based on a neural network. In this process, for each semantic resource we combine the graph based representations from the RDF2vec model, together with the text representation from the BERT model in order to capture the semantic and structural features. This methodology is evaluated using the FoodOn and OntoFood ontologies, based on the Food Onto Map alignment dataset, which contains 155 unique and validly aligned resources. Using these limited resources, we managed to obtain accuracy of 74% and F1 score of 75% on the test set, which is a promising result that can be further improved in future. Furthermore, the methodology presented in this paper is both robust and ontology-agnostic. It can be applied to any ontology, regardless of the domain.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Machine Learning Approaches for Smart City Energy Management
    (Faculty of Electrical Engineering and Information Technologies - Skopje, 2020-12)
    Mladenovikj, Valerija
    ;
    Ilieva, Tamara
    ;
    With the constant increase in population and the growing impact of climate change, energy efficiency on a household and a city-wide level represents a significant key in the process of transformation of smart cities. Recently, machine learning approaches have been proven to be beneficial in addressing several global problems, especially in areas where large amounts of data is available. In this paper, we propose the use of machine learning methods to analyze the energy consumption behavior of households on a daily and seasonal basis, in order to detect the parts of the days and seasons in which they have peak energy consumption. Our machine learning models allow us to segment households according to daily and seasonal behavior into different groups. Both energy suppliers and individual households may benefit from the segmentation carried out in this paper. Energy suppliers can be precisely aware about the expected energy consumption by the different groups of customers, in different parts of the day, by knowing their daily behaviour. They can also precisely target households with energy efficient programs and provide more reliable estimates of energy savings. Individual households can reduce costs by increasing energy consumption during off-peak cheaper tariff periods, and would also potentially be more careful about their behaviour if they knew how efficient their energy consumption is, compared to other households. Therefore, we believe that the analysis in this paper can provide a solid foundation for the construction of an energy efficiency system which is necessary for creating a smart city.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    PharmKE: Knowledge Extraction Platform for Pharmaceutical Texts using Transfer Learning
    (2021-02-25)
    Jofche, Nasi
    ;
    ;
    ;
    ;
    The challenge of recognizing named entities in a given text has been a very dynamic field in recent years. This is due to the advances in neural network architectures, increase of computing power and the availability of diverse labeled datasets, which deliver pre-trained, highly accurate models. These tasks are generally focused on tagging common entities, but domain-specific use-cases require tagging custom entities which are not part of the pre-trained models. This can be solved by either fine-tuning the pre-trained models, or by training custom models. The main challenge lies in obtaining reliable labeled training and test datasets, and manual labeling would be a highly tedious task. In this paper we present PharmKE, a text analysis platform focused on the pharmaceutical domain, which applies deep learning through several stages for thorough semantic analysis of pharmaceutical articles. It performs text classification using state-of-the-art transfer learning models, and thoroughly integrates the results obtained through a proposed methodology. The methodology is used to create accurately labeled training and test datasets, which are then used to train models for custom entity labeling tasks, centered on the pharmaceutical domain. The obtained results are compared to the fine-tuned BERT and BioBERT models trained on the same dataset. Additionally, the PharmKE platform integrates the results obtained from named entity recognition tasks to resolve co-references of entities and analyze the semantic relations in every sentence, thus setting up a baseline for additional text analysis tasks, such as question answering and fact extraction. The recognized entities are also used to expand the knowledge graph generated by DBpedia Spotlight for a given pharmaceutical text.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    DD-RDL: Drug-Disease Relation Discovery and Labeling
    (Springer International Publishing, 2022-04-12)
    Dobreva, Jovana
    ;
    ;
    Drug repurposing, which is concerned with the study of the effectiveness of existing drugs on new diseases, has been growing in importance in the last few years. One of the core methodologies for drug repurposing is text-mining, where novel biological entity relationships are extracted from existing biomedical literature and publications, whose number skyrocketed in the last couple of years. This paper proposes an NLP approach for drug-disease relation discovery and labeling (DD-RDL), which employs a series of steps to analyze a corpus of abstracts of scientific biomedical research papers. The proposed ML pipeline restructures the free text from a set of words into drug-disease pairs using state-of-the-art text mining methodologies and natural language processing tools. The model’s output is a set of extracted triplets in the form (drug, verb, disease), where each triple describes a relationship between a drug and a disease detected in the corpus. We evaluate the model based on a gold standard dataset for drug-disease relationships, and we demonstrate that it is possible to achieve similar results without requiring a large amount of annotated biological data or predefined semantic rules. Additionally, as an experimental case, we analyze the research papers published as part of the COVID-19 Open Research Dataset (CORD-19) to extract and identify relations between drugs and diseases related to the ongoing pandemic.
  • Some of the metrics are blocked by your 
    Item type:Publication,
    MOCHA2017: The Mighty Storage Challenge at ESWC 2017
    (Springer International Publishing, 2017-10)
    Georgala, Kleanthi
    ;
    Spasić, Mirko
    ;
    ;
    Petzka, Henning
    ;
    Röder, Michael
    The aim of the Mighty Storage Challenge (MOCHA) at ESWC 2017 was to test the performance of solutions for SPARQL processing in aspects that are relevant for modern applications. These include ingesting data, answering queries on large datasets and serving as backend for applications driven by Linked Data. The challenge tested the systems against data derived from real applications and with realistic loads. An emphasis was put on dealing with data in form of streams or updates.