Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/31370
DC FieldValueLanguage
dc.contributor.authorStanoev, Borisen_US
dc.contributor.authorMitrov, Goranen_US
dc.contributor.authorKulakov, Andreaen_US
dc.contributor.authorMirceva, Georginaen_US
dc.contributor.authorLameski, Petreen_US
dc.contributor.authorZdravevski, Eftimen_US
dc.date.accessioned2024-09-25T10:06:33Z-
dc.date.available2024-09-25T10:06:33Z-
dc.date.issued2024-04-01-
dc.identifier.urihttp://hdl.handle.net/20.500.12188/31370-
dc.description.abstract<jats:p>With the exponential growth of data, extracting actionable insights becomes resource-intensive. In many organizations, normalized relational databases store a significant portion of this data, where tables are interconnected through some relations. This paper explores relational learning, which involves joining and merging database tables, often normalized in the third normal form. The subsequent processing includes extracting features and utilizing them in machine learning (ML) models. In this paper, we experiment with the propositionalization algorithm (i.e., Wordification) for feature engineering. Next, we compare the algorithms PropDRM and PropStar, which are designed explicitly for multi-relational data mining, to traditional machine learning algorithms. Based on the performed experiments, we concluded that Gradient Boost, compared to PropDRM, achieves similar performance (F1 score, accuracy, and AUC) on multiple datasets. PropStar consistently underperformed on some datasets while being comparable to the other algorithms on others. In summary, the propositionalization algorithm for feature extraction makes it feasible to apply traditional ML algorithms for relational learning directly. In contrast, approaches tailored specifically for relational learning still face challenges in scalability, interpretability, and efficiency. These findings have a practical impact that can help speed up the adoption of machine learning in business contexts where data is stored in relational format without requiring domain-specific feature extraction.</jats:p>en_US
dc.publisherMDPI AGen_US
dc.relation.ispartofBig Data and Cognitive Computingen_US
dc.subjectdata mining; relational learning; propositionalization; machine learning; deep learningen_US
dc.titleAutomating Feature Extraction from Entity-Relation Models: Experimental Evaluation of Machine Learning Methods for Relational Learningen_US
dc.typeJournal Articleen_US
dc.identifier.doi10.3390/bdcc8040039-
dc.identifier.urlhttps://www.mdpi.com/2504-2289/8/4/39/pdf-
dc.identifier.volume8-
dc.identifier.issue4-
item.fulltextWith Fulltext-
item.grantfulltextopen-
crisitem.author.deptFaculty of Computer Science and Engineering-
crisitem.author.deptFaculty of Computer Science and Engineering-
crisitem.author.deptFaculty of Computer Science and Engineering-
crisitem.author.deptFaculty of Computer Science and Engineering-
Appears in Collections:Faculty of Computer Science and Engineering: Journal Articles
Files in This Item:
File SizeFormat 
BDCC-08-00039.pdf680.42 kBAdobe PDFView/Open
Show simple item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.