Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/10134
DC FieldValueLanguage
dc.contributor.authorDobreva, Jovanaen_US
dc.contributor.authorJofche, Nasien_US
dc.contributor.authorJovanovik, Milosen_US
dc.contributor.authorTrajanov, Dimitaren_US
dc.date.accessioned2021-02-17T09:01:24Z-
dc.date.available2021-02-17T09:01:24Z-
dc.date.issued2020-10-30-
dc.identifier.urihttp://hdl.handle.net/20.500.12188/10134-
dc.description.abstractAnalyzing long text articles in the pharmaceutical domain, for the purpose of knowledge extraction and recognizing entities of interest, is a tedious task. In our previous research efforts, we were able to develop a platform which successfully extracts entities and facts from pharmaceutical texts and populates a knowledge graph with the extracted knowledge. However, one drawback of our approach was the processing time; the analysis of a single text source was not interactive enough, and the batch processing of entire article datasets took too long. In this paper, we propose a modified pipeline where the texts are summarized before the analysis begins. With this, the source articles is reduced significantly, to a compact version which contains only the most commonly encountered entities. We show that by reducing the text size, we get knowledge extraction results comparable to the full text analysis approach and, at the same time, we significantly reduce the processing time, which is essential for getting both real-time results on single text sources, and faster results when analyzing entire batches of collected articles from the domain.en_US
dc.language.isoenen_US
dc.publisherSpringer International Publishingen_US
dc.subjectNamed entity recognitionen_US
dc.subjectData processingen_US
dc.subjectText summarizationen_US
dc.subjectKnowledge extractionen_US
dc.subjectKnowledge graphsen_US
dc.titleImproving NER Performance by Applying Text Summarization on Pharmaceutical Articlesen_US
dc.typeBook chapteren_US
dc.relation.conferenceICT Innovations 2020en_US
dc.identifier.doi10.1007/978-3-030-62098-1_8-
dc.identifier.urlhttp://link.springer.com/content/pdf/10.1007/978-3-030-62098-1_8-
item.grantfulltextnone-
item.fulltextNo Fulltext-
crisitem.author.deptFaculty of Computer Science and Engineering-
crisitem.author.deptFaculty of Computer Science and Engineering-
Appears in Collections:Faculty of Computer Science and Engineering: Conference papers
Show simple item record

Page view(s)

55
checked on May 2, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.