Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/27287
Title: CafeteriaSA corpus: scientific abstracts annotated across different food semantic resources
Authors: Cenikj, Gjorgjina
Valenčič, Eva
Ispirova, Gordana
Ogrinc, Matevž
Stojanov, Riste 
Korošec, Peter
Cavalli, Ermanno
Seljak, Barbara Koroušić
Eftimov, Tome
Issue Date: 1-Jan-2022
Publisher: Oxford University Press (OUP)
Journal: Database
Abstract: In the last decades, a great amount of work has been done in predictive modeling of issues related to human and environmental health. Resolution of issues related to healthcare is made possible by the existence of several biomedical vocabularies and standards, which play a crucial role in understanding the health information, together with a large amount of health-related data. However, despite a large number of available resources and work done in the health and environmental domains, there is a lack of semantic resources that can be utilized in the food and nutrition domain, as well as their interconnections. For this purpose, in a European Food Safety Authority–funded project CAFETERIA, we have developed the first annotated corpus of 500 scientific abstracts that consists of 6407 annotated food entities with regard to Hansard taxonomy, 4299 for FoodOn and 3623 for SNOMED-CT. The CafeteriaSA corpus will enable the further development of natural language processing methods for food information extraction from textual data that will allow extracting food information from scientific textual data.
URI: http://hdl.handle.net/20.500.12188/27287
DOI: 10.1093/database/baac107
Appears in Collections:Faculty of Computer Science and Engineering: Journal Articles

Files in This Item:
File Description SizeFormat 
baac107.pdf3.85 MBAdobe PDFView/Open
Show full item record

Page view(s)

22
checked on Apr 28, 2024

Download(s)

7
checked on Apr 28, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.