Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/25433
DC FieldValueLanguage
dc.contributor.authorda Purificação Silvano, Mariaen_US
dc.contributor.authorDamova, Marianaen_US
dc.contributor.authorOleskeviciené Valunaité, Giedréen_US
dc.contributor.authorLiebeskind, Chayaen_US
dc.contributor.authorChiarcos, Christianen_US
dc.contributor.authorTrajanov, Dimitaren_US
dc.contributor.authorCiprian-Octavian, Truicaen_US
dc.contributor.authorApostol, Elena-Simonaen_US
dc.contributor.authorBaczkowska, Annaen_US
dc.date.accessioned2023-01-18T09:42:07Z-
dc.date.available2023-01-18T09:42:07Z-
dc.date.issued2022-
dc.identifier.urihttp://hdl.handle.net/20.500.12188/25433-
dc.description.abstractDiscourse markers carry information about the discourse structure and organization, and also signal local dependencies or epistemological stance of speaker. They provide instructions on how to interpret the discourse, and their study is paramount to understand the mechanism underlying discourse organization. This paper presents a new language resource, an ISO-based annotated multilingual parallel corpus for discourse markers. The corpus comprises nine languages, Bulgarian, Lithuanian, German, European Portuguese, Hebrew, Romanian, Polish, and Macedonian, with English as a pivot language. In order to represent the meaning of the discourse markers, we propose an annotation scheme of discourse relations from ISO 24617-8 with a plug-in to ISO 24617-2 for communicative functions. We describe an experiment in which we applied the annotation scheme to assess its validity. The results reveal that, although some extensions are required to cover all the multilingual data, it provides a proper representation of discourse markers value. Additionally, we report some relevant contrastive phenomena concerning discourse markers interpretation and role in discourse. This first step will allow us to develop deep learning methods to identify and extract discourse relations and communicative functions, and to represent that information as Linguistic Linked Open Data (LLOD).en_US
dc.relation.ispartofProceedings of the 13th Language Resources and Evaluation Conference (LREC 2022)en_US
dc.subjectmultilingual corpus, discourse markers, ISO-based annotation scheme, discourse relations, communicative functionsen_US
dc.titleISO-based annotated multilingual parallel corpus for discourse markersen_US
dc.typeProceedingsen_US
item.grantfulltextopen-
item.fulltextWith Fulltext-
crisitem.author.deptFaculty of Computer Science and Engineering-
Appears in Collections:Faculty of Computer Science and Engineering: Journal Articles
Files in This Item:
File Description SizeFormat 
596162.pdf434.79 kBAdobe PDFView/Open
Show simple item record

Page view(s)

32
checked on May 21, 2024

Download(s)

8
checked on May 21, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.