Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/30399
DC FieldValueLanguage
dc.contributor.authorDamova, Marianaen_US
dc.contributor.authorMishev, Kostadinen_US
dc.contributor.authorValunaite Oleskeviciene, Giedreen_US
dc.contributor.authorLiebeskind, Chayaen_US
dc.contributor.authorda Purificação Silvano, Mariaen_US
dc.contributor.authorTrajanov, Dimitaren_US
dc.contributor.authorTruica, Ciprian-Octavianen_US
dc.contributor.authorApostol, Elena-Simonaen_US
dc.contributor.authorChiarcos, Christianen_US
dc.contributor.authorBaczkowska, Annaen_US
dc.date.accessioned2024-06-05T09:23:23Z-
dc.date.available2024-06-05T09:23:23Z-
dc.date.issued2023-
dc.identifier.urihttp://hdl.handle.net/20.500.12188/30399-
dc.description.abstractUsing language models to detect or predict the presence of language phenomena in the text has become a mainstream research topic. With the rise of generative models, experiments using deep learning and transformer models trigger intense interest. Aspects like precision of predictions, portability to other languages or phenomena, scale have been central to the research community. Discourse markers, as language phenomena, perform important functions, such as signposting, signalling, and rephrasing, by facilitating discourse organization. Our paper is about discourse markers detection, a complex task as it pertains to a language phenomenon manifested by expressions that can occur as content words in some contexts and as discourse markers in others. We have adopted language agnostic model trained in English to predict the discourse marker presence in texts in 8 other unseen by the model languages with the goal to evaluate how well the model performs in different structure and lexical properties languages. We report on the process of evaluation and validation of the model's performance across European Portuguese, Hebrew, German, Polish, Romanian, Bulgarian, Macedonian, and Lithuanian and about the results of this validation. This research is a key step towards multilingual language processing.en_US
dc.relation.ispartofLanguage, Data and Knowledge 2023 (LDK 2023): Proceedings of the 4th Conference on Language, Data and Knowledgeen_US
dc.titleValidation of language agnostic models for discourse marker detectionen_US
dc.typeProceedingsen_US
item.grantfulltextopen-
item.fulltextWith Fulltext-
crisitem.author.deptFaculty of Computer Science and Engineering-
Appears in Collections:Faculty of Computer Science and Engineering: Journal Articles
Files in This Item:
File Description SizeFormat 
649361.pdf231.86 kBAdobe PDFView/Open
Show simple item record

Page view(s)

14
checked on Jul 22, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.