Aspect-term Extraction from Albanian Reviews with Topic Modeling Techniques
Date Issued
2020-09-24
Author(s)
Axhiu, Majlinda
Aliu, Azir
Abstract
Bearing in mind the exponential increase of online data generated by the social networks’ users in every language, the urge need of sentiment analysis is also increasing. However, we have reached to a point that even the overall sentiment of an opinion is not enough that is why the necessity of Aspect-based Sentiment Analysis (ABSA) is very high. Considering our aim, to work on the first phase of the ABSA task, namely to extract the aspect terms from the reviews in Albanian language, and considering the lack of research on this field for this language and the lack of resources, we have chosen the unsupervised approach beside the supervised one. In this technique two of the mostly used models that are considered to be the state of art for topic modeling are Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF). We have done a comparative analysis for these two models by using a dataset that we have created from Facebook reviews, in the domain of restaurants. We have successfully extracted the aspects with both models. As a sample of the results we have listed the top 10 words that were extracted by both models and which were classified in three different topics. Taking into account the results from the evaluation measures (Precision, Recall and F1-score) it resulted that both models worked well for extracting the aspects, having NMF with a higher accuracy than LDA. NMF was also more accurate in the classification of the aspects into different topics.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
aspect-term-extraction-from-albanian-reviews-with--topic-modeling-techniques.pdf
Size
366.67 KB
Format
Adobe PDF
Checksum
(MD5):d15a3873a0112df4fbd1d3c41661c49c
