Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/22843
Title: Framework for Real-Time Parallel and Distributed Natural Language Processing
Authors: Mileski, D.
Zdraveski, Vladimir 
Kostoska, Magdalena 
Gushev, Marjan 
Keywords: framework, real-time processing, natural language processing, parallel processing, distributed processing
Issue Date: 2021
Publisher: IEEE
Conference: 44th International Convention on Information, Communication and Electronic Technology (MIPRO)
Abstract: In this paper, we present a new framework for parallel and distributed processing of real-time text streams capable for executing NLP-Natural Language Processing algorithms. The focus is set on acceleration based on attention for building the topology, and not on the individual NLP algorithms. We elaborate the configuration of our specific use case, and discuss the reduction of the time required for system configuration in order to use the benefits of virtualization and containers. Research hypothesis: We can process more text tuples per unit time using the new developed framework for an algorithm that divides the sequential algorithm into smaller jobs and tasks including tokenisation, part of speech tagging, stopwords, sentiment analysis, where each of these individual jobs are specific nodes in the Apache Storm-based topology. We have conducted an experimental proof-of-concept and found the optimal configuration confirming the validity of the hypothesis.
URI: http://hdl.handle.net/20.500.12188/22843
Appears in Collections:Faculty of Computer Science and Engineering: Conference papers

Files in This Item:
File Description SizeFormat 
MIPRO2021_Framework_for_real_time_parallel_and_distributed_Natural_Language_Processing.pdf530.72 kBAdobe PDFView/Open
Show full item record

Page view(s)

41
checked on Apr 26, 2024

Download(s)

18
checked on Apr 26, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.