Exploring the Terrier information Retrieval Platform for Web Search of Documents Written in Macedonian
Date Issued
2013
Author(s)
Vangelovski, Vasil
Abstract
Terrier is a modular and scalable platform for rapid
development of Information Retrieval (IR) systems. This
paper presents a short overview of the Terrier architecture and
describes ways in which it can be extended for more effective
indexing and searching of documents written in Macedonian
language. Although Terrier supports out of the box search in a
few non-English languages, the Macedonian language poses
some specific challenges, especially when search of Web
content is involved. An integrated search platform is
developed for the purpose of this research extending the text
retrieval engine with a more advanced content filtering
capabilities. Some of the proposed methods can be easily
applied to other non-English languages.
development of Information Retrieval (IR) systems. This
paper presents a short overview of the Terrier architecture and
describes ways in which it can be extended for more effective
indexing and searching of documents written in Macedonian
language. Although Terrier supports out of the box search in a
few non-English languages, the Macedonian language poses
some specific challenges, especially when search of Web
content is involved. An integrated search platform is
developed for the purpose of this research extending the text
retrieval engine with a more advanced content filtering
capabilities. Some of the proposed methods can be easily
applied to other non-English languages.
File(s)![Thumbnail Image]()
Loading...
Name
10CiiT-24.pdf
Size
264.23 KB
Format
Adobe PDF
Checksum
(MD5):92bc184f88b597c193a95d3f2790c8f5
