Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.12188/21384
Title: | Simplifying parallel implementation of algorithms on Hadoop with Pig Latin | Authors: | Zdravevski, Eftim Lameski, Petre Kulakov, Andrea Filiposka, Sonja Trajanov, Dimitar |
Keywords: | Hadoop, MapReduce, HBase, Pig, parallel algorithms, distributed algorithms | Issue Date: | 2015 | Conference: | CIIT | Abstract: | In this paper we present a general technique for parallelizing regular algorithms with the tools the Hadoop ecosystem offers: MapReduce, HDFS, HBase and Pig. This framework can be applied for parallelizing algorithms for feature selection, clustering, machine learning etc. It consists of several steps: load the datasets in HDFS, apply some transformations if they are needed, store the datasets in HBase, and implement the algorithm in Pig with the help of User Defined Functions. | URI: | http://hdl.handle.net/20.500.12188/21384 |
Appears in Collections: | Faculty of Computer Science and Engineering: Conference papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
SimplifyingMapReducedevelopmentonHadoopandHBasewithPigLatin-EftimZdravevski.pdf | 312.49 kB | Adobe PDF | View/Open |
Page view(s)
41
checked on Oct 11, 2024
Download(s)
13
checked on Oct 11, 2024
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.