Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.12188/21384
Title: | Simplifying parallel implementation of algorithms on Hadoop with Pig Latin | Authors: | Zdravevski, Eftim Lameski, Petre Kulakov, Andrea Filiposka, Sonja Trajanov, Dimitar |
Keywords: | Hadoop, MapReduce, HBase, Pig, parallel algorithms, distributed algorithms | Issue Date: | 2015 | Conference: | CIIT | Abstract: | In this paper we present a general technique for parallelizing regular algorithms with the tools the Hadoop ecosystem offers: MapReduce, HDFS, HBase and Pig. This framework can be applied for parallelizing algorithms for feature selection, clustering, machine learning etc. It consists of several steps: load the datasets in HDFS, apply some transformations if they are needed, store the datasets in HBase, and implement the algorithm in Pig with the help of User Defined Functions. | URI: | http://hdl.handle.net/20.500.12188/21384 |
Appears in Collections: | Faculty of Computer Science and Engineering: Conference papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
SimplifyingMapReducedevelopmentonHadoopandHBasewithPigLatin-EftimZdravevski.pdf | 312.49 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.