Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/20825
Title: Row key designs of NoSQL database tables and their impact on write performance
Authors: Zdravevski, Eftim 
Lameski, Petre 
Kulakov, Andrea 
Keywords: NoSQL, HBase, Hadoop, table design, row key, primary key, clustered index
Issue Date: 17-Feb-2016
Publisher: IEEE
Conference: 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)
Abstract: In several NoSQL database systems, among which is HBase, only one index is available for the tables, which is also the row key and the clustered index. Using other indexes does not come out of the box. As a result, the row key design is the most important thing when designing tables, because an inappropriate design can lead to detrimental consequences on performances and costs. Particular row key designs are suitable for different problems, and in this paper we analyze the performance, characteristics and applicability of each of them. In particular we investigate the effect of using various techniques for modeling row keys: sequences, salting, padding, hashing, and modulo operations. We propose four different designs based on these techniques and we analyze their performance on different HBase clusters when loading HDFS files with various sizes. The experiments show that particular designs consistently outperform others on differently sized clusters in both execution time and even load distribution across nodes.
URI: http://hdl.handle.net/20.500.12188/20825
Appears in Collections:Faculty of Computer Science and Engineering: Conference papers

Files in This Item:
File Description SizeFormat 
2015_HBase_Rowkeys_PDP_2016_EftimZdravevski.pdf1.35 MBAdobe PDFView/Open
Show full item record

Page view(s)

63
checked on May 17, 2024

Download(s)

34
checked on May 17, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.