Minimal MapReduce algorithms
Refereed conference paper presented and published in conference proceedings


Times Cited
Altmetrics Information
.

Other information
AbstractMapReduce has become a dominant parallel computing paradigm for big data, i.e., colossal datasets at the scale of tera-bytes or higher. Ideally, a MapReduce system should achieve a high degree of load balancing among the participating machines, and minimize the space usage, CPU and I/O time, and network transfer at each machine. Although these principles have guided the development of MapReduce algorithms, limited emphasis has been placed on enforcing serious constraints on the aforementioned metrics simultaneously. This paper presents the notion of minimal algorithm, that is, an algorithm that guarantees the best parallelization in multiple aspects at the same time, up to a small constant factor. We show the existence of elegant minimal algorithms for a set of fundamental database problems, and demonstrate their excellent performance with extensive experiments. Copyright © 2013 ACM.
All Author(s) ListTao Y., Lin W., Xiao X.
Name of Conference2013 ACM SIGMOD Conference on Management of Data, SIGMOD 2013
Start Date of Conference22/06/2013
End Date of Conference27/06/2013
Place of ConferenceNew York, NY
Country/Region of ConferenceUnited States of America
Detailed descriptionorganized by ACM,
Year2013
Month7
Day29
Pages529 - 540
ISBN9781450320375
ISSN0730-8078
LanguagesEnglish-United Kingdom
KeywordsBig data, MapReduce, Minimal algorithm

Last updated on 2020-14-10 at 01:58