Independent range sampling
Refereed conference paper presented and published in conference proceedings

Times Cited
Altmetrics Information

Other information
AbstractThis paper studies the independent range sampling prob-lem. The input is a set P of n points in ℝ. Given an interval q = [x, y] and an integer t ≥ 1, a query returns t elements uniformly sampled (with/without replacement) from P ∩ q. The sampling result must be independent from those returned by the previous queries. The objective is to store P in a structure for answering all queries efficiently. If P fits in memory, the problem is interesting when P is dynamic (i.e., allowing insertions and deletions). The state of the art is a structure of O(n) space that answers a query in O(t log n) time, and supports an update in O(log n) time. We describe a new structure of O(n) space that answers a query in O(log n+t) expected time, and supports an update in O(log n) time. If P does not fit in memory, the problem is challeng-ing even when P is static. The best known structure in-curs O(logB n + t) I/Os per query, where B is the block size. We develop a new structure of O(n/B) space that an-swers a query in O(log*(n/B)+logB n+(t/B) logM/B(n/B)) amortized expected I/Os, where M is the memory size, and log*(n/B) is the number of iterative log2(.) operations we need to perform on n/B before going below a constant. We also give a lower bound argument showing that this is nearly optimal-in particular, the multiplicative term log M/B(n/B) is necessary. Copyright 2014 ACM.
All Author(s) ListHu X., Qiao M., Tao Y.
Name of Conference33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2014
Start Date of Conference22/06/2014
End Date of Conference27/06/2014
Place of ConferenceSnowbird, UT
Country/Region of ConferenceUnited States of America
Detailed descriptionorganized by ACM,
Pages246 - 255
LanguagesEnglish-United Kingdom
KeywordsIndependent range sampling, Lower bound, Range reporting

Last updated on 2021-16-02 at 23:53