Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data
Refereed conference paper presented and published in conference proceedings

香港中文大學研究人員
替代計量分析
.

其它資訊
摘要We consider the estimation of Dirichlet Process Mixture Models (DPMMs) in distributed environments, where data are distributed across multiple computing nodes. A key advantage of Bayesian nonparametric models such as DPMMs is that they allow new components to be introduced on the fly as needed. This, however, posts an important challenge to distributed estimation -- how to handle new components efficiently and consistently. To tackle this problem, we propose a new estimation method, which allows new components to be created locally in individual computing nodes. Components corresponding to the same cluster will be identified and merged via a probabilistic consolidation scheme. In this way, we can maintain the consistency of estimation with very low communication cost. Experiments on large real-world data sets show that the proposed method can achieve high scalability in distributed and asynchronous environments without compromising the mixing performance.
著者Ruohui Wang, Dahua Lin
會議名稱International Joint Conference on Artificial Intelligence (IJCAI)
會議開始日19.08.2017
會議完結日25.08.2017
會議地點Melbourne
會議國家/地區澳大利亞
會議論文集題名Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
出版年份2017
月份8
頁次4632 - 4639
語言美式英語

上次更新時間 2021-16-01 於 00:56