Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers
Refereed conference paper presented and published in conference proceedings

替代計量分析
.

其它資訊
摘要The high memory consumption and computational costs of Recurrent neural network language models (RNNLMs) limit their wider application on resource constrained devices. In recent years, neural network quantization techniques that are capable of producing extremely low-bit compression, for example, binarized RNNLMs, are gaining increasing research interests. Directly training of quantized neural networks is difficult. By formulating quantized RNNLMs training as an optimization problem, this paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM). This method can also flexibly adjust the trade-off between the compression rate and model performance using tied low-bit quantization tables. Experiments on two tasks: Penn Treebank (PTB), and Switchboard (SWBD) suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs. Faster convergence of 5 times in model training over the baseline binarized RNNLM quantization was also obtained.
著者Junhao Xu, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Mei-Ling Meng
會議名稱IEEE ICASSP2020
會議開始日04.05.2020
會議完結日08.05.2020
會議地點Barcelona
會議國家/地區西班牙
會議論文集題名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
出版年份2020
出版社IEEE
頁次7939 - 7943
國際標準期刊號1520-6149
語言美式英語

上次更新時間 2021-09-05 於 00:12