Reading scene text in deep convolutional sequences
Refereed conference paper presented and published in conference proceedings


全文

其它資訊
摘要We develop a Deep-Text Recurrent Network (DTRN) that regards scene text reading as a sequence labelling problem. We leverage recent advances of deep convolutional neural networks to generate an ordered highlevel sequence from a whole word image, avoiding the difficult character segmentation problem. Then a deep recurrent model, building on long short-term memory (LSTM), is developed to robustly recognize the generated CNN sequences, departing from most existing approaches recognising each character independently. Our model has a number of appealing properties in comparison to existing scene text recognition methods: (i) It can recognise highly ambiguous words by leveraging meaningful context information, allowing it to work reliably without either pre- or post-processing; (ii) the deep CNN feature is robust to various image distortions; (iii) it retains the explicit order information in word image, which is essential to discriminate word strings; (iv) the model does not depend on pre-defined dictionary, and it can process unknown words and arbitrary strings. It achieves impressive results on several benchmarks, advancing the-state-of-the-art substantially.
著者He P., Huang W., Qiao Y., Loy C.C., Tang X.
會議名稱30th AAAI Conference on Artificial Intelligence, AAAI 2016
會議開始日12.02.2016
會議完結日17.02.2016
會議地點Phoenix
會議國家/地區美國
詳細描述organized by AAAI,
出版年份2016
月份1
日期1
頁次3501 - 3508
國際標準書號9781577357605
語言英式英語

上次更新時間 2020-05-08 於 03:42