Detecting comments showing risk for suicide in YouTube
Refereed conference paper presented and published in conference proceedings

香港中文大學研究人員
替代計量分析
.

其它資訊
摘要Natural language processing (NLP) with Cantonese, a mixture of Traditional Chinese, borrowed characters to represent spoken terms, and English, is largely under developed. To apply NLP to detect social media posts showing suicide risk, which is a rare event in regular population, is even more challenging. This paper tried different text mining methods to classify comments in Cantonese on YouTube whether they indicate suicidal risk. Based on word vector feature, classification algorithms such as SVM, AdaBoost, Random Forest, and LSTM are employed to detect the comments’ risk level. To address the imbalance issue of the data, both re-sampling and focal loss methods are used. Based on improvement on both data and algorithm level, the LSTM algorithm can achieve more satisfied testing classification results (84.3% and 84.5% g-mean, respectively). The study demonstrates the potential of automatically detected suicide risk in Cantonese social media posts.
出版社接受日期18.10.2018
著者Jiahui Gao, Qijin Cheng, Philip L. H. Yu
會議名稱The Future Technologies Conference (FTC) 2018
會議開始日13.11.2018
會議完結日14.11.2018
會議地點Vancouver
會議國家/地區加拿大
會議論文集題名Proceedings of the Future Technologies Conference (FTC) 2018: Advances in intelligent systems and computing
出版年份2019
卷號880
出版社Springer
頁次385 - 400
國際標準書號978-3-030-02685-1
電子國際標準書號978-3-030-02686-8
語言美式英語
關鍵詞suicide, text mining, social media, Cantonese, sentiment analysis

上次更新時間 2020-23-10 於 02:33