Discovering Binding Cores in Protein-DNA Binding Using Association Rule Mining with Statistical Measures
Publication in refereed journal


引用次數
替代計量分析
.

其它資訊
摘要Understanding binding cores is of fundamental importance in deciphering Protein-DNA (TF-TFBS) binding and for the deep understanding of gene regulation. Traditionally, binding cores are identified in resolved high-resolution 3D structures. However, it is expensive, labor-intensive and time-consuming to obtain these structures. Hence, it is promising to discover binding cores computationally on a large scale. Previous studies successfully applied association rule mining to discover binding cores from TF-TFBS binding sequence data only. Despite the successful results, there are limitations such as the use of tight support and confidence thresholds, the distortion by statistical bias in counting pattern occurrences, and the lack of a unified scheme to rank TF-TFBS associated patterns. In this study, we proposed an association rule mining algorithm incorporating statistical measures and ranking to address these limitations. Experimental results demonstrated that, even when the threshold on support was lowered to one-tenth of the value used in previous studies, a satisfactory verification ratio was consistently observed under different confidence levels. Moreover, we proposed a novel ranking scheme for TF-TFBS associated patterns based on p-values and co-support values. By comparing with other discovery approaches, the effectiveness of our algorithm was demonstrated. Eighty-four binding cores with PDB support are uniquely identified.
著者Wong MH, Sze-To HY, Lo LY, Chan TM, Leung KS
期刊名稱IEEE/ACM Transactions on Computational Biology and Bioinformatics
詳細描述DOI 10.1109/TCBB.2014.2343952.
出版年份2015
月份1
日期1
卷號12
期次1
出版社Institute of Electrical and Electronics Engineers (IEEE)
頁次142 - 154
國際標準期刊號1545-5963
電子國際標準期刊號1557-9964
語言英式英語
關鍵詞association rule mining; binding cores; Protein-DNA binding; statistical measures
Web of Science 學科類別Biochemical Research Methods; Biochemistry & Molecular Biology; Computer Science; Computer Science, Interdisciplinary Applications; Mathematics; Mathematics, Interdisciplinary Applications; Statistics & Probability

上次更新時間 2020-22-10 於 00:22