Modeling Associated Protein-DNA Pattern Discovery with Unified Scores
Publication in refereed journal


摘要Understanding protein-DNA interactions, specifically transcription factor (IF) and transcription 'factor binding site (TFBS) bindings, is crucial in deciphering gene regulation. The recent associated TF-TFBS pattern discovery combines one-sided motif discovery on both the TF and the TFBS sides. Using sequences only, it identifies the short protein-DNA binding cores available only in high-resolution 3D structures. The discovered patterns lead to promising subtype and disease analysis applications. While the related studies use either association rule mining or existing TFBS annotations, none has proposed any formal unified (both-sided) model to prioritize the top verifiable associated pattdrns. We propose the unified scores and develop an effective pipeline for associated TFTFBS pattern discovery. Our stringent instance-level evaluations show that the patterns with the top unified scores match with the binding cores in 3D structures considerably better than the previous works, where up to 90 percent of the top 20 scored patterns are verified. We also introduce extended verification from literature surveys, where the high unified scores correspond to even higher verification percentage. The top scored patterns are confirmed to match the known WRKY binding cores with no available 3D structures and agree well with the top binding affinities of in vivo experiments.
著者Chan TM, Lo LY, Sze-To HY, Leung KS, Xiao X, Wong MH
期刊名稱IEEE/ACM Transactions on Computational Biology and Bioinformatics
詳細描述ISSN :1545-5963; Digital Object Identifier: 10.1109/TCBB.2013.60.
出版社Institute of Electrical and Electronics Engineers (IEEE)
頁次696 - 707
關鍵詞binding rules; Bioinformatics; motif discovery; protein-DNA interactions; TF-TFBS associated pattern discovery
Web of Science 學科類別Biochemical Research Methods; BIOCHEMICAL RESEARCH METHODS; Biochemistry & Molecular Biology; Computer Science; Computer Science, Interdisciplinary Applications; COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS; Mathematics; Mathematics, Interdisciplinary Applications; MATHEMATICS, INTERDISCIPLINARY APPLICATIONS; Statistics & Probability; STATISTICS & PROBABILITY

上次更新時間 2020-27-10 於 00:52