Classification of RNA sequences with pseudoknots using features based on partial sequences
Refereed conference paper presented and published in conference proceedings


摘要Classification on pseudoknots existence is a challenging and meaningful problem in Bioinformatics. As predicting RNA secondary structures with pseudoknots is NP-complete problem while predicting pseudoknot-free structures can be done in O(n3) time, if a preliminary pseudoknots existence classification of RNA sequence can be done before the prediction, the classification result can enhance the efficiency of RNA secondary structure prediction. In this paper, a classification of the existence of pseudoknots in an RNA sequence is presented. A set of features have been chosen by partial sequence content and thousands of RNA sequences with validated structures are used to train the classifier. Using a validated testing dataset, this classification method is shown to achieve a very good performance that the best result get 87% accuracy in 10-fold cross validation and around 75% accuracy in testing data. Moreover it may reveal how partial sequence content can affect the formation of pseudoknots.
著者Tong K.-K., Cheung K.-Y., Lee K.-H., Leung K.-S.
會議名稱IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015
會議地點Niagara Falls
關鍵詞classification, pseudoknot, RNA secondary structure prediction

上次更新時間 2020-17-10 於 01:17