SOMM4mC: a second-order Markov model for DNA N4-methylcytosine site prediction in six species
Publication in refereed journal


引用次數
Scopus ( 21/11/2020)
替代計量分析
.

其它資訊
摘要Motivation: DNA N4-methylcytosine (4mC) modification is an important epigenetic modification in prokaryotic DNA due to its role in regulating DNA replication and protecting the host DNA against degradation. An efficient algorithm to identify 4mC sites is needed for downstream analyses.

Results: In this study we propose a new prediction method named SOMM4mC based on a second-order Markov model, which makes use of the transition probability between adjacent nucleotides to identify 4mC sites. The results show that the first-order and second-order Markov model are superior to the three existing algorithms in all six species (C. elegans, D. melanogaster, A. thaliana, E. coli, G. subterruneus, and G. pickeringii) where benchmark datasets are available. However, the classification performance of SOMM4mC is more outstanding than that of first-order Markov model. Especially, for E. coli and C. elegans, the overall accuracy of SOMM4mC are 91.8% and 87.6%, which are 8.5% and 6.1% higher than those of the latest method 4mcPred-SVM, respectively. This shows that more discriminant sequence information is captured by SOMM4mC through the dependency between adjacent nucleotides.

Availability: The web server of SOMM4mC is freely accessible at www.insect-genome.com/SOMM4mC.

Supplementary information: Supplementary data are available at Bioinformatics online.
出版社接受日期08.05.2020
著者Yang J, Lang K, Zhang G, Fan X, Chen Y, Pian C
期刊名稱Bioinformatics
出版年份2020
月份8
日期15
卷號36
期次14
頁次4103 - 4105
國際標準期刊號1367-4803
電子國際標準期刊號1460-2059
語言英式英語

上次更新時間 2020-21-11 於 23:40