rG4-seeker enables high confidence identification of novel rG4 motifs from rG4-seq experiment via platform-specific noise modeling
Refereed conference paper presented and published in conference proceedings

香港中文大學研究人員

全文

其它資訊
摘要Emergence of RNA-seq has revolutionized the studying and understanding of transcriptome, and empowered many high-throughput RNA structural/regulatory element mapping platforms. However, subsequent bioinformatics search to retrieve elements-of-interest often pick up noise that impacts overall research interpretation and outcome. Nevertheless, noise is often considered normal consequences of biological variances and tolerated by conducting statistical tests with replicated experiments.
We have recently developed RNA G-quadruplex sequencing (rG4-seq) for transcriptome-wide mapping of RNA G-quadruplexes (rG4s) by exploiting their intrinsic reverse transcriptase-stalling (RTS) properties. RNA G-quadruplex secondary structures are proposed to play significant regulatory roles in transcriptional, post-transcriptional and translational processes. In this study, we investigated the context of non-biological platform-specific noise in rG4-seq and demonstrated how noise modeling could improve both sensitivity and specificity of rG4 detection in replicate-independent manner.
Through in-depth re-analysis of HeLa rG4-seq datasets, it was revealed that the RNA fragmentation process in rG4-seq chemistry is associated with a distinct distribution of background RTS signal, which contributed as the most significant source of noise. By modeling and thus eliminating the effect of noise in RTS measurements, an improved rG4 detection pipeline called rG4-seeker were formulated. In contrast to the original pipeline that achieved 12% FDR with a 4-replicate-combined analysis; the new implementation demonstrated significant improvements by enabling reliable single-replicate analysis at FDR <2% and recalling ~80% of rG4 motifs identified previously. Meanwhile, unrecalled rG4 motifs were found coincidentally mapped to transcript regions of significantly higher GC ratio, where RTS signals were likely compromised by sequencing bias and rendered inconclusive rG4 detection outcomes. Furthermore, with rG4-seeker we identified hundreds of novel rG4 that nucleotide sequence do not match existing motif definitions, where candidates were experimentally validated. The information provided new insights in interpreting the nucleotide sequence rules governing rG4 formation.
Employing rG4-seq analysis as a showcase, our research demonstrated how the understanding of platform-specific noise could help tailoring bioinformatic analysis for better interpretation of high-throughput RNA structural/regulatory element probing experiments, which bring promises to further elucidate the transcriptome-wide landscape of RNA regulations and interactions.
出版社接受日期10.04.2019
著者Eugene Yui-Ching Chow, Kaixin Lyu, Chun Kit Kwok, Ting-Fung Chan
會議名稱24th Annual Meeting of the RNA Society 2019
會議開始日11.06.2019
會議完結日16.06.2019
會議地點Krakow, Poland
會議國家/地區波蘭
出版年份2019
語言美式英語

上次更新時間 2019-21-10 於 16:06