Challenges rising from learning motif evaluation functions using genetic programming
Refereed conference paper presented and published in conference proceedings

Times Cited
Altmetrics Information

Other information
AbstractMotif discovery is an important Bioinformatics problem for deciphering gene regulation. Numerous sequence-based approaches have been proposed employing human specialist motif models (evaluation functions), but performance is so unsatisfactory on benchmarks that the underlying information seems to have already been exploited. However, we have found that even a simple modified representation still achieves considerably high performance on a challenging benchmark, implying potential for sequence-based motif discovery. Thus we raise the problem of learning motif evaluation functions. We employ Genetic programming (GP) which has the potential to evolve human competitive models. We take advantage of the terminal set containing specialist-modellike components and have tried three fitness functions. Results exhibit both great challenges and potentials. No models learnt can perform universally well on the challenging benchmark, where one reason may be the data appropriateness for sequence-based motif discovery. However, when applied on different widely-tested datasets, the same models achieve comparable performance to existing approaches based on specialist models. The study calls for further novel GP to learn different levels of effective evaluation models from strict to loose ones on exploiting sequence information for motif discovery, namely quantitative functions, cardinal rankings, and learning feasibility classifications. Copyright 2010 ACM.
All Author(s) ListLo L.-Y., Chan T.-M., Lee K.-H., Leung K.-S.
Name of Conference12th Annual Genetic and Evolutionary Computation Conference, GECCO-2010
Start Date of Conference07/07/2010
End Date of Conference11/07/2010
Place of ConferencePortland, OR
Country/Region of ConferenceUnited States of America
Detailed descriptionorganized by ACM SIGEVO,\n\nTo ORKTS: Genetic Programming (GP), modelling
Pages171 - 178
LanguagesEnglish-United Kingdom
KeywordsBioinformatics, Genetic Programming (GP), Modelling, Motif Discovery, Transcription Factor Binding Site (TFBS)

Last updated on 2021-26-01 at 23:34