Multifractal characterisation of length sequences of coding and noncoding segments in a complete genome
AbstractThe coding and noncoding length sequences constructed from a complete genome are characterised by multifractal analysis. The dimension spectrum D-q and its derivative, the 'analogous' specific heat C-q are calculated for the coding and noncoding length sequences of bacteria, where q is the moment order of the partition sum of the sequences. From the shape of the D-q and C-q curves, it is seen that there exists a clear difference between the coding/noncoding length sequences of all organisms considered and a completely random sequence. The complexity of noncoding length sequences is higher than that of coding length sequences for bacteria. Almost all D-q curves for coding length sequences are flat, so their multifractality is small whereas almost all D-q curves for noncoding length sequences are multifractal-like. It is seen that the 'analogous' specific heats of noncoding length sequences of bacteria have a rich variety of behaviour which is much more complex than that of coding length sequences. We propose to characterise the bacteria according to the types of the C-q curves of their noncoding length sequences. This new type of classification allows a better understanding of the relationship among bacteria at the global gene level instead of nucleotide sequence level. (C) 2001 Elsevier Science B.V. All rights reserved.
All Author(s) ListYu ZG, Anh V, Lau KS
Journal namePhysica A: Statistical Mechanics and its Applications
Volume Number301
Issue Number1-4
Pages351 - 361
LanguagesEnglish-United Kingdom
Keywords'analogous' specific heat; coding/noncoding segments; complete genome; length sequence; multifiractal analysis
Web of Science Subject CategoriesPhysics; Physics, Multidisciplinary; PHYSICS, MULTIDISCIPLINARY

