Statistical properties and linguistic coherence in noncoding DNA sequences

Authors

  • B. Cantú-Bolán
  • E. Hernández-Lemus

Keywords:

DNA, genomics, statistical linguistics, fractal dimension

Abstract

It has generally been thought that the vast majority of the DNA of living organisms (about 95%) was constituted of what is now called non-coding DNA (NC-DNA). No mechanisms of the genetic expression were known for this NC-DNA, as opposed to the protein expression for coding DNA (C-DNA). So NC-DNA was traditionally assigned a role as a cover-up (with no biological function of its own) against the random attack of mutagenic elements on the C-DNA. Nevertheless (and in some sense motivated by the discovery of the tertiary structure of the genetic code), studies into the nature and biological function of NC-DNA began. Some of the tools of multifractal theory and statistical linguistics were recently applied to the analysis of coherence and correlation in non-coding DNA fragments. As a result, the presence of long-range correlations, coherent patterns, and even some well defined structural features, showed up. This structure and correlation would be impossible to find in a random nucleotide sequence (as NC-DNA was originally thought to be constituted).

Downloads

Published

2005-01-01

How to Cite

[1]
B. Cantú-Bolán and E. Hernández-Lemus, “Statistical properties and linguistic coherence in noncoding DNA sequences”, Rev. Mex. Fis. E, vol. 51, no. 2 Jul-Dec, pp. 118–125, Jan. 2005.