Statistical properties and linguistic coherence in noncoding DNA sequences
Keywords:
DNA, genomics, statistical linguistics, fractal dimensionAbstract
It has generally been thought that the vast majority of the DNA of living organisms (about 95%) was constituted of what is now called non-coding DNA (NC-DNA). No mechanisms of the genetic expression were known for this NC-DNA, as opposed to the protein expression for coding DNA (C-DNA). So NC-DNA was traditionally assigned a role as a cover-up (with no biological function of its own) against the random attack of mutagenic elements on the C-DNA. Nevertheless (and in some sense motivated by the discovery of the tertiary structure of the genetic code), studies into the nature and biological function of NC-DNA began. Some of the tools of multifractal theory and statistical linguistics were recently applied to the analysis of coherence and correlation in non-coding DNA fragments. As a result, the presence of long-range correlations, coherent patterns, and even some well defined structural features, showed up. This structure and correlation would be impossible to find in a random nucleotide sequence (as NC-DNA was originally thought to be constituted).Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2019 Revista Mexicana de Física E
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors retain copyright and grant the Revista Mexicana de Física E right of first publication with the work simultaneously licensed under a CC BY-NC-ND 4.0 that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.