Mosaic organization of DNA nucleotides

Mosaic organization of DNA nucleotides

FEBRUARY 1994 | C-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley, and A. L. Goldberger
DNA sequences exhibit long-range power-law correlations, which have been reported in noncoding regions. This study investigates whether such correlations are a result of the known mosaic structure ("patchiness") of DNA. Two types of control sequences were analyzed: one without and one with long-range correlations. An alternative fluctuation analysis method, detrended fluctuation analysis (DFA), was used to distinguish between local patchiness and long-range correlations. Application of DFA to DNA sequences showed that patchiness alone is not sufficient to explain long-range correlation properties. The DFA method involves dividing the sequence into nonoverlapping boxes, calculating the local trend, and then determining the detrended walk. The variance of the detrended walk is then calculated to determine the scaling properties of the sequence. If only short-range correlations exist, the variance scales as $ \ell^{1/2} $, while long-range correlations result in a different scaling exponent $ \alpha \neq 1/2 $. The study found that the two types of control sequences were distinguishable by their scaling exponents: $ \alpha = 0.51 $ for the uncorrelated control and $ \alpha = 0.61 $ for the correlated control. The E. coli genomic sequence, composed primarily of coding regions, had the same exponent as the uncorrelated control, while the human T-cell receptor alpha/delta locus, containing noncoding regions, had the same exponent as the correlated control. This suggests that the scaling exponent $ \alpha $ reflects the overall distribution properties of the sequence, not just the standard deviation. The DFA method also allows the identification of the characteristic length scale of patches. The crossover from $ \alpha \approx 0.5 $ to larger $ \alpha $ values occurs at $ \ell \approx 2500 $, which corresponds to the characteristic length scale of the uncorrelated control sequence. This crossover was also observed in the E. coli data, suggesting a characteristic patch size in the E. coli nucleotide sequence. In contrast, no such crossover was observed in the correlated control sequence, indicating the absence of a characteristic patch size. The study demonstrates that the DFA method can unambiguously differentiate between long-range correlations and patchiness, and that patchiness alone cannot account for long-range correlations in noncoding regions. The results also show that the scaling exponent $ \alpha $ is a measure of the degree of long-range correlation, and that the characteristic length scale of patches can be determined from the crossover behavior.DNA sequences exhibit long-range power-law correlations, which have been reported in noncoding regions. This study investigates whether such correlations are a result of the known mosaic structure ("patchiness") of DNA. Two types of control sequences were analyzed: one without and one with long-range correlations. An alternative fluctuation analysis method, detrended fluctuation analysis (DFA), was used to distinguish between local patchiness and long-range correlations. Application of DFA to DNA sequences showed that patchiness alone is not sufficient to explain long-range correlation properties. The DFA method involves dividing the sequence into nonoverlapping boxes, calculating the local trend, and then determining the detrended walk. The variance of the detrended walk is then calculated to determine the scaling properties of the sequence. If only short-range correlations exist, the variance scales as $ \ell^{1/2} $, while long-range correlations result in a different scaling exponent $ \alpha \neq 1/2 $. The study found that the two types of control sequences were distinguishable by their scaling exponents: $ \alpha = 0.51 $ for the uncorrelated control and $ \alpha = 0.61 $ for the correlated control. The E. coli genomic sequence, composed primarily of coding regions, had the same exponent as the uncorrelated control, while the human T-cell receptor alpha/delta locus, containing noncoding regions, had the same exponent as the correlated control. This suggests that the scaling exponent $ \alpha $ reflects the overall distribution properties of the sequence, not just the standard deviation. The DFA method also allows the identification of the characteristic length scale of patches. The crossover from $ \alpha \approx 0.5 $ to larger $ \alpha $ values occurs at $ \ell \approx 2500 $, which corresponds to the characteristic length scale of the uncorrelated control sequence. This crossover was also observed in the E. coli data, suggesting a characteristic patch size in the E. coli nucleotide sequence. In contrast, no such crossover was observed in the correlated control sequence, indicating the absence of a characteristic patch size. The study demonstrates that the DFA method can unambiguously differentiate between long-range correlations and patchiness, and that patchiness alone cannot account for long-range correlations in noncoding regions. The results also show that the scaling exponent $ \alpha $ is a measure of the degree of long-range correlation, and that the characteristic length scale of patches can be determined from the crossover behavior.
Reach us at info@study.space
Understanding Mosaic organization of DNA nucleotides.