Principal Component Analysis and Large-Scale Correlations in
Non-Coding Sequences of Human DNA
Michael Teitelman and Frank H. Eeckman
Journal of Computational Biology,
3(4), 573-576 (1996).
Abstract
We have calculated a full set of second-order correlation functions of
nucleotides in non-coding DNA. They are found to be independently
invariant in regard to permutations of A and T, and also C and G.
Considering correlation functions as a 4x4 matrix with a symmetrical
basis, we have found the principal components - objects with zero
cross-correlations. These three principal components are
present the base compositions: (A+T-C-G), (A-T), (C-G).
The long range behavior of these principal components yield power-law
dependencies with different critical exponents.