+ about this bibliography

This bibliography started out with a narrow focus: non-trivial long-range statistical correlations in DNA sequences. Gradually, I have been collecting papers on other topics as well. Now I have a collection of papers studying the most basic features of DNA and protein sequences, those concerning these sequences as symbolic strings.

There are roughly two large categories: static and dynamic. The static category treats DNA and protein sequences as fixed entities and uses various statistical methods (counting is the simplest one!) to characterize features in the text of these sequences. Sequence statistics used for predicting genes are treated in a separate bibliography: http://www.nslij-genetics.org/gene/

The dynamic category treats DNA sequences as a product of evolution, and investigates which particular process led to which particular feature. Although point mutation is the most studied dynamical process, I'm more interested in large-scale duplication process. To keep this focused, I created a separate page on gene duplication: http://www.nslij-genetics.org/duplication/

The "usual suspects": topics that motivated this bibliography at the first place: Large-scale base composition variations (heterogeneity, isochore...); Basic properties on base composition (entropy...); Characterizing whole-sequence correlation structure (correlation function, traditional spectral analysis, wavelet analysis...); Multiple-scaled features (fractal, 1/f noise, self-similarity, scale-invariance, domains-within-domains...)

Other topics that this bibliography is expanding to:

a/symmetry
about strand symmetry or strand a-symmetry
isochore
isochore, large-scale variation of GC-content
period:3
periodicity of three bases
period:10-11
periodicity of ten or eleven bases
repeats
tandom repeats (not many papers listed)
dna-music
about DNA/protein music (more information on this topic can be found at http://www.nslij-genetics.org/dnamusic/ )
sequence
sequencing papers (since they usually also include analysis results) (more links to whole-genome sequencing can be found in another page: http://www.nslij-genetics.org/seq/
protein
protein sequence analysis
bending
DNA bending (not many papers listed)
binding-energy
double helix binding energy (not many papers listed)