BIBLIOGRAPHY

Features, Patterns, Correlations in DNA and Protein Texts

© Copyright, 2002-2006, Wentian Li of Feinstein Institute for Medical Research. (1995-1996 Columbia University, 1996-2001 Rockefeller University). Leave a message at my guestbook . Last updated: February 26, 2008 (~1670 papers, excluding preprints)

front page | preprints | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 |
1999 | 1998 | 1997 | 1996 | 1995 | 1994 | 1993 | 1992 | 1991 | 1990 | 80s | 70s | 60s | 50s
[about] [links] [complete genomes] [gene duplication] [dna/protein music] [gene recognition] [mirror]

Number of visists:

DNA sequences carry biological information, DNA moleculars (double-stranded helix) obey physical laws, DNA genomes are products of generations of evolution, and DNA texts can be read as a string of symbols. All these different aspects of DNA sequences (and to some extends, protein sequences) make their analysis a multi-disciplinary topic, which touches biochemistry, biophysics, evolutionary biology, as well as mathematics, statistics, computer science, and statistical physics.

This bibliography is an attempt to collection papers on DNA sequences: statistical patterns, statistical features, statistical corrections, etc. No claim is made on the completeness of this bibliography and it is updated on an irregular basis.




... But I am very much excited by your article in May 30th [1953] Nature, and think that bring Biology over into the group of "exact" sciences. I plan to be in England through most of September, and hope to have a chance to talk to you about all that, but I would like to ask a few questions now. If your point of view is correct [,] each organism will be characteristized by a long number written in quadrucal (?) system with figures 1, 2, 3, 4 standing for different bases ... This would open a very exciting possibility of theoretical research based on combinatorix [sic] and the theory of numbers! ... I have a feeling this can be done. What do you think?
George Gamow (1904-1968), in a letter to Watson and Crick (1953).

From the point of view of the theory of information, the works of Shakespeare, with the same number of letters and signs aligned at random by a monkey, would have the same value. It is this lack of definition of the value of information that makes it difficult to use in biology. What could be considered as "objective" in the Shakespearean information that would distinguish it from the monkey's information? Essentially the transmissibility. The value of influence, therefore of evolution. ...
Jacques Monod (1910-1976), notesbook (1959).

"Whereas ordinary mortals are content to mimic others, creative geniuses are condemned to plagiarize themselves" is my shorter, albeit inarticulate, version of what Van Veen said in Ada by Vladimir Nobokov. Indeed, it seems that vaunted geniuses seldom invented more than one modus operandi during their lifetimes, and even civilization has largely been dependent upon plagiarizing a small number of creative works; e.g., the multitudes of Gothic churches can be viewed as pan European plagiarism of the abbey church of St. Denis and/or the cathedral at Sens. This is not surprising for new genes sensu stricto has seldom been invented. Evolution rather relies on palgiarizing an old and tested theme; the mechanism of evolution by gene duplication. ... this principle of repetitious recurrence pervades both the construction of coding sequences in the genome, which can be regarded as being representative of nature, and musical composition which can be regarded as the most abstract and therefore the most intellectual expression of nature."
Susumu Ohno (1928-2000) and Midori Ohno, Immunogenetics, 24:71-78 (1986).

Searching for an objective reconstruction of the vanished past must surely be the most challenging task in biology. I need to say this because, today, given the powerful tools of molecular biology, we can answer many questions simply by looking up the answer in Nature - and I do not mean the journal of the same name. ... In one sense, everything in biology has already been 'published' in the form of DNA sequences of genomes; but, of course, this is written in a language we do not yet understand. Indeed, I would assert that the prime task of biology is to learn and understand this language so that we could then compute organisms from their DNA sequences. ... We are at the dawn of proper theoretical biology.
Sidney Brenner, in Evolution of Life, eds. S Osawa and T Honjo (Springer-Verlag) (1991).

While ... human genome projects ... were launched only in the past decade, the technoscientific imaginary and the discursive practices that have animated them, specifically the textual and linguistic representations of the genome, are quite old. In their (post?) modern form they first emerged in the late 1940s and were then fully elaborated within the work on the genetic code in the 1950s and 1960s.
Lily Kay (science historian, 1947-2000), in Who Wrote the Book of Life? (2000).




French: Une bibliographie sur les caracteristiques, motifs et corrélations les sequences proteiques et l'ADN
German: Ein Literaturverzeichnis über Eigenschaften, Strukturen und Korrelationen in DNA- und Proteinsequenzen
Italian: Una bibliografia su caratteristiche, pattern, correlazioni in sequenze di DNA e proteine
Portuguese: Uma bibliografia sobre caracteristicas, modelos, correlacoes em ADN e textos de proteinas
Spanish: Bibliografía sobre las propiedades, patrones y correlaciones en textos de ADN y proteínas

Google
WWW http://www.nslij-genetics.org