Analysis of eukaryotic promoter sequences reveals a systematically
occurring CT-signal
Niels I. Larsen, Jacob Engelbrecht, and Søren Brunak
Nucleic Acids Research, 23(7):1223-1231 (April 11, 1995)
Abstract
A general data study of eukaryotic promoter sequences from widely different species is
presented. Mammalian promoters with known transcription initiation sites represented the
largest subclass of the data, and for this group neural network algorithms were trained to
predict the location of the initiation site in a test set. The prediction accuracy of this local
method was higher than what could be expected from the known non-local structure of
eukaryotic promoters. Subsequent analysis revealed, besides the consensus of the two known
important subregions: the TATA-box TATAAA and the Cap-signal CA, a CT-signal
positioned on the average seven nucleotides downstream of the transcription initiation site.
The consensus of the CT-signal is CTNCNG. The details of this core promoter element
were disclosed using multiple alignment and have earlier only been described in a few isolated
examples.