1Instituto de Investigaciones en Matematicas Aplicadas
y en Sistemas, Universidad
Nacional Autonoma de Mexico, Ciudad Universitaria, Mexico D.F., 04510 Mexico,
2Centro de Investigacion sobre Fijacion de Nitrogeno,
Universidad Nacional Autonoma de Mexico, Cuernavaca A.P. 565-A,
Morelos 62100, Mexico and Corresponding author
Computer Applications in the Biosciences 12(5), 415-422 (Oct, 1996)
Results. On the basis of the analysis of an exhaustive collection of regulatory regions in Escherichia coli, a grammatical model for the regulatory regions of [sigma]70 promoters has been developed. The terminal symbols of the grammar represent individual sites for the binding of activator and repressor proteins, and include the precise position of sites in relation to transcription initiation. Combining these symbols, the grammar generates a large number of different sentences, each of which can be searched for matching against a collection of regulatory regions by means of weight matrices specific for each set of sites for individual proteins. On the basis of this grammatical model, a Prolog parser is presented here. Specific subgrammars for ArgR, LexA and TyR were implemented. When parsing a collection of 128 [sigma]70 promoter regions, the syntactic recognizer produces a much lower number of false-positive sites than the standard search using weight matrices.
Availability. A WWW interface is under development and will be freely accessible at the url: http://www.cifn.unam.mx/ Computational_Biology/index.htm.
Contact. E-mail: collado@cifn.unam.mx