FANS-REF

Functional Analysis of Nucleotide Sequences Reference Data Bank

Release 9.2 (September, 1994)

AUTHOR

M.S.Gelfand
Institute of Protein Research
Russia Academy of Sciences
Pushchino, Moscow Region, 142292
Russia

phone:	7-(095)-135-11-41, 7-(095)-938-82-19
fax: 	7-(095)-135-99-84
e-mail:	misha@imb.imb.ac.ru

SCOPE

This release contains 1696 references.

The following areas are essentially covered:

The following areas are referenced extensively but not exhaustively:

There are also some references on:

References are supplied with keywords. The list of keywords currently consists of 166 entries (see below).

FORMAT

Each entry can contain the following fields (in an arbitrary order): and terminates with the double slash
//

The field name starts at position 1.

The field content is indented to position 4.

Multiple contents are separated by semicolons.

Each 'identifier' is unique and consists of 8 characters:

NNNIYYSS
where:
NNN
are first 3 letters of the last name of the first author;
I
is first letter of the first name of the first author (exceptions to this rule can occur);
YY
last 2 digits of the publication year;
SS
serial number of entries with the same identifier prefixes (starting with 01).

A compatibility with the SEQANALREF data bank identifiers (see ACKNOWLEDGEMENTS) was attempted but not always sustained, especially in relaively late references.

The 'keywords' field contains keyword(s) describing the content of the publication. The list of the keywords is given below. Keywords describe, when appropriate, field, approach, structures, processes, ferments and sites being analyzed, and organisms involved.

The 'comment' field contains the information about the language of the publication (if not English), 'In press' note and maybe other short notes.

Entries are listed in identifier alphanumerical order.


EXAMPLE

identifier
   GELM9201
authors
   Gelfand M.S.
title
   Computer functional analysis of nucleotide sequences: problems 
   and approaches
location
   in: Mathematical methods of the analysis of biopolymer sequences
   (DIMACS vol.8) (S.G.Gindikin, ed) (AMS, Providence RI, 1992) 19-61
keywords
   reviews; DNA models; statistics; prediction; functional sites;
   protein-coding regions; biomolecular linguistics; periodicity
//

DISTRIBUTION

The data bank (file 'fans-ref.dat') can be copied freely provided that this document (file 'fans-ref.doc') is reproduced with each copy.

Versions starting with 9.1 can be obtained from anonymous ftp site imb.imb.ac.ru (147.45.3.14), directory /Biblio, as the single compressed file fans-ref.zip).

Version 3.4 of the data bank can be obtained from the EMBL file server at the Internet e-mail address

NetServ@EMBL-Heidelberg.DE

The file should contain nothing but the file server commands, e.g.

   HELP
   HELP REFLIST
   DIR REFLIST
   GET REFLIST:FANS_REF.DOC
   GET REFLIST:FANS_REF.DAT

Version 4.3 is being published in "Biotechnology Software" journal (vol. 8, nos. 5, 6 (1991); vol. 9 nos. 1, 2, 3, 4, 6 (1992), vol. 10 nos. 4, 6 (1993) etc.).

If the data bank is extensively used or incorporated into another data bank, an acknowledgement would be appreciated.


REGISTERED USERS

Registered users will regularly receive information about updates of the data bank and other developments.

To become a registered user one must fill the following form and mail and/or e-mail it to M.S.Gelfand at the address above.

....................... cut here ..........................................

======================= REGISTRATION FORM ============================
   FANS-REF (Release 9.2)

1. Name ______________________________________________________________

2. Address ___________________________________________________________
   ___________________________________________________________________

3. E-mail ____________________________________________________________

4. Fax _______________________________________________________________

5. Phone _____________________________________________________________

6. Field of research _________________________________________________
   ___________________________________________________________________
   ___________________________________________________________________

7. Other fields of interest __________________________________________
   ___________________________________________________________________
   ___________________________________________________________________

8. Which computer system do you use? _________________________________

9. Suggestions for the data bank development _________________________
   ___________________________________________________________________
   ___________________________________________________________________

10. Date ______________________________________________________________

....................... cut here ...........................................

SUBMISSION

I would be grateful for any information about missing references and data bank errors and misprints. This information should be mailed and/or e-mailed to M.S.Gelfand at the above address. I would also be very grateful for reprints in the areas covered by this data bank.

ACKNOWLEDGEMENTS

Some ideas and several entries from the SEQANALREF (version 9.60) data bank compiled by Dr. A.Bairoch were used. Some entries were taken from the compilation by Dr. S.Barron (see BARS9101)

I thank F.G.Ball, R.D.Blake, M.Borodovsky, B.Demeler, J.W.Fickett, D.Graur, R.Hanai, I.Iida, M.A.Jimenez-Montano, A.Konopka, E.V.Koonin, S.V.Korolev, E.V.Korotkov, J.Kypr, A.Marin, S.Ohno, P.A.Pevzner, A.A.Ptitsyn, V.A.Ratner, J.Rice, I.B.Rogozin, P.R.Sibbald, D.W.Smith, Yu.A.Sprizhitsky, V.B.Streletz, S.Tavare, E.N.Trifonov, M.S.Waterman, B.S.Weir, S.H.White, who supplied me with reprints of their works relevant to the data bank.

This data bank is or has been partially supported by the Russian State Program "Human Genome" (grant 534), Russian Fund of Fundamental Research (grant 94-04-12330), Department of Energy, USA (grant DE-FG02-94ER61919.A000), National Institutes of Health, USA (grant HG00783) and International Science Foundation (grant MTWOOO).


REMARK

I apologize to authors whose work in the areas marked as 'essentially covered' I have overlooked. I'll be glad to correct all shortcomings.

LIST OF KEYWORDS

Keywords are listed in the alphabetic order. Comments in brackets are not parts of keywords.
acceptor site (splicing)
Acinetobacter
amino acid control
Bacillus subtilis
biomolecular linguistics
branch point (splicing)
Caernorabilis elegans (nematode)
CAP receptor
caulimovirus
CCAAT box (in eukaryotic promoters)
chloroplasts
ciliates
codon usage
compilation (of functional sites)
complexity
consensus
Corynebacteria
coxsackievirus
cucumovirus
cytomegalovirus
databases
Dictyostelium discoideum (slime mold)
DNA models
DNA-protein interaction
DNA representation
DNA structure (three-dimensional or thermostability parameters)
DNAse I
donor site (splicing)
Drosophila
E.coli
echinoderms
enhancer (of transcription)
Entamoeba
Epstein-Barr virus
exons
eukaryotes
evolution
exon-intron structure
f1 (phage)
fd (phage)
flatworm
Fourier transform
fractals
frame shift site
functional sites
fungi
G4 (phage)
Gallus gallus
genetic code
genetic language
glucocorticoid control (of transcritpion in eukaryotes)
glucocorticoid receptor
green algae
group I introns
gyrase
heat shock control (of transcription)
herpes simplex
HIV (virus)
Homo sapiens
honeybee
hormone response
IKe (phage)
information theory
initiation site (translation)
integration
integration host factor
introns
invertebrates
Ising model
kinetoplastids
Klebsiella pneumoniae
Lactobacillus
lambda (phage)
lentivirus
LexA binding site
long-range correlations
M13 (phage)
mammals
Marchantia polymorpha
Markov chain
methylation
Micrococcus luteus
mitochondria
mouse
MS2 (phage)
multiple alignment
mutations
neural network
Neurospora
nick site
Nicotiana tabacum
noncoding regions
NotI site
nucleosomes
operator
origin (replication)
p53
pattern recognition (prediction methods)
periodicity (see also 'nucleosomes')
phages
phiX174 (phage)
plants
plasmids
Plasmodium
poliovirus
polyadenylation
polyomavirus
prediction
primates
probability (theory)
prokaryptes
promoter (of transcription)
protein-coding regions
protein ICP4
protein Mut
Pseudomonas
rabbit
rat
recombination
relaxase
repetitive elements
replication
restrictase EcoRI
restriction site
retrovirus
Rev responsive element
review
rhinovirus
rice
RNA polymerase II (eukaryotic)
RNA secondary structure
rodents
Rous sarcoma virus
Salmonella typhimurium
sequencing
Serratia marcences
similarity search
splicing
statistics
Streptococcus pneumoniae
SV40 (virus)
T4 (phage)
T7 (phage)
target site
termination site (translation)
terminator (of transcription)
Tetrahymena
tobacco
tobamovirus
topoisomerase I
topoisomerase II
Toxoplasma
translation
transcription
tRNA
tymovirus
vaccinia virus
vertebrates
virus BK
viruses
Walsh transform
weight matrix
wheat
Xenopus
yeast
Zea mays