Bibliography on gene and genome duplication (1998)

front page
glossary
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
80s
70s
60s
50s
40s
30s
20s
10s

  1. A Amores, A Force, YL Yan, L Joly, C Amemiya, A Fritz, RK Ho, J Langeland, V Prince, YL Wang, M Westerfield, M Ekker, JH Postlethwait (1998), "Zebrafish hox clusters and vertebrate genome evolution", Science, 282:1711-1714.

  2. NM Brooke, J Garcia-Fernandez, PW Holland (1998), "The ParaHox gene cluster is an evolutionary sister of the Hox gene cluster", Nature, 392:920-922.
    Comments by Skrabanek and Wolfe (1998) : Three homeobox genes, Gsx, Xlox (Pdx) and Cdx, are clustered in the amphioxus genome and may also be clustered in mammals. These genes are similar to three of the Hox paralogy groups in terms of their sequence, expression, and order on the chromosome. ParaHox and Hox arose by duplication >520 Mya, before Hox duplicated further to produce the four clusters found in mammals.

  3. Wilfried W de Jong, Gert-Jan Caspers, Jack AM Leunissen (1998), "Genealogy of the alpha-crystallin-small heat-shock protein superfamily", International Journal of Biological Macromolecules, 22(3-4):151-162.
    [ abstract]

  4. N El-Mabrouk, JH Nadeau, D Sankoff (1998), "Genome halving", in Combinatorial Pattern Matching , Lecture Notes in Computer Science, Vol.1448, ed. Martin Farach-Colton (Springer). ISBN 3540647392.

  5. N El-Mabrouk, J Nadeau, D Sankoff (1998), "Genome halving", In Combinatorial Pattern Matching , ed. Martin Farach, Lecture Notes in Computer Science, Vol. 1448, pp. 235-250. (Springer). ISBN 3540647392.
    [ PDF ]
    [ this is a computer science paper ]

  6. O Eulenstein, B Mirkin, M Vingron (1998), "Duplication-based measures of difference between gene and species trees", Journal of Computational Biology, 5:135-148.

  7. MD Gale, KM Devos (1998), "Comparative genetics in the grasses", Proceedings of National Academy of Sciences, 95:1971-1974.
    [html]
    Comments by Skrabanek and Wolfe (1998) : In our opinion, although there is strong evidence both that the maize genome is an ancient tetraploid and that there is substantial conservation of gene order among grasses, the 'Lego' model of Gale and co-workers is misleading and questionable. The circular representation of the aligned genomes of different species (including two putative sub-genomes from maize) implies that either there have been no chromosomal fusions or translocations during grass evolution - in which case the ancestor of the grasses must have had just a single, giant, chromosome; - or else that chromosomal fusions occur but each chromosome is only permitted to fuse with a particular designated partner. Neither of these seems plausible.

  8. TJ Gibson, J Spring (1998), "Genetic redundancy in vertebrates: polyploidy and persistence of genes encoding multidomain proteins", Trends in Genetics, 14:46-49.
    Comments by Skrabanek and Wolfe (1998) : Point mutations in developmental genes often have dominant deleterious phenotypes, whereas complete deletion of these genes often has no phenotype. Gibson and Spring argue that this is to be expected for genes encoding multidomain proteins and that this may prevent these genes from decaying into pseudogenes.

  9. Peter WH Holland (1998), "Major transitions in animal evolution: a developmental genetic perspective", American Zoologist, 38(6):829-842.
    [abstract]

  10. AL Hughes (1998), "Phylogenetic tests of the hypothesis of block duplication of homologous genes on human chromosomes 6, 9, and 1", Molecular Biology and Evolution, 15:854-870.
    Comments by Skrabanek and Wolfe (1998) : The combined results from this study and that of Endo et al. show that, of the 11 gene pairs on HSA6/HSA9 that have previously been proposed, a simultaneous origin seems possible for six: RXRA/RXRB, COL5A1/COL11A2, ORFX/RING3, PBX3/PBX2, C5/C4A and TNC/TNX. Gene order for these pairs is conserved except for one inversion of C5 and TNC. Other gene pairs are either much older (ABC2/TAP1, NOTCH1/INT3, PSMB7/PSMB8 and HSPA5/HSPA1A) or much younger (VARS1/VARS2). Despite their similar conclusions, the use of different molecular clock calibrations in the two studies causes them to disagree on the absolute date of the block duplication: 579-696 Mya in Hughes' study, but 161-580 Mya in Endo et al.'s. The latter group also put forward a convoluted hypothesis involving two rounds of duplication to explain the presence of older gene pairs in the region, whereas Hughes proposes that there may be some sort of selective constraint causing clustering of the ancient paralogues.

  11. MA Huynen, E van Nimwegen (1998), "The frequency distribution of gene family sizes in complete genomes", Molecular Biology and Evolution, 15:583-589.
    abstract: We compare the frequency distribution of gene family sizes in the complete genomes of six bacteria (Escherichia coli, Haemophilus influenzae, Helicobacter pylori, Mycoplasma genitalium, Mycoplasma pneumoniae, and Synechocystis sp. PCC6803), two Archaea (Methanococcus jannaschii and Methanobacterium thermoautotrophicum), one eukaryote (Saccharomyces cerevisiae), the vaccinia virus, and the bacteriophage T4. The sizes of the gene families versus their frequencies show power- law distributions that tend to become flatter (have a larger exponent) as the number of genes in the genome increases. Power-law distributions generally occur as the limit distribution of a multiplicative stochastic process with a boundary constraint. We discuss various models that can account for a multiplicative process determining the sizes of gene families in the genome. In particular, we argue that, in order to explain the observed distributions, gene families have to behave in a coherent fashion within the genome; i.e., the probabilities of duplications of genes within a gene family are not independent of each other. Likewise, the probabilities of deletions of genes within a gene family are not independent of each other.

  12. M Kasahara (1998), "What do the paralogous regions in the genome tell us about the origin of the adaptive immune system?", Immunological Review, 166:159-175. .

  13. Marc Kirschner, John Gerhart (1998), "Evolvability", Proceedings of National Academy of Sciences, 95(15):8420-8427.
    [html]

  14. James R Lupski (1998), "Genomic disorders: structural features of the genome can lead to DNA rearrangements and human disease traits", Trends in Genetics, 14(10):417-422.
    [ abstract]

  15. MA Matzke, AJM Matzke (1998), "Polyploidy and transposons", Trends in Ecology & Evolution, 13(6):241.
    [ abstract]

  16. Richard Mazzarella, David Schlessinger (1998), "Pathological consequences of sequence duplications in the human genome", Genome Research, 8(10):1007-1021.
    [abstract]

  17. M-J Pébusque, F Coulier, D Birnbaum, P Pontarotti (1998), "Ancient large scale genome duplications: phylogenetic and linkage analyses shed light on chordate genome evolution", Molecular Biology and Evolution, 15:1145-1159.
    [abstract]

  18. JH Postlethwait, YL Yan, MA Gates, S Horne, A Amores, A Brownlie, A Donovan, ES Egan, A Force, Z Gong, et al. (1998), "Vertebrate genome evolution and the zebrafish gene map", Nature Genetics, 18:345-349.
    Comments by Skrabanek and Wolfe (1998) : Duplicated genes linked to the four Hox clusters in mammals - the HSA 2/7/12/17 region - are also linked to them in zebrafish, and parts of the HSA 1/6/9/19 region are conserved. This implies that the duplications producing these regions occurred prior to the bony fish/tetrapod divergence, contrary to Lundin's proposal (Fig. 1). As noted in the commentary by Aparicio [49], although zebrafish and mammals show conservation of synteny, gene order is often rearranged in these examples. Postlethwait et al. also report some examples where a pair of linked genes in a mammal seems to correspond to two linked pairs in zebrafish and propose that additional duplications (either of chromosomal fragments or of the whole genome) may have occurred in this species.

  19. J Ramsey, DW Schemske (1998), "Pathways, mechanisms and rates of polyploid formation in flowering plants", Annual Review of Ecol. Syst. 29:467-501.

  20. Sylvie Rouquier, Sylvie Taviaux, Barbara J Trask, Véronique Brand-Arpon, Ger van den Engh, Jacques Demaille, Dominique Giorgi (1998), "Distribution of olfactory receptor genes in the human genome", Nature Genetics, 18:243-250.
    [ abstract]

  21. C Seoighe, KH Wolfe (1998), "Extent of genomic rearrangement after genome duplication in yeast", Proceedings of National Academy of Sciences, 95:4447-4452.
    [html] [ PDF ]
    abstract: Whole-genome duplication approximately 108 years ago was proposed as an explanation for the many duplicated chromosomal regions in Saccharomyces cerevisiae. Here we have used computer simulations and analytic methods to estimate some parameters describing the evolution of the yeast genome after this duplication event. Computer simulation of a model in which 8% of the original genes were retained in duplicate after genome duplication, and 70-100 reciprocal translocations occurred between chromosomes, produced arrangements of duplicated chromosomal regions very similar to the map of real duplications in yeast. An analytical method produced an independent estimate of 84 map disruptions. These results imply that many smaller duplicated chromosomal regions exist in the yeast genome in addition to the 55 originally reported. We also examined the possibility of determining the original order of chromosomal blocks in the ancestral unduplicated genome, but this cannot be done without information from one or more additional species. If the genome sequence of one other species (such as Kluyveromyces lactis) were known it should be possible to identify 150-200 paired regions covering the whole yeast genome and to reconstruct approximately two-thirds of the original order of blocks of genes in yeast. Rates of interchromosome translocation in yeast and mammals appear similar despite their very different rates of homologous recombination per kilobase.

  22. MW Simmen, S Leitgeb, VH Clark, SJ Jones, A Bird (1998), "Gene number in an invertebrate chordate, Ciona intestinalis", Proceedings of National Academy of Sciences, 95:4437-4440.
    [html]
    Comments by Skrabanek and Wolfe (1998) : A method for estimating the number of genes in any eukaryote on the basis of BLAST searches with random genomic and cDNA sequences. The method was tested using subsets of the data from the Caenorhabditis elegans genome project, and was then applied to the tunicate Ciona intestinalis. Its apparent accuracy, simplicity, and low cost - only 76 EST and 1487 genomic single-pass sequencing runs were made - make it attractive to apply to other organisms such as amphioxus and lamprey.

  23. EA Sistermans, RF de Coo, IJ De Wijs and BA Van Oost (1998), "Duplication of the proteolipid protein gene is the major cause of Pelizaeus-Merzbacher disease", Neurology, 50(6):1749-1754.
    [abstract]

  24. L Skrabanek, KH Wolfe (1998), "Eukaryotic genome duplication - where's the evidence? " Current Opinion in Genetics and Development, 8:694-700.
    abstract: Several eukaryotes, including maize, yeast and Xenopus, are degenerate polyploids formed by relatively recent whole-genome duplications. Ohno's conjecture that more ancient genome duplications occurred in an ancestor of vertebrates is probably at least partly true but the present shortage of gene sequence and map information from vertebrates makes it difficult to either prove or disprove this hypothesis. Candidate paralogous segments in mammalian genomes have been identified but the lack of statistical rigour means that many of the proposals in the literature are probably artefacts.
    [ PDF ] [ html]

  25. Andreas Wagner (1998), "The fate of duplicated genes: loss or new function?", Bioessays, 20(10):785-788.
    [ abstract]

  26. Kunitoshi Yamanaka, Li Fang, Masayori Inouye (1998), "The CspA family in Escherichia coli: multiple gene duplication for stress adaptation", Molecular Microbiology, 27(2):247-255.
    [ abstract]

  27. YP Yuan, O Eulenstein, M Vingron, P Bork (1998) "Towards detection of orthologues in sequence databases", Bioinformatics, 14:285-289.
    abstract: MOTIVATION: Numerous homologous sequences from diverse species can be retrieved from databases using programs such as BLAST. However, due to multigene families, evolutionary relationship often cannot be easily determined and proper functional assignment becomes difficult. Thus, discrimination between orthologues and paralogues within BLAST output lists of homologous sequences becomes more and more important. RESULT: We therefore developed a method that attempts to construct a reconciled tree from a gene tree of selected sequences and its corresponding phylogenetic tree of the species involved (species tree). An interface on the Web is developed to enable users to analyse the BLAST result. BLAST outputs are parsed and, for the selected sequences, multiple alignments are constructed either globally or for local regions. Bootstrapped trees are returned and compared with the expected species tree. In cases of discrepancies, gene duplications are assumed and a reconciled tree is computed. The reconciled tree shows probable orthologues and paralogues as predicted.

  28. Jianzhi Zhang, Helene F. Rosenberg, Masatoshi Nei (1998), "Positive Darwinian selection after gene duplication in primate ribonuclease genes", Proceedings of National Academy of Sciences, 95:3708-3713.
    [abstract] [html]