COMPARISON OF WHOLE BECTERIAL GENOMES

Benkahla A., Ogata H., Alimi P., Audic S., Poirot O., Abergel C., and Claverie J.-M.

Structural and Genetic Information Laboratory, CNRS-AVENTIS UMR-1889
Marseille, 13402 France (URL: http://igs-server.cnrs-mrs.fr)

Following the pionneering whole genome shotgun sequencing of hemophilus influenzae [Fleischmann et al. Science 269:496 (1995)], bacterial genomes have accumulated steadily in public databases (see: www.tigr.org/tdb/mdb). More than 20 microorganism genome sequences are now available. Gram- peoteobacteria are well represented with 2 complete genomes for the gamma subdivision (H. influenzae. and E. coli), one for the alpha subdivision (R. prowazekii), and one for the epsilon subdivition (two strains of H. Pylori). Gram + bacteria are also well sampled by 4 complete genomes of firmicutes (B. subtilis, M. tuberculosis, M. genitalium, M. pneumoniae), two of spirochetes (B. burgdorferi and T. pallidum), and two chlamydiales (C. pneumoniae and C. trachomatis.) Finally, the whole genomic sequences of Synechocystis (a cyanobacteria), A. aeolicus and T. maritima (two hyperthermophilic bacteria) complete an already broad survey of the eubacteria sequence universe. The two other basic domains of life are represented, on one hand, by 6 completed genomes of hyperthermophilic archebacteria (Methanococcus, Methanobacterium, Archaeoglobus, Pyrococcus, Aeropyrum) and, on the other hand, by the complete yeast ( S. cerevisiae) and nematode (C. elegans) genomes for the eukaryotes.
Given this large body of sequence data sampling from the 3 main phyla and a wide variety of life styles (aerobic, anaerobic, intracellular, mesophilic, hyperthermophilic, etc.), it seems paradoxical that each newly sequenced genome continues to reveal a significant fraction of unknown genes. At this time, the fraction of completely unassigned ORFs are for instance of 40% for E. coli, 45% for Synechocystis, and 32% for M. genitalium. The corresponding figure for Yeast is about 40%. This trend is persisting in the lattest decyphered genome of T. maritima where 46% of the ORFs are of unknown function. In the meantime, comparative genome analyses have revealed a quite chaotic picture of the molecular evolution of microorganisms. Large variation in genome sizes, as well as numerous evidence of lateral gene transfer, may eventually result in threatening previous classifications and tzxonomy.
After reviewing the impact of whole genome sequence analysis in functional genomics, and the different approaches that can be taken, I will focus on the evidence for eukaryote gene transfer in intracellular bacteria, such as Rickettsia.
International Congress of Molecular Infectiology 2000
(Mar 29-31, 2000, Marseille, France)