RECONSTRUCTION AND PREDICTION OF BIOCHEMICAL PATHWAYS FROM GENOMIC SEQUENCES USING KEGG

Susumu Goto, Kazushige Sato, Hiroyuki Ogata, Wataru Fujibuchi and Minoru Kanehisa
Institute for Chemical Research, Kyoto University,
Gokasho, Uji, Kyoto 611-0011, Japan

KEGG (http://www.genome.ad.jp/kegg/) is our attempt to computerize functional aspects of living cells and organisms in terms of a network of interacting molecules. The PATHWAY database in KEGG currently contains most known metabolic pathways and some regulatory pathways represented in graphical diagrams. In addition, KEGG has two molecular interaction databases: ENZYME for metabolic reactions and BRITE for regulatory interactions, two molecular component databases: GENES for catalogs of genes and gene products in individual organisms and COMPOUND for a catalog of chemical compounds in living cells, and the GENOME map database for the complete genomes and human and mouse chromosomes. All the KEGG databases are tightly integrated together with the existing molecular biology databases through the DBGET/LinkDB system (http://www.genome.ad.jp/dbget/dbget.links.html).

We will demonstrate some of the most unique features of KEGG including the pathway search and reconstruction capabilities. The graphical objects (enzymes, gene products, and compounds) in the PATHWAY database can be searched by the EC number, the molecule identifier (gene accession and compound accession), and the sequence similarity. The pathway reconstruction is done either by matching against a reference pathway or by computing a network of binary relations. Here are some examples. (1) Using the KEGG standard metabolic pathway as a reference, the user can check if a set of enzyme genes with pre-assigned EC numbers would form an amino acid synthethis pathway. (2) When the metabolic pathway is incompletely reconstructed from a given set of enzyme genes, the user can further examine if alternative reaction paths can exist between two specified compounds. (3) Similarly, the user can ask all possible reaction paths that can be formed from a given set of enzyme genes. (4) Using the E. coli ABC transporter as a reference, the user can examine if a set of genes located next each other would form an ABC transporter according to the sequence similarity and also predict the substrate based on the best matching transporter.

This work was suppported in part by the Human Genome Program of the Ministry of Education, Science, Sports and Culture of Japan.
Meeting on "Genome Mapping, Sequencing & Biology"
(May 13-17, 1998, Cold Spring Harbor Laboratory, New York)