RECONSTRUCTION AND PREDICTION OF BIOCHEMICAL
PATHWAYS FROM GENOMIC SEQUENCES USING KEGG
Susumu Goto, Kazushige Sato, Hiroyuki Ogata, Wataru Fujibuchi and
Minoru Kanehisa
Institute for Chemical Research, Kyoto University,
Gokasho, Uji, Kyoto 611-0011, Japan
KEGG (http://www.genome.ad.jp/kegg/) is our attempt to computerize
functional aspects of living cells and organisms in terms of a network of
interacting molecules. The PATHWAY database in KEGG currently
contains most known metabolic pathways and some regulatory pathways
represented in graphical diagrams. In addition, KEGG has two molecular
interaction databases: ENZYME for metabolic reactions and BRITE for
regulatory interactions, two molecular component databases: GENES for
catalogs of genes and gene products in individual organisms and
COMPOUND for a catalog of chemical compounds in living cells, and the
GENOME map database for the complete genomes and human and mouse
chromosomes. All the KEGG databases are tightly integrated together with
the existing molecular biology databases through the DBGET/LinkDB
system (http://www.genome.ad.jp/dbget/dbget.links.html).
We will demonstrate some of the most unique features of KEGG including
the pathway search and reconstruction capabilities. The graphical objects
(enzymes, gene products, and compounds) in the PATHWAY database
can be searched by the EC number, the molecule identifier (gene accession
and compound accession), and the sequence similarity. The pathway
reconstruction is done either by matching against a reference pathway or by
computing a network of binary relations. Here are some examples. (1)
Using the KEGG standard metabolic pathway as a reference, the user can
check if a set of enzyme genes with pre-assigned EC numbers would form
an amino acid synthethis pathway. (2) When the metabolic pathway is
incompletely reconstructed from a given set of enzyme genes, the user can
further examine if alternative reaction paths can exist between two specified
compounds. (3) Similarly, the user can ask all possible reaction paths that
can be formed from a given set of enzyme genes. (4) Using the E. coli
ABC transporter as a reference, the user can examine if a set of genes
located next each other would form an ABC transporter according to the
sequence similarity and also predict the substrate based on the best
matching transporter.
This work was suppported in part by the Human Genome Program of the
Ministry of Education, Science, Sports and Culture of Japan.
Meeting on "Genome Mapping, Sequencing & Biology"
(May 13-17, 1998, Cold Spring Harbor Laboratory, New York)