AAAI conference (1998)

KEGG and DBGET/LinkDB: Integration of Biological Relationships in Divergent Molecular Biology Data

Wataru Fujibuchi, Kazushige Sato, Hiroyuki Ogata, Susumu Goto, and Minoru Kanehisa

A simple formulation to integrate various biological data is presented based on the concept of links, which are classified into three types: factual, similarity, and biological. Factual links are cross-reference information of entries among molecular biology databases. Similarity links are neighbor information of sequence entries computed by sequence similarity search programs. Biological links are the novel and powerful relations that are being organized in KEGG, including molecular interactions on the metabolic and regulatory pathways, physical closeness of genes on the genome, and the rest of binary relations in which divergent biological phenomena can be accommodated. DBGET/LinkDB was originally designed to handle factual and similarity links in the existing molecular biology databases, but it is now being extended to include biological links as well. The process of logical reasoning, for example, for functional assignment of newly sequenced genes can often be decomposed into a sequence of combining different types of links; thus, we expect it can be automated by the extension of the DBGET/LinkDB system and the database efforts of KEGG.