What is KEGG?

The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a comprehensive database resource for understanding high-level functions and utilities of biological systems such as cells, organisms, and ecosystems, based on genomic and molecular-level information. The database consists of 16 datasets, each of which is organized into 4 categories (Systems Information, Genomic Information, Chemical Information, and Health Information).

  1. KEGG PATHWAY - Systems Information

  2. KEGG BRITE - Systems Information

  3. KEGG MODULE - Systems Information

  4. KEGG ORTHOLOGY (KO) - Genomic Information

  5. KEGG GENES - Genomic Information

  6. KEGG GENOME - Genomic Information

  7. KEGG COMPOUND - Chemical Information

  8. KEGG GLYCAN - Chemical Information

  9. KEGG REACTION - Chemical Information

  10. KEGG RCLASS - Chemical Information

  11. KEGG ENZYME - Chemical Information

  12. KEGG NETWORK - Health Information

  13. KEGG VARIANT - Health Information

  14. KEGG DISEASE - Health Information

  15. KEGG DRUG - Health Information

  16. KEGG DGROUP - Health Information


Why is KEGG Important?

As we discussed in the Background lesson, the biological function of the living cell is a result of many interacting molecules. KEGG is an effort to create a computer model of biological information systems represented in terms of molecular interaction and reaction networks. By integrating genomic, chemical, and systemic functional information, it facilitates the understanding of biological system, such as the cell, organism, and ecosystem. So, the KEGG database is crucial in the fields of genomics, bioinformatics, and systems biology. One of its key contributions is the KEGG PATHWAY database, which maps molecular interaction and reaction networks, including metabolic pathways, regulatory pathways, and molecular complexes. These pathways help researchers identify potential targets for drug development and understand the biochemical basis of diseases, leading to more precise medical interventions. In addition, the KEGG database is used for annotating genes and proteins in newly sequenced genomes. It also provides a valuable resource for interpreting the biological significance of genome data.

Systems Information

  1. KEGG PATHWAY: KEGG PATHWAY contains pathway maps that depict molecular interaction and reaction networks, covering various biological processes such as metabolism, genetic information processing, and cellular processes. It aids researchers in understanding complex biological functions and how different molecular entities interact within a cell or organism.

  2. KEGG BRITE: KEGG BRITE is a collection of hierarchical classifications of biological entities, including proteins, genes, and compounds. It organizes data into functional hierarchies, ontologies, and taxonomies, facilitating the integrated analysis of biological data across different levels of biological organization.

  3. KEGG MODULE: KEGG MODULE comprises predefined functional units or modules that represent conserved gene sets and functional complexes. These modules are crucial for pathway annotation, helping researchers understand the systematic organization and modularity of cellular functions.

Genomic Information

  1. KEGG ORTHOLOGY (KO): KEGG ORTHOLOGY (KO) is a database of orthologous gene groups that link genomic information to higher-order functional information. KO entries are essential for assigning functions to genes and proteins across various organisms, enabling comparative genomics and evolutionary studies.

  2. KEGG GENES: KEGG GENES provides comprehensive gene catalogs for all sequenced genomes, including annotations and cross-references to other databases. This resource is vital for studies on gene function, regulation, and interaction.

  3. KEGG GENOME: KEGG GENOME contains information about the complete genomes of various organisms, offering genetic and genomic data essential for comparative genomics. It provides insights into the evolutionary relationships and functional annotations of genomes.

Chemical Information

  1. KEGG COMPOUND: KEGG COMPOUND includes information on chemical compounds involved in metabolic reactions. It covers details on their structures, chemical properties, and biological activities, aiding in the study of metabolism and biochemical pathways.

  2. KEGG GLYCAN: KEGG GLYCAN provides information on glycans, including glycan structures, glycan-related genes, enzymes, and pathways. It is essential for glycomics research, focusing on the role of carbohydrates in biological processes.

  3. KEGG REACTION: KEGG REACTION lists biochemical reactions, detailing substrates, products, and the enzymes involved. It is used to map reactions to metabolic pathways, helping researchers understand metabolic fluxes and transformations.

  4. KEGG RCLASS: KEGG RCLASS defines reaction classes, categorizing biochemical reactions based on their reaction mechanisms and the transformation patterns of substrates. This classification helps in the systematic analysis of biochemical reactions and metabolic networks.

  5. KEGG ENZYME: KEGG ENZYME provides detailed information on enzymes, their functions, and their roles in metabolic pathways. It integrates with pathway maps and gene catalogs, offering a comprehensive view of enzymatic activities and their biological significance.

Health Information

  1. KEGG NETWORK: KEGG NETWORK contains data on molecular interaction networks and their relationships with diseases and drugs. It helps in understanding the complex network of molecular interactions underlying various health conditions and therapeutic interventions.

  2. KEGG VARIANT: KEGG VARIANT focuses on genetic variants and their associations with diseases. It provides information on the impact of genetic variations on gene function and their role in disease susceptibility, aiding in personalized medicine and genomics research.

  3. KEGG DISEASE: KEGG DISEASE contains information on genetic and environmental factors associated with human diseases. It links molecular data to disease phenotypes, helping researchers understand disease mechanisms and identify potential therapeutic targets.

  4. KEGG DRUG: KEGG DRUG includes information about approved drugs, their molecular targets, and mechanisms of action. It serves as a valuable resource for drug discovery, development, pharmacology, and medical research, providing insights into drug interactions and therapeutic strategies.

  5. KEGG DGROUP: KEGG DGROUP groups related drugs, facilitating the comparative analysis of drug efficacy, safety, and pharmacokinetics. This database helps in understanding drug interactions, optimizing therapeutic strategies, and managing drug therapy in clinical settings.

Below tutorial guides you in accessing KEGG pathways database using a Python package

Prerequisites

  • Have the latest version of Python

Credit: KEGG