Lesson on the Protein Data Bank (PDB) Database

Introduction to the PDB Database

What is the PDB?

The Protein Data Bank (PDB) is a globally accessible repository for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. Established in 1971, the PDB archives the coordinates and structure factors of biological macromolecules that have been determined experimentally using techniques like X-ray crystallography, NMR spectroscopy, and cryo-electron microscopy. The PDB is managed by the Worldwide Protein Data Bank (wwPDB), which is a consortium of organizations including the RCSB PDB (USA), PDBe (Europe), PDBj (Japan), and BMRB (USA).

Relevant Terms

  • Protein: Large, complex molecules made up of amino acids that play many critical roles in the body, including catalyzing metabolic reactions, DNA replication, and transporting molecules.
  • Nucleic Acid: Biopolymers, including DNA and RNA, that are essential for all known forms of life. They carry genetic information and are involved in protein synthesis.
  • X-ray Crystallography: A technique used to determine the atomic and molecular structure of a crystal by diffracting X-rays through it.
  • NMR Spectroscopy (Nuclear Magnetic Resonance): A technique that exploits the magnetic properties of certain nuclei to determine the structure of organic compounds and biomolecules.
  • Cryo-Electron Microscopy: A form of electron microscopy where samples are studied at cryogenic temperatures, providing structural information at near-atomic resolution.
  • Coordinate Files: Files containing the three-dimensional coordinates of atoms in a macromolecule.
  • Structure Factors: Data derived from X-ray diffraction experiments that are used to reconstruct the electron density map of a crystal.
  • Electron Density Map: A three-dimensional representation of the electron density within a crystal, used to model the atomic structure.

Impact of the PDB

The PDB has had a profound impact on the fields of biology, medicine, and biotechnology. By providing detailed structural data, it enables researchers to:

  • Understand the molecular basis of diseases.
  • Design new drugs and therapeutics.
  • Engineer novel biomolecules with specific functions.
  • Advance fundamental knowledge in molecular biology and biochemistry.

How to Use the PDB

Accessing the PDB

You can access the PDB database through various portals, such as:

Searching for Data

  1. Basic Search: Use keywords such as protein names, PDB IDs, or authors.
  2. Advanced Search: Filter results by criteria like resolution, method of determination, or organism.
  3. Sequence Search: Find structures based on amino acid or nucleotide sequences.
  4. Structure Comparison: Compare 3D structures to find similarities and differences.

Downloading Data

Once you find the desired structure, you can download:

  • Coordinate Files: Contain 3D coordinates of atoms.
  • Structure Factors: Include experimental data used for structure determination.
  • NMR Restraints: Provide information on distances and angles used in NMR studies.

Features of the PDB

Visualization Tools

  • 3D Viewers: Tools like Jmol, NGL Viewer, or PyMOL allow you to visualize structures in 3D.
  • Annotations: Functional sites, domains, and motifs are annotated for ease of understanding.
  • Molecular Graphics: High-quality images and animations of molecular structures.

Validation Tools

  • Validation Reports: Each structure is accompanied by a validation report assessing the quality of the model.
  • Geometry Checks: Ensuring the accuracy of bond lengths, angles, and torsions.
  • Electron Density Maps: Verification against experimental data.

Analysis Tools

  • Ligand Interaction: Analyzing how small molecules interact with proteins.
  • Mutational Analysis: Studying the effects of amino acid substitutions.
  • Homology Modeling: Predicting structures based on known homologs.

Preparation, Validation, and Analysis in the PDB

Preparation

  1. Experimental Determination: Using X-ray crystallography, NMR spectroscopy, or cryo-electron microscopy to determine the structure.
  2. Data Processing: Refining raw data to generate 3D coordinates.
  3. Deposition: Submitting data to the PDB, including coordinates, structure factors, and metadata.

Validation

  1. Initial Checks: Automatic validation checks for errors in the deposited data.
  2. Quality Assessment: Using tools like MolProbity to assess model geometry and fit to experimental data.
  3. Peer Review: Validation reports are reviewed by experts in the field.

Analysis

  1. Functional Analysis: Studying the biological function and mechanism of the macromolecule.
  2. Comparative Analysis: Comparing with other known structures to find similarities and differences.
  3. Drug Design: Using structural information to design and optimize potential drugs.

Visualization in the PDB

3D Visualization Tools

  • NGL Viewer: Web-based tool for viewing and analyzing structures.
  • Jmol: Java-based viewer for interactive visualization.
  • PyMOL: Professional molecular visualization tool with extensive features.

Generating Images and Animations

  • Snapshots: Capture high-quality images of molecular structures.
  • Animations: Create movies to illustrate molecular dynamics and interactions.

Mini-Activity: Exploring the PDB

Objective

Learn how to search, download, and visualize a protein structure using the PDB database.

Steps

  1. Search for a Protein:

    • Go to RCSB PDB.
    • Search for "hemoglobin" or use PDB ID "1A3N".
    • Explore the summary page to learn about the structure.
  2. Download the Structure:

    • Download the coordinate file (PDB format) for "1A3N".
  3. Visualize the Structure:

    • Open the file using an online viewer like NGL Viewer or download PyMOL.
    • Explore the 3D structure, focusing on the heme group and its binding sites.
  4. Analyze the Structure:

    • Identify key features such as alpha-helices, beta-sheets, and functional sites.
    • Use the validation report to assess the quality of the structure.

Discussion Questions

  1. What is the function of hemoglobin, and how is its structure related to its function?
  2. What are the key interactions between the heme group and the protein?
  3. How can structural information from the PDB be used in drug design?

Conclusion

Understanding the PDB and its functionalities is crucial for modern biological research. It provides invaluable resources for studying macromolecular structures, understanding their functions, and applying this knowledge to solve real-world problems in medicine and biotechnology.

For more detailed information, you can refer to the Guide to Understanding PDB Data on the PDB-101 website.