Proteomics: What, Why, and How
Proteomics is the study of the proteome—the complete set of proteins in an organism. As you may know, proteins are one of the three biological macromolecules (the other two are carbohydrates and lipids). Proteins are unique in the incredible diversity of structure and function they can exhibit. From the hair on your head to the digestive enzymes in your saliva to the antigens in your immune system, proteins are vital to nearly every biological process. Proteins are the primary way the information stored in DNA is expressed as a physical structure. Indeed, it is through proteins that an organism's genetic makeup (genotype) is reflected in its physical characteristics (phenotype). For all these reasons, researchers are extremely interested in understanding how the proteins in an organism interact and evolve to result in various biological phenomena.
How proteomes are studied
There are two main methods for studying proteomes: immunoassays and antibody-free protein detection. Immunoassays work by using antibodies to specifically bind to a target molecule (antigen) and then detecting or measuring this binding to determine the presence or amount of the target in a sample. Antibody-free detection relies on various technologies, the most prominent of which is mass spectrometry. Researchers use antibody-free detection when they are interested in the sequence of a proteins or there is no known antibody for the protein of interest.
Technologies in Proteomics
2D Gel Electrophoresis
This technique separates proteins based on two properties - their electrical charge and size. First, proteins are separated in one direction based on their electrical charge (isoelectric focusing). Then, they're separated perpendicular to the first direction based on their size (molecular weight). This creates a gel with spots representing individual proteins, allowing researchers to compare protein expression between different samples.
Mass spectrometry
Mass spectrometry is a powerful analytical tool that allows for antibody-free protein detection. Proteins are first broken down into smaller peptides. These peptides are then ionized and sent through a mass analyzer, which measures the mass-to-charge ratio of these ions. The resulting mass spectrum is a unique and can be used to identify and quantify proteins.
A gel shows the presence of proteins of varying electrical charge and size. Photo by Selbst Generiert, via Wikimedia Commons
A scientist examines a gel obtained using 2D electrophoresis. Photo by Ксения Верещагина, via Wikimedia Commons
Protein Microarrays
Similar to DNA microarrays, protein microarrays allow for the simultaneous analysis of thousands of proteins. Small amounts of purified proteins or antibodies are spotted onto a solid surface in an organized grid. These arrays can be used to study protein-protein interactions, enzyme-substrate relationships, or to identify antibodies in patient samples.
Why is proteomics research important?
Studying the proteome is complicated and difficult; unlike an organism’s genome, it is constantly changing (due to interactions between proteins, as well as post-translational modifications) and differs among cell types. If genomics research investigates genomes, which store all the information of the proteins they encode, why bother with proteomics? The answer is that DNA does not actually store all the information of the proteins they encode. It does not account for:
Varying levels of mRNA degradation leading to varying levels of gene expression
Post-translational modifications of proteins that alter their structure and function
The dependency of some proteins to form complexes with other molecules to function
Protein degradation due to cell stress, apoptosis, or autophagy
All these processes and contexts matter when trying to understand how a living, breathing organism works. Analysis of genes alone simply do not provide researchers with a full understanding of how the proteins encoded within those genes operate. This is where proteomics comes in.
Applications of Computational Tools in Proteomics
The technologies described above generate a vast amount of data, which needs to be analyzed using computational tools. Here’s an overview of these tools:
Protein Identification
Mass spectrometry and microarrays produce peptide fragmentation data.
Computational tools compare these peptide sequences with databases (e.g., UniProt, PROSITE) to identify proteins in samples.
Algorithms automate the process, providing faster and more accurate protein identification.
Protein Structure Prediction
Traditional methods (X-ray crystallography, NMR spectroscopy) for protein structure determination have limitations.
Computational models can predict protein structures based on amino acid sequences and known structural data.
This allows for modeling protein interactions and considering protein flexibility.
Post-Translational Modifications
Proteins undergo modifications after synthesis (post-translational modifications), affecting their structure and function.
Bioinformatics tools are being developed to predict and analyze these modifications.
Understanding modifications helps in studying protein behavior and their roles in biological processes.
Computational Methods for Biomarker Discovery
Biomarkers are indicators of biological processes or diseases.
Computational models analyze proteomic data to identify biomarkers.
For instance, studies use computational approaches to detect fetal proteins in maternal blood, aiding in non-invasive monitoring of fetal development.
Applications of Bioinformatics in Proteomics
Medicine: Diagnosis and treatment based on protein biomarkers.
Disease research: Understanding disease mechanisms through protein analysis.
Biomarker identification: Discovering indicators of biological processes or conditions.
A convenient and expansive resource is the ExPASy. It’s developed by Swiss Institute of Bioinformatics and provides access to current programs and databases used in proteomics research, as well as genomics, structural biology, systems biology, and more.