The Global Proteome Machine Organization

  The Global Proteome Machine Organization
  Proteomics Database and Open Source Software
  www.thegpm.org

Welcome!

The Global Proteome Machine Organization was set up so that scientists involved in proteomics using tandem mass spectrometry could use that data to analyze proteomes. The projects supported by the GPMO have been selected to improve the quality of analysis, make the results portable and to provide a common platform for testing and validating proteomics results.

Latest GPM News

Data set of the week: (2010/02/07)
The value of using multiple proteases for large-scale mass spectrometry-based proteomics.

This dataset was transfered to GPMDB via ProteoExchange from TRANCHE. The data is composed of 15 LC/MS/MS runs is from a study published in J. Proteome Research by Danielle L. Swaney, Craig D. Wenger and Joshua J. Coon (DOI: 10.1021/pr900863u).

The data is from experiments in which an S. cerevisiae whole cell lysate was digested with one of five enzymes (trypsin, LysC, ArgC, AspN, and GluC), in triplicate. The results clearly show that any of these proteases can be used very effectively with standard proteomics equipment, giving very similar protein identifications.

New database server added at Rockefeller University (2010/02/05)

Starting today, a new database server has been added to the GPMDB system, based at Rockefeller University in New York City. This new server joins the other servers at the University of Manitoba, the University of British Columbia and Beavis Informatics, which make up the GPMDB cloud system.

Data set of the week: (2010/01/31)
Identifying blood biomarkers and physiological processes that distinguish humans with superior performance under psychological stress.

This dataset was transfered to GPMDB via ProteoExchange from PRIDE (Pride accessions 10075-10092). The data (GPM77710000113-GPM77710000130) is from a study published in PLoS One by Cooksey AM, Momen N, Stocker R, and Burgess SC (PLoS One. 2009 Dec 18;4(12):e8371 PubMed).

The results show the plasma proteins that change in response to the Modular Egress Training psychological stress test, given to a group of naval aviation students. The data was obtained using an LCQ DECA XP Plus and analyzed using X! Hunter (annotated spectrum library searches).

GPM sites using the new X! Tandem (2010/01/27)

Starting today, the public GPM servers will be using the new release of X! Tanden and X! P3 (2010.01.01.1). Once live testing is complete, the release code for this new version will be made available.

Features new to 2010.01.01 are improved handling of protein N-terminii and improved handling of phosphorylated peptides, through the detection of associated neutral losses. The new parameter set includes the following:

  1. quick acetyl - protein N-terminal modification detection,
  2. stP bias - interpretation of peptide phosphorylation models, and
  3. quick pyrolidone - peptide N-terminus cyclization detection.

Data set of the week: (2010/01/24)
High quality catalog of proteotypic peptides from human heart

This dataset was transfered to GPMDB from the authors' web site, corresponding to the manuscript of the same name, Kline, KG, et al.,J Proteome Res. 2008 Nov;7(11):5055-61. PubMed. This data is not currently available on other ProteomeExchange respositories.

The data consists of 96 LCMS runs analyzed with a ThermoFinnigan LTQ mass spectrometer. It is a good example of the type of data that can be obtained from cardiac muscle using multidimensional chromatography directly on tissue lysate.

Data set of the week: (2010/01/17)
A Mitochondrial Protein Compendium Elucidates Complex I Disease Biology

This dataset was transfered to GPMDB from TRANCHE, corresponding to the manuscript of the same name, Pagliarini, DJ, et al., Cell 134:112-123 doi:10.1016/j.cell.2008.06.016.

The data consists of 26 individual data sets, composed of replicates of mitochondrial proteins obtained from a variety of mouse tissues (cerebellum, cerebrum, brainstem, spinal cord, kidney, liver, heart, skeletal muscle, testis and placenta). It is a good example of high quality proteomics data, obtained using a Thermo-Finnigan Orbitrap hybrid mass spectrometer.

Data set of the week: (2010/01/10)
Comparative analysis of the human and mouse placental transcriptome and proteome

This dataset was transfered to GPMDB from Peptidome via ProteoExchange, from the Peptidome entries PSM1063 (mouse) and and PSM1064 (human). The cells in the tissue were separated from extracellular proteins and various subcellular fractions were analyzed separately. The data was originally published in Cox B, et al., Mol Syst Biol 2009;5:279. PMID: 19536202.

Note: the Peptidome entry misidentifies the mass spectrometry platform as being an "TRAP-FTMS" while it is actually a Thermo-Finnigan LTQ (with no additional hybrid component).

Data set of the week: (2010/01/03)
Large-scale phosphorylation analysis of mouse liver

This dataset was transfered to GPMDB from TRANCHE and it is not currently held in any other ProteoExchange database (see data). It is credited to Villén J, Beausoleil SA, Gerber SA, and Gygi SP, and it is described in Proc Natl Acad Sci U S A. 2007 Jan 30;104(5):1488-93.

This data set is a good example of the quality of phosphorylation data that can be obtained using SCX separation of a tissue extract, followed by IMAC phosphopeptide enrichment of each fraction, when using an LTQ-Orbitrap mass spectrometer. The data view that is obtained from the link above shows all of the detected phosphopeptides, with a peptide false positive rate of ~ 0.14%, i.e., about 10 times more stringent than the analysis in the original paper.

Data set of the week: (2009/12/28)
Community proteogenomics reveals insights into the physiology of phyllosphere bacteria

This dataset was transfered to GPMDB via ProteoExchange from PRIDE (see data). It is credited to Delmotte N, et al. and it is described in Proc Natl Acad Sci U S A. 2009 Sep 22;106(38):16428-33.

Data-set-of-the-week is a new feature for GPMDB, started with the intent of highlighting high quality data sets that have been made available via GPMDB and ProteomExchange. Data sets will be selected by a panel, but any suggestions (email to dsotw@thegpm.org) of suitable data will be considered.

The 1,000 most observed human proteins (2009/11/06)

This spreadsheet (human_top_1000.xls) is a list of protein sequences that have been observed most often by GPM users who used the "human" GPM search server. The columns in the spreadsheet are as follows:

  1. Column A: ENSEMBL protein accession number for the sequences;
  2. Column B: HUGO Gene Naming Committee symbol for the associated gene;
  3. Column C: NCBI gene number for the associated gene;
  4. Column D: International Protein Index accession number for the sequence;
  5. Column E: SwissProt/Uniprot accession for the sequence;
  6. Column F: the probability that a protein will be found in a dataset (as a percentage);
  7. Column G: the base-10 log of the minimum expectation value found for that protein; and
  8. Column H: a text description of the protein.

The value in Column F was calculated by taking the number of times (ni) that the protein was observed in the approximately 24,000 (N) datasets examined and doing the simple calculation:

pi = 100(ni/N)

A "dataset" corresponds to a submitted set of MS/MS spectra, which results in a GPM result file, so it is roughly equivalent to the set of data from an LC/MS/MS run. A protein can only be observed once in a dataset.


Click here to check the news archive.

Copyright © 2008, The Global Proteome Machine Organization

Privacy Statement