The Global Proteome Machine Organization
The Global Proteome Machine
The home of proteomics crowd-sourced "Big Data"
   News Archive
PaxDB links added to GPM (2013/5/5)
Links to the "Protein Abundance Across Organisms" project (PaxDB) have been added to the protein-specific display pages in GPM and GPMDB for appropriate species. These links can be found by clicking the green "Protein" links button at the top of the appropriate pages. The PaxDB system (Wang, M. et al. Mol Cell Proteomics 2012, doi:10.1074/mcp.O111.014704) is, in the words of the developers:
PaxDB is a comprehensive absolute protein abundance database, which contains whole genome protein abundance information across organisms and tissues. In PaxDB, the publicly available experimental data are imported and mapped onto a common namespace and, in the case of tandem mass spectrometry data, re-processed using our in-house standardized spectral counting pipeline.
Data set of the week: (2013/5/3)
Cell type-specific nuclear pores: a case in point for context-dependent stoichiometry of molecular machines.
Overall rating: excellent data (leading the field)
This data set consisted of 30 results, each representing a sub-cellular fractionation experiment analyzed by reversed-phase HPLC. The data files were made available through PASSEL (PASS00190). It was published by Ori A, Banterle N, Iskar M, Andrés-Pons A, Escher C, Khanh Bui H, Sparks L, Solis-Mezarino V, Rinner O, Bork P, Lemke EA, and Beck M in Mol Syst Biol. 2013 Mar 19; 9:648 (PubMed).
This study represents a thorough examination of the proteins associated with an important intracellular structure, the human nuclear pore. A variety of methods were applied to the problem, such as protein chemistry isolations, proteomics, high resolution optical and electron microscopies, and the study tries to synthesize all of these measurements together to obtain a model of how nuclear pores differ between common cell lines (SK-MEL5, RKO, HEK 293 and HeLa). The proteomics data was excellent and demonstrates how well these techniques can be used as a component in the examination of a complex problem.
Data set of the week: (2013/4/25)
The Protein Interaction Landscape of the Human CMGC Kinase Group.
Overall rating: excellent data (worth study)
This data set consisted of 114 results, each representing an affinity-purification experiment analyzed by reversed-phase HPLC. The data files were made available through PASSEL (PASS00226). It was published by Varjosalo M, Keskitalo S, Van Drogen A, Nurkkala H, Vichalkovski A, Aebersold R and Gstaiger M in Cell Reports 2013 25 April, 3(4):1306–20 (PubMed).
The data presented with this manuscript will give any interested investigator significant insight into the work necessary to turn a set of identifications into a high-quality protein-protein interaction map. The proteomics experiments were consistently done, with attention to the often overlooked chromatographic details. Anyone interested in performing or analyzing this type of information would be well served by examining these experiments prior to planning their own. One feature of the data not commented on in the manuscript were the clear signals for Human adenovirus C E1A and E1B proteins (these proteins are coded on the viral DNA responsible for the original HEK 293 transformation). These proteins are commonly observed in HEK 293 proteomics data, but the presence of these proteins in specific pull-down experiments here may shed some light on their role in CMGC kinase signalling, which to our knowledge has not yet been examined.
Data set of the week: (2013/4/18)
A Chemical Proteomics Approach to Profiling the ATP-Binding Proteome of Mycobacterium tuberculosis.
Overall rating: very good data (specialist interest)
This data set consisted of 33 results, each representing chemical modifications, pull-downs and reversed-phase HPLC peptide separations. The data files were made available through ProteomeXchange (PXD000141). It was published by Wolfe LM, Veeraraghavan U, Idicula-Thomas S, Schurer S, Wennerberg K, Reynolds R, Besra GS, and Dobos KM in Mol Cell Proteomics. 2013 Mar 5 (PubMed).
This data represents a particularly interesting class of proteomics experiments: the combination of specific chemical probes that facilitate protein and peptide purification techniques that very specifically target particular pathways and/or biochemistry. In this case, it is the incorperation of desthiobiotin into proteins that bind ATP. This type of experiment does not require cutting-edge high throughput instruments, rather its value is associated with developing a coherent model of how well the modification reagent is functioning in the experimental system and then evaluating the results in the context of that model and the known biochemistry of the system.
Data set of the week: (2013/4/11)
The human leukocyte antigen-presented ligandome of B lymphocytes.
Overall rating: very good data (specialist interest)
This data set consisted of 191 results, each representing a combination of peptide separation methods and reversed-phase HPLC peptide separations. The data files were made available through PASSEL (PASS00211). It was published by Hassan C, Kester MG, Ru AH, Hombrink P, Drijfhout JW, Nijveen H, Leunissen JA, Heemskerk MH, Falkenburg JH, and Veelen PA in Mol Cell Proteomics. 2013 Mar 19 (PubMed).
This study was an interesting investigation of how to purify and detect a special class of endogenous peptides: those presented on the surface of leukocytes by human leukocyte antigen class I molecules. These peptides are relatively short (~9 residues) and are generated by ubiquitin-mediated proteolysis in the cell's proteasome. The peptides are also present at relatively low concentrations in normal cells and since they have no special properties, they can be difficult to isolate. The protocols described in this paper do a good job of purifying these peptides and demonstrated the merits of different peptide separations methods when applied to this situation. Any group interested in the large-scale analysis of HLA class I peptides should examine the data in this work carefully.
Data set of the week: (2013/4/4)
Extensive Mass Spectrometry-Based Analysis of the Fission Yeast Proteome: The S. pombe PeptideAtlas.
Overall rating: very good data (general interest)
This data set consisted of 384 results, each representing a combination of protein separation methods and reversed-phase HPLC peptide separations. The data files were made available through PASSEL (PASS00069). It was published by Gunaratne J, Schmidt A, Quandt A, Neo SP, Sarac OS, Gracia T, Loguercio S, Ahrne E, Li Hai Xia R, Tan KH, Loessner C, Bahler J, Beyer A, Blackstock W, and Aebersold R in Mol Cell Proteomics. 2013 Mar 5 (PubMed).
The experiments described in this paper were an attempt to characterize the proteome of the yeast Schizosaccharomyces pombe using modern instruments and standard sample preparation methods. The laboratory model organism S. pombe is descended from an environmental species and it has been extensively studied for determining the mechanisms behind its cell cycle and genetics. S. pombe has been less studied by proteomics methods than its other beer-related relative, S. cerevisiae, and this study does a lot to address that deficit.
X! Tandem turns X! (2013/4/3)
Yesterday (April 2, 2013) marked the ten anniversary of the first public release of the proteomics search engine X! Tandem. At the time, it was the first open source protein identification software project to become available (a previous academic effort had been attempted but was shutdown because of patent concerns). The release of the X! Tandem code – and the subsequent release a few months later of the NCBI-sponsored open source search engine OMSSA – ushered in the modern era of proteomics informatics and computational biology.
Data set of the week: (2013/3/28)
Rapid phosphoproteomic and transcriptomic changes in the rhizobia-legume symbiosis.
Overall rating: excellent data (worth study)
This data set consisted of 382 results, each representing a fraction from multidimensional chromatography experiments, using iTRAQ derivatives for quantitation. The data files were made available through PASSEL (PASS00056). It was published by Rose CM, Venkateshwaran M, Volkening JD, Grimsrud PA, Maeda J, Bailey DJ, Park K, Howes-Podoll M, den Os D, Yeun LH, Westphall MS, Sussman MR, Ané JM, and Coon JJ in Mol Cell Proteomics 2012 11:724-44 (PubMed).
This study represented the first major effort to characterize the proteome of the model plant species Medicago truncatula, a legume species in the same genus as the commercially important forage crop Medicago sativa. The study takes advantage of the known genome of the organism and provides some interesting insights into the role of protein phosphorylation in metabolic processes associated with interaction between the plant and the symbiotic nitrogen-fixing bacteria that it hosts in root nodules. The data was of excellent quality and it provides the best resource available for obtaining phosphodomain annotation for legume proteins.
Data set of the week: (2013/3/21)
Interlaboratory reproducibility of large-scale human protein-complex analysis by standardized AP-MS.
Overall rating: very good data (specialist interest)
This data set consisted of 288 results, each representing a single affinity purification experiment. The data files were made available through PASSEL (PASS00117). It was published by Varjosalo M, Sacco R, Stukalov A, van Drogen A, Planyavsky M, Hauri S, Aebersold R, Bennett KL, Colinge J, Gstaiger M and Superti-Furga G in Nat Methods 2013 Mar 3 (PubMed).
These experiments represent probably the best resource available for data that can be used to teach the practice of computational biology as it applies to proteomics. The raw data, as well as the information generated by two different peptide identification systems, can be directly downloaded and used. The preliminary analyses available (on which the manuscript was based) provide a nice insight into the current generic practices used for peptide and protein identification by many laboratories. Given the data and information available in these files, it is practical to create a series of tutorial subsets of the files that can be manipulated directly by students to answer many questions relevant to practical proteomics data analysis, for example:
  1. What are the limits of reproducibility of any particular affinity purification?
  2. How do false positives and false negatives affect the conclusions that can be reached from such a study?
  3. Are the splice variants, residue modifications and amino acid variants consistent between affinity purification identifications of the same proteins?
Data set of the week: (2013/3/14)
Physiological Adaptation of the Rhodococcus jostii RHA1 Membrane Proteome to Steroids as Growth Substrates.
Overall rating: excellent data (worth study)
This data set consisted of 220 results consisting of multidimensional chromatography single HPLC runs and experiment summaries. The data files were made available through Proteomexchange (PXD000016). It was published by Haussmann U, Wolters DA, Fraenzel B, Eltis LD, and Poetsch A. in J Proteome Res. 2013 12:1188-98 (PubMed).
The organism being studied here (Rhodococcus jostii) belongs to a genus of largely non-pathogenic prokaryotes belonging to the large suborder Corynebacterineae that contains Mycobacteria and Corynebacteria. It is known to be very good at utilizing a wide variety of substrates for growth, including many types of aromatic compounds that are not easily metabolized by other bacteria. This study does an excellent job of characterizing changes in the membrane proteins present in the organism in response to changing its carbon source from mainly pyruvate to cholesterol and cholate. The experiments were consistently well done and the results provide an excellent resource for investigating how well membrane proteins can be recovered using standard proteomics techniques. An interesting feature of the results is the apparent presence of a threonine/serine specific, protein-N-acetyl transferase, which results in protein N-terminal acetylation at threonine residues (with reduced activity for serine).
X! Tandem 2013.02.01 on-line testing complete (2013/3/3)
The latest version of X! Tandem 2012.02.01 has completed its on-line testing and should be considered a stable release. This new version of the open source search engine has several new features, including enhanced compatibility with the SRM/MRM project Skyline (thanks to Brendan Maclean and Brett Phinney). It also has a new sequence stacking mechanism that effectively produces non-redundant sequence databases on the fly, which can significantly improve performance when using the proteomes of multiple strains of prokaryote species or eukaryote protein sequences that correspond to multiple alternate splice variants that are only different in the mRNA's UTRs.
Human protein sequence update (2013/2/12)
The main GPM system has been updated to use the latest version of the human proteome — ENSEMBL v. 70.37 — which was based on the human genome sequence GRCh37.p10, Feb 2009. All of the relevant resources (including annotated spectral library and proteotypic peptides) have been updated to the new sequence set. The annotation file for human SNAPs (Single Nucleotide Amino acid Polymorphisms) has been updated to dbSNP 137 (1,690,969 SNAPs), using ENSEMBL's Biomart interface for the DNA-to-protein coordinate and allele mapping.
JPR Special Issue on cHPP released (2013/1/7)
The Journal of Proteome Research has released it's special issue associated with the launch of the chromosome-based Human Proteome Project. This issue is freely available on-line and it includes articles from many of the cHPP national groups, as well as technical papers associated with the creation of boutique databases, laboratory techniques and visualization systems specifically designed for use by the cHPP. There are also several general articles on the current status of experimental evidence for the existence of human proteins, discussing the gaps in our knowledge at the moment.
Copyright © 2013, The Global Proteome Machine Organization