Proteomics crowdsourced "Big Data". The GPM is an experimental project to create knowledge from proteomics data & reuse it to solve biomedical research problems.
Data set of the week: (2013/5/16)
Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq.
Overall rating: excellent data (leading the field)
This data set consisted of 28 results that comprised a single multidimensional chromatography experiment. The data files were made available through PASSEL (PASS00215). It was published by Sheynkman GM, Shortreed MR, Frey BL and Smith LM in Mol Cell Proteomics 2013 Apr 29 (PubMed).
This data set defines the state-of-the-art with respect to "deep" proteomics of a human cell line (Jurkat cells). The combination of a first dimension using high pH HPLC followed by low pH HPLC produced a very well separated collection of peptides. The use of HCD coupled with high resolution fragment ion measurements using an Orbitrap lead to very high confidence peptide assignments. Anyone interested in detecting relatively rare post-translational modifications or determining splice variants would be well served by performing their analysis on this data set first.
In addition to the human phosphorylation annotation released on Sunday, we have also prepared annotation in the same format for a set of model species commonly used in proteomics experiments. Annotation for the following species is now available: C. elegans, D. melanogaster, M. musculus and S. cerevisiae.
As part of our contribution to the Human Proteome Project, we have compiled a comprehensive list of all human protein phosphorylation sites represented by good quality data in GPMDB. This list has been subdivided on a chromosome-by-chromosome basis, using ENSEMBL v. 70 as the source of the protein and gene sequences. All of the splice variants listed by ENSEMBL have been annotated.
The files associated with the annotation for each chromosome (and a merged list of all chromosomes) is now available by FTP. A description of the format of these files (README.txt) is in the same directory. A short summary of the number of phospho-proteins, genes and sites is given here. For unique protein sequences in the proteome, the overall totals are as follows:
Data set of the week: (2013/5/10)
Proteogenomic Analysis of Human Colon Carcinoma Cell Lines LIM1215, LIM1899, and LIM2405.
Overall rating: very good data (specialist interest)
This data set consisted of 136 results composed of individual SDS-PAGE gel slices and experiment summaries. The data files were made available through ProteomeXchange (PXD000120). It was published by Fanayan S, Smith JT, Lee LY, Yan F, Snyder M, Hancock WS and Nice E in J Proteome Res. 2013 Mar 13 (PubMed).
The data reported here was a good example of what can be done with whole cell lysates analyzed using SDS-PAGE protein separations and low resolution (LTQ) mass spectrometry. The experiments elucidate an interesting biological issue: "How different were the protein concentrations in three related cell lines and how were those changes generated by differences in RNA concentration?" This data would be useful for anyone interested in practical difficulties associated with combining protein molecular mass information with peptide identifications when using SDS-PAGE gels for protein separations.
As some readers may have noticed, the logo for GPM and GPMDB has changed recently (thanks to noted electronic artist KD Thornton). This change is part of a general redesign of the site to conform to more modern web page coding trends, simplify page navigation and improve the usefulness of the overall site on smaller screens and mobile platforms. If you have any suggestions regarding things you would like to see in a new design (or things that really bug you about the current one), please let us know at email@example.com.
The definition of the GPMDB REST interface has been expanded to include a new method, allowing the rapid calculation of peptide ω frequencies for any set of peptides and protein accession numbers stored in GPMDB. These frequencies are useful when comparing observed peptides to those previously observed: a technical definition is given here. The description of this method and an example have been added to the GPMDB Wiki page for the REST interface.
Copyright © 2013, The Global Proteome Machine Organization. Privacy Statement