|
The Global Proteome Machine was set up so that scientists involved
in proteomics using tandem mass spectrometry could use that data to analyze
proteomes. The following references to the reviewed literature are
suggestions for assisting scientists in preparing publications that
use the system.
-
The Biopolymer Markup Language, David Fenyö,Bioinformatics. 1999, 15, 339-40.
This is the best reference for the underlying XML that
is used throughout the GPM system.
-
Informatics and data management in proteomics, David Fenyö and Ronald C. Beavis,
Trends Biotechnol. 2002, 20, S35-8.
This is a good reference for our underlying philosophy
of how bioinformatics and proteomics go together.
-
A Method for Assessing the Statistical Significance of Mass Spectrometry-Based
Protein Identifications Using General Scoring Schemes, David Fenyö and Ronald C. Beavis,
Anal. Chem., 2003, 75, 768-774.
This reference describes how peptides are scored by X! TANDEM. The
expectation values on the individual peptides are calculated using this method.
-
TANDEM: matching proteins with mass spectra, Robertson Craig and Ronald C. Beavis, Bioinformatics,
2004, 20, 1466-7.
This reference is the official reference for X! TANDEM as open source software.
-
A Method for Reducing the Time Required to Match Protein Sequences with Tandem Mass Spectra, Robertson Craig and Ronald C. Beavis; Rapid Commun. Mass Spectrom., 2003, 17: 2310-2316.
This contains most of the technical details of how X! TANDEM speeds up searches.
-
Probity: A Protein Identification Algorithm with Accurate Assignment of the Statistical
Significance of the Results,
Jan Eriksson and David Fenyö, J. Proteome Res., 2004, 3, 32-36.
This reference describes the statistical model
of how protein expectations can be calculated from a selected group of peptides. It is referenced in 4 and 5 (although it
wasn't in print yet). The expectation values for proteins are calculated with this method, together with the
expectation values of the individual peptides.
-
Evaluation of Multidimensional Chromatography Coupled with Tandem Mass Spectromety
(LC/LC-MS/MS) for Large-Scale Protein Analysis: The Yeast Proteome,
J. Peng, J.E. Elias, C.C. Thoreen, L.J. Licklider and S. P. Gygi, J. Proteome Res., 2003, 2, 43-50.
This reference describes the idea of using reversed sequences to validated large collections
of protein identifications. The GPM has this method built-in as a possible method for validation.
-
An Open Source System for Analyzing, Validating and Storing Protein Identification Data,
Robertson Craig, John P. Cortens and Ronald C. Beavis,
, J. Proteome Res., 2004, 3, 1234-42.
This reference describes the underlying technical aspects of how the GPMDB was constructed
and some of its potential uses in proteomics.
-
An improved model for prediction of retention times of tryptic peptides in ion-pair reverse phase
HPLC; its application to protein peptide mapping by off-line HPLC-MALDI MS,
O. V. Krokhin, R. Craig, V. Spicer, W. Ens, K. G. Standing, R. C. Beavis, J. A. Wilkins,
, Mol. Cell Proteomics, 2004, 3, 908-919.
GPM uses calculated reverse phase HPLC retention times to create synthetic HPLC displays. This
article describes the theoretical and practical aspects of how the system does this calculation.
-
The use of proteotypic peptide libraries for protein identification.,
R. Craig, J.P. Cortens, R.C. Beavis,
Rapid Commun Mass Spectrom. 2005 Jun 8;19(13):1844-1850.
The initial publication describing how proteotypic peptide libraries are constructed and
used to improve protein identifications in the GPM.
-
Using Annotated Peptide Mass Spectrum Libraries for Protein Identification.,
R. Craig, J.P. Cortens, D. Fenyo and R.C. Beavis,
J. Proteome Res. 2006, 10.1021/pr0602085.
The initial publication describing the use of Annotated Spectrum Libraries (ASLs) and
X! Hunter to identify proteins.
-
Determining the overall merit of protein identification data sets: rho-diagrams and rho-scores.,
Fenyo D, Phinney BS, Beavis RC.,
J Proteome Res. 2007 May;6(5):1997-2004
The publication describing rho-scoring (ρ-scoring) and how it can be used to evaluate the relative merit of a proteomics data set.
|