The Global Proteome Machine Organization
   GPMDB Frequently Asked Questions
  1. When is GPMDB populated with new results?
  2. Where can I read more about GPMDB?
  3. What do the various diagrams represent?
  4. What do the G|P|X links on the accession results page link to?

1. When is GPMDB populated with new results?

Each day at approximately 6:00 GMT, each public GPM server sends the results that were created in the past 24 hours to the GPMDB server. The database is populated with the results it has received for the current session. All the results are publicly available by approximately 12:00 GMT the same day.


2. Where can I read more about GPMDB?

GPMDB and associated GPM features have been published in a set of manuscripts that can be accessed here. If you are more interested in technical details, we have compiled a set of documents on the GPM wiki.


3. What do the various diagrams represent?

The coverage diagram is displayed at the top of the protein page, on the accession number page as a list and on the protein validation page. The red sections represent the peptides that have been identified for the result and their placement along the entire protein length. The opacity is based on the expectation value for that peptide. The darker the color, the better the expectation value.

Example:

The spectrum diagram is displayed on the peptide page. It represents the mass spectrum data for the current peptide identification using vertical lines to represent peaks. The y-axis is the relative intensity. It is calculated by taking the value from spectrum, dynamic range input parameter and 'normalizing' the most intense peak with that value. See the API documentation for more details on this parameter. The x-axis is the mass range of the spectra data. The vertical lines represent the peaks. The different colors represent the ion type.

  • Red: y ions
  • Yellow: y-17 ions
  • Blue: b ions
  • Green: b-17 ions
  • Black: unassigned ions
  • Mauve: trivial neutral loss ions

The diagram above the spectrum diagram (showing the sequence with vertical lines between the residues) represents the b and y ion break down from the spectrum for assigned ions. The length of the vertical lines represents the intensity of the ion peak at that point in the sequence.

The diagram to the left of the spectrum is the delta scatter diagram. The position of the red and blue 'dots' is based on the identified ion mass (y-axis) and the difference, in Daltons, between the observed and calculated masses (x-axis). Hover the mouse over the 'dots' to see the ion mass.

Example:

The diagrams on the details|supporting evidence page are:

  • Hyperscore Expectation Function and Convolution Survival Function. These diagrams represent the values that were used in scoring the spectrum. Details on the formulae used for this can be found in the paper called: A Method for Assessing the Statistical Significance of Mass Spectrometry-Based Protein Identifications Using General Scoring Schemes, David Fenyö and Ronald C. Beavis, Anal. Chem., 2003, 75, 768-774.
  • y and b ion Histograms. The y-axis is the number of peptides and the x-axis is the number of ions. So in the b ion example below, the first vertical line means that there were 456 peptides with 0 b ions, the second line is 743 peptides with 1 b ion, the third line is 229 peptides with 2 b ions and so on. The last value that is shown is the largest non-zero value. In this case, 3 peptides had 10 b ions.
  • Spectra Histogram. A simplified version of the spectrum diagram as shown above.
Example:


4. What do the G|P|X links on the accession results page link to?

The different links allow the user to analyze the results from different perspectives. The 'G' link will display all the proteins for the current result file sorted by expectation value. The 'P' link displays the current protein only for the current result file along with all the peptides identified from the spectra for that protein. The 'X' link displays all the data from the current result file on a spectrum to spectrum basis.

Example:

#log(e) modelcoverage
1.-303.1G  | P  | X
Copyright © 2004, The Global Proteome Machine Organization