The Global Proteome Machine Organization
   GPMDB URL Conventions

GPMDB contains a large amount of information that can be accessed using a HTTP CGI interface. This type of interface produces URL's that look like the following:

  1. http://gpmdb.thegpm.org/thegpm-cgi/dblist_keyword.pl?db=all+species&db_index=24&keyword=chloroplast+kinase
    (retreives all proteins with "chloroplast kinase" in their associated descriptions);
  2. http://gpmdb.thegpm.org/thegpm-cgi/dblist_label.pl?label=ENSP00000369887&proex=-1
    (retrieves all observations of the protein with accession number "ENSP00000216181") ; or
  3. http://gpmdb.thegpm.org/thegpm-cgi/dblist_pep.pl?seq=SPSSVEPVADMLMGLFFR
    (retrieves all observations of the peptide sequence "SPSSVEPVADMLMGLFFR").

GPMDB uses a system known as URL rewriting to simplify the construction of these query lines for some of the most common types of data request. Using this simplied system, the three URLs above can be rewritten as:

  1. http://gpmdb.thegpm.org/protein/keyword/chloroplast kinase;
  2. http://gpmdb.thegpm.org/protein/accession/ENSP00000369887; or
  3. http://gpmdb.thegpm.org/peptide/sequence/SPSSVEPVADMLMGLFFR.

These simplified URLs are easier to write and remember and they are less likely to be affected by internal technology changes in the GPMDB system. The following description outlines the simplified URL types that are currently available through GPMDB.

1. Accessing a data set by its accession number.

Format: "http://gpmdb.thegpm.org/data/accession/GPMddddddddddd"
where GPMddddddddddd is a valid GPMDB data set accession number (GPM + 11 digits).

OR

Format: "http://gpmdb.thegpm.org/GPMddddddddddd"
where GPMddddddddddd is a valid GPMDB data set accession number (GPM + 11 digits).

2. Finding data sets via a keyword search.

Format: "http://gpmdb.thegpm.org/data/keyword/SEARCH TERM"
where SEARCH TERM is any word or words that may be found in the text description of a GPMDB data set.

3. Accessing all observations of a protein by its accession number.

Format: "http://gpmdb.thegpm.org/protein/accession/NUMBER"
were NUMBER is a protein accession number in any of the formats used by GPMDB. Examples of these accesion numbers are as follows:

  1. ENSEMBL: ENSP00000369887;
  2. NCBI gi: gi|132147| (the "|" symbols are required);
  3. IPI: IPI00021440; or
  4. FLYBASE: FBpp0100125.

Shortcut: if the accession number is easy to recognize (ENSEMBL or NCBI), simply "http://gpmdb.thegpm.org/NUMBER" will work.

4. Finding proteins via a keyword search.

Format: "http://gpmdb.thegpm.org/protein/keyword/SEARCH TERM"
where SEARCH TERM is any word or words that may be found in the text description of a protein.

Shortcut: if the keywords cannot be confused for a GPM, ENSEMBL or NCBI accession number, simply "http://gpmdb.thegpm.org/SEARCH TERM" will work.

5. Finding all observed phosphorylation sites in a protein by its accession number.

Format: "http://gpmdb.thegpm.org/protein/psyt/NUMBER"
were NUMBER is a protein accession number in any of the formats used by GPMDB.

6. Finding all observed SNAPs in a protein by its accession number.

Format: "http://gpmdb.thegpm.org/protein/snap/NUMBER"
were NUMBER is a protein accession number in any of the formats used by GPMDB.

7. Finding all PTMs observed in a protein by its accession number and PTM type.

Format: "http://gpmdb.thegpm.org/protein/modification/dd.d@XYZ/NUMBER"
were NUMBER is a protein accession number in any of the formats used by GPMDB, dd.d is the mass of the PTM and XYZ is a list of the single letter abbreviations for amino acid residues that may have this modification. Examples for the specification of modifications are as follows:

  1. /protein/modification/42@K/NUMBER will find acetylations at lysine (K) residues;
  2. /protein/modification/16@P/NUMBER will find hydroxy-proline mdofications (P); or
  3. /protein/modification/1@NQ/NUMBER will find deamidiations at asparagine (N) or glutamine (Q) residues.
8. Finding all observations of a peptide sequence.

Format: "http://gpmdb.thegpm.org/peptide/sequence/ACDEFGHIK"
where ACDEFGHIK is any peptide sequence that may have been observed.

9. Other links.

Many of the display pages in the GPM system can be referred to directly using a URL that specifies the CGI script used to create the page. These URLs have the general form:

http://gpmdb.thegpm.org/thegpm-cgi/SCRIPT_NAME.pl?p1=XX&p2=XX ...

where "SCRIPT_NAME.pl" is the PERL script being called and the values following the "?" are the CGI parameters. This type of URL can be simplified by the following recipe:

  • replace "thegpm-cgi" with "~" and
  • replace "SCRIPT_NAME.pl?" with "SCRIPT_NAME/".

For example, the following full URL:

http://gpmdb.thegpm.org/thegpm-cgi/plist.pl?path=/gpm/archive/320/GPM32010005747.xml&proex=-1

becomes the modestly more legible:

http://gpmdb.thegpm.org/~/plist/path=/gpm/archive/320/GPM32010005747.xml&proex=-1

Copyright © 2010, The Global Proteome Machine Organization