|
Annotated modifications added (2008/01/20)
Starting with the 2008.02.01 release of X! Tandem, the search engine will now have
the capacity to set the potential modifications being tested on a protein by protein basis.
This new feature can be activated in either the first round of searching or the subsequence
refinement rounds by setting the "Use sequence annotations" control to be "yes".
The way that this new feature works is quite simple. A file is constructed that contains
a list of sequence accession numbers and potential sequence modification specifications, e.g.
several lines from the human modification file look like the following:
<protein label="ENSP00000166244" pmods="79.966331@Y" />
<protein label="ENSP00000350614" pmods="79.966331@S" />
<protein label="ENSP00000372956" pmods="79.966331@S,79.966331@T" />
<protein label="ENSP00000372947" pmods="79.966331@S,79.966331@T" />
<protein label="ENSP00000363773" pmods="15.994915@P" />
The first line indicates that the protein sequence ENSP00000166244 is known to
be tyrosine phosphorylated. If the "Use sequence annotations" feature is turned on,
then that sequence will be tested for tyrosine phosphorylation, in addition to the other
potential modifications you have specified for your search to be applied to all protein sequences.
Similarly ENSP00000350614
will be checked for serine phosphorylation, ENSP00000372956 and ENSP00000372956 will
be checked for phosphorylation at serine and threonine residues, while ENSP00000363773 will be
tested for the possible presence of hydroxyproline residues.
Files of this type have been constructed for human, mouse, rat, chicken and brewer's yeast
proteomes, using publically available annotation sources such as Uniprot and GPMDB.
This capability is now available on all of the public GPM search servers. The annotation files
are available here.
|