protein, cleavage site

Syntax

This parameter is a formatted text string with three fields. The first and third fields are square - [] - or french - {} - brace pairs, containing single amino acid residue symbols. These two fields are separated by a vertical line, e.g., [KR]|{P}.

Notes

Braces are required in the first and third fields.
Only one set of braces is allowed per field.
The character representing any residue is "X"
Multiple cleavage site rules may be specified by separating them with a comma, e.g. [KR]|{P},[X]|[D].
Multiple rules are applied individually in the order written, from left-to-right. If the conditions of any of the rules are met, then a cleavage occurs.

Description

Short peptides are generated from longer protein sequences by the use of either chemical or enzymatic cleavage. Both types of cleavage tend to have preferred sites of cleavage, based on the residues on either side of the peptide bond to be cleaved. Proteomics experiments frequently use enzymes with very strong sequence specificity, to limit the number of peptides generated and the assure that there are a reasonable number of peptides in the length range most useful for protein identification.

The value of this parameter allows the description of this sequence specificity. A list of residues in square braces means that those residues are required in that position, while a list of residues in french braces means that those residues will block cleavage. For example,

[KR]|{P} means cleavage C-terminal to every lysine or arginine, except when followed by a proline;
[X]|[D] means cleavage N-terminal to every aspartic acid residue; and
[X]|[X] means cleavage at every peptide bond.