The X! search engine project

X! Search Engine Development

  X! TANDEM Spectrum Modeler

X! Tandem open source is software that can match tandem mass spectra with peptide sequences, in a process that has come to be known as protein identification.

This software has a very simple, sophisticated application programming interface (API): it simply takes an XML file of instructions on its command line, and outputs the results into an XML file, which has been specified in the input XML file.

This version of X! Tandem is a full release and it includes extensive source level documentation.

   Installation instructions
Please read the artistic license under which this software is distributed.
 
Introduction
  The X! Tandem MS Protein Sequence Modeler is a software package that includes a number of c++ and xml files:
  • all c++ source code required to build new versions of the executable
  • a c++ executable (tandem.exe)
  • an xml "template" (default_input.xml)
  • an xml taxonomy input file (taxonomy.xml)
  • an xml test input file (input.xml)
  • a spectrum input test file (spectrum_1.pkl)
 
Download
1. Download and extract compressed file
  • tandem-win32-{release-date}.zip,
  • tandem-linux-{release-date}.tar.z or
  • tandem-osx-{release-date}.tar.z
from here into a temporary folder. If you are using Linux or OS X use the following command to extract the files:
       tar -xzf tandem-[linux|osx]-tar.z.
This will create a folder named tandem which contains 3 sub folders:
  • src - contains the source files for tandem

  • bin - contains the binary for tandem, as well as example files and mass spectra.
    You can run tandem from the command line in this directory, by typing:

          tandem input.xml or
          ./tandem.exe input.xml (Linux or OS X)

    The output from the program will appear in the output file specified in the input.xml file with a date stamp appended to the filename. Check the path names in taxonomy.xml and input.xml if there is trouble doing this check. Read the default_input.xml file for more information on how the input parameter file format works. For more information on these xml files please read the FAQ.

  • fasta - contains an example FASTA protein sequence list file. This type of file is often referred to as a "database" file or simply a "database". However, it is not a database of any type: it is simply a file with a simple (but poorly specified) format. X! Tandem does not search databases of protein sequences.
 
Altering the Program
  The "bin" folder contains the tandem.exe program which runs the search algorithm. If you are using Windows to create a new version of tandem, either double click on the file called x-bang-msms.sln or create a new project in Visual Studio. Add all the .cpp and .h files from the "src" folder to the newly created project.
The Windows version of tandem.exe program was originally compiled using Microsoft Visual Studio 7.0 on Windows XP.
If you are using Linux or OS X, from the command prompt, navigate to the src folder. Type the command "make" (without the quotes). This will create a new tandem.exe in the bin folder. The Linux version of tandem.exe was originally compiled on RedHat Linux 9.0 with the g++ compiler version 3.2.2.
The OS X version of tandem.exe was originally compiled using an Apple Computer, GCC version 1175, based on gcc 3.1 20020420.
  If you have any troubles installing or running X! Tandem, please consult the Tandem FAQ.
If you are using a version of Linux other than Red Hat, please consult the Linux FAQ.
   Literature

Here are some journal articles that describe ideas associated with Tandem:

  1. The Biopolymer Markup Language; David Fenyö
    Bioinformatics.. 1999; 15:339-40.
  2. Informatics and data management in proteomics; David Fenyö and Ronald C. Beavis
    Trends Biotechnol. 2002; 20:S35-8.
  3. A Method for Assessing the Statistical Significance of Mass Spectrometry-Based Protein Identifications Using General Scoring Schemes; David Fenyö and Ronald C. Beavis
    Anal. Chem.; 2003; 75:768-774.
  4. Further references ...

Copyright © 2004-2011, The Global Proteome Machine Organization