![]() |
The Global Proteome Machine
The home of proteomics crowd-sourced "Big Data" |
QUACK: Quality Assurance & Control Knowledge base
Most of the research into proteomics data analysis algorithms has centered on trying
to extract as much biologically relavent information as possible from the results of an
experiment. Significant improvements in laboratory technique and instrument performance
have made it possible to extract far more information than was available even a few years ago.
The increase in data volume and improvements in laboratory methods have opened up a new field for
informatics research in proteomics, which can be broadly classified as the
development of Quality Assurance and Quality Control algorithms. The purpose of the these
algorithms is not to extract biological information: it is to provide rapid feedback to
experimental groups about problems in experimental design, laboratory technique or instrument
measurment stability that make results unsuitable for their intended purpose. These algorithms
should promote the idea of a "virtuous circle", in which the informatics and
experimental groups have the tools available so that precious experimental and computation resources are not
wasted on generating and processing sub-par experimental results. Informatics fixes for bad data are never as
productive as a commitment to generating the best data possible.
The purpose of this raw data repository is to provide real experimental data to facility the development of QA/QC algorithms.
The data will not be the pristine, high quality data that supports published research: instead it
is data with fatal flaws either caused by single blunders or an accumulation of smaller problems
that collectively render the data unsuitable for use. Painstaking analysis of this often highly
complex data is not important. Instead, providing a simple report back to an analyst highlighting
what is wrong should be the goal.
Not all data flaws are of equivalent importance. A flaw that may be of critical importance when
using a data set for quantitation may be merely an annoyance when the data is being used for
parent ion mass calibration. The issues associated with each data set will be annotated on
a three point scale, depending on the overall context of the associated experiment:
Note: only data explicitly meant for algorithm development will be included on this site. No data
associated with biological or biomedical experiments or associated publications will be permitted. If you have
some data that you would like to contribute, please contact Ron Beavis.
Copyright © 2012, The Global Proteome Machine Organization.
Privacy Statement
|