Peptide Sequence Tag

Briefly, a peptide sequence tag is a piece of information about a peptide obtained by tandem mass spectrometry that can be used to identify this peptide in a protein database. However, to understand more and thorough knowledge about peptide sequence tag, we need to know something about the sequence database, Tandem Mass spectrometry, Mass Spectrometry and Mass Spectrometer.

The “Sequence Tag” protein identification technique was developed by Matthias Mann and Matthias Wilm in the mid 90s, while at the Protein and Peptide Group at the EMBL in Heidelberg Germany. In the spirit of open research and the early internet Dr. Mann made the search program “Peptide Search”, freely available along with regularly updated databases distributed through EBI. This program is no longer available (2010) as a standalone Macintosh application from Matthias Wilm at the EMBL. This search technique was and is far advanced for it’s time and leverages the idea of search constraint to its maximum potential. Sequence Tag employs MS/MS data produced by tandem MS methods. In this search strategy the peptide fragment spectrum is searched for obvious sequence tags. A sequence tag is a short string of amino acid mass differences deduced from the fragment spectrum.

In the field of bioinformatics, a sequence database is a type of biological database that is composed of a large collection of computerized (“digital”) nucleic acid sequences, protein sequences, or other polymer sequences stored on a computer. The UniProt database is an example of a protein sequence database. As of 2013 it contained over 40 million sequences and is growing at an exponential rate. Historically, sequences were published in paper form, but as the number of sequences grew, this storage method became unsustainable.

Sequence databases can be searched using a variety of methods. The most common usage is probably searching for sequences similar to a certain target protein or gene whose sequence is already known to the user. The BLAST program is a popular method of this type.

Most mass specromitrists have fond memories of this technique since it was one of the first freely available programs, and helped many scientist whose bioinformatics resources were limited. The original program does suffer from some surmountable problems: the manual nature of calling the sequence tag, the manual nature of determining the neutral peptide mass, and the problem of not knowing whether one is calling a “b” or “y” ion series. Hence, as we have entered the era of high through-put proteomics style peptide sequencing and identification this technique has suffered a decline in popularity.  One can envision a rebirth and dominance of this technique once an automated de-novo sequencing program is able to call the sequence tag automatically with high probability. Again the manual nature of this technique has slowed it’s popularity in recent years.

What’s tandem mass spectrometry? Tandem mass spectrometry, also known as MS/MS or MS2, involves multiple steps of mass spectrometry selection, with some form of fragmentation occurring in between the stages.

Mass spectrometry (MS) is an analytical technique that ionizes chemical species and sorts the ions based on their mass-to-charge ratio. In simpler terms, a mass spectrum measures the masses within a sample. Mass spectrometry is used in many different fields and is applied to pure samples as well as complex mixtures.

In general, peptides can be identified by fragmenting them in a mass spectrometer. For example, during collision-induced dissociation peptides collide with a gas within the mass spectrometer and break into pieces at their peptide bonds. The resulting fragment ions (called b-ions and y-ions) have mass differences corresponding to the residue masses of the respective amino acids. Thus, a tandem mass spectrum contains partial information about the amino acid sequence of the peptide. The peptide sequence tag approach, developed by Matthias Wilm and Matthias Mann at the EMBL, uses this information to identify the peptide in a database. Briefly, a couple of masses are extracted from the spectrum in order to obtain the peptide sequence tag. This peptide sequence tag is a unique identifier of a specific peptide and can be used to find it in a database containing all possible peptide sequences.

A notation has been developed for indicating peptide fragments that arise from a tandem mass spectrum. Peptide fragment ions are indicated by a, b, or c if the charge is retained on the N-terminus and by x, y or z if the charge is maintained on the C-terminus. The subscript indicates the number of amino acid residues in the fragment. Prime symbols indicate the number of protons or hydrogens added to the fragment to form the observed ion. For example, y” denotes the singly charged ion analogous to a protonated peptide, (y”’)2+ is a doubly charged ion analogous to a doubly protonated peptide.

Share on facebook
Facebook
Share on google
Google+
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on pinterest
Pinterest

Leave a Reply

Close Menu
Choose Your Lauguage »
×
×

Cart