In a general way, spectral library searching, unlike sequence database searches, involve finding the best match of an acquired MS/MS spectrum to a library of pre-searched spectra for which the sequences have been determined. This approach can be hundreds of times faster than traditional searching, with comparable or better accuracy.
A peptide spectral library is a curated, annotated and non-redundant collection/database of LC-MS/MS peptide spectra. One essential utility of a peptide spectral library is to serve as consensus templates supporting the identification of peptide/proteins based on the correlation between the templates with experimental spectra.
Spectral libraries have been used in the small molecules mass spectra identification since the 1980s. In the early years of shotgun proteomics, pioneer investigations suggested that a similar approach might be applicable in shotgun proteomics for peptide/protein identification.
One potential application of peptide spectral libraries is the identification of new, currently unknown mass spectra. Here, the spectra from the library are compared to the new spectra and if a match is found, the unknown spectra can be assigned the identity of the known peptide in the library.
Sequence database searching is widely used currently for mass spectra based protein identification. In this approach, a protein sequence database is used to calculate all putative peptide candidates in the given setting (proteolytic enzymes, miscleavages, post-translational modifications). The sequence search engines use various heuristics to predict the fragmentation pattern of each peptide candidate. Such derivative patterns are used as templates to find a sufficiently close match within experimental mass spectra, which serves as the basis for peptide/protein identification. Many tools have been developed for this practice, which have enabled many past discoveries.
You will also notice that there are matches to peptides modified with Carbamidomethyl and Oxidation, even though no modifications were specified for the search, because these modifications were present in the library entry. Likewise, there are some matches to non-tryptic peptides (e.g. N.CLAPLAK.V) even though the enzyme in the search form was trypsin. Only a few search parameters are relevant to a library search, the most important being precursor and fragment mass tolerances.
Spectral library searching is not applicable in a situation where the discovery of novel peptides or proteins is the goal. Fortunately, more and more high-quality mass spectra are being acquired by the collective contribution of the scientific community, which will continuously expand the coverage of peptide spectral library.
First, a greatly reduced search space will decrease the searching time. Second, by taking full advantage of all spectral features including relative fragment intensities, neutral losses from fragments and various additional specific fragments, the process of spectra searching will be more specific, and it will generally provide better discrimination between true and false matches.
Modern tandem MS instruments combine features of fast duty cycle, exquisite sensitivity, and unprecedented mass accuracy. Tandem mass spectrometry, which is an ideal match for the large-scale protein identification and quantification in complex biological systems. In a shotgun proteomics approach, proteins in a complex mixture are digested by proteolytic enzymes such as trypsin. Subsequently, one or more chromatographic separations are applied to resolve resulting peptides, which are then ionized and analyzed in a mass spectrometer. To acquire tandem mass spectra, a particular peptide precursor is isolated, and fragmented in a mass spectrometer; the mass spectra corresponding to the fragments of peptide precursor is recorded. Tandem mass spectra contains specific information regarding the sequence of the peptide precursor, which can aid the identification of peptide/protein.
For a peptide spectra library, to reach a maximal coverage is a long-term goal, even with the support of scientific community and ever-growing proteomic technologies. However, the optimization for a particular module of the peptide spectra library is a more manageable goal, e.g. the proteins in a particular organelle or relevant to a particular biological phenotype. For example, a researcher studying mitochondrial proteome, will likely focus his/her analyses within protein modules within the mitochondria. The research community focused peptide spectral library supports targeted research in a comprehensive fashion for a particular research community.