Key Dates

* First Call for Papers:
September 28, 2009
* Paper submissions begin:
November 15, 2009
* Paper submissions deadline: December 15, 2009
* Paper decisions announced:
February 1, 2010
* Conference dates:
March 26-28, 2010










Program Committee Chairs

Nuno Bandeira, University of California, San Diego, USA

Oliver Kohlbacher,
University of Tuebingen, Germany

Martin McIntosh,
Fred Hutchinson Cancer Research Center, USA

Tutorials

Tutorials will be offered on Friday, March 26th, 2010.


Proteogenomics: using mass spectrometry for genome and proteome annotation.

Presented by Natalie Castellana, Dept of Computer Science and Engineering, University of California, San Diego.

Download Slides

Abstract:

Genome annotation is the process of determining the function, if any, of every base pair in the genome, including determining the structure and location of all genes. The most common methods for annotating genes combines evidence from multiple sources including ab initio gene predictions, cDNA and EST sequencing, homology mapping with related species, and manual curation. These methods depend on canonical genomic signals or sampling of the transcriptome to determine transcription and translation boundaries, as well as exon boundaries and splice junctions.

As a complementary approach to gene annotation, the field of proteogenomics has emerged. Proteogenomic studies are methodologically similar to high-throughput proteomics experiments. However, instead of considering a database of know protein sequences, a database of predicted proteins anchored on the genome is used. In this way, the identified peptides can be used to validate hypothetical or predicted protein sequences and their coordinates on the genome. Most studies proceed by creating specialized databases containing putative proteins. These databases may be built directly from the genome or from transcript data . Tandem mass spectra can be identified using database-search tools. Each peptide-spectrum match gives evidence for translation for one or more locations on the genome. The peptides can then be used for validation of known genes, identification of novel genes, or improvements to known gene models.

In this tutorial, a typical proteogenomic workflow will be presented including considerations for building databases, identifying spliced peptides, assigning confidence to novel genes and gene improvements, as well as general goals and limitations of the methods. This tutorial will focus predominantly on eukaryotic gene annotation using proteogenomic methods, however, much of the information is relevant to prokaryotic organisms as well.

Statistical design and analysis of quantitative mass spectrometry-based proteomic experiments

Presented by Olga Vitek, Depts of Statistics and Computer Science, Purdue University.

Download Slides

Abstrac:

Mass spectrometry is a method of choice for quantitative proteomics, due to its sensitivity and versatility of instrumentation. Despite the recent progress in analytical capabilities, data from proteomic experiments exhibit a substantial amount of variation and uncertainty. For example, in clinical investigations which compare protein abundance between patients with and without disease, the measurements reflect the natural variation of protein abundance between patients, as well as the technical variation introduced during sample collection and spectral acquisition. Statistical reasoning allows us to make objective and reproducible conclusions in the presence of such variation, and should be applied at least twice: when designing a new experiment, and when deriving conclusions from the acquired spectra. However the choice of the methods, and their appropriate use, are not well established in the broader proteomic community as of yet.

The tutorial reviews the 'best-practices' of experimental design and interpretation of data generated with bottom-up mass spectrometry-based quantitative proteomics. First, we discuss the fundamental principles of statistical experimental design, in particular translating the goal of the study into statistical hypotheses, selecting biological samples for the study from the underlying populations, and allocating experimental resources for sample handling and spectral acquisition. Second, we discuss specifications of probabilistic models that describe the major sources of variation, and allow us to detect proteins with differential abundance, estimate protein abundance in individual samples, and test pre-defined groups of proteins for enrichment in differential abundance.

We illustrate the discussion using a case study, which applies statistical reasoning to the design and analysis of a label-free LC-MS-based investigation of patients with coronary artery disease. We discuss the motivation underlying various statistical analysis steps and provide examples of computer code in open-source statistical software. We also point out alternative statistical methods that can be applied for these tasks, and possible extensions for experiments with labeling workflows and/or targeted profiling.

Dereplication and Sequencing of Cyclic Non-Ribosomal Peptides Using Mass Spectrometry

Presented by Julio Ng, Bioinformatics Program, University of California, San Diego.

Download Slides

Abstract:

De novo sequencing of natural products remains a bottleneck in pharmaceutical industry aimed at discovery of bioactive compounds. Nonribosomal peptides (NRPs) are arguably the most important class of natural products with unparallel record in pharmaceutical industry: cephalosporin, vancomycin, cyclosporin, luzopeptin A, bleomycin represent best-selling drugs derived from NRPs. Because nonribosomal peptides are optimized for millions of years to control growth, self-defense, and predation, they often display drug-like features that are rarely found in synthetic molecules. Unfortunately, recent advances in next-generation sequencing do not get us any closer to sequencing NRPs since amino acid sequences of NRPs are not encoded in the genomes of producing organisms. We have developed a set of tools to aid natural product researchers with the study of cyclic NRPs using mass spectrometry.

Assigning statistical significance to peptide/protein identifications

Presented by Sangtae Kim, Dept of Computer Science and Engineering, University of California, San Diego.

Download Slides

Abstract:

One of the major problems in MS/MS database search is to estimate error rates of peptide-spectrum matches (PSMs). Several approaches that assign statistical significance to PSMs have been introduced, such as 1) spectrum-specific E-value approximation (e.g. X!Tandem and OMSSA), 2) probabilistic score assignment using machine learning methods (e.g. PeptideProphet and Percolator) and 3) the target-decoy search strategy. In the tutorial, we will review these existing approaches and discuss their advantages and shortcomings. We then introduce the recently proposed generating function approach to compute rigorous p-values of PSMs and compares it with the other approaches. Lastly, we criticize "the two-peptide rule" often used for protein identifications and present an idea to extend the generating function approach to evaluating matches between a protein and an entire spectral data set.