Skip to content. | Skip to navigation

Sections
Personal tools
You are here: Home Events Colloquia 2012 Colloquium - Paul Medvedev
Document Actions

Colloquium - Paul Medvedev

(University of California, San Diego)

"Algorithms for Reconstructing Genomes Using High-Throughput Sequencing"

What Colloquium
When Feb 13, 2012
from 10:00 am to 11:00 am
Where 333 IST Bldg.
Contact Name Raj Acharya
Contact email
Contact Phone 865-0301
Add event to calendar vCal
iCal

Whole-genome shotgun sequencing is an experimental technique used for obtaining information about a genome's sequence, whereby it is broken up into millions of short segments (called reads) whose sequence is then determined. Recent technological advances hold the potential for tremendous bio-medical discovery, enabling us to characterize the genomes of thousands of species, to discover the wide spectrum of human genetic variation, and to identify the wide range of bacteria making up the human microbiome. However, the challenges posed by the novelty and sheer quantity of the data are increasingly computational. A long-standing problem is how to infer the genomic sequence of an unknown species from its reads, called genome assembly. On the other hand, even within the same species the genomes of two individuals differ, and the problem of detecting such variation has received a lot of attention in the last few years.

 

In this talk, we will describe algorithms for assembling genomes, discovering structural variants, and correcting errors in the reads. Our methods are based on genome graphs, which capture the structure of a genome even when its sequence is not fully known (as with the case of sequencing data). We show how traditional genome graph models can be extended to capture matepair information  (pairs of reads at a known distance apart), which is crucial for improving the quality of assembly.

We also show how genome graphs can be used for detecting structural variation through a method called CNVer.  CNVer is based on a maximum likelihood model, which is optimized using a reduction to the bidirected network-flow problem. We demonstrate CNVer's performance on a Yoruban human individual, showing a high degree of accuracy and improvement over previous methods. No prior knowledge of biology is required.

More information about Colloquium - Paul Medvedev