Resources for Exome sequencing annotation

With the recent tsunami of sequencing data, and the reduced cost of whole exome capture technologies like Raindance, Agilent SureSelect and Nimblegen, a lot of data is bound to end up with Bioinformaticians. Annotating that data with information like coding variants, synonymous or not, etc is the first thing that needs to be done after mapping the reads and reporting variants in the sample.

3 tools seem to be all one needs for such an annotation – SIFT, SeattleSeqAnnotation and BEDTools. The first two are useful to obtain the annotation for a variant call, obtain allele frequencies for that, and generally filter the variants to a manageable list. They certainly have some overlap. BEDTools is a very useful suite to quickly merge and intersect mapped reads, variant calls, etc, with bed annotation files. It’s a single command to ‘intersect’ a bam alignment file with a bed annotation file to obtain all reads mapping on-target for a capture sequencing data.

There are of course some short-comings, and some more work to be done, so this place will be updated soon!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s