ASHG-2012 SESSION 29 – Next-Generation Sequencing: Methods and Applications

90/11:00 Accurate haplotype estimation using phase informative sequencing reads.

in 1000genome project – 1/3 of heterozygous genotypes are covered by phase informative reads (PIRs)

SHAPEIT1 – GWAS, SHAPEIT2 (=SHAPEIT1 + IMPUTE2) – NGS
– uses PIR and LD info beta SNP and across multiple samples
– can do whole-chr phasing
test case
– parents of euro trio (100bp reads, 300bp inserts) merged with 382 euro from 1000g
– BEAGLE gives error-free segments (EFS) of 115kb while shapeit gives 180kb (without using any reads) to 200kb (using 5x coverage) on avg
Effects of read-length, insert size and base-error
– 200bp SR reads – 5% longer and 500bp SR reads – 28% longer EFS
– 500bp insert – 13% longer and 1kb insert – 20% longer EFS; using a mixture of insert sizes gives high gains
– base error does not effect EFS much

91/11:15 An LD-based method for genotype calling and phasing using low-coverage sequencing reads and a haplotype scaffold

assumes individuals are both genotyped and low-coverage sequenced
haplotype scaffold is constructed using a phasing algorithm on genotyped SNP (pedigree aware)
parallelizable method that works on 1 position at a time
applied on phase-1 1000g
for chr20: all tools had excellent concordance
created haplotype sets for the 4 tools – MCNCall, Thunder, SNPTools, Beagle
very close curves, but MVNcall performs better on low-freq SNPs
a boost using african haplotypes due to african samples being trio-phased (improves performance)

94/12:00 Methods for noninvasive prenatal determination of fetal genomes. M. W. Snyder

maternal plasma = maternal- and fetal-derived molecules
deep seq of the maternal plasma + haplotype-resolved genome seq of parents = prenatally predict inheritance of parental alleles and identify de novo mutations
developed HMM – uses allelic imbalance across parental haplotype blocks to predict transmission based on sequence data from maternal plasma – identify recomb, fix parental phasing errros
test case = trio (WG seq of parents and haplotype resolved maternal genome + deep seq maternal plasma)
99.3% accuracy of inferred inherited alleles at 1.1 million phased meternal-only hets
99.7% accuracy in top 95% haplotype block-length ranked sites (hap-blocks help mitigate sampling noise)
prediction accuracy decreases with population minor allele freq
HMM predicted 72 blocks to be partially transmitted – likely recomb events
89% accurate (25 million calls = low specificity – mostly due to seq errors) de-novo fetal mutations = high-q call in plasma inconsistent with mendelian inheritance
– downstream (ad-hoc) filtering of these de-novo led to 4000 variants
– predicted and validated 17 from these as true de-novo mutations
– now developing SVM filtering approach trained with known inherited paternal variants
– may yield a generalizable method

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s