Probabilistic Inference of Viral Quasispecies Subject to Recombination

0 views
Skip to first unread message

Denis Jacob Machado

unread,
May 14, 2013, 2:49:16 PM5/14/13
to sequ...@googlegroups.com

ABSTRACT
RNA viruses exist in their hosts as populations of different but related strains. The virus
population, often called quasispecies, is shaped by a combination of genetic change and
natural selection. Genetic change is due to both point mutations and recombination events.
We present a jumping hidden Markov model that describes the generation of viral qua-
sispecies and a method to infer its parameters from next-generation sequencing data. The
model introduces position-specific probability tables over the sequence alphabet to explain
the diversity that can be found in the population at each site. Recombination events are
indicated by a change of state, allowing a single observed read to originate from multiple
sequences. We present a specific implementation of the expectation maximization (EM)
algorithm to find maximum a posteriori estimates of the model parameters and a method to
estimate the distribution of viral strains in the quasispecies. The model is validated on
simulated data, showing the advantage of explicitly taking the recombination process into
account, and applied to reads obtained from a clinical HIV sample.
Reply all
Reply to author
Forward
0 new messages