Beta 2 Release of 0.8.0

aroth

unread,

Feb 24, 2012, 3:45:36 PM2/24/12

to JointSNVMix User Group

The second beta of the 0.8.0 version is now posted on the downloads
page http://code.google.com/p/joint-snv-mix/downloads/list.

This version brings several changes. Most notably are

1) JointSNVMix will now be under the GPL2 license.
2) MutationSeq like post-processing is now included via the --
post_process flag. See below for more details.
3) Default output for classify is the screen. To output to a file use
the --out_file flag.
4) Training and classifying by chromosome is possible with the --
chromosome flag.
5) Fisher, Threshold and Indepdent SNVMix models are removed.
6) A BetaBinomial model is now avaiiable. Models can be selected via
the --model flag for both training and classifying. Note all models
are not JointSNVMix type models.
7) Pysam and Cython are no longer a dependencies. ALGLIB is now a
dependency.

This version is still a beta because of a lack of testing and
documentation. Installation information is available on the wiki
http://code.google.com/p/joint-snv-mix/wiki/Installation. If you want
to try the beta, help can be obtained by running 'jsm.py train -h' and
'jsm.py classify -h'.

The most important feature is the new post-processing module. For a
description of the basic idea see the MutationSeq paper "Feature-based
classifiers for somatic mutation detection in tumour–normal paired
sequencing data" by J. Ding. The current implementation in JointSNVMix
uses a random forest hence the new dependency on ALGLIB. If the --
post_process flag is passed then an extra column will be appended to
the normal JointSNVMix output, which is the random forests probability
of somatic. This value should be more accurate than the JointSNVMix
probabilities as it uses many features beyond count data, such as the
presence of homopolymer runs. Passing sites through the post-processor
is a bit slow, so it is recommend to set --somatic_threshold >= 0.01.
This will cause only sites with a JointSNVMix somatic probability of
0.01 or higher to be post-processed and printed to screen.

Assuming the reference genome is ref_genome.fa, the normal BAM is
normal.bam and the tumour is tumour.bam.

Example : To train JointSNVMix2 on chromosome 22 only and save the
files in params_22.cfg with a subsampling of every 10th site

jsm.py train ref_genome.fa normal.bam tumour.bam params_22.cfg --
chrom=22 --model=snvmix2 --skip_size=10

Example: To classify using JointSNVMix1 on chromosome 22, using a
custom parameters file and post-processing sites with p_somatic>=0.2

jsm.py classify ref_genome.fa normal.bam tumour.bam --
parameters_file=params_22.cfg --chromosome=22 --post_process --
somatic_threshold=0.2

I would appreciate any feedback.

Cheers,
Andy

RG

unread,

Apr 5, 2012, 10:51:33 AM4/5/12

to JointSNVMix User Group

Hi Andy,
I just gave the second beta a try. Installation went smoothly, no
issues at all.
I used the --post_process flag and I am seeing a lot of:
Exception ZeroDivisionError: 'float division' in
'joint_snv_mix.post_processing.feature_extractor.FeatureExtractor._get_normalised_ratio'
ignored

Cheers,
~Raad

On Feb 24, 3:45 pm, aroth <andrewjlr...@gmail.com> wrote:
> The second beta of the 0.8.0 version is now posted on the downloads

> pagehttp://code.google.com/p/joint-snv-mix/downloads/list.

>
> This version brings several changes. Most notably are
>
> 1) JointSNVMix will now be under the GPL2 license.
> 2) MutationSeq like post-processing is now included via the --
> post_process flag. See below for more details.
> 3) Default output for classify is the screen. To output to a file use
> the --out_file flag.
> 4) Training and classifying by chromosome is possible with the --
> chromosome flag.
> 5) Fisher, Threshold and Indepdent SNVMix models are removed.
> 6) A BetaBinomial model is now avaiiable. Models can be selected via
> the --model flag for both training and classifying. Note all models
> are not JointSNVMix type models.
> 7) Pysam and Cython are no longer a dependencies. ALGLIB is now a
> dependency.
>
> This version is still a beta because of a lack of testing and

> documentation. Installation information is available on the wikihttp://code.google.com/p/joint-snv-mix/wiki/Installation. If you want

aroth

unread,

Apr 5, 2012, 2:30:24 PM4/5/12

to JointSNVMix User Group

Thanks for pointing this out Raad. Does the classification step still
finish though?

The problem seems to manifest when computing features for
classification with 0 coverage in the normal. This might cause
problems for the post-processed probabilities at such sites, but
generally sites with no coverage should should be ignored anyways.

I will try to get and updated version with a fix once I've confirmed
that is the only thing causing the error.

Cheers,
Andy

RG

unread,

Apr 9, 2012, 4:56:46 PM4/9/12

to JointSNVMix User Group

Hi Andy,
Yes, the classification step finished and I examined the output file
and I can't see anything unusual.

Cheers,
~Raad

Reply all

Reply to author

Forward