Hi Erik,
Thanks for the tutorial. I like that Freebayes is "pretty vanilla" for beginners until they are confident enough to explore more. I am working on Tumor-Normal pairs and I followed your advice on a biostar thread where you said that one can call somatic variants in a tumor-normal pair as:
freebayes -f ref.fa --pooled-discrete --genotype-qualities tumor.bam normal.bam | vcfsamplediff -s VT normal tumor -
So I used the same command and the resulting VCF file named, 'freebayes_pooled-discrete.vcf' had
##INFO=<ID=SSC,Number=1,Type=Float,Description="Somatic variant score
(phred-scaled probability that the somatic variant call is
correct).">
However, looking at some of the data lines and using 'grep SSC
freebayes_pooled-discrete.vcf', none of the called variants had the
Somatic variant score (SSC). Could you be knowing why this is so?
The exact command line I used was:
freebayes --no-indels --no-mnps --no-complex -f hg19.fa \
--pooled-discrete --genotype-qualities normal.bam tumor.bam | \
./vcfsamplediff -s VT normal.bam tumor.bam - | ./vcffilter \
-f "QUAL > 20" > freebayes_pooled-discrete.vcf
Also, I am using muTect and it has the PASS and SOMATIC defined as:
##FILTER=<ID=PASS,Description="Accept as a confident somatic mutation">
##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Somatic event">
How do these compare with freebayes as I used it? I am trying to
determine which parameters will best make freebayes and muTect
comparable when it comes to calling somatic point mutations.
The muTect command I used is:
java -Xmx2g -jar muTect-1.1.4.jar --analysis_type MuTect \
--reference_sequence hg19.fa --dbsnp dbsnp_137.hg19.vcf \
--input_file:normal normal.bam --input_file:tumor tumor.bam \
--out mutect_results.txt --coverage_file mutect_coverage.wig \
--enable_extended_output \
--vcf mutect_results_extended.vcf
Would really appreciate your advice!
Johnbosco