Because my strain, Lactobacillus pentosus SLC13, is not listed in snpEff database, I decide to build snpEff database and add it into Bcbio custom genome database.
Then, the analysis works fine.
I share my flow below. If you have any suggestion, please tell me.
1. download the genbank
2. check the genbank file. Make sure that the Sequence name in gb2genome.py translated fasta is same as CONTIG sequence name.
If the names are not same, the "ERROR_CHROMOSOME_NOT_FOUND" messages will display in vcf files.
(note: The gb2genome.py use NZ_CP022130.1 for sequence name)
Genbank file (Original header):
Genbank file (Original sequence header):
Genbank file (modified header, but I guess that there is not essential to modifiy here
Genbank file (Modified sequence header):
3. Create snpEff database
mkdir -p data/Lactobacillus_pentosus_SLC13
mv /path/of/modified/sequence.gb ./
Database name "Lactobacillus_pentosus_SLC13" is same as my build custom genome name.
Then modify the snpEff.config.
Open snpEff.config, add the words in figure, and save.
(If the species is not bacteria or plasmid, the "(database name).(Sequence name).codonTable : Baterial_and_Plant_Plastid" should not be required.)
Finally, build the snpeff database
snpEff build -genbank -v Lactobacillus_pentosus_SLC13
If you want to check the chromosome name in build snpeff database, you can type the command:
snpEff Lactobacillus_pentosus_SLC13 -v
4. add the snpeff database into bcbio
Then add the snpeff setting in seq/Lactobacillus_pentosus_SLC13-resources.yaml
Otherwise, the document has a typo:
The create snpeff directory is "snpeff
", not "snpEff
Because I created snpEff and stored the snpeff data into this directory, I got an error.
5. Run the analysis and you get the well annotated vcf file.
Brad Chapman於 2018年2月7日星期三 UTC+8下午10時11分27秒寫道：