Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1029202: snippy: Error when using snpeff 5.1

433 views
Skip to first unread message

Andreas Tille

unread,
Jan 19, 2023, 9:43:30 AM1/19/23
to
Package: snippy
Version: 4.6.0+dfsg-1
Severity: important
X-Debbugs-Cc: Pierre Gruet <p...@debian.org>

Hi,

I was informed that snippy is not behaving nicely in all cases when
snpeff 5.1 is used. A colleague is rather using it successfully with
snpeff 5.0. You can verify this with the following test script:


#!/bin/sh
# create tmp dir
TMPDIR=$(mktemp -d /tmp/snippyXXXXX)
cd $TMPDIR
# download public read data
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR201/004/SRR2014554/SRR2014554_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR201/004/SRR2014554/SRR2014554_2.fastq.gz
# download reference data
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/005/845/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_genomic.gbff.gz
gunzip GCF_000005845.2_ASM584v2_genomic.gbff.gz
mkdir tmp

set -x
snippy --cpus 4 --ram 20 --tmpdir ./tmp --reference GCF_000005845.2_ASM584v2_genomic.gbff --R1 SRR2014554_1.fastq.gz --R2 SRR2014554_2.fastq.gz --outdir results --report --mincov 20


This script ends with:

...
[15:25:13] Running: mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log
[15:25:13] Running: snpEff build -c reference/snpeff.config -dataDir . -gff3 ref 2>> snps.log
build -c reference/snpeff.config -dataDir . -gff3 ref
[15:25:16] Error running command, check results/snps.log


Checking the end of the log gives:

$ tail results/snps.log
transl_table : 11
translation : MSDYKSTLNLPETGFPMRGDLAKREPGMLARWTDDDLYGIIRAAKKGKKTFILHDGPPYANGSIHIGHSVNKILKDIIVKSKGLSGYDSPYVPGWDCHGLPIELKVEQEYGKPGEKFTAAEFRAKCREYAATQVDGQRKDFIRLGVLGDWSHPYLTMDFKTEANIIRALGKIIGNGHLHKGAKPVHWCVDCRSALAEAEVEYYDKTSPSIDVAFQAVDQDALKAKFAVSNVNGPISLVIWTTTPWTLPANRAISIAPDFDYALVQIDGQAVILAKDLVESVMQRIGVTDYTILGTVKGAELELLRFTHPFMGFDVPAILGDHVTLDAGTGAVHTAPGHGPDDYVIGQKYGLETANPVGPDGTYLPGTYPTLDGVNVFKANDIVVALLQEKGALLHVEKMQHSYPCCWRHKTPIIFRATPQWFVSMDQKGLRAQSLKEIKGVQWIPDWGQARIESMVANRPDWCISRQRTWGVPMSLFVHKDTEELHPRTLELMEEVAKRVEVDGIQAWWDLDAKEILGDEADQYVKVPDTLDVWFDSGSTHSSVVDVRPEFAGHAADMYLEGSDQHRGWFMSSLMISTAMKGKAPYRQVLTHGFTVDGQGRKMSKSIGNTVSPQDVMNKLGADILRLWVASTDYTGEMAVSDEILKRAADSYRRIRNTARFLLANLNGFDPAKDMVKPEEMVVLDRWAVGCAKAAQEDILKAYEAYDFHEVVQRLMRFCSVEMGSFYLDIIKDRQYTAKADSVARRSCQTALYHIAEALVRWMAPILSFTADEVWGYLPGEREKYVFTGEWYEGLFGLADSEAMNDAFWDELLKVRGEVNKVIEQARADKKVGGSLEAAVTLYAEPELSAKLTALGDELRFVLLTSGATVADYNDAPADAQQSEVLKGLKVALSKAEGEKCPRCWHYTQDVGKVAEHAEICGRCVSNVAGDGEKRKFA
type : CDS
. File '/tmp/snippyBRsOJ/results/reference/./ref/genes.gff' line 30 'NC_000913 snippy CDS 22391 25207 . + 0 ID=b0026;eC_number=6.1.1.5;Name=ileS;codon_start=1;db_xref=UniProtKB/Swiss-Prot:P00956,ASAP:ABE-0000094,ECOCYC:EG10492,GeneID:944761;gene=ileS;gene_synonym=ECK0027%3B ilvS;locus_tag=b0026;product=isoleucine--tRNA ligase;protein_id=NP_414567.1;transl_table=11;translation=MSDYKSTLNLPETGFPMRGDLAKREPGMLARWTDDDLYGIIRAAKKGKKTFILHDGPPYANGSIHIGHSVNKILKDIIVKSKGLSGYDSPYVPGWDCHGLPIELKVEQEYGKPGEKFTAAEFRAKCREYAATQVDGQRKDFIRLGVLGDWSHPYLTMDFKTEANIIRALGKIIGNGHLHKGAKPVHWCVDCRSALAEAEVEYYDKTSPSIDVAFQAVDQDALKAKFAVSNVNGPISLVIWTTTPWTLPANRAISIAPDFDYALVQIDGQAVILAKDLVESVMQRIGVTDYTILGTVKGAELELLRFTHPFMGFDVPAILGDHVTLDAGTGAVHTAPGHGPDDYVIGQKYGLETANPVGPDGTYLPGTYPTLDGVNVFKANDIVVALLQEKGALLHVEKMQHSYPCCWRHKTPIIFRATPQWFVSMDQKGLRAQSLKEIKGVQWIPDWGQARIESMVANRPDWCISRQRTWGVPMSLFVHKDTEELHPRTLELMEEVAKRVEVDGIQAWWDLDAKEILGDEADQYVKVPDTLDVWFDSGSTHSSVVDVRPEFAGHAADMYLEGSDQHRGWFMSSLMISTAMKGKAPYRQVLTHGFTVDGQGRKMSKSIGNTVSPQDVMNKLGADILRLWVASTDYTGEMAVSDEILKRAADSYRRIRNTARFLLANLNGFDPAKDMVKPEEMVVLDRWAVGCAKAAQEDILKAYEAYDFHEVVQRLMRFCSVEMGSFYLDIIKDRQYTAKADSVARRSCQTALYHIAEALVRWMAPILSFTADEVWGYLPGEREKYVFTGEWYEGLFGLADSEAMNDAFWDELLKVRGEVNKVIEQARADKKVGGSLEAAVTLYAEPELSAKLTALGDELRFVLLTSGATVADYNDAPADAQQSEVLKGLKVALSKAEGEKCPRCWHYTQDVGKVAEHAEICGRCVSNVAGDGEKRKFA'
WARNING_TRANSCRIPT_NOT_FOUND: Too many 'WARNING_TRANSCRIPT_NOT_FOUND' warnings, no further warnings will be shown.
WARNING_TRANSCRIPT_ID_DUPLICATE: Transcript 'b4616' already added. File '/tmp/snippyBRsOJ/results/reference/./ref/genes.gff' line 3839 'NC_000913 snippy ncRNA 3853118 3853190 .- . ID=b4616;Name=istR;db_xref=ECOCYC:G0-10201,GeneID:5061525;gene=istR;gene_synonym=ECK4425%3B istR-1%3B istR-2%3B psrA19;locus_tag=b4616;ncRNA_class=other;product=small regulatory RNA IstR-1'
WARNING_FRAMES_ZERO: All frames are zero! This seems rather odd, please check that 'frame' information in your 'genes' file is accurate.
ERROR: CDS check file '/tmp/snippyBRsOJ/results/reference/./ref/cds.fa' not found.
ERROR: Protein check file '/tmp/snippyBRsOJ/results/reference/./ref/protein.fa' not found.
ERROR: Database check failed.



In contrast to this my colleage reported the script was working when
using snippy via conda environment:

# create the environment
mamba create -n snippy
# install snippy (version fixing of snpeff is essential to avoid a bug when using genbank files as reference
input)
mamba activate snippy
mamba install -y -c conda-forge -c bioconda -c defaults snpeff=5.0 snippy
# check installation
snippy --check
# deactivate environment
mamba deactivate


So we somehow need to find out the difference between snpeff 5.0 and 5.1
to make snippy working nicely with the Debian packaged snpeff. May be
this could be discussed with upstream but I simply wanted to drop a
record here.

Kind regards
Andreas.
-- System Information:
Debian Release: bookworm/sid
APT prefers testing
APT policy: (501, 'testing'), (50, 'buildd-unstable'), (50, 'unstable'), (5, 'experimental'), (1, 'buildd-experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 6.0.0-2-amd64 (SMP w/4 CPU threads; PREEMPT)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8), LANGUAGE=de_DE:de
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages snippy depends on:
ii any2fasta 0.4.2-2
ii bcftools 1.16-1
ii bedtools 2.30.0+dfsg-2
ii bwa 0.7.17-7
ii freebayes 1.3.6-2
ii libbio-perl-perl 1.7.8-1
ii libvcflib-tools 1.0.3+dfsg-2
ii minimap2 2.24+dfsg-3
ii parallel 20221122+ds-2
ii perl 5.36.0-7
ii samclip 0.4.0-4
ii samtools 1.16.1-1
ii seqtk 1.3-4
ii snp-sites 2.5.1-2+b1
ii snpeff 5.1+d+dfsg-2
ii vt 0.57721+ds-3

snippy recommends no packages.

snippy suggests no packages.

-- no debconf information

Andreas Tille

unread,
Jan 21, 2023, 9:00:04 AM1/21/23
to
Hi Pierre,

Am Sat, Jan 21, 2023 at 10:00:32AM +0100 schrieb Pierre Gruet:
> I will provide the upstream of snpeff with a minimal non-working example, as
> I am unfortunately not able to understand it myself.

Thanks a lot, that's actually the help I was hoping for

Andreas.

PS: Please add tags upstream and forwareded to have a close connection to this bug once you have contacted upstream

--
http://fam-tille.de

Andreas Tille

unread,
Jan 25, 2023, 11:50:04 AM1/25/23
to
Hi Pierre,

given that this bug might be quite invasive to a couple of rdepends
do you think it makes sense to upload some

5.1+d+dfsg+really+5.0

named copy of version 5.0? At least if upstream might need some time
to respond? We might also turn your example into an autopkgtest to
avoid future regressions.

Kind regards
Andreas.

--
http://fam-tille.de

Andreas Tille

unread,
Feb 7, 2023, 11:10:03 AM2/7/23
to
Hi,

since upstream is not very quick in answering to questions in issues I
had a look myself into the upstream repository and checked when the
version string was bumped to the last release of the 5.0 series which is
5.0f. I found the commit[1] where REVISION was bumped to 'f'. The
git log messages are usually pretty worthless "Project updated" is the
most frequently used commit comment in the whole log.

I simply checked out that commit[1], created a tarball from it and
injected it into a new packaging branch 5.0f[2]. If we might succeed
in adapting the patches we use we might be able to build a Debian
release 5.1+d+dfsg-really5.0+f-1 and check whether the bug is fixed
here.

Kind regards
Andreas.

[1] https://github.com/pcingola/SnpEff/commit/e4f2c6b3d
[2] https://salsa.debian.org/med-team/snpeff/-/tree/5.0f

--
http://fam-tille.de

Andreas Tille

unread,
Aug 24, 2023, 5:30:05 AM8/24/23
to
Hi Pierre,

I just noticed that snpeff upstream has tagged a new release. I've
injected the new tarball into Salsa Git but did not yet worked on the
quilt patches that need to be adapted. If you throw an ENOTIME error
I could see how far I might come with the changes. If you find some
spare time for this issue I'd leave further changes to you.

Andreas Tille

unread,
Aug 26, 2023, 9:40:05 AM8/26/23
to
Hi Pierre,

writing from some weak connection while traveling.

Am Fri, Aug 25, 2023 at 02:30:45PM +0200 schrieb Pierre Gruet:

> I found some time :-D
> Upstream changed the location of the build-time tests, putting them in a
> more canonical place. I updated the patches and d/rules accordingly. This is
> all pushed to the Salsa repo.

Thanks a lot.

> However, the script you provided in the bug report [0] is still not working
> with this newly packaged version 5.1+f+dfsg (tested locally), so probably we
> should do as you suggested there [1] and provide 5.1+d+dfsg-really5.0+f in
> unstable, as the script is working with version 5.0+f.
>
> Admittedly this is a step back for the library, but as least one would have
> the use case of your colleagues working. If you still think this is a good
> idea, I offer to finalize this step back.

Network here is to weak to check for upstream tracker. I would add the
action we take to the tracker and if we do not get any response we do so
and roll back to 5.0+f.


Thanks a lot for caring
Andreas.

PS: You probably will keep the current version in some branch on Salsa ...

> [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1029202#5
> [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1029202#37




--
http://fam-tille.de

Andreas Tille

unread,
Nov 28, 2023, 12:00:04 PM11/28/23
to
Hi,

since version 5.2.b is not worse than 5.1 (= upstream upgrade with
the very same problematic output as version 5.1) I decided to update
to the latest upstream version and pinged upstream again[1].

Kind regards
Andreas.

[1] https://github.com/pcingola/SnpEff/issues/455

--
http://fam-tille.de

Andreas Tille

unread,
Nov 28, 2023, 12:00:04 PM11/28/23
to
Hi,

we tried to package snippy which has uncovered a bug in SnpEff which we
have reported upstream[1] with some easily verificable data set and
command line. Unfortunately we did not had any response from upstream.
We even would consider to revert SnpEff version to a working one without
this regression (like 5.0e). However, we would like to have the correct
code base for this release but even asking for proper tagging of the
code was ignored inside the issue.

I wonder whether some people from the community might be able to join
our attempt to get this issue fixed and raise their voice inside the
Github issue (or use other channels) to get this finally fixed. I guess
the problem does not only occure in snippy test suite but also in other
pipelines so the community might be affected by this issue.

Kind regards and thanks for your support

Nilesh Patra

unread,
Dec 5, 2023, 3:40:05 PM12/5/23
to
On Tue, Nov 28, 2023 at 05:54:30PM +0100, Andreas Tille wrote:
> I wonder whether some people from the community might be able to join
> our attempt to get this issue fixed and raise their voice inside the
> Github issue (or use other channels) to get this finally fixed. I guess
> the problem does not only occure in snippy test suite but also in other
> pipelines so the community might be affected by this issue.
>
>
> [1] https://github.com/pcingola/SnpEff/issues/455

Sometimes just tagging upstream author does wonders :)

They have replied and the (upstream) bug has been closed.
BTW, are you able to still reproduce (without any fixes for snpeff/snippy) #1031465?

For me, it is working fine in an unstable chroot (including autopkgtests)
and maybe could be closed.

Best,
Nilesh
signature.asc

Andreas Tille

unread,
Dec 6, 2023, 11:40:05 AM12/6/23
to
Hi Nilesh,

Am Wed, Dec 06, 2023 at 02:00:52AM +0530 schrieb Nilesh Patra:
> Sometimes just tagging upstream author does wonders :)

I hope I'll keep that trick in mind! ;-)

> They have replied and the (upstream) bug has been closed.
> BTW, are you able to still reproduce (without any fixes for snpeff/snippy) #1031465?

Confirming snippy builds nicely and bug can be closed.

> For me, it is working fine in an unstable chroot (including autopkgtests)
> and maybe could be closed.

I've just pushed a patch for snippy that works with the options
suggested by snpEff upstream. I need to do some further tests but I
think the problem is solved.

Andreas Tille

unread,
Dec 6, 2023, 3:40:05 PM12/6/23
to
Control: reassign -1 snippy
Control: retitle -1 Wrong calls of snpEff
Control: tags -1 upsteam
Control: forwarded -1 https://github.com/pcingola/SnpEff/issues/510

Thanks to snpEff upstream I've found a solution[1] for the problem
reported above. I've also added a conscise test case[2] which was
exposing the problem above (which is solved) but in a later call to
snpEff with a different command a new problem occures.

I have reported this to snpEff[3] since I hope to get quick help there
for another patch to snippy to simply get rid of some single line in
some output file. If all fails we can remove this manuall inside the
snippy Perl code but I'm striving for a clean solution here.

Kind regards
Andreas.

[1] https://salsa.debian.org/med-team/snippy/-/blob/master/debian/patches/snpeff_v5.1%2B.patch
[2] https://salsa.debian.org/med-team/snippy/-/blob/master/debian/tests/test_for_working_snpeff_v5.1%2B
[3] https://github.com/pcingola/SnpEff/issues/510

--
http://fam-tille.de

Andreas Tille

unread,
Dec 6, 2023, 4:40:05 PM12/6/23
to
Control: tags -1 upstream
0 new messages