problem running shortbred-quantify

70 views
Skip to first unread message

Daniel Barich

unread,
Nov 15, 2019, 11:03:48 AM11/15/19
to shortbr...@googlegroups.com, Joan Slonczewski
Hello,

It seems that shortbred_quantify won't recognize my fasta files.  I was able to run the same markers with the wgs.fns example file, but they won't work with my data.  Can someone tell what the problem is?  Below are the first two lines of the file I tried followed by the output from shortbred:

(python2) barichd@JOAN9000:~/mouse/shortbred$ head -2 ../trimmomatic/output_1P.fa
>D00420:134:HC2LTBCXY:2:1101:1249:2139 1:N:0:CGTACTAG+GTAAGGAG
CCTATACCACATCGGACGGTCAGCGGTTTCGGGATAACCGATTACTCGATGAAAAATATTTAAAAACCAATATGGAGGAATTATTTGCTTATGAATATGAAGCAGTTAAAACAAATCAATCAAATACAAACGATAACCGAGCAGACAGTAG

(python2) barichd@JOAN9000:~/mouse/shortbred$ shortbred_quantify.py --markers ShortBRED_CARD_2017_markers.faa --wgs ~/mouse/trimmomatic/output_1P.fa --results outputresults.txt --tmp output_quantify
Tested usearch. Appears to be working.
('Using this version of usearch: ', 'v9.2.64')
Treating input as a wgs file...
usearch v9.2.64_i86linux32, 4.0Gb RAM (528Gb total), 56 cores
(C) Copyright 2013-16 Robert C. Edgar, all rights reserved.
http://drive5.com/usearch

License: bus...@kenyon.edu

00:00 40Mb    100.0% Reading ShortBRED_CARD_2017_markers.faa
00:00 6.8Mb   100.0% Masking (fastnucleo)
00:00 48Mb    100.0% Word stats
00:00 48Mb    100.0% Alloc rows
00:00 49Mb    100.0% Build index
00:00 36Mb   Buffers (3237 seqs)
00:00 53Mb    100.0% Seqs

List of files in WGS set:/home/barichd/mouse/trimmomatic/output_1P.fa

List of files in WGS set (after unpacking tarfiles):/home/barichd/mouse/trimmomatic/output_1P.fa

Working on file 1 of 1
Traceback (most recent call last):
  File "/home/barichd/miniconda3/envs/python2/bin/shortbred_quantify.py", line 524, in <module>
    for seq in SeqIO.parse(streamWGS, strFormat):
  File "/home/barichd/miniconda3/envs/python2/lib/python2.7/site-packages/Bio/SeqIO/__init__.py", line 659, in parse
    raise ValueError("Unknown format '%s'" % format)
ValueError: Unknown format 'unknown'
(python2) barichd@JOAN9000:~/mouse/shortbred$

Thanks.

--
Daniel Barich
Computing and Web Solutions Associate
Biology Department
Kenyon College
Gambier, OH 43022

Eric Franzosa

unread,
Nov 18, 2019, 4:06:33 PM11/18/19
to Daniel Barich, shortbr...@googlegroups.com, Joan Slonczewski
I believe the code is using the file's extension to determine its format and expecting "fasta" or "fastq" specifically to match the names recognized by the SeqIO package in biopython.

Thanks,
Eric



--
You received this message because you are subscribed to the Google Groups "shortbred-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to shortbred-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/shortbred-users/CAE3pVRaNm7ZLHT4_bJrnnD3Bm%2BF1gtbpezU9PvGYmnULj%3DH_pQ%40mail.gmail.com.

Joan Slonczewski

unread,
Nov 19, 2019, 1:13:09 PM11/19/19
to shortbred-users
Eric,
Thanks so much for helping out! I have a lab of 15 undergrads starting to run resistomes on several projects, from mouse gut to waterways. We got an NSF grant for our hypothesis that uncoupler drugs reverse drug resistance in microbial communities. We're hoping this ShortBRED will find the most ARGs for us.
Joan

Joan L. Slonczewski
Professor of Biology
Higley Hall, 202 N. College Road
Kenyon College
Gambier, OH 43022
To unsubscribe from this group and stop receiving emails from it, send an email to shortbr...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages