Prinseq standalone input erro

6 views
Skip to first unread message

millipede...@gmail.com

unread,
May 10, 2018, 5:55:33 PM5/10/18
to Edwards Lab Tools
I've downloaded SRA data from NCBI using fastq-dump and my fastq files have caused a couple of errors when inputing them into prinseq-lite.

The most common one: "ERROR: input file for -fastq is in UNKNOWN format not in FASTQ format."

The other less common error: "Use of uninitialized value $qual in scalar chomp at /apps/prinseq/0.20.4/bin/prinseq-lite.pl line 2583, <FILE> line 31141578."

Here is an example of my input files for 'read_1':

@FCC4LTMACXX:1:1101:2339:1998/1
NGATAATTAGAACTATAACCCCCTTCCTGCTCTATAGATAAGATTTGATAATTCTGACCATATACCAGAACCCCCCATTCCGTATTATTAG
+SRR5040251.1 FCC4LTMACXX:1:1101:2339:1998 length=91
#1=DDDDDEDDDDIIIBE?CF@A)CBE>CBCD*:C@@?9**??*?B*?DD99D?B44*?DC@C###########################@
@FCC4LTMACXX:1:1101:3060:1995/1
NTGCTTCTCAAGGTGGCCATCAAATTGTTAAGTTGTTCCTTGTAAGAGGAAGATACGGTGGCGAAGCCACCACCCTTCTTTCCACGGCCAT
+SRR5040251.2 FCC4LTMACXX:1:1101:3060:1995 length=91
#1=DFFFFHHHHHEHIJJJJJJJJJJJJJJJJIJJGJJJJJJIGGJIJJIJJJHIFDEFHIGIGJHGGFFFFDDCACDDDDDDEDBDDDDC
@FCC4LTMACXX:1:1101:3278:1996/1
NTTATTTGTTCAAACTACTTCTGATTGGAGATTCTGGAGTAGGGAAATCGTGCTTATTGTTGAGATTTGCGGATGATGCTTATTCTGAAAG
+SRR5040251.3 FCC4LTMACXX:1:1101:3278:1996 length=91
#4BDFFFFHHHHHJJJIJJJJJJJJJJJJJIJJJJJJHJFHIJJHJIJJJHIJJIJJJJHIJIIHJJJJJJHHFFEEEEEEEEEEFEDDDC
@FCC4LTMACXX:1:1101:4171:1998/1
NGTCCCCAAACCCCAGATCAAATAGTACCGGACCGTTAAAACACTCTGTAATCATTTTTTGGTATAACTGTGTTTTATTTTGAAGACATGG
+SRR5040251.4 FCC4LTMACXX:1:1101:4171:1998 length=91
#1=DDFFFHHHHHJJJHJJJJJJJIIIIJJJJJIJGIIJJJJJJJJIJJJJIJGIJHHHFFDAEEDDDDDCDACDCDDDEEDDDDDDDDDC
@FCC4LTMACXX:1:1101:5115:1991/1
NGACCACAGACGCTTAGCTCTCCAGAGCCCGGTGAAGTTGAAGAGTCATTGGATGCGCCTTTCGCCATGAGCCAAACAGAATCACCAGCTC
+SRR5040251.5 FCC4LTMACXX:1:1101:5115:1991 length=91
#4=DFFFFHHHHHJJJJJJJJJJJJJJJJJJJDHHIJHIGIIJJICHIJJIJJJIJJHHHFFFFDDDDDDDDDDDCBDDDDDDDDDDDDDB

For the second file, there doesn't really seem to be any noticeable issues at the referenced line (last line):

@SRR5040251.7785394.2 FCC4LTMACXX:1:1304:7314:31814 length=89
AGAGGAATTCGAAAACACCAGGAAGAACCACCAACGAGCTCTCGATTCCATGCAAGCGTCATTGGAAGCTGAAGCAAAGGGTAAGGCTG
+SRR5040251.7785394.2 FCC4LTMACXX:1:1304:7314:31814 length=89
@CFFFFFHHHHHJJJJJJJJJJJJJJJJJGJJIHJJDIJJJJJJJJJJJHIJJJGIJHHHFFFFFEEEDEDDDDDDDDDDDACDDDDDD

millipede...@gmail.com

unread,
May 10, 2018, 7:59:27 PM5/10/18
to Edwards Lab Tools
I actually re-did my fastq-dump and got fastq files that will work with prinseq...to some degree. I'm going to post a different thread for the new issue I've encountered.
Reply all
Reply to author
Forward
0 new messages