Error report in insilico_read_normalization

103 views
Skip to first unread message

YanYan

unread,
Jan 21, 2021, 3:04:04 AM1/21/21
to trinityrnaseq-users
I got error report in the step of insilico_read_normalization.
"quals and seq lines dont match in length" 
The file is download from SRA ncbi

I attached the screen here:
(base) y@y-ThinkStation-P320:~$ Trinity --seqType fq --left /home/y/SRR/SRR6767056_1.fastq --right /home/y/SRR/SRR6767056_2.fastq --CPU 4 --max_memory 10G --output /home/y/trinity_out/


     ______  ____   ____  ____   ____  ______  __ __
    |      ||    \ |    ||    \ |    ||      ||  |  |
    |      ||  D  ) |  | |  _  | |  | |      ||  |  |
    |_|  |_||    /  |  | |  |  | |  | |_|  |_||  ~  |
      |  |  |    \  |  | |  |  | |  |   |  |  |___, |
      |  |  |  .  \ |  | |  |  | |  |   |  |  |     |
      |__|  |__|\_||____||__|__||____|  |__|  |____/

    Trinity-v2.11.0



Left read files: $VAR1 = [
          '/home/y/SRR/SRR6767056_1.fastq'
        ];
Right read files: $VAR1 = [
          '/home/y/SRR/SRR6767056_2.fastq'
        ];
Trinity version: Trinity-v2.11.0
** NOTE: Latest version of Trinity is v2.11.0, and can be obtained at:

Thursday, January 21, 2021: 15:48:18 CMD: java -Xmx64m -XX:ParallelGCThreads=2  -jar /home/y/miniconda3/opt/trinity-2.11.0/util/support_scripts/ExitTester.jar 0
Thursday, January 21, 2021: 15:48:18 CMD: java -Xmx4g -XX:ParallelGCThreads=2  -jar /home/y/miniconda3/opt/trinity-2.11.0/util/support_scripts/ExitTester.jar 1
Thursday, January 21, 2021: 15:48:18 CMD: mkdir -p /home/y/trinity_out/chrysalis


----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads  ---------------------
----------------------------------------------------------------------------------

---------------------------------------------------------------
------------ In silico Read Normalization ---------------------
-- (Removing Excess Reads Beyond 200 Coverage --
---------------------------------------------------------------

# running normalization on reads: $VAR1 = [
          [
            '/home/y/SRR/SRR6767056_1.fastq'
          ],
          [
            '/home/y/SRR/SRR6767056_2.fastq'
          ]
        ];


Thursday, January 21, 2021: 15:48:18 CMD: /home/y/miniconda3/opt/trinity-2.11.0/util/insilico_read_normalization.pl --seqType fq --JM 10G  --max_cov 200 --min_cov 1 --CPU 4 --output /home/y/trinity_out/insilico_read_normalization --max_CV 10000  --left /home/y/SRR/SRR6767056_1.fastq --right /home/y/SRR/SRR6767056_2.fastq --pairs_together  --PARALLEL_STATS  
-prepping seqs
Converting input files. (both directions in parallel)CMD: seqtk-trinity seq -A -R 1  /home/y/SRR/SRR6767056_1.fastq >> left.fa
CMD: seqtk-trinity seq -A -R 2  /home/y/SRR/SRR6767056_2.fastq >> right.fa
Error encountered just after sequence entry[1]: 1/1', quals and seq lines dont match in length:

... corrupt file?Thread 1 terminated abnormally: Error, cmd: seqtk-trinity seq -A -R 1  /home/y/SRR/SRR6767056_1.fastq >> left.fa died with ret 768 at /home/y/miniconda3/opt/trinity-2.11.0/util/insilico_read_normalization.pl line 793.
Error encountered just after sequence entry[1]: 1/2', quals and seq lines dont match in length:

... corrupt file?Thread 2 terminated abnormally: Error, cmd: seqtk-trinity seq -A -R 2  /home/y/SRR/SRR6767056_2.fastq >> right.fa died with ret 768 at /home/y/miniconda3/opt/trinity-2.11.0/util/insilico_read_normalization.pl line 793.
Error, conversion thread failed at /home/y/miniconda3/opt/trinity-2.11.0/util/insilico_read_normalization.pl line 336.
Error, cmd: /home/y/miniconda3/opt/trinity-2.11.0/util/insilico_read_normalization.pl --seqType fq --JM 10G  --max_cov 200 --min_cov 1 --CPU 4 --output /home/y/trinity_out/insilico_read_normalization --max_CV 10000  --left /home/y/SRR/SRR6767056_1.fastq --right /home/y/SRR/SRR6767056_2.fastq --pairs_together  --PARALLEL_STATS   died with ret 7424 at /home/y/miniconda3/bin/Trinity line 2826.
main::process_cmd("/home/y/miniconda3/opt/trinity-2.11.0/util/insilico_read_norm"...) called at /home/y/miniconda3/bin/Trinity line 3379
main::normalize("/home/y/trinity_out/insilico_read_normalization", 200, ARRAY(0x559d47eb27e8), ARRAY(0x559d47eb27d0)) called at /home/y/miniconda3/bin/Trinity line 3319
main::run_normalization(200, ARRAY(0x559d47eb27e8), ARRAY(0x559d47eb27d0)) called at /home/y/miniconda3/bin/Trinity line 1372
(base) y@y-ThinkStation-P320:~$ 

less -S /home/y/SRR/SRR6767056_1.fastq

'@1/1'
NATATTGTGATTAGACAGGGACAGAGTCATTGTGATTTATTTAATAGGAGTCGCCAAGGAAATGAATATCCTCTTTCATAAACAAC>
'+1/1'
#07B<<<BB0<B<<B<<<'<BBBBBB<B<<B<B<<<<00<<<<<0<<<<<B<7<<B777<<BBB<<<B7<BB<<<BBB<BB<<<<<>
'@2/1'
NATCGGAAGAGCACACGTCTGAACTCCAGTCACTAATGCGCATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAGAACTGCCGCCAA>
'+2/1'
#0<FFFFFFFFFBFBFFIIBFF<FFIIFFFFFIIIIIBI7FIFFBFBFFFFBFF'<<BFFFFBFFFFFBBBB##############>
'@3/1'
GTTTCTTCGAATCATAAACAAATGTATTAAATTTCATGTAAACGCAGGAAAACTGCATTTGTATTGGCATATCCTGATCCGATGCT>
'+3/1'
BBBFFFFFFFFFFIIFIIIIIIIIFFIIIIIIIIIIIFFIIFIIIFFFFFIIIIIIIIIIIIFIIIFFFFIIIFFFFFFFFFFFFF>
'@4/1'
NGGCCAAATAAACAGGGGCGCCTGCACCAACTCTTTCAGCGTAGTTGCCTTTACGCAATAGTCGATGAATACGACCAACAGGGAAT>
'+4/1'
#0<FFFFFFFFFFIFFFIIIIFFFFIIIIIIIIIFFFIFFIIIIFFFFBFFFFFFFFFFFBFFFFFFFFBBFBFFFBFFFFB<BFB>
'@5/1'
CGTTGCTACTCCACAATTATTAGTTCCATATTTCATTTTCATAAAACCACCTTCTCCCCATGTCGTTCCATATGAGTTTTTAACAA>
'+5/1'
BB<BFFFFF<FFFFFFFBIFFFFFIIFIFFFBFFFBFFFIIIIFFFFIBFB7BFFBFFFFI<FFFBFFIFFFF<BB77BFFFFB<B>

less -S /home/y/SRR/SRR6767056_2.fastq
'@1/2'
GCTCATTCTCTGTTAGCGGTGTTTGATTACCTTCTTTTGACTACGATATTGAGGATTCGTCGTTGTTTATGAAAGAGGA>
'+1/2'
B<BFFFFFFFBFBFF<BFF70BFFFFFIIFFFBBFFIBFIIFFBFFFFFIFBFFFFIFBB<BB<B<BBBB<BBB<7<B<>
'@2/2'
GATGAGGAAAGGGGCCGGGGAGGGAAAAGGGGAGGGGGAAGGGGGGGGTCGCGGGGGTGGCCCGGTAATATAAAAAAAA>
'+2/2'
###############################################################################>
'@3/2'
GCAGTTATTTATAGAGGATTACCGATGGATAACTTAATTTTCTGTAAAAAAAATCTGATATATCCCCTTAGCATTATGA>
'+3/2'
BBBFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFFFFFFFFFFFF>
'@4/2'
GTTCGTACTTTCTTTCAAATCTTTCTTACAATAATGTCTGGTCGTGGAAAAGGCGGGAAGGTAAAGGCTAAGGCCAAGA>
'+4/2'
BBBFFFFFFFFFFIIBFIIIIIIIIIIIIIIIIIIIBFFIIIIIIFIIIIIBFIIIFBBBF'<BB<BBFBFBBBFBB7B>
'@5/2'
TACAACCCAAGCATCTGAAGCTGATTTGAAGAATAAAGTAGGAACAGTAGGCCCTATTAGTGTAGGAATAAATGGAGAT>
'+5/2'
00<BFB<FFFFFFFBBFFF7BFBFFBBBBFFBBFFFIFFFFBBFFFB7BBFFFFBF<BFF7<<B77B<FFFBFFFFBFB>
'@6/2'
GTAGGACTATCAGGCATGTCATAAGTTGATTTAACAGAAACACGTGGTGCTTCATTATCAATTTGCTGGCTAAATACTT>
'+6/2'


Brian Haas

unread,
Jan 21, 2021, 9:21:01 AM1/21/21
to YanYan, trinityrnaseq-users
hi,

Is there really a '>' character at the end of each sequence line?

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trinityrnaseq-users/5cb7c242-643d-40c8-b89e-c6d80bac13c1n%40googlegroups.com.


--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

YanYan

unread,
Jan 21, 2021, 7:13:44 PM1/21/21
to trinityrnaseq-users
Sorry, there is > because I didn't fully expand the screen.            
What's the problem? Is that in the head?

'@1/1'
NATATTGTGATTAGACAGGGACAGAGTCATTGTGATTTATTTAATAGGAGTCGCCAAGGAAATGAATATCCTCTTTCATAAACAACGACGAATCCTCAATATCGTAGTCA
'+1/1'
#07B<<<BB0<B<<B<<<'<BBBBBB<B<<B<B<<<<00<<<<<0<<<<<B<7<<B777<<BBB<<<B7<BB<<<BBB<BB<<<<<<<<<<7<7<0<<<7<<<<777<<<
'@2/1'
NATCGGAAGAGCACACGTCTGAACTCCAGTCACTAATGCGCATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAGAACTGCCGCCAACACGGGCCACGCCAGAGGAGACCA
'+2/1'
#0<FFFFFFFFFBFBFFIIBFF<FFIIFFFFFIIIIIBI7FIFFBFBFFFFBFF'<<BFFFFBFFFFFBBBB######################################
'@3/1'
GTTTCTTCGAATCATAAACAAATGTATTAAATTTCATGTAAACGCAGGAAAACTGCATTTGTATTGGCATATCCTGATCCGATGCTATATATCAAAGCATTCCATGTCTG
'+3/1'
BBBFFFFFFFFFFIIFIIIIIIIIFFIIIIIIIIIIIFFIIFIIIFFFFFIIIIIIIIIIIIFIIIFFFFIIIFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFF
'@4/1'
NGGCCAAATAAACAGGGGCGCCTGCACCAACTCTTTCAGCGTAGTTGCCTTTACGCAATAGTCGATGAATACGACCAACAGGGAATTGCAATCCTGCACGGGATGAACGA
'+4/1'
#0<FFFFFFFFFFIFFFIIIIFFFFIIIIIIIIIFFFIFFIIIIFFFFBFFFFFFFFFFFBFFFFFFFFBBFBFFFBFFFFB<BFBBFFBFFFFFFFFFB<BBFFFFFBB
'@5/1'
CGTTGCTACTCCACAATTATTAGTTCCATATTTCATTTTCATAAAACCACCTTCTCCCCATGTCGTTCCATATGAGTTTTTAACAATCCAGTACTTGTCCCTTGTTGTAA
'+5/1'
BB<BFFFFF<FFFFFFFBIFFFFFIIFIFFFBFFFBFFFIIIIFFFFIBFB7BFFBFFFFI<FFFBFFIFFFF<BB77BFFFFB<B<<BBB<<<<BBBBBBBBB<BB<BB
'@6/1'
CTTGGAGAAGATGAAAATGAAGTAATTGAAACTACAGATGAACATGGTGTTATTAAAAGGGTAATTCGTCATCTAATGGCACCTCCAGATATTCATTCTGTAACTTTTAC
'+6/1'
0<7B7BBB<B0B<<0BFF000B<BBFF0<BB<BBB7BFBFFF00<''B''7BF0B0BF0'''0<BBB7B7BB<'7B<7<<B<BBB0B<BBB<BB0<<<<<7<<B<B<<<<
'@7/1'
GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAATGCGCATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAGAGAATACATTTGAGAAGAAGGATGATTCAAGAAGGT
'+7/1'
BBBFFFFFFF0BFFFFFFFIFFFIBFFFFBFFFBBFIBFBFF0BFF<'7BBFIFBBBFF<F<<BBFBFF<B#######################################

screen.png

Brian Haas

unread,
Jan 21, 2021, 9:17:32 PM1/21/21
to YanYan, trinityrnaseq-users

Hi,

You might try running the 'fastqc' software on your input fastqs.  If there's something peculiar wrt formatting, I would hope that it would report it.  If it doesn't find any problems, maybe you could give me access to the files that are cause the problem. I'll need to look at them directly to troubleshoot the issue.

best,

~b

YanYan

unread,
Jan 22, 2021, 3:03:43 AM1/22/21
to trinityrnaseq-users
(base) y@y-ThinkStation-P320:~$ fastqc -o /home/y/fastqc -t 4 /home/y/SRR/SRR6767056_1.fastq /home/y/SRR/SRR6767056_2.fastq
Failed to process /home/y/SRR/SRR6767056_1.fastq
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line didn't start with '@'
at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:158)
at uk.ac.babraham.FastQC.Sequence.FastQFile.<init>(FastQFile.java:89)
at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:106)
at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:129)
at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:102)
at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:316)
Failed to process /home/y/SRR/SRR6767056_2.fastq
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line didn't start with '@'
at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:158)
at uk.ac.babraham.FastQC.Sequence.FastQFile.<init>(FastQFile.java:89)
at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:106)
at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:129)
at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:102)
at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:316)
(base) y@y-ThinkStation-P320:~$ 

Does it mean the @ at the beginning? Should it be replaced? What should I replace?

Brian Haas

unread,
Jan 22, 2021, 6:53:52 AM1/22/21
to YanYan, trinityrnaseq-users
hi,

From your head of the fastq, it looks like there's a quote character at the beginning of the id line.  Is this for real, or is this something that your viewer/pager was introducing?  If the first character is not exactly a '@', then there's something unusual about your fastqs and you might revisit how they were produced.

Reply all
Reply to author
Forward
0 new messages