ERR=134 tr2ncrna.pl

51 views
Skip to first unread message

Iuri Silva

unread,
Aug 28, 2023, 2:19:15 PM8/28/23
to EvidentialGene
Dear Dr. Gilbert,

First of all, thank you very much for Evidential Gene! It has been a very useful tool in my research.

Right now I'm aiming to remove the redundancy of a transcriptome. While tr2aacds.pl ran successfully, an error occurred when running tr2ncrna.pl (ERROR=134).

I ran the scripts with the test data (http://arthropods.eugenes.org/EvidentialGene/plants/arabidopsis/evigene_tr2aacds_test2021/arath_TAIR10_20101214up.cdna.gz) and no error occurred. I suspect that there is a problem/conflict between my dataset and the software, but I was not able to solve it.

This transcriptome was assembled with three assemblers: trinity, rnaSPAdes and Trans-ABySS.

 This is what I already tried:

- Updating the dependencies (no effect)

- Running the blastn step outside the script (ran without errors)

- Subsampling the dataset (no effect)

I appreciate any advice that you may have on this issue. Thank you!

Kind regards,
Iuri

Here is the log:
#ncrna: EvidentialGene tr2ncrna, VERSION 2020.02.25
#ncrna: tr2ncrna  -trset transcriptome.fasta -mrna okayset/transcriptome.okay.mrna -ncpu 34 -log -debug
#ncrna: app=fastanrdb, path=/miniconda3/envs/evigene/bin/fastanrdb
#ncrna: app=blastn, path=/miniconda3/envs/evigene/bin/blastn
#ncrna: app=makeblastdb, path=/miniconda3/envs/evigene/bin/makeblastdb
#ncrna: BEGIN with input= transcriptome.fasta
#ncrna: tr2ncrna( transcriptome.fasta, okayset/transcriptome.okay.mrna)
#ncrna: ncRNA public ID: PanEVm,1573000
#ncrna: remove_mrna_oids kept=37369218/40590780, transcriptome.nomrna.tr
#ncrna: remove_bigcdsdrops kept=33493585/40590780, transcriptome.nodropbigcds.tr
#ncrna: CMD= /miniconda3/envs/evigene/bin/fastanrdb -i -f transcriptome.nodropbigcds.tr > transcriptome.nodropbigcds.tr.nr
#ncrna: CMD= /softwares/evigene/scripts/prot/make_consensus.pl transcriptome.nodropbigcds.tr
#ncrna: CMD= /miniconda3/envs/evigene/bin/makeblastdb -dbtype nucl -in okayset/transcriptome.okay.mrna -logfile /dev/null
#ncrna: CMD= /miniconda3/envs/evigene/bin/blastn -perc_identity 98 -evalue 1e-19 -dust no  -num_threads 34 -db okayset/transcriptome.okay.mrna -query transcriptome.nodropbigcds.tr -outfmt 6 -out transcriptome.mrnaperf.blastn
#ncrna: ERR=134  /miniconda3/envs/evigene/bin/blastn -perc_identity 98 -evalue 1e-19 -dust no  -num_threads 34 -db okayset/transcriptome.okay.mrna -query transcriptome.nodropbigcds.tr -outfmt 6 -out transcriptome.mrnaperf.blastn
#ncrna: remove_mrna_aligned kept=/40590780,
#ncrna: ERR: openRead
#ncrna: ERR: openRead
#ncrna: FAILED at step: long_seqs(>=300) kept=/40590780, 0
#ncrna: tidy: n= 4 ncrnaset/transcriptome.nomrna.tr ncrnaset/transcriptome.nodropbigcds.tr ncrnaset/transcriptome.nodropbigcds.tr.consensus ncrnaset/transcriptome.mrnaperf.blastn
#ncrna: tidy: n= 3 ncrnaset_failtmp/transcriptome.okay.mrna.nsq ncrnaset_failtmp/transcriptome.okay.mrna.nin ncrnaset_failtmp/transcriptome.okay.mrna.nhr

Don Gilbert

unread,
Aug 28, 2023, 3:47:32 PM8/28/23
to Iuri Silva, EvidentialGene
Luri,

Your log error almost makes sense :)  it failed during blastn with an error code.   But I can't find a label for that ERR=134 (my system error numbers stop at 133), but it might mean out-of-memory during blastn.  Do you know how much memory is available? sometimes blastn will chew up much memory for repetitive sequences. 

My suggestion is try to re-run this using fewer CPU to reduce memory use, -NCPU=24 instead of -NCPU=34

#ncrna: ERR=134  /miniconda3/envs/evigene/bin/blastn -perc_identity 98 -evalue 1e-19 -dust no  -num_threads 34 -db okayset/transcriptome.okay.mrna -query transcriptome.nodropbigcds.tr -outfmt 6 -out transcriptome.mrnaperf.blastn

- Don Gilbert

--
You received this message because you are subscribed to the Google Groups "EvidentialGene" group.
To unsubscribe from this group and stop receiving emails from it, send an email to evidentialgen...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/evidentialgene/a4d65b06-56b8-4961-b53d-826da21bb6b7n%40googlegroups.com.


--
don gilbert - www.bio.net - bioinformatics - indiana.u.

Iuri Silva

unread,
Aug 30, 2023, 6:07:31 PM8/30/23
to EvidentialGene
Dear Dr. Gilbert,

Thank you for your quick reply.

Regarding the memory available, I have access to 512 GB.
I did the re-run with the suggested modifications and the result was the same as before. I monitored the memory usage and interestingly, the CPU usage was intense, however, the memory usage was very low from the beginning till the error message.
Then, I proceeded to run the makeblasdb and blastn steps outside the script, using the same command options, just to obtain the logs.  
As expected, the error occurred again and was indicated as "2848362 segmentation fault (core dumped)" in the log file.  It seems that you were right on the spot on the out-of-memory issue.
However, I couldn't find the reason behind the low memory usage even when there is memory available. I don't have this problem with other software, so I suspect it to be a problem with the blastn installed.
I will run a few tests with other versions of blastn, outside miniconda, hopefully, I will return with good news.

Thanks for the help, Dr. Gilbert!

Kind regards,
Iuri

Iuri Silva

unread,
Sep 22, 2023, 3:00:16 PM9/22/23
to EvidentialGene
Dear Dr. Gilbert,

I was able to overcome the previous issue by splitting the input into multiple fasta files. However, at the end of the tr2ncrna.pl script, another error prompted:

#ncrna: CMD=evigene/scripts/prot/cdshexprob.pl -onlyfick -test transcriptome.longnok.tr -out  transcriptome.longnok.tr.codepot
#ncrna: FAILED at step: make_allevdtab nevd=0/118 in  transcriptome.longnok.allevd.tab
#ncrna: tidy: n= 12 ncrnaset/ transcriptome.nomrna.tr ncrnaset/ transcriptome.notokmrna.tr ncrnaset/ transcriptome.nomrna.tr.consensus ncrnaset/ transcriptome.mrnaperf.blastn ncrnaset/ transcriptome.longnok.tr
#ncrna: tidy: n= 3 ncrnaset_failtmp/ transcriptome.longnok.tr.nsq ncrnaset_failtmp/ transcriptome.longnok.tr.nin ncrnaset_failtmp/ transcriptome.longnok.tr.nhr
#ncrna: DONE at date= 21 set 2023 21:42:32 -03
#ncrna: ======================================

 Sorry to bother you again on this matter, but could help me understand what is going on?

It seems to me that this is an error related to the construction of the table because the transcriptome.longnok.allevd.tab is created with the column names but no information is added.
Again, this error didn't prompt with the test data, so I suppose that is something related to my dataset.
Sadly, I have no clue what is behind this error, however, cdshexprob.pl doesn't seem to be the problem, because it generates the output and I was able to run this script outside the pipeline with no errors.
I tried a couple of things, such as reducing the number of sequences to 1,000; replacing the original IDs with short ones and even increasing/decreasing the number of cpus. None of them changed the outcome.
If you have any ideas or suggestions to understand this issue, don't hesitate. I'm willing to do any testing that may be needed.

Kind regards,
Iuri
Reply all
Reply to author
Forward
0 new messages