Hello,
I'm trying to run concatenated alignemts throught ETE3 Toolkit from last week but I'm not capable.
I have been written a post in github but I try here if it's more easy to obtain a feedback.
However, I tried other strategies and I obtained two different errors that I cannot find on the web some hint to overcome it.
I attach below the two commands and output error.
I can send you the entire cog file and multifasta if is needed. Thanks on advance if someone can help/guide me.
Command 1:
ete3 build -w mafft_default-none-none-none -m sptree_fasttree_all -o provakaijudef --cogs coglist.txt -a multifasta.fa --clearall --cpu 40 --noimg
Output error:
Traceback (most recent call last):
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 207, in get_status
self.finish()
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/task/concat_alg.py", line 188, in finish
ConcatAlg.store_data(self, fasta, phylip, txt_partitions)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 552, in store_data
db.add_task_data(self.taskid, DATATYPES.concat_alg_fasta, fasta)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/db.py", line 227, in add_task_data
("%s", "%s") """ %(duplicates, data_id, zencode(data, data_id))
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/db.py", line 92, in zencode
pdata = six.moves.cPickle.dumps(x)
OverflowError: cannot serialize a string larger than 4GiB
None
ERR - Errors found in ConcatAlgTask (36881 species, 40 COGs, ConcatAlg, /cog_all-al...ttree_full)
Traceback (most recent call last):
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 207, in get_status
self.finish()
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/task/concat_alg.py", line 188, in finish
ConcatAlg.store_data(self, fasta, phylip, txt_partitions)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 552, in store_data
db.add_task_data(self.taskid, DATATYPES.concat_alg_fasta, fasta)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/db.py", line 227, in add_task_data
("%s", "%s") """ %(duplicates, data_id, zencode(data, data_id))
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/db.py", line 92, in zencode
pdata = six.moves.cPickle.dumps(x)
OverflowError: cannot serialize a string larger than 4GiB
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/scheduler.py", line 257, in schedule
task.status = task.get_status(qstat_jobs)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 210, in get_status
raise TaskError(self, e)
ete3.tools.ete_build_lib.errors.TaskError: cannot serialize a string larger than 4GiB
INFO - Waiting 2 seconds
INFO - Launched 0 jobs. 0(R), 0(W). Cores usage: 0/40
ERR - Thread cog_all-alg_concat_default-fasttree_full contains errors:
ERR - ** ConcatAlgTask (36881 species, 40 COGs, ConcatAlg, /cog_all-al...ttree_full)
ERR - -> da8b0df7bd2a070cef7b49a045238c61
ERR - -> cannot serialize a string larger than 4GiB
ERR - Done with ERRORS
Command 2:
ete3 build -w clustalo_default-trimal01-none-none -m sptree_fasttree_all -o provakaijuas_article --cogs coglist.txt -a multifasta.fa --clearall --cpu 44 --noimg
Output error:
ERR - Job error reported: Job (clustalo---threads-1, 590869)
ERR - Errors found in ConcatAlgTask (36881 species, 40 COGs, ConcatAlg, /cog_all-al...ttree_full)
Traceback (most recent call last):
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/scheduler.py", line 257, in schedule
task.status = task.get_status(qstat_jobs)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 198, in get_status
self.job_status = self.get_jobs_status(sge_jobs)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 285, in get_jobs_status
st = j.get_status(sge_jobs)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 198, in get_status
self.job_status = self.get_jobs_status(sge_jobs)
File "/share/apps/anaconda3.9/envs/ete3/lib/python3.6/site-packages/ete3/tools/ete_build_lib/master_task.py", line 306, in get_jobs_status
raise TaskError(j, "Job execution error %s" %errorpath)
ete3.tools.ete_build_lib.errors.TaskError: Job execution error /mnt/hydra/ubs/shared/projects/Microbioma/16S_vs_Shotgun/analysis/progenomes_tree/markerGenes/prova/provakaijuas_article/tasks/59086998eb4812df0ec04aad9b59541c
INFO - Waiting 2 seconds
INFO - Launched 0 jobs. 22(R), 0(W). Cores usage: 22/44
ERR - Thread cog_all-alg_concat_default-fasttree_full contains errors:
ERR - ** ConcatAlgTask (36881 species, 40 COGs, ConcatAlg, /cog_all-al...ttree_full)
ERR - -> Job (clustalo---threads-1, 590869)
ERR - -> /mnt/hydra/ubs/shared/projects/Microbioma/16S_vs_Shotgun/analysis/progenomes_tree/markerGenes/prova/provakaijuas_article/tasks/59086998eb4812df0ec04aad9b59541c
ERR - -> Job execution error /mnt/hydra/ubs/shared/projects/Microbioma/16S_vs_Shotgun/analysis/progenomes_tree/markerGenes/prova/provakaijuas_article/tasks/59086998eb4812df0ec04aad9b59541c
ERR - Done with ERRORS
Data Error: Errors found in some tasks
Killing 22 running jobs...