Error in the classify_metagenome.sh step from the README file

97 views
Skip to first unread message

Divya Inamdar

unread,
Jan 17, 2020, 1:19:36 PM1/17/20
to CLARK Users
I am trying to run all the steps in the README file for CLARK-S. I started with Step 0, ran the set_target.sh file and got the bacteria -- species data in the folder by running the following command:
bash /mnt/HA/groups/rosenGrp/opt/CLARKSCV1.2.6.1/set_targets.sh /lustre/scratch/dsi33/clark/clark-s bacteria --species

For the next step, when I run the command for building the database mentioned in step 1 of the README file I get error.
command:  
bash /mnt/HA/groups/rosenGrp/opt/CLARKSCV1.2.6.1/classify_metagenome.sh -O  /lustre/scratch/dsi33/clark/clark-s/Bacteria/GCF_902502775.1_D03_1.3_genome.fna -R /lustre/scratch/dsi33/clark/clark-s result

I get the following error message:
Please set the targets (cf. the script 'set_targets.sh') before running the classification.

How do I add the database path so that the classify_metagenome.sh files has the location of both the database and the input fasta sequence?

-Divya

Rachid

unread,
Feb 5, 2020, 1:20:19 PM2/5/20
to CLARK Users
Hi Divya!

Have you tried running these scripts "set_targets.sh" and "classify_metagenome.sh" from your working directory?
try this:
$ cd /mnt/HA/groups/rosenGrp/opt/CLARKSCV1.2.6.1/

then

set_targets.sh /lustre/scratch/dsi33/clark/clark-s bacteria --species
then

$ classify_metagenome.sh -O  /lustre/scratch/dsi33/clark/clark-s/Bacteria/GCF_902502775.1_D03_1.3_genome.fna -R /lustre/scratch/dsi33/clark/clark-s result

Please let me know how it goes?

Best,
Rachid

Divya Inamdar

unread,
Feb 6, 2020, 11:47:41 AM2/6/20
to CLARK Users
Hey
Yes I made sure they are in the same directory.
Now I am facing problem in exe directory. I guess the executable file is not running.
could you help me with that
this is the error message that i am getting after running classify_metagenome.sh script:
[dsi33@proteusi01 clark]$ tail step_2_classify_metagenome.sh.o526544
CLARK version 1.2.6.1 (UCR CS&E. Copyright 2013-2019 Rachid Ounit, roun...@cs.ucr.edu)
The program did not find the database files for the provided settings and reference sequences (4958 targets). The program will build them.
Starting the creation of the database of targets specific 31-mers from input files...
 Progress report: (10773/16459)    /lustre/scratch/dsi33/clark/classify_metagenome.sh: line 177: 55936 Segmentation fault      $LDIR/exe/CLARK $PARAMS

On Friday, January 17, 2020 at 1:19:36 PM UTC-5, Divya Inamdar wrote:

Rachid

unread,
Feb 6, 2020, 12:12:36 PM2/6/20
to CLARK Users
Hi Divya,

It seems your system does not have enough memory/RAM. Could you share your server settings? 
Please make sure to follow instructions for reporting bugs/errors provided in this post:

Thank you,
Best,
Rachid

Divya Inamdar

unread,
Feb 9, 2020, 6:26:48 PM2/9/20
to CLARK Users
This was allocated when i ran the script:
### a hard limit 8 GB of memory per slot - if the job grows beyond this, the job is killed
#$ -l h_vmem=160G
### want nodes with at least 6 GB of free memory per slot
#$ -l m_mem_free=158G

Rachid

unread,
Feb 9, 2020, 7:36:34 PM2/9/20
to CLARK Users
Thank you for sharing Divya,

It does seem to me that setting 160GB/158GB seems to be too low. Please try iteratively with higher amount such as 200GB, and see if it works. If not, increase to 250GB, etc until it works.

Best,
Rachid

Divya Inamdar

unread,
Feb 9, 2020, 7:37:26 PM2/9/20
to CLARK Users
Hey,
My server settings are as follows:
#$ -l h_vmem=160G
### want nodes with at least 6 GB of free memory per slot
#$ -l m_mem_free=158G

Divya

On Thursday, February 6, 2020 at 12:12:36 PM UTC-5, Rachid wrote:

Divya Inamdar

unread,
Feb 13, 2020, 2:48:16 PM2/13/20
to CLARK Users
Hey,
I tried running with 200G memory allocation but I still got this error:
CLARK version 1.2.6.1 (UCR CS&E. Copyright 2013-2019 Rachid Ounit, roun...@cs.ucr.edu)
The program did not find the database files for the provided settings and reference sequences (4958 targets). The program will build them.
Starting the creation of the database of targets specific 31-mers from input files...
 Progress report: (10893/16459)    /cm/local/apps/sge/var/spool/ac01n02/job_scripts/527089: line 34: 51773 Segmentation fault      /lustre/scratch/dsi33/clark/exe/CLARK -k 31 -T /lustre/scratch/dsi33/clark/targets.txt -D /lustre/scratch/dsi33/clark/ -O /lustre/scratch/dsi33/clark/Bacteria/GCF_003193965.1_ASM319396v1_genomic.fna -R /lustre/scratch/dsi33/clark/results

Also, I checked how much memory is actually utilized in this step, it comes out to be 66.849G. So despite of giving higher memory it is still creating problem by giving segmentation fault. Could you help me with this.

Divya

On Friday, January 17, 2020 at 1:19:36 PM UTC-5, Divya Inamdar wrote:

Rachid

unread,
Feb 13, 2020, 4:52:51 PM2/13/20
to CLARK Users
Hi Divya,

This is strange. Have you tried to report this to your helpdesk ?
Have you repeated this and consistently found this issue? Could you show it to us using several examples?
Thanks for sharing this!

Best,
Rachid

Rachid

unread,
Mar 11, 2020, 10:51:51 AM3/11/20
to CLARK Users
H Divya,
Is this issue still happening to you? Could you share with us some updates?
Thank you for your help!
Reply all
Reply to author
Forward
0 new messages