Setting custom target database cannot execute binary file error.

43 views
Skip to first unread message

Rami Baghdan

unread,
Jun 4, 2020, 9:37:16 PM6/4/20
to CLARK Users
Hello,

After following the instructions to set a custom target database, I get a "cannot execute binary file" warning after running the following command:
./set_targets.sh DIR_DB custom

My custom database contains single reference fasta files in the "accession.fasta" format.
How may I resolve this?

More specifically this warning comes up after uncompressing files... for the getAccssnTaxID and getfilesToTaxNodes files.


Rachid

unread,
Jun 4, 2020, 10:09:43 PM6/4/20
to CLARK Users
Hi Rami,
Thank you for sharing this!
Before executing "set_targets.sh ..." did you run the installation command?

Rami Baghdan

unread,
Jun 11, 2020, 1:01:36 AM6/11/20
to CLARK Users
Hello,

I have resolved the issue, thank you.

After running CLARK-S, some of the outputs contain NA for both first and second assignments although scores are present. 

The following commands were run:

./set_targets.sh DIR_DB viruses --species

./buildSpacedDB.sh

./classify_metagenome.sh -O samples.txt -R samples.txt --spaced

where samples.txt contains the file names of the fasta files I am trying to classify.

To combine all the outputs:

cat sample*.csv > sampleALL.csv

I get the following output:



My concern is, why would I get the 2nd assignment for some and not the first assignment, and why do I get both assignments as NA in some cases but there is a value for gamma score and score1 for instance?

Thank you in advance.

Rachid

unread,
Jun 11, 2020, 1:09:34 AM6/11/20
to CLARK Users
Hi Rami,
Please see my answers below. I believe this post can be closed.
Thank you!


On Wednesday, June 10, 2020 at 10:01:36 PM UTC-7, Rami Baghdan wrote:
Hello,

I have resolved the issue, thank you.


Great!
 
After running CLARK-S, some of the outputs contain NA for both first and second assignments although scores are present. 

This can be expected, please make sure to review the provided README file.
 

The following commands were run:

./set_targets.sh DIR_DB viruses --species

./buildSpacedDB.sh

./classify_metagenome.sh -O samples.txt -R samples.txt --spaced

where samples.txt contains the file names of the fasta files I am trying to classify.

To combine all the outputs:

cat sample*.csv > sampleALL.csv

I would be very careful with this method for concatenation because each CLARK results file contains a header that you want to deal with.
 

I get the following output:



My concern is, why would I get the 2nd assignment for some and not the first assignment, and why do I get both assignments as NA in some cases but there is a value for gamma score and score1 for instance?


This is expected given the settings/filtering you have applied (whether they're custom or default), as CLARK-S reports classifications that have confidence/gamma scores above threshold settings.
 The scores for a read do not pass these settings then the result is overridden to NA to void the results. Please make sure to review the README file for more.

 
Thank you in advance.

Anytime!

Best,
Rachid 
Reply all
Reply to author
Forward
0 new messages