Calculate abundances for Custom databases

49 views

assignmentclassificationclassify_metagenomefeatureoutputspecies

Skip to first unread message

Jens

unread,

Jan 27, 2017, 10:06:41 AM1/27/17

to CLARK Users

Hi!,

I made a Custom database with a taxonomy file that lookes like:

PATH species_1

PATH species_2

...

The CLARK step works fine. Then I want to calculate the abundances.

1st run: It complained names.dmp doesn't exist. For this particular database I didn't download the full NCBI taxonomy into the taxonomy folder. But if needed, I can do that. However, I made an empty names.dmp file and ran again.

2nd run: I get these warning messages:

Loading nodes of taxonomy tree... done

Start retrieving lineage for each target identified (2375)...

Failed to identify species_62236: Unknown taxonomy id given the provided taxonomy database.

...

The program will estimates abundance per taxonomy id.

And I thought, that's fine! However, when I look in the output I get:

Name,TaxID,Lineage,Count,Proportion_All(%),Proportion_Classified(%)

UNKNOWN,UNKNOWN,6173800,100,-

Now, how do I trick CLARK to summarize into these home-made species cateogires that are not related to NCBI taxonomy? I assume I may have to just make my own version of the names.dmp file. What should it look like?

Thanks!

Rachid

unread,

Jan 27, 2017, 8:36:20 PM1/27/17

to CLARK Users

Hello Jens,

You can do this by not using the option "-D" when you call "estimate_abundance.sh". Please, see the definition of the parameters in the README file.

Cheers,

Rachid

Reply all

Reply to author

Forward

0 new messages