Making sense of classification output from clark

Sanjeev Sariya

unread,

May 11, 2017, 2:30:13 PM5/11/17

to CLARK Users

Hi Rachid,

Thanks for the robust support and clark tool. I'm using clark to classify 16S (V3-V4 region) reads. CLARK version 1.2.3. Tool is run on SLURM job scheduler on a 250Gb RAM node.

First, I created bacteria database, default with no parameters or flags provided using set_targets.sh script.

Next, I classified my FASTA file using classify_metagenome.sh script. Upon completion I get results.csv file, with 3 columns: sequence identifier, length and assignment.

I'd like to know different species and genus that tool classified reads to and with respective confidence to each taxonomic rank (kingdom, phyla, class, order, genus, species). Can I retrieve information I'm interested in from .csv file?

Looking forward to hear from you.

Cheers!

Sanjeev

Rachid

unread,

May 12, 2017, 7:21:26 PM5/12/17

to CLARK Users

Hi Sanjeev,

Thank you for your interest! Please consider using "estimate_abundance.sh" to have the taxonomic information about the CLARK results (please consult the README file).

If you have a specific need or feature that "estimate_abundance.sh" does not include then please let us know in detail what it is.

Thank you!

Cheers,

Rachid

Sanjeev Sariya

unread,

May 15, 2017, 9:17:36 AM5/15/17

to CLARK Users

Hi Rachid,

Thanks for your quick response.

I used "estimate_abundance.sh" but it gives me Krona file for visualization. I'm more interested for taxonomy (scientific names) in a text file from results.csv with the confidence at different taxonomy rank (phylum, class, order, etc.). Is that something do you plan to add in your enhancement list?

Best,

Sanjeev

----------------

Sanjeev Sariya

unread,

May 15, 2017, 9:26:21 AM5/15/17

to CLARK Users

Hi Rachid,

Edit:

I used estimate_abundance.sh before providing it flag for krona file. With results csv file as input estimate script gives me Target_ID (two columns), count, proportion and proportion classified. Attached.

Is there any way I could retrieve names of Target_Id (complete taxonomy) with the confidence?

Thanks,

Sanjeev

------------------------

On Friday, 12 May 2017 19:21:26 UTC-4, Rachid wrote:

estimate.csv

Sanjeev Sariya

unread,

May 15, 2017, 10:42:05 AM5/15/17

to CLARK Users

I think I got scientific names with command: bash estimate_abundance.sh -F result.csv -D db_dir >> estimate.txt

This is great. :)

The default confidence is 0.5.

How do I get confidence of each assignment in estimate abundance's output?

Thanks,

Sanjeev

-------

Reply all

Reply to author

Forward