extracting target specific kmers

64 views
Skip to first unread message

Juma Ajodeh

unread,
Apr 14, 2016, 7:33:43 AM4/14/16
to CLARK Users

Hey,

I would like to extract target specific kmers from the databases of the created target specific kmers, that is am interested in finding unique kmers per species. I understand that there could be shared kmers between the species I have.
Using, the option --tsk in the classification, these target specific kmer databases are generared. However, the naming seem to confise me. For instance, I have three targets (M. floridensis and M. hapla, M. incognita)- occuring in this order in the custom database for targets.
When I request for creation of target specific kmers, the newly created databases for the target specific kmers have the naming 298350_k21.ht, 6305_k21.ht, 6306_k21.ht appearing in this order. How can I tell which database of target specific kmers corresponds to its target.

Regards,
John

Rachid

unread,
Apr 14, 2016, 1:58:15 PM4/14/16
to CLARK Users
Hello John,

This is an excellent question. 
CLARK works with taxonomy id to handle targets. Thus, CLARK uses the taxonomy id of the targets to create filenames and store in them the related specific k-mers.
For instance, the target specific k-mers (here you used k=21) for Meloidogyne incognita (taxid = 6306) will be stored in the file 6306_k21.ht. 

Cheers,
Rachid

Juma Ajodeh

unread,
Apr 15, 2016, 7:45:40 AM4/15/16
to CLARK Users
Hey Rachid,

Thanks. I later on realized that indeed the taxid id correspond to the species after running Krona using the output from estimate_abundance.sh. I need to compare target specific kmers that are present or absent in the species I am analyzing. i.e get unique kmers for each species.
Not sure how I can do it, probably I will need a script to perform the task, perhaps write a python script where I compare and extract the unique kmers for each species. I would welcome a suggestion on how I can approach it?

Regards,
John

Rachid Ounit

unread,
Apr 15, 2016, 1:16:50 PM4/15/16
to Juma Ajodeh, CLARK Users
Hello John,

I am not sure what your challenge fully is. What do you need to compare? Could you elaborate ?
If you look for unique/specific kmers per species, then these files (of extension *.ht) contain what you need.

Best,
Rachid
--
You received this message because you are subscribed to the Google Groups "CLARK Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clarkusers+...@googlegroups.com.
To post to this group, send email to clark...@googlegroups.com.
Visit this group at https://groups.google.com/group/clarkusers.
To view this discussion on the web visit https://groups.google.com/d/msgid/clarkusers/5d7db723-5e47-4ae1-a232-23a5cd4d1583%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages