I've been preparing a custom database for use with CLARK-S and I had a few questions after going through the README.1.) How are accession numbers in the FASTA data used? The majority of the genomes I'm using have GenBank accession IDs, but a few only have JGI GOLD IDs or no standard database ID yet. In those cases, will it be fine to use any unique identifier?
">accession.number ..." or ">gi|number|ref|accession.number| ..."
2.) For targets_addresses.txt the README recommends NCBI taxonomy IDs as the labels for each genome file. It mentions that any label is fine, but I'm wondering if there will be an issue mixing NCBI IDs and, for example, "Genus_species" labels within the same targets_addresses file.
3.) For setting up a custom database, the README specifies "...one fasta file per reference sequence". The database I'm working with provides a single file containing all the FASTA records for all the genomes, most of which are split into multiple contigs. I'm working on processing the file for use with CLARK, and I just wanted to verify whether CLARK will expect one file per contig, or if I could group multiple contigs for one genome in the same file.
Thanks,Shareef
If you know the species name or species taxonomy ID of your JGI sequences then can you look up for the NCBI accession number? Once you have the accession number you can override the header of these sequences with it through the format mentioned.
When you define your targets, you define them for a given taxonomic rank (species by default if you can), so one identifier (a number, a text, etc.) is sufficient.But what do you mean by "mixing NCBI IDs"? Can you provide a real case for such a situation ?
./genomes/genome1.fa NCBI_ID_1
./genomes/genome2.fa NCBI_ID_2
./genomes/genome3.fa Genus_species
./genomes/genome4.fa NCBI_ID_3But even if I constructed the targets file myself...
--Thanks,Shareef
You received this message because you are subscribed to the Google Groups "CLARK Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clarkusers+unsubscribe@googlegroups.com.
To post to this group, send email to clark...@googlegroups.com.
Visit this group at https://groups.google.com/group/clarkusers.
To view this discussion on the web visit https://groups.google.com/d/msgid/clarkusers/37daf4eb-b16f-45d9-90f0-23d87ed15322%40googlegroups.com.
Thanks,Shareef
To unsubscribe from this group and stop receiving emails from it, send an email to clarkusers+...@googlegroups.com.
To post to this group, send email to clark...@googlegroups.com.
Visit this group at https://groups.google.com/group/clarkusers.