Dear Joany,
You are right, the provided scripts for setting the database (set_targets.sh) and classifying samples (classify_metagenome.sh) are specific for the NCBI/RefSeq databases. In your case, you want to use a different database.
But you can still use CLARK, if you do not use these scripts but rather use directly the executable "CLARK" located in the "exe" folder (created at the installation).
To use CLARK with data from the SILVA database, you would need to do the following directions (I am assuming you want to work at the species level):
1) Download all the files/sequences into a specific directory,
2) Extract every sequences from each file downloaded and store each of these sequences into separate file into a specific directory, called "DIR_DB"
3) Build a two-column file "targets.txt": the first column contains all filenames related to the sequences stored in DIR_DB and the second column has all the identifiers (or scientific names, and for each name, all words are concatenated with a '-' or "_" for example) associated to each sequence in DIR_DB. For example, from the terminal:
$ cat targets/.txt
<DIR_DB/SEQUENCE1> <ID_1>
<DIR_DB/SEQUENCE2> <ID_2>
<DIR_DB/SEQUENCE3> <ID_3>
...
Note that <ID_1> and <ID_2> can be the same identifier if <SEQUENCE1> and <SEQUENCE2> have the same identifier at the species level.
I believe that in order to do this, you need to use the SILVA taxonomy definitions.
These steps are actually the main steps done by the script "set_targets.sh" but for NCBI/RefSeq sequences.
Then run CLARK, to classify, say the sample <sampleA.fa> and store the results in <resultsA>, with default settings:
$ ./exe/CLARK -T targets.txt -D DIR_DB -O <sampleA.fa> -R <resultsA>
It will build the database first if it has not been created yet.
Cheers,
Rachid