what files are okay to delete from taxonomy/?

34 views
Skip to first unread message

Drishti Kaul

unread,
Mar 8, 2018, 4:06:34 PM3/8/18
to CLARK Users
Hi Rachid,

I was trying to delete a couple of files from the *taxonomy*/ directory post analysis, after the desired database (Bacteria in the example below) is created, and I was wondering if I could remove nucl_accss, nucl_gb.accession2taxid, and nucl_wgs.accession2taxid? These files take up a lot of the memory and it seems like the important files required for the downstream abundance estimation (names.dmp/nodes.dmp among others) are already saved and zipped at taxdump.tar.gz. 

So, is it okay to remove everything else but the taxdump.tar.gz folder? I couldn't find that information in the readme and I tried looking at a couple of the bash scripts too and couldn't find any instances of any other files in taxonomy/ other than names.dmp and nodes.dmp, so I just wanted to make sure I inquired before deleting something that would be important for reproducing the results in the future.

-bash-4.1$ ls db
Bacteria  bacteria_0  files_excluded.txt  targets.txt  taxonomy
-bash-4.1$ ls db/taxonomy/
citations.dmp  delnodes.dmp  division.dmp  gc.prt  gencode.dmp  merged.dmp  names.dmp  nodes.dmp  nucl_accss  nucl_gb.accession2taxid  nucl_wgs.accession2taxid  readme.txt  taxdump.tar.gz

Thanks,
Drishti 

Rachid

unread,
Mar 9, 2018, 4:14:29 PM3/9/18
to CLARK Users
You will need taxdump.tar.gz, as well as nucl_accss.

Drishti Kaul

unread,
Mar 12, 2018, 4:29:35 PM3/12/18
to CLARK Users
So its okay to delete the rest then, everything apart from the zipped taxdump and nucl_accss? 
Reply all
Reply to author
Forward
0 new messages