Dear Gavin,
I have currently two issues using PICRUSt2
1) I would like to predict 16s gene copy numbers not using your default database, but the rrnDB. This is because my analyses indicated that the RRN values of your references database might be less accurate then those in rrnDB. For this purpose I run the following command:
picrust2_pipeline.py -s test.fasta -i test.asv.tsv --no_pathways --min_samples 0 -o pic.rrnDB -p 32 --marker_gene_table rrnDB_customREF/16S_rrnDB.txt --ref_dir rrnDB_customREF/ --skip_minpath --edge_exponent 0
the output for RRN predictions is created as I want to, but the command anyway ends with the following error message:
Error running this command:
hsp.py --tree /bio/Analysis_data/IOWseq000048_gesifus.field/snakemake_test_20240824/Intermediate_results/metab/pic.rrnDB/tmp/out.tre --output /bio/Analysis_data/IOWseq000048_gesifus.field/snakemake_test_20240824/Intermediate_results/metab/pic.rrnDB/tmp/EC_predicted.tsv.gz --observed_trait_table /bio/Software/anaconda3/envs/picrust2-2.5.1/lib/python3.8/site-packages/picrust2/default_files/prokaryotic/ec.txt.gz --hsp_method mp --edge_exponent 0.0 --seed 100 --processes 32
obviously, the reference file EC_predicted.tsv.gz does not fit to the species in my default reference tree. As I am not interested in EC predictions, I have so far ignored this error. However, I am currently working on integrating the PICRUST2 predictions into a multi-step snakemake pipe, and the snakemake pipline stops due to this error. I hoped to fix this problem by adding the flag --no_pathways, but the error is still produced. I get this error for the picrust versions 2.4.2 and 2.5.1. As I do not have admin rights on our server I currently cannot try with the latest PICRUSt2 version that is not installed. Any idea how to enable the EC prediction in my command to avoid the error message and make the command working within my snakemake pipeline?
2) I am also running further trait predictions via the hsp.py command. This is working well using version 2.3.0_b. Versions 2.4.2 and 2.5.1 produce the same output, but it takes forever. Is it possible that there is a bug with the parallelization of the hsp.py script in newer PICRUST2 versions?
Everything best,
Sara