Error while running picrust2 in qiime2

53 views
Skip to first unread message

Sreevatshan Srinivasan

unread,
Mar 21, 2021, 1:43:37 AM3/21/21
to picrust-users

Hi everyone, I was trying to do picrust2 analysis for my data, I was using custom-tree-pipeline for my analysis. I had successfully constructed the SEPP tree… Then followed as given in the PiCRUST2 tutorial.

qiime picrust2 custom-tree-pipeline --i-table Pyro-dada2-denoise/pyro-LogMPIE-table.qza --i-tree Downloads/Galaxy24-[picrust-tree.qza].qza --p-threads 8 --output-dir PiCRUST2-pyro-LOgMPIE-results --p-max-nsti 2 --p-highly-verbose --verbose
Running the below commands:
hsp.py -i 16S -t /tmp/tmpqtcoxpy6/placed_seqs.tre -p 1 -n -o /tmp/tmpqtcoxpy6/picrust2_out/16S_predicted.tsv.gz -m mp --verbose

Rscript /home/sreevatshan/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/picrust2/Rscripts/castor_nsti.R /tmp/tmpqtcoxpy6/placed_seqs.tre /tmp/tmp5q4j9t3g/known_tips.txt /tmp/tmp5q4j9t3g/nsti_out.txt

Rscript /home/sreevatshan/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/picrust2/Rscripts/castor_hsp.R /tmp/tmpqtcoxpy6/placed_seqs.tre /tmp/tmpgm21nsf1/subset_tab_0 mp FALSE FALSE /tmp/tmpfiluqyri/predicted_counts.txt /tmp/tmpfiluqyri/predicted_ci.txt 100

hsp.py -i EC -t /tmp/tmpqtcoxpy6/placed_seqs.tre -p 8 -n -o /tmp/tmpqtcoxpy6/picrust2_out/EC_predicted.tsv.gz -m mp --verbose

Error running this command:
hsp.py -i EC -t /tmp/tmpqtcoxpy6/placed_seqs.tre -p 8 -n -o /tmp/tmpqtcoxpy6/picrust2_out/EC_predicted.tsv.gz -m mp --verbose

Standard output of the above failed command:

Standard error of the above failed command:
Rscript /home/sreevatshan/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/picrust2/Rscripts/castor_nsti.R /tmp/tmpqtcoxpy6/placed_seqs.tre /tmp/tmpoc1a9qer/known_tips.txt /tmp/tmpoc1a9qer/nsti_out.txt.

Hope to hear from you guys soon...!

Gavin Douglas

unread,
Mar 22, 2021, 11:01:13 AM3/22/21
to picrus...@googlegroups.com
Hi there,

I’m not sure what the problem is, but if you don’t mind sharing your input files with me directly then I can help troubleshoot!


Cheers,

Gavin

--
You received this message because you are subscribed to the Google Groups "picrust-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to picrust-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/picrust-users/537bd7bc-a149-4f65-8284-0046dc7b68e5n%40googlegroups.com.

Sreevatshan Srinivasan

unread,
Mar 22, 2021, 11:17:49 AM3/22/21
to picrust-users
Hey Gavin, Thanks for quick reply, It will be great if you can share the mail id to send the files.!

Gavin Douglas

unread,
Mar 22, 2021, 11:19:19 AM3/22/21
to picrus...@googlegroups.com
For sure, you can email it to the email in the attached image. 


Cheers,

Gavin

Sreevatshan Srinivasan

unread,
Mar 22, 2021, 11:52:49 AM3/22/21
to picrust-users

Gavin, I had sent the files thru mail. Check it out

Best,
Sreevatshan Srinivasan

Gavin Douglas

unread,
Mar 22, 2021, 6:16:37 PM3/22/21
to picrus...@googlegroups.com
Hey there,

I had a chance to run the command that caused the error and I wasn’t able to reproduce the error.

However, I noticed your BIOM table has ~250,000 OTUs and a fraction of non-zero values of 0.003. I think the error is probably related to issues handling that many OTUs. I would recommend that you filter our rare OTUs, which I think will greatly reduce the number of input OTUs and simplify all of your analyses actually. I think that the error will be avoided with such a filtered table.


Cheers,

Gavin

Sreevatshan Srinivasan

unread,
Mar 23, 2021, 2:33:49 PM3/23/21
to picrust-users
Hey Gavin, I had filtered out the rare ASVs as you had suggested, but still I am getting an same error. I wont be able to do the full pipeline as it requires much more intense memory. Is there any way around.?

Regards,
 
 Sreevatshan Srinivasan

Gavin Douglas

unread,
Mar 23, 2021, 3:33:23 PM3/23/21
to picrus...@googlegroups.com
Hey there,

It took about ~24 GB of RAM to run the original test command I tried and I’m sorry to hear that reducing the number of input OTUs hasn’t resolved your issue. Without access to more RAM (if that is indeed the issue) then your best bet is probably to use a different metagenome prediciton method, like Piphillin, which might be easier to run with that many OTUs.


All the best,

Gavin

Sreevatshan Srinivasan

unread,
Mar 30, 2021, 12:49:43 PM3/30/21
to picrust-users
Hey Gavin, Update 1 - I had tried with Piphillin server , well it showed maximum file size exceeded, I had tried to remove the rare asv both in sequences and table it shows the same. Is there any other way to do this analysis.? and if possible can you do the analysis.?

Best,

Sreevatshan

Sreevatshan Srinivasan

unread,
Mar 30, 2021, 3:56:39 PM3/30/21
to picrust-users
Hey Gavin, I just have a doubt I had generated the Placement and Phylogeny tree using the SEPP in qiime2 with the reference database given in tutorial. Can we use the placement tree that is produced by the SEPP and run the hsp.py separately. For example running hsp.py -i 16S -t placed_seqs.tre -o marker_nsti_predicted.tsv.gz -p 1 -n  at first and after completion of this the next command with respect to EC and KO individually and carry on to next step.

Gavin Douglas

unread,
Mar 31, 2021, 9:04:03 AM3/31/21
to picrus...@googlegroups.com
Hey there,

You can use a tree made from an outside pipeline like SEPP, but that’s actually what we were doing originally when we were running hsp.py, so that wont fix the problem.

Sorry to hear that Piphillin also doesn’t work for your use case. It seems like in either case (running PICRUSt2 or Piphillin) you would need to subset your input files into different sets. For instance you could probably run PICRUSt2 fine if you subset the input FASTA and BIOM into sets of 10,000 ASVs at a time and then ran PICRUSt2 separately on each set.


Cheers,

Gavin

Sreevatshan Srinivasan

unread,
Apr 1, 2021, 9:50:14 AM4/1/21
to picrust-users
Hey Gavin,

Update-2 - I had generated the final output files for EC and KO metagenomes. Under EC_metagenome- these are the files generated- pred_metagenome_contrib.tsv.gz  seqtab_norm.tsv.gz and pred_metagenome_unstrat.tsv.gz.
And under KO-metagenome- these are the files generated - seqtab_norm.tsv.gz and pred_metagenome_unstrat.tsv.gz. Could you confirm these are only files will be generated..but pred_metagenome_contrib.tsv.gz was not generated in KO_metagenome.

Best,

Sreevatshan

Gavin Douglas

unread,
Apr 1, 2021, 5:00:13 PM4/1/21
to picrus...@googlegroups.com
Hey there,

Those are the expected outfiles, but the “pred_metagenome_contrib.tsv.gz” would also have been expected to be created in the KO output folder. I think you likely ran out of RAM when running that part of the command as it can create a very large object.


Cheers,

Gavin

Reply all
Reply to author
Forward
0 new messages