Picrust Analysis with Nanopore long reads

527 views
Skip to first unread message

Vaibhav Wagh

unread,
Mar 23, 2022, 7:38:07 AM3/23/22
to picrust-users
Hi,
I am looking for a way to get functional analysis with 16s long reads. For the abundance classification we are using Centrifuge tool and have centrifuge outputs in kreport format. Or Can generate a table with first column as TaxID and other columns for Sample abundance. Is there a way I could go ahead with doing COG and KEGG functional annotations with current data?

Any help is appreciated!
Thanks in advance!

Best,
Vaibhav

Gavin Douglas

unread,
Mar 23, 2022, 9:52:51 AM3/23/22
to picrus...@googlegroups.com
Hey Vaibhav,

I haven’t tried the pipeline with nanopore long reads, but the pipeline is agnostic to how the sequences are created: it will take in any FASTA of 16S rRNA genes. So it should work if you just follow the standard tutorial.


Cheers,

Gavin

--
You received this message because you are subscribed to the Google Groups "picrust-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to picrust-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/picrust-users/2ee05e60-32fe-452a-9f5f-c75ae68916bcn%40googlegroups.com.

Vaibhav Wagh

unread,
Mar 24, 2022, 3:36:00 AM3/24/22
to picrust-users
Hi Gavin,
Thanks for the quick reply,
With the long reads I am facing poor alignment issue.
here is the command  and output I have used for the test of 1000 test sequences.
Since without the  min_align option it gave very poor alignment, I changed it to 0.5 but faced the same issue.

Another followup : The input fasta sequence headers need to be without spaces(the above alignment I changed those) is there a turnaround for that, since these are adapter trimmed fasta files, could you suggest a way to deal with bigger fasta files.

###
$place_seqs.py -s ../test.fasta -o out.tre -p 100 --intermediate intermediate/place_seqs --min_align 0.5
$Warning - 142 input sequences aligned poorly to reference sequences (--min_align option specified a minimum proportion of 0.5 aligning to reference sequences). These input sequences will not be placed and will be excluded from downstream steps.

Thanks !
Best,
Vaibhav

Gavin Douglas

unread,
Mar 24, 2022, 8:08:53 AM3/24/22
to picrus...@googlegroups.com
Hi there,

You can see here for a quick fix for how to get one field per fasta header:

Is there a reason why that wouldn’t work for your fasta file?


Cheers,

Gavin

Vaibhav Wagh

unread,
Mar 25, 2022, 3:03:34 AM3/25/22
to picrust-users
Hi,
I think, the issue is with the alignment with the long reads, we have tried changing the min_align and also the fasta headers. The alignment seems to be poor and only around 40% reads have been classified with min_align 0.5.
Is there a way I could use taxonomy abundance table(i.e. TaxIds in the first column and Absolute read abundance of samples in the remaining columns)? for the functional annotation and classification?
Thanks!

Best,
Vaibhav

Gavin Douglas

unread,
Mar 25, 2022, 9:20:33 AM3/25/22
to 'cervant...@licifug.ugto.mx' via picrust-users
Hi there,

I see, that is definitely an issue then  and I'm not sure how to get around it with the current reference sequences. You could try an alternative tool such as Piphillin instead, to see if the alignment works. PanFP is a tool for getting metagenomic predictions based on taxonomic labels  but I'm not sure if it has been updated for a while.


All the best,

Gavin

Vaibhav Wagh

unread,
Mar 28, 2022, 12:50:16 AM3/28/22
to picrust-users
Thank you for your response!
I have been a Picrust user for a while, and really appreciate it.
Look forward to an upgrade soon!
Wish you the Best,
Vaibhav
Reply all
Reply to author
Forward
0 new messages