Is there a way to extract the names of closest references in the tree?

42 views
Skip to first unread message

Adi Lavy

unread,
Feb 26, 2019, 2:08:02 PM2/26/19
to pplacer users
Hi All!
I've been using pplacer and guppy to place a set of sequences on a reference tree. The reference tree does not contain any taxonomy information, just names of organisms.
I would like to know if there is a way to ask pplacer to create a table (or list) in which for each query sequence it will show what are the N closest reference sequences.

Thank you!
Adi

Greg G

unread,
Feb 26, 2019, 8:06:11 PM2/26/19
to pplace...@googlegroups.com
Hello,

I had this same question, and there is function in guppy called "to_csv" that does the job, by creating a table of best hits.

Example usage:

~/pplacer-Linux-v1.1.alpha19/guppy to_csv Input.jplace -o Output_table.csv

It helps to first run pplacer and have it to output only the top hit for each sequence (rather than up to 4 matches, as is the default). That can be done like so:

Example usage:

~/pplacer-Linux-v1.1.alpha19/pplacer -c example.refpkg aln.fasta* --keep-at-most 1

~Greg

--
You received this message because you are subscribed to the Google Groups "pplacer users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pplacer-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adi Lavy

unread,
Mar 8, 2019, 7:36:45 PM3/8/19
to pplacer users
Hi Greg,
Thank you for your answer.
I did follow your advice, but I did not restrict pplacer to just 1 match as I could use the 4 matches... 
The output table has the following format:

origin,name,multiplicity,edge_num,like_weight_ratio,post_prob,likelihood,marginal_like,distal_length,pendant_length,classification,map_ratio,map_overlap,map_identity
query
.15rp,Query_PLM0_60_coex_sep16_Abawaca1_113,1,2201,1,1,-5.89562e+06,-5.89563e+06,0.189797,0.192744,NA,NA,NA,NA

How can I get the actual names of the best matches from this table?

Thanks!

Adi

Greg G

unread,
Mar 10, 2019, 9:59:05 AM3/10/19
to pplace...@googlegroups.com
Sorry, you're right, I see now that I have not fully addressed your question. Unfortunately I do not have an answer, as I resorted to manually replacing the branch numbers in the tree with taxon names of interest.

Wish I could be of more help,

~Greg
Reply all
Reply to author
Forward
0 new messages