logprob ranking for partition returns

13 views
Skip to first unread message

Evi Van Itallie

unread,
Dec 7, 2022, 10:05:00 AM12/7/22
to partis
Hello!
I have a question about how the partition command returns the likely partitions and annotation. 

The information about the partition annotations says from here:
says: 

"The partition action writes a list of the most likely partitions and their relative likelihoods, as well as annotations for each cluster in the most likely partition." 

Also here: Partitioning results in a list of partitions, with one line for the most likely partition (the one with the lowest logprob), as well as a number of lines for the surrounding less-likely partitions.

However, when we use the partition command the output for both the yaml and csv output has a disconnect between the cluster than is returned in the annotation and the one that is considered "most likely" or "lowest" based on the logprob values. 

Here is an example below. The first row has the lowest log prob, however, the partition returned in the annotation is the one with logprob of "-866.083336." We have compared the result with a different clonal partitioning method and the clones that are returned in the annotation is also the one that this other method returns and is what makes sense when we look at the sequences. Thus, there seems to be a disconnect between the logprob numbers and what is actually considered the most likely partition.  

logprob.           n_clusters           partition01
-896.113272     4     [[M01438_176_000000000-KGJG2_1_1106_22055_7729...11
-882.433441     3     [[M01438_176_000000000-KGJG2_1_1106_22055_7729...21
-866.088336     2     [[M01438_176_000000000-KGJG2_1_1106_22055_7729...

If we try another example that returns more partition events then again we see that partition that is annotated is the one with the least negative logprob, not the most negative logprob. 

We would love an explanation of what is happening here! Do you mean something different by "lowest" than the most negative since probabilities closer to 1 should give larger log probabilities than number closer to 0. 

Thank you so much!
Elizabeth Van Itallie, PhD


Duncan Ralph

unread,
Dec 7, 2022, 11:28:17 AM12/7/22
to Evi Van Itallie, partis
Whoops! Yes, you're completely right, "lowest" is incorrect, it should say "highest", since as you say that's the one with the highest probability. I just pushed the fix to dev, but it'll take a bit to propagate to the main branch.

Thanks!

--
You received this message because you are subscribed to the Google Groups "partis" group.
To unsubscribe from this group and stop receiving emails from it, send an email to partis+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/partis/fdddaee3-5ae8-4f86-ac2f-be99181937can%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages