Best way to handle a probabilistic data structure.

86 views
Skip to first unread message

thomas...@gmail.com

unread,
Oct 18, 2024, 11:31:46 AM10/18/24
to tlaplus
I wanted to simulate something like a Bloom Filter. For those unaware, a Bloom Filter is kind of like a set, but when looking up a value in a set, there's a risk of a false positive, but never a risk of a false negative.

I'm not terribly concerned with the actual implementation of this, so I just wanted to make an operator like "IsInFilter(value, filter)", where filter is actually just a set, and if value is in the set it might return TRUE or FALSE, but if it's not in the set it always returns FALSE.

I'm having a little trouble expressing the non-determinism with this.  My initial implementation was something like this:


IsInSet(value, mySet) ==
    \E result \in {TRUE, FALSE} :
        (value \in mySet => result) /\ (value \notin mySet => FALSE)

But I think the existential quantifier in this case is tautological.  Does anyone have any ideas on the best way of handling this?

thomas...@gmail.com

unread,
Oct 18, 2024, 12:17:05 PM10/18/24
to tlaplus
I just realized that this might be a malformed question; it probably makes sense to express this in the "next-state" level instead of the operator level.

Andrew Helwer

unread,
Oct 18, 2024, 1:16:40 PM10/18/24
to tlaplus
If you want to analyze the actual probability properties of something like this, you may be more interested in PRISM. It supports modeling both Discrete-Time Markov Chains (DTMCs) which have a probabilistic state transition out of every state, and a mixed nondeterminism/probabilistic transition model called a Markov Decision Process (MDP). I wrote about the differences between these two models here: https://ahelwer.ca/post/2020-09-11-probabilistic-distsys/

If you are more interested in exploring this problem with TLA+ I recommend taking a look at this spec authored by Markus Kuppe and Ron Pressler: https://github.com/tlaplus/Examples/tree/master/specifications/KnuthYao

Andrew Helwer

A. Jesse Jiryu Davis

unread,
Oct 18, 2024, 1:32:24 PM10/18/24
to tla...@googlegroups.com

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/5f6efaa9-10bd-48a6-aa00-be4975d75741n%40googlegroups.com.

thomas...@gmail.com

unread,
Oct 18, 2024, 2:01:52 PM10/18/24
to tlaplus
I don't care about the probabilities, I just want to express that false positives are possible to ensure my model works correctly when dealing with them, so I don't think I need to use PRISM or anything like that. As I mentioned, I think trying to avoid any kind of "next-state" semantics on this was probably the wrong approach anyway.

Most of my model is using PlusCal so I think I can express what I need with an `either` or a `with`. 
Reply all
Reply to author
Forward
0 new messages