Hi all,
I've a PState which is a map from a term to set of document ids (basically an inverted index). I would like to perform set arithmetic (intersection, union) on two or more subindexed sets belonging to the same PState.
MicrobatchTopology indexing = topologies.microbatch("indexing");
indexing.pstate("$$termToDocIdPostings", PState.mapSchema(String.class, PState.setSchema(String.class)).subindexed());
How can I achieve this ?
One vague approach I've is to use a query topology in-memory PState to incrementally perform the set operations. Not sure if this is a right approach. Here's what I tried so far,
topologies.query("intersectPostings", "*term1", "*term2").out("*res")
.hashPartition("*term1")
.localSelect("$$termToDocIdPostings", Path.collect(
Path.multiPath(
Path.key("*term1"),
Path.key("*term2"))
.all())).out("*allPostings")
.localTransform("$$intersectPostings$$", Path.termVal("*allPostings"))
.each(Ops.PRINTLN, "Intersection:", "*allPostings");
Not sure how to proceed further doing the set intersection for two or more sets.
Thanks,
Sashi