Hi,
I have a hierarchical clustering problem, where I have 3 levels of clustering -- clusters of within document mentions (lowest level), across document links b/w these clusters (middle level), and document-clustering based on the previous two levels (top most). I want to implement this in a alternating sampling fashion, similar to what was described in
this paper. (Hopefully this is doable in factorie)
I can run the code for HierCoref using the instructions
here. The problem is I am not too familiar with factorie's primitives and understanding the code is somewhat difficult (my scala is scratchy at best).
In particular,
1. What exactly is a canopy? Is it akin to cluster id?
2. What is the difference between CanopyPairGenerator and DeterministicPairGenerator?
3. What is a cubbie?
4. Why do you have nonexistentEnts in the CanopyPairGenerator? This seems strange, as I cannot see when a node will be nonexistent mention.
5. How to propogate features from child to parent?
6. How to learn the model params and then use them for inference.
Is there someone who is currently maintaining the coreference code who can guide me? I will be extremely grateful for your help.
Thanks