First of all thanks for the useful tool.
I have been experimenting with CLAVIN. We have texts of addresses for various companies. I am trying to resolve location from address texts such as following.
1: 200 Gardner Steel Conference Center Thackeray & Oӈara Streets , Pittsburgh , United States
2: 30765 Pacific Coast Highway , Suite 435, Malibu, Los Angeles, Ca 90265 , California City
3: 5th Avenue , New York , United States
As expected, CLAVIN would resolve multiple locations from such texts. e.g. in case 1, it would extract Pittsburgh and United States. To have most specific lat-long I am choosing the first location resolved i.e. I would chose Pittsburgh in case 1 since its more specific than United States.
There is one problem though in some cases like text 1. Clavin resolves 3 locations from text 1.
Resolved "Gardner" as: "Gardner" {Gardner (United States, MA) [pop: 20228] <4937557>}, position: 4, confidence: 1.000000, fuzzy: false
Resolved "Pittsburgh" as: "Pittsburgh" {Pittsburgh (United States, PA) [pop: 305704] <5206379>}, position: 65, confidence: 1.000000, fuzzy: false
Resolved "United States" as: "United States" {United States (United States, 00) [pop: 310232863] <6252001>}, position: 78, confidence: 1.000000, fuzzy: false
As you can see, the first location extracted is wrong. And hence due to my preference to chose the most specific location I will end up choosing the wrong location. I would like to verify the hierarchy. Is there any way I can do that?
I explored classes of Clavin and found a relevant package "multipart". I tried following methods keeping fuzzy true.
public ResolvedLocation resolveLocation(String loc, boolean fuzzy)
public ResolvedLocation resolveLocation(boolean fuzzy,String... locationParts)
First method does not resolve any location and second method only return "United States" for both the texts.
Can anyone direct on how this issue can be solved?