Resolving locations from comma separated address text

81 views
Skip to first unread message

pratik patel

unread,
Jan 22, 2015, 10:47:14 AM1/22/15
to clavin...@googlegroups.com
First of all thanks for the useful tool.

I have been experimenting with CLAVIN. We have texts of addresses for various companies. I am trying to resolve location from address texts such as following.

1: 200 Gardner Steel Conference Center Thackeray & Oӈara Streets , Pittsburgh , United States
2: 30765 Pacific Coast Highway , Suite 435, Malibu, Los Angeles, Ca 90265 , California City
3: 5th Avenue , New York , United States

As expected, CLAVIN would resolve multiple locations from such texts. e.g. in case 1, it would extract Pittsburgh and United States. To have most specific lat-long I am choosing the first location resolved i.e. I would chose Pittsburgh in case 1 since its more specific than United States.

There is one problem though in some cases like text 1. Clavin resolves 3 locations from text 1.

Resolved "Gardner" as: "Gardner" {Gardner (United States, MA) [pop: 20228] <4937557>}, position: 4, confidence: 1.000000, fuzzy: false
Resolved "Pittsburgh" as: "Pittsburgh" {Pittsburgh (United States, PA) [pop: 305704] <5206379>}, position: 65, confidence: 1.000000, fuzzy: false
Resolved "United States" as: "United States" {United States (United States, 00) [pop: 310232863] <6252001>}, position: 78, confidence: 1.000000, fuzzy: false

As you can see, the first location extracted is wrong. And hence due to my preference to chose the most specific location I will end up choosing the wrong location. I would like to verify the hierarchy. Is there any way I can do that?

I explored classes of Clavin and found a relevant package "multipart".  I tried following methods keeping fuzzy true.

public ResolvedLocation resolveLocation(String loc, boolean fuzzy)
public ResolvedLocation resolveLocation(boolean fuzzy,String... locationParts)

First method does not resolve any location and second method only return "United States" for both the texts.

Can anyone direct on how this issue can be solved?

Makiko Shukunobe

unread,
Jan 22, 2015, 11:11:47 AM1/22/15
to pratik patel, clavin...@googlegroups.com
Hi Pratik,

I think what you need is the geocoder instead of CLAVIN since you have full address to map the locations.  

About Geocoding:

Geocoders (there are more geocoder available.  Links below are a few of many.):



--
You received this message because you are subscribed to the Google Groups "clavin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clavin-users...@googlegroups.com.
To post to this group, send email to clavin...@googlegroups.com.
Visit this group at http://groups.google.com/group/clavin-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/clavin-users/35bd5c1b-a69b-4ae5-b621-5101b2e54240%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Research Associate
NC State University,
Center for Geospatial Analytics
Box 7106, Jordan Hall 5117
Raleigh, NC 27695
Makiko Shukunobe

Charlie Greenbacker

unread,
Jan 22, 2015, 11:14:35 AM1/22/15
to clavin...@googlegroups.com, prat...@gmail.com
Yes, Makiko is correct. It sounds like Pratik needs a geocoder, not a geoparser like CLAVIN.

CLAVIN's multipart package is designed to work with tables/spreadsheets that have separate columns for city, province, country, etc. -- not fully-specified street addresses.

- Charlie

pratik patel

unread,
Jan 23, 2015, 12:41:28 AM1/23/15
to clavin...@googlegroups.com, prat...@gmail.com
What we are trying to process is not necessarily a structured address text such as examples I gave in original post. We have semantic data and we want to resolve locations from unstructured text. But there is a specific field which I know would be a structured street address. So, I guess I am going to need CLAVIN anyway. I will play with it a little more and see if it can work for structured street address. If not, then I will use CLAVIN for unstructured data and some geocoder for structured data.

Thanks for the quick help!

Charlie Greenbacker

unread,
Jan 23, 2015, 3:10:48 PM1/23/15
to pratik patel, clavin...@googlegroups.com
Pratik,

I will play with it a little more and see if it can work for structured street address.

It won't work -- CLAVIN is not designed to resolve street addresses. At best, it will give you the coordinates of the correct city. At worst, it will try to resolve the street name to some geopolitical entity (such as in your example).

If not, then I will use CLAVIN for unstructured data and some geocoder for structured data.

This is definitely the way to go. Give structured street addresses to a geocoder. Give unstructured data (i.e., sentences, or city/state/country tuples) to CLAVIN.

- Charlie

Reply all
Reply to author
Forward
0 new messages