Managing geocoordinates

20 views
Skip to first unread message

Parthasarathi Mukhopadhyay

unread,
May 24, 2021, 10:19:16 PM5/24/21
to openr...@googlegroups.com
Dear all

I'm facing a problem in formatting geocoordinate data available in decimal degrees (fetched from Nominatim and Geonames) for indian place name according to MARC 21 instruction (it says data must be in the format hddd.dddddd (hemisphere-degrees.decimal degrees).

The data as obtained varies like this (longitude for example) -

Extracted


To be formatted


93.9

93.900000

92.78333

92.783330

93

93.000000

92.9155

92.915500

92.65234

92.652340

92.5

92.500000

77.05424

77.054240

78.02168

78.021680

76.9776

76.977600

77.60813

77.608130

78.58704

78.587040

77.77038

77.770380

77.136327899

77.136327


Following a primitive way in "cell transfor" repeatedly to get rid of this issue like

if(value.split(".")[1].length()==1, value+"00000",value)

if(value.split(".")[1].length()==2, value+"0000",value)

if(value.split(".")[1].length()==3, value+"000",value)

if(value.split(".")[1].length()==4, value+"00",value)

if(value.split(".")[1].length()==5, value+"0",value)


But I still need to do some hand-crafted editing when there are more than 6 places after a decimal and when there are no decimal values at all.

Obviously, not a way to deal with more than half-a-million Indian place names.


Any suggestions for doing it more elegantly?


Best regards



-----------------------------------------------------------------------
Parthasarathi Mukhopadhyay
Professor, Department of Library and Information Science,
University of Kalyani, Kalyani - 741 235 (WB), India
-----------------------------------------------------------------------

Yael Netzer

unread,
May 25, 2021, 2:13:44 AM5/25/21
to openr...@googlegroups.com
In principle, you can edit the cells with the expression - which will add zeros completing 6 digits

value + "000000".substring(value.split(".")[1].length())


then need to take care of the two exceptions - when there is no dot and when there are more than 6 digits after it.

value.split(".")[0] + "." + value.split(".")[1].substring(0,6)

and all the cases where there are no dot at all
value + ".000000"

Putting it all together:

if (value.contains("."),
    if (value.split(".")[1].length() > 6,
        value.split(".")[0] + "." + value.split(".")[1].substring(0,6),
         value + "000000".substring(value.split(".")[1].length())),
    value + ".000000")



image.png

I hope this is understandable and helpful

Yael


On Tue, 25 May 2021 at 05:19, Parthasarathi Mukhopadhyay <psmukho...@gmail.com> wrote:
Dear all

I'm facing a problem in formatting geocoordinate data available in decimal degrees (fetched from Nominatim and Geonames) for indian place name according to MARC 21 instruction (it says data must be in the format hddd.dddddd (hemisphere-degrees.decimal degrees).

The data as obtained varies like this (longitude for example) -
e

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/CAGM_5uY%3D%2BUw7xAb41moSSbJcL%2ByVmLT6gPccM9axJ7S%2B14Nv2g%40mail.gmail.com.

Parthasarathi Mukhopadhyay

unread,
May 25, 2021, 6:13:46 AM5/25/21
to openr...@googlegroups.com
Thanks Yael for the kind help.
It's now through.

Best regards

Owen Stephens

unread,
May 27, 2021, 3:24:53 AM5/27/21
to OpenRefine
If you convert these to numbers first, you can then convert back to a string with a specified format. For example

value.toNumber().toString("%.6f")

The formatter "%.6f" is a Java formatter and not the most intuitive (in my opinion) but here the "6f" means six decimal places.

Best wishes

Owen

Parthasarathi Mukhopadhyay

unread,
May 27, 2021, 3:55:10 AM5/27/21
to openr...@googlegroups.com
Thanks Owen. A smart solution really with only one issue - it is rounding off in some cases (see result no. 6)

1.28.613895428.613895
6.11.13703888888911.137039
11.13.08388888888913.083889
16.11.012511.012500
21.11.7511.750000
26.12.12612.126000
31.10.410.400000
 
I need to know whether rounding off is allowed or not in case of geocoordinate values (though I personally think it has little effect if we are rounding off in the sixth position but I'm not a geographer; a comment from a geographer (if in the list) is highly appreciated).

Heartfelt thanks and Best regards



Owen Stephens

unread,
May 27, 2021, 7:35:47 AM5/27/21
to OpenRefine
If you don't need to round it off, you can do some work to truncate the number to the right of the decimal separator before doing the number/string conversion:

with(value.split("."),a,a[0]+"."+if(isError(a[1]),"0",a[1].substring(0,6))).toNumber().toString("%.6f")

This expression also deals with an issue I saw where a number with no decimal separator/places would cause an error in the conversion (because behind the scenes there is a difference between how integers and floating point numbers are handled)

Parthasarathi Mukhopadhyay

unread,
May 27, 2021, 7:58:17 AM5/27/21
to openr...@googlegroups.com
Ahhh... now perfectly producing results.

1.28.613895428.613895
6.11.13703888888911.137038
11.13.08388888888913.083888
16.11.012511.012500
21.11.7511.750000
26.12.12612.126000
31.10.410.400000

Best regards

Reply all
Reply to author
Forward
0 new messages