[XML-SIG] parsing XML with minidom

33 views
Skip to first unread message

kimmyaf

unread,
Apr 25, 2010, 6:24:53 PM4/25/10
to xml...@python.org

Hello. I've only done a litte bit of parsing with minidom before but I'm
having trouble getting my values out of this xml. I need the latitude and
longitude values in bold. I've tried several things. I think that I am
getting into the location tag but maybe the getAttribute function is not
correct for this example?



<GeocodeResponse>
<status>OK</status>
<result>
<type>street_address</type>
<formatted_address>50 Oakland St, Wellesley, MA 02481,
USA</formatted_address>
<address_component>
<long_name>50</long_name>
<short_name>50</short_name>
<type>street_number</type>
</address_component>
<address_component>
<long_name>Oakland St</long_name>
<short_name>Oakland St</short_name>
<type>route</type>
</address_component>
<address_component>
<long_name>Wellesley</long_name>
<short_name>Wellesley</short_name>
<type>locality</type>
<type>political</type>
</address_component>
<address_component>
<long_name>Wellesley</long_name>
<short_name>Wellesley</short_name>
<type>administrative_area_level_3</type>
<type>political</type>
</address_component>
<address_component>
<long_name>Norfolk</long_name>
<short_name>Norfolk</short_name>
<type>administrative_area_level_2</type>
<type>political</type>
</address_component>
<address_component>
<long_name>Massachusetts</long_name>
<short_name>MA</short_name>
<type>administrative_area_level_1</type>
<type>political</type>
</address_component>
<address_component>
<long_name>United States</long_name>
<short_name>US</short_name>
<type>country</type>
<type>political</type>
</address_component>
<address_component>
<long_name>02481</long_name>
<short_name>02481</short_name>
<type>postal_code</type>
</address_component>
<geometry>
<location>
<lat>42.3118520</lat>
<lng>-71.2632680</lng>
</location>
<location_type>ROOFTOP</location_type>
<viewport>
<southwest>
<lat>42.3093524</lat>
<lng>-71.2665476</lng>
</southwest>
<northeast>
<lat>42.3156476</lat>
<lng>-71.2602524</lng>
</northeast>
</viewport>
</geometry>
</result>
</GeocodeResponse>


Code:

body = dom.getElementsByTagName('GeocodeResponse')[0]

for item in body.getElementsByTagName('location'):
lat = item.getAttribute('lat')
lng = item.getAttribute('lng')
--
View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28359328.html
Sent from the Python - xml-sig mailing list archive at Nabble.com.

_______________________________________________
XML-SIG maillist - XML...@python.org
http://mail.python.org/mailman/listinfo/xml-sig

--
You received this message because you are subscribed to the Google Groups "Python: XML SIG" group.
To post to this group, send email to python-...@googlegroups.com.
To unsubscribe from this group, send email to python-xml-si...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/python-xml-sig?hl=en.

Stefan Behnel

unread,
Apr 26, 2010, 12:44:49 AM4/26/10
to kimmyaf, xml...@python.org
kimmyaf, 26.04.2010 00:24:
> Hello. I've only done a litte bit of parsing with minidom before but I'm
> having trouble getting my values out of this xml. I need the latitude and
> longitude values in bold.

I don't see anything 'bold' in your mail, but your example tells me what
data you mean.

Here is some untested code using xml.etree.cElementTree:

import xml.etree.cElementTree as ET
tree = ET.parse("thefile.xml")
for tag in tree.getiterator("location"):
print tag.findtext("lat"), tag.findtext("lng")

Note that cElementTree is both faster and simpler than minidom.

Stefan

kimmyaf

unread,
Apr 26, 2010, 5:14:57 PM4/26/10
to xml...@python.org

Thanks Stefan. I tried this but it's not getting into the for block for some
reason.
I'll keep trying!
View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28370309.html
Sent from the Python - xml-sig mailing list archive at Nabble.com.

Stefan Behnel

unread,
Apr 27, 2010, 7:35:32 AM4/27/10
to kimmyaf, xml...@python.org
kimmyaf, 26.04.2010 23:14:
> Stefan Behnel-3 wrote:
>> kimmyaf, 26.04.2010 00:24:
>>> Hello. I've only done a litte bit of parsing with minidom before but I'm
>>> having trouble getting my values out of this xml. I need the latitude and
>>> longitude values in bold.
>>
>> I don't see anything 'bold' in your mail, but your example tells me what
>> data you mean.
>>
>> Here is some untested code using xml.etree.cElementTree:
>>
>> import xml.etree.cElementTree as ET
>> tree = ET.parse("thefile.xml")
>> for tag in tree.getiterator("location"):
>> print tag.findtext("lat"), tag.findtext("lng")
>
> Thanks Stefan. I tried this but it's not getting into the for block for some
> reason.

Maybe the document uses namespace declarations that you forgot to show us?

Stefan

kimmyaf

unread,
Apr 27, 2010, 5:30:39 PM4/27/10
to xml...@python.org

I don't really know... Here's the whole story.

I am retrieving the xml by calling this link.

http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true



Here's the entire function:

addr = '50+Oakland+St,Wellesley,MA,02481'

def geocode_addr(addr):
hostname = 'http://maps.google.com/maps/api/geocode/xml?'
prefix = 'address='
sensor = '&sensor=true'
url = hostname + prefix + addr + sensor

print url

handler = urllib2.urlopen(url)

xml_response = handler.read()
print xml_response
#dom = minidom.parseString(xml_response)
handler.close()

tree = ET.parse("GeocodeResponse.xml")
print 'here'
for tag in tree.getiterator("location"):
print 'here1'
print tag.findtext("lat")
tag.findtext("lng")


*** I actually just pasted the xml from the shell where i printed
xml_response and saved it into an xml file in my folder called
GeocodeResponse.xml to test this... before going through the work of saving
the xml into a file. I got the "here" but not the "here1"

I'm attaching my actual file..

Sorry! I appreciate the help! this is the last piece of functionality i need
to get working for my programming assignment!








Stefan Behnel-3 wrote:
>
> kimmyaf, 26.04.2010 23:14:
>> Stefan Behnel-3 wrote:
>>> kimmyaf, 26.04.2010 00:24:
>>>> Hello. I've only done a litte bit of parsing with minidom before but
>>>> I'm
>>>> having trouble getting my values out of this xml. I need the latitude
>>>> and
>>>> longitude values in bold.
>>>
>>> I don't see anything 'bold' in your mail, but your example tells me what
>>> data you mean.
>>>
>>> Here is some untested code using xml.etree.cElementTree:
>>>
>>> import xml.etree.cElementTree as ET
>>> tree = ET.parse("thefile.xml")
>>> for tag in tree.getiterator("location"):
>>> print tag.findtext("lat"), tag.findtext("lng")
>>
>> Thanks Stefan. I tried this but it's not getting into the for block for
>> some
>> reason.
>
> Maybe the document uses namespace declarations that you forgot to show us?
>
> Stefan
> _______________________________________________
> XML-SIG maillist - XML...@python.org
> http://mail.python.org/mailman/listinfo/xml-sig
>
>
http://old.nabble.com/file/p28382321/GeocodeResponse.xml GeocodeResponse.xml
http://old.nabble.com/file/p28382321/GeocodeResponse.xml GeocodeResponse.xml
--
View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28382321.html
Sent from the Python - xml-sig mailing list archive at Nabble.com.

kimmyaf

unread,
Apr 27, 2010, 5:32:39 PM4/27/10
to xml...@python.org

Now that I look at my file it does not look well formed. Do I have to use a
file? I tried to do

tree = ET.parse(xml_response)

but i got a file IO error...
View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28382343.html

Luis Miguel Morillas

unread,
Apr 27, 2010, 6:34:54 PM4/27/10
to kimmyaf, xml...@python.org
2010/4/27 kimmyaf <flahe...@hotmail.com>:
>
> I don't really know... Here's the whole story.
>
> I am retrieving the xml by calling this link.
>
> http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true
>
>
>
> Here's the entire function:
>
> addr = '50+Oakland+St,Wellesley,MA,02481'
>
> def geocode_addr(addr):
>    hostname =  'http://maps.google.com/maps/api/geocode/xml?'
>    prefix = 'address='
>    sensor = '&sensor=true'
>    url = hostname + prefix + addr + sensor
>

I prefer amara:

>>> from amara import bindery
>>> doc = bindery.parse("http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true")
>>> locations = doc.xml_select(u'//location')
>>> for loc in locations:
... print loc.lat, loc.lng
...
42.3118520 -71.2632680

;)

--lm

Peter Bigot

unread,
Apr 27, 2010, 7:37:33 PM4/27/10
to Luis Miguel Morillas, kimmyaf, xml...@python.org
I'd have to concur with that recommendation.  Google is uninterested in defining schema for their APIs, so you need to process the XML manually and hope they don't change their interface.

BTW: The lat and lon components of the location are elements, not attributes.  For minidom, use:
    lat = item.getElementsByTagName('lat')[0]
    lat.normalize()
    print lat.firstChild.data

Much easier when you can generate a proper binding from a schema, or use something like Amara that does so without a schema.

Peter

Fred Drake

unread,
Apr 28, 2010, 1:09:59 AM4/28/10
to Peter Bigot, XML-SIG
On Tue, Apr 27, 2010 at 7:37 PM, Peter Bigot <big...@acm.org> wrote:
> Google is uninterested in defining schema for their APIs, so you need to
> process the XML manually and hope they don't change their interface.

And indeed, they do change their schemas without real concern backward
compatibility. The sitemaps are in the middle of changing even now.


-Fred

--
Fred L. Drake, Jr. <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller
_______________________________________________
XML-SIG maillist - XML...@python.org
http://mail.python.org/mailman/listinfo/xml-sig

Stefan Behnel

unread,
Apr 28, 2010, 1:43:32 AM4/28/10
to kimmyaf, xml...@python.org
kimmyaf, 27.04.2010 23:32:
>> handler = urllib2.urlopen(url)
>> xml_response = handler.read()
>> handler.close()
>>
>> tree = ET.parse("GeocodeResponse.xml")

>> Do I have to use a file? I tried to do
>>
>> tree = ET.parse(xml_response)

parse() is meant for parsing files. Use fromstring() to parse from a string.

This works for me:

>>> import xml.etree.cElementTree as ET
>>> tree = ET.parse('gmap.xml')
>>> print [ (el.findtext('lat'), el.findtext('lng'))
... for el in tree.getiterator('location') ]
[('42.3118520', '-71.2632680')]

Stefan
_______________________________________________
XML-SIG maillist - XML...@python.org
http://mail.python.org/mailman/listinfo/xml-sig

kimmyaf

unread,
Apr 28, 2010, 5:37:13 PM4/28/10
to xml...@python.org

Thanks all for the help. This gives me alot of good options and I have a few
working.... I learned a lot!
View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28394291.html
Reply all
Reply to author
Forward
0 new messages