Aug 26, 2020, 10:32:51 AM8/26/20
to nokogiri-talk

My knowledge of Ruby and XML is shallow.

I need to read/edit KML files, but am going nowhere fast :-/

require 'nokogiri'

#============ 1. Read KML
xmldoc = Nokogiri::XML( "track.kml")

#============ 2. Set //Document/Name to filename, regardless if it exists or not
puts File.basename(__FILE__)

#============ 3. Loop through all <Placemark>…</Placemark> blocks
xmldoc.xpath("//Placemark").each do |link|
  #If current placemark contains LineString, add <some>element</some> to block
#============ #4 Write KML to new file.

Would someone have a small example I could use to get started?

Thank you.

Mike Dalessio

Aug 26, 2020, 10:44:48 AM8/26/20
to nokogiri-talk

Thanks for reaching out, and I'm sorry that Nokogiri doesn't make it more obvious how to do this kind of document manipulation.

Can I suggest that you start by looking at:

If those aren't sufficient, can you help me understand where you're blocked by asking more-specific questions?

Aug 26, 2020, 10:59:04 AM8/26/20
to nokogiri-talk
Thanks. I read those, but I'm stuck.

1. How to read and loop through all <Placemark> block?

xmldoc.xpath("//Placemark").each do |link|
  #If current placemark contains LineString, add <some>element</some> to block

2. How to check if the current block holds a <LineString> element. If not; how to add one?

3. How to save what's in RAM into a new KML file?

I'm not clear at the difference between CSS and XPath, to give you an idea at where I stand.

Thank you.

Aug 26, 2020, 11:00:08 AM8/26/20
to nokogiri-talk
" 2. How to check if the current block holds a <LineString> element. If not; how to add one? "
"2. How to check if the current block holds a <LineString> element. If not; how to add a child element <some>blah</some>? 

Mike Dalessio

Aug 26, 2020, 11:34:02 AM8/26/20
to nokogiri-talk
OK, thanks for clarifying. If you could provide a short example "input" XML snippet and the desired "output" XML snippet, that would help me to find a good path to get from one to the other ... is that something you're able to provide?

Aug 26, 2020, 12:24:08 PM8/26/20
to nokogiri-talk
Sure. Here's what a KML file looks like:

<?xml version="1.0" encoding="UTF-8"?> <kml xmlns=""> <Document> <name>Document.kml</name> <Placemark> <name>Waypoint</name> <Point> <coordinates>-122.371,37.816,0</coordinates> </Point> </Placemark> <Placemark> <name>Track</name> <LineString> <coordinates>-0.376291,43.296237,199.75 -0.376299,43.296237,199.75</coordinates> </LineString> </Placemark> </Document> </kml> 

I simply need to insert/replace the document's name (if there's any), and add a <some>blah</some> to any Placemark block that contains a LineString. 

Mike Dalessio

Aug 26, 2020, 1:41:50 PM8/26/20
to nokogiri-talk
Here's an attempt to answer your questions with working code:

For posterity, the code looks like:

#! /usr/bin/env ruby

require "nokogiri"

xml = <<EOF
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="">
        <coordinates>-0.376291,43.296237,199.75 -0.376299,43.296237,199.75</coordinates>

doc = Nokogiri::XML(xml)

# 1. insert/replace the document's name. 
# First, let's find the name node.
# For explanation of why the "xmlns" is needed, check out:
# >
name_node = doc.at_xpath("/xmlns:kml/xmlns:Document/xmlns:name")

# You could also use a CSS query which will (mostly) ignore namespaces. This is exactly the same search.
name_node = doc.at_css("kml > Document > name")

# Modify the contents of the <name/> node
name_node.content = "New Document Name"

# 2. a. find all Placemark blocks that contain a LineString
# I can think of two ways to do this. The first way is to search for
# LineString nodes within a Placemark node, and then get those nodes' parents:
placemarks_with_linestring = doc.xpath("//xmlns:Placemark/xmlns:LineString").map(&:parent)

# This first approach would work with CSS as well:
placemarks_with_linestring = doc.css("Placemark > LineString").map(&:parent)

# The second approach is to just use an XPath query to express that
# you want Placemarks that contain a LineString:
placemarks_with_linestring = doc.xpath("//xmlns:Placemark[xmlns:LineString]")

# Then you can add a new child node to that Placemark:
placemarks_with_linestring.each do |placemark|
  # the string passed into add_child is parsed just like any other XML fragment
  placemark.add_child "<some>blah</some>"

# The end result:
puts doc.to_xml
# >> <?xml version="1.0" encoding="UTF-8"?>
# >> <kml xmlns="">
# >>   <Document>
# >>     <name>New Document Name</name>
# >>     <Placemark>
# >>       <name>Waypoint</name>
# >>       <Point>
# >>         <coordinates>-122.371,37.816,0</coordinates>
# >>       </Point>
# >>     </Placemark>
# >>     <Placemark>
# >>       <name>Track</name>
# >>       <LineString>
# >>         <coordinates>-0.376291,43.296237,199.75 -0.376299,43.296237,199.75</coordinates>
# >>       </LineString>
# >>     <some>blah</some></Placemark>
# >>   </Document>
# >> </kml>

# 3. Write this to a file
# Use normal Ruby idioms for opening a file and writing to it, and use #to_xml to serialize the doc:"output.kml", "w") do |file|
  file.write doc.to_xml

Aug 26, 2020, 7:24:24 PM8/26/20
to nokogiri-talk
Thanks very much!

require "nokogiri"

doc = Nokogiri::XML( "input.kml")

#I'm only using a single namespace, so let's keep things simple

name_node = doc.at_xpath("/kml/Document/name")
name_node.content = "New Document Name"

placemarks_with_linestring = doc.xpath("//Placemark[LineString]")
placemarks_with_linestring.each do |placemark|
  placemark.add_child "<some>blah</some>"

puts doc.to_xml"output.kml", "w") do |file|
  file.write doc.to_xml

Ouf ot curiosity, why is "<some>blah</some>" not lined up, has the </Placemark> trailing, and is there a way to remedy this?


Mike Dalessio

Aug 27, 2020, 3:23:43 PM8/27/20
to nokogiri-talk

> doc.remove_namespaces!

Please keep in mind that your output document will then not have any namespaces, which may be a problem for anybody consuming your edited KML file.

> why is "<some>blah</some>" not lined up

Whitespace is a text node like any other, though usually it carries no semantic meaning in XML documents. If you'd like to eliminate whitespace altogether, you could parse this way:

doc = Nokogiri::XML(xml) do |config|

Aug 28, 2020, 5:23:57 AM8/28/20
to nokogiri-talk
Thank you very much.

Mike Dalessio

Aug 28, 2020, 8:44:14 AM8/28/20
to nokogiri-talk
Just as an aside, this question and use case are really well-suited to be an example in the Nokogiri tutorials. I've created an issue to drive that addition:

Thanks again for asking!

