Access KML file downloaded by NetworkLink

934 views
Skip to first unread message

Robert Cazaciuc

unread,
Feb 1, 2011, 9:18:34 AM2/1/11
to KML Developer Support - Advanced Support for KML
Hi everyone,

I am trying to find a way to access the KML retrieved by the Network
Link either manually or using Java Script and the GE plug-in. I have
posted this in 2 different groups because I am not sure under which
category would fit better.

The short story: I want to save on my computer every single instance
of the KML file downloaded by Google Earth which is specified in the
Network Link.

The long story:

Some websites allow you to view delayed information of all the
commercial ships in the world. They provide a KML file which contains
all the information needed by Google Earth to then download the actual
data and display it. From my understanding, with the Network Link
functionality, it automatically downloads the file specified under the
<href> tag every x seconds. I am interested in saving this KML file
locally, and so far it has been pretty easy for me to do so: simply
take the link from the initial KML file downloaded (specified in the
<NetworkLink><Link> tag) and do a bit of Java coding to download this
file on a regular basis. Everything was nice and smooth.

Recently, I subscribed to a website which has a wider coverage of the
data that I am interested in, and they pretty much have the same
solution to providing the data and displaying it in Google Earth
except a few things.

The most important difference is the fact that the URL inside the
<NetworkLink> of the KML file that I download from they're website is
randomly generated every time I download the file.

Say for example that the KML is file is located at http://www.mywebsite.com/ge.kml.
The file structure is the following:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<kml xmlns="http://earth.google.com/kml/2.1">
<Document>
<NetworkLink>
<description>from </description>
<name>Live Data</name>
<Link>
<href>http://www.mywebsite.com:80/earth/
user_ME_XXXXXXX.kmz</href>

<httpQuery>gv=[clientVersion]&kv=[kmlVersion]&l=[language]</
httpQuery>
<refreshInterval>60.0</refreshInterval>
<refreshMode>onInterval</refreshMode>
<viewBoundScale>2.0</viewBoundScale>
</Link>
</NetworkLink>
<ScreenOverlay>
<description>Logo</description>
<name>Logo</name>
<color>FFFFFFFF</color>
<drawOrder>0</drawOrder>
- <Icon>
<href>/images/Logo_Google_earth.png</href>
</Icon>
<overlayXY yunits="fraction" y="0.05" xunits="fraction"
x="0.02" />
<screenXY yunits="fraction" y="0.05" xunits="fraction"
x="0.02" />
<size yunits="fraction" y="0.0" xunits="fraction" x="0.3" />
</ScreenOverlay>
</Document>
</kml>


If with the first two websites I would simply take the link from the
<Link><href> tag (in this case <href>http://www.mywebsite.com:80/earth/
user_ME_XXXXXXX.kmz</href>) and download it, now I can't do this
anymore because XXXXXX is a randomly generated random. Now, I know
that I can simply do a bit more coding to simply download the first
file and then look inside for the newly generated link and use that
one but there is another problem. Before, I was able to copy/paste the
link and download the KMZ file. With this new website, whenever I try
to download the file from the location given by them (the one ending
in XXXX), I get a HTTP error request (404). But in the same time, if I
open the original file (ge.kml) in Google Earth, it will automatically
download that file every minute and display the data.

This made me think that maybe there is something in the request that
Google Earth is making to the server that allows it to get the data. I
started monitoring the outgoing/incoming packets made from my computer
to see if I can 'catch' the request made by GE to get the KMZ file but
without any luck. I then realised that GE stores all the downloaded
files in a temporary folder and I have been able to track which file
inside the temp folder is the file that I am interested in. So I've
managed to find a rather ugly solution to this by simply reading that
file from the temp folder whenever I want. The problem with this is
that 1) As a programmer I like to do follow good code practices and
this is definitely not one of them 2) If something happens to the
NetworkLink connection that requires a refresh in GE ( right click on
the feed -> refresh), it can only be done manually.

So I'm looking at finding a different way of being able to access that
file. I have 3 options of how to do it differently, one of which I can
probably implement myself.

1) Use the GE plug-in and do a bit of java scripting to be able to
listen to the NetworkLink connection to the server and do a refresh
every time the file hasn't been refreshed for more than the specified
time (in this case 60 seconds). This is one solution that would help
me get the file even if I am away from the computer but I am still not
happy about it.

2) Looking at the official NetworkLink documentation (
http://code.google.com/apis/kml/documentation/kmlreference.html#link)
I came across the following paragraph:

When a file is fetched, the URL that is sent to the server is composed
of three pieces of information:
the href (Hypertext Reference) that specifies the file to load.
an ARBITRARY format string that is created from (a) parameters
that you specify in the <viewFormat> element or (b) bounding box
parameters (this is the default and is used if no <viewFormat>
element is included in the file).
a second format string that is specified in the <httpQuery>
element.

From this I can only concluded that the parameters that GE adds to end
of the link provided by the <href> node (http://www.mywebsite.com:80/
earth/user_ME_XXXXXXX.kmz) IS NOT optional as otherwise I can't
explain myself why it says resource not found when I try to access it
without any parameters.

My question is: Do you know how I could possibly complete the link in
the same manner that GE does (effectively pretending to be GE when in
fact I'll be coding it in Java or something) so I can access the file?
Also, I've been reading about CGI scripiting and dynamic KMZ file
generation. Is this the case? If so, what do I need to do in order to
retrieve the file?


3) Use GE plug-in and try to access the kmlObject every time it is
downloaded by GE. At the moment, I found some sample code (below)
that would allow me to capture the original KML object that I download
(ge.kml) but not the subsequent KMZ files downloaded by GE by using
the information in NetworkLink.

google.earth.fetchKml(ge,kmlUrl,finishFetchKml);

function finishFetchKml(kmlObject) {
// check if the KML was fetched properly
if (kmlObject) {
// add the fetched KML to Earth
currentKmlObject = kmlObject;
ge.getFeatures().appendChild(currentKmlObject);
} else {
// wrap alerts in API callbacks and event handlers
// in a setTimeout to prevent deadlock in some browsers
setTimeout(function() {
alert('Bad or null KML.');
}, 0);
}
}

The finishFetchKml will be called after the KML file is downloaded and
I can then use the kmlObject to construct my own KML file. Is there a
similar way of doing this for the files downloaded from the link
inside the NetworkLink?


I hope I have provided enough background on the issue (and not too
much either) and I am looking forward to any suggestions!

Thank you,

Rob




Kriston

unread,
Feb 1, 2011, 12:38:19 PM2/1/11
to kml-suppor...@googlegroups.com
I had encountered a similar problem and while I do not have a complete solution I can recommend some more ways to get this information out of some unknown remote host's NetworkLinks.  Try to keep in mind that the people running that service are, in all likelihood, intentionally restricting your access so please don't use my advice to do anything improper.

We have lots of internal people making all sorts of good and bad KML hosted all over the place so I frequently encounter problems with them.  I use a program called Fiddler2 as well as Wireshark to analyze the packets.  I also do extensive work on the server end so I can give you an idea where those arbitrary data come from in the KML/KMZs.  Note that KMZ files are merely ZIP files with a KML extension so you can use Fiddler2 to transform them if you're so inclined.

When you retrieve the remote Network Link on the first visit, look at "viewFormat" and "httpQuery."  These two contain part of data you're seeking but not all of it.  Using Fiddler2 or Wireshark, use the Google Earth Client to visit this Network Link and see the KML/KMZ file that the Network Link retrieves.  If it's another NetworkLink, repeat the process until you get an actual KML or KMZ file, inside of which you'll probably find a KML element called "NetworkLinkControl."

NetworkLinkControl is the element that overrides the original NetworkLink with some new items.  There is one named "cookie" which is a sort of unique identifier for the client that the remote host has probably set up as a session variable for all subsequent KML requests from that particular client.

Furthermore, if you don't already have it, get the slightly outdated and hard-to-find book, "The KML Handbook," by J. Wernecke, ISBN 0-321-52559-0.  Just keep watching here because while it's authoritative it's far from perfect.

I know this isn't a complete answer but I hope this helps you find your solution and I look forward to what you've found out.

Kris


LenChaney

unread,
Feb 1, 2011, 9:52:56 PM2/1/11
to KML Developer Support - Advanced Support for KML
There is also the user-agent string which could tell the server that
is generating the KML that GoogleEarth is making the request.

Check this out for the user-agent that is used by GoogleEarth.
http://user-agent-string.info/list-of-ua/browser-detail?browser=Google%20Earth

I did not follow this entire conversation but I did get the idea that
this company is purposely trying it to make it so you cannot automate
retrieval. If they don't want you to do it, don't do it.

Robert Cazaciuc

unread,
Feb 3, 2011, 10:57:44 AM2/3/11
to KML Developer Support - Advanced Support for KML
Thank you both for your suggestions.

I managed to solve the problem by simply using Fiddler2 to monitor the
exact http request that Google Earth was doing and I was able to copy
that link and manually download the file from a web browser which was
exactly what I wanted. I have tried using Wireshark before but they
seemed to display individual packets level data rather than separate
queries as Fiddler2 was doing.

It was way easier than I expected it to be. Nevertheless, thank you
both again for your support.

Rob

LenChaney

unread,
Feb 3, 2011, 9:44:44 PM2/3/11
to KML Developer Support - Advanced Support for KML
I know it sounds simple but could not you just use a proxy component
to retrieve the KML, parse it as XML and get the network link. You
could then either open GE with the KML file or use GE's COM component
to open GE and have it retrieve the KML file.

Len

Rossko

unread,
Feb 6, 2011, 6:19:42 AM2/6/11
to KML Developer Support - Advanced Support for KML
> Recently, I subscribed to a website which has a wider coverage of the
> data that I am interested in,

You could always ask the owner of the data if they would provide you
with a feed.
Reply all
Reply to author
Forward
0 new messages