There has been a lot of discussion on a large amount of overlays on a single Google map. Different hacks have approached this problem in different ways. Many sites do not even deal with it because they can't, don't know how, or don't want to.
The capability to display a huge amount of overlays I believe should be handled somehow by the Google Maps tool itself. However, since most, if not all of Google Maps is written in JavaScript this may be a problem. The JavaScript would have to load a large file and then query it. For JavaScript, this would take a long time if the file was large.
I'm going to explain a solution which is derived from what I've seen on a few different sites, and what I think would be the best. It uses PHP and MySQL, but I'm sure you could substitute in other technologies like ASP and Access, or JSP and Oracle for example. What would be an even better thing to do would be to figure out how to get a real GIS map server (Google it) to interact with Google's native overlays, but for those of you who just need a simple solution, here it goes.
You would use PHP along with a MySQL table full of Overlay information. Then, the JavaScript would call a PHP function with the current bounds of the map which is represented by two points. It doesn't matter how you call the function, just find a way. Then, PHP would use these two points to make a MySQL query on your table of points and only return those in the specified bounds back to the JavaScript. Then, the JavaScript would clear the map of markers and then redraw only the ones which are returned by your server (the ones that are currently in bounds).
I would appreciate questions and comments as I go ahead and implement this solution. Maybe someone has already done exactly this and open sourced their code. If so, I would love to hear about it to save me time.
I still don't see how that would speed things up. If you've got 500 markers in a display area, you've got 500 markers... and JavaScript will have to render those points somehow. Perhaps there is something here that I'm missing, but in my experience the data acquisition/parsing is only about 1-5% of the time it takes to display a map; the rest is rendering.
Don't get me wrong, I would love for some solution to be found and I am interested in seeing what the ultimate solution will be (it's going to have to happen at some point). But at first glance this solution seems to me to be just shuffling stuff around into a slightly different arrangement, but not actually correcting the core render lag.
Anyone at Google feel like addressing this issue, which to me seems to be the API's greatest weakness at its current state?
> There has been a lot of discussion on a large amount of overlays on a > single Google map. Different hacks have approached this problem in > different ways. Many sites do not even deal with it because they can't, > don't know how, or don't want to.
> The capability to display a huge amount of overlays I believe should be > handled somehow by the Google Maps tool itself. However, since most, if > not all of Google Maps is written in JavaScript this may be a problem. > The JavaScript would have to load a large file and then query it. For > JavaScript, this would take a long time if the file was large.
> I'm going to explain a solution which is derived from what I've seen on > a few different sites, and what I think would be the best. It uses PHP > and MySQL, but I'm sure you could substitute in other technologies like > ASP and Access, or JSP and Oracle for example. What would be an even > better thing to do would be to figure out how to get a real GIS map > server (Google it) to interact with Google's native overlays, but for > those of you who just need a simple solution, here it goes.
> You would use PHP along with a MySQL table full of Overlay information. > Then, the JavaScript would call a PHP function with the current bounds > of the map which is represented by two points. It doesn't matter how > you call the function, just find a way. Then, PHP would use these two > points to make a MySQL query on your table of points and only return > those in the specified bounds back to the JavaScript. Then, the > JavaScript would clear the map of markers and then redraw only the ones > which are returned by your server (the ones that are currently in > bounds).
> I would appreciate questions and comments as I go ahead and implement > this solution. Maybe someone has already done exactly this and open > sourced their code. If so, I would love to hear about it to save me > time.
I'm sorry, I forgot to address the whole too many markers in bounds thing. You need to have some sort of symbol to represent a group of markers so that you can combine a group of markers into one marker with your PHP.
Here's a response to both Andrew and Justin McConnell above:
I forgot to discuss the problem of having too many overlays within a given bounds. That is kind of important. Oops. For example, if you are zoomed out all the way, your JavaScript will be handed all of the overlays in the world, and this will not help performance AT ALL for higher zoom levels. So, I'll outline a general solution to this problem.
You need to organize overlays into groups. Visually, on a map, without knowing anything about the underlying data, I see only one general solution. If anyone has any others, I'd love to listen.
You would have an algorithm which reduces the number of overlays to display by automatically grouping overlays based on location. There are tons of ways to do this. I think the easiest way would be to split up the given bounds into four squares, then check each square to see if there are still to many markers in that square. If a square still has too many markers, simply divide that square into another four squares and count again. All of this information could be calculated ahead of time and stored in a database for painless and fast retrieval when queried. This is almost exactly what Google does with its map tiles. Each zoom level simply divides each tile (a square) into four smaller squares and displays the new four tiles. This is how Google Maps itself is so fast. They can simply save all of these tiles in a database or as image files for absolutely painless retrieval. There is almost no processing required on the server. It is extremely scalable.
Another improvement to this system may be to use the same system that Google uses to keep track of tiles. The tiles are cached on your computer, and when you go one way on a map, then go back, it doesn't have to query the server for those tiles. It would be nice if you could some how cache the marker information in the browser as well.
You know, your solution is great and all except for one thing - it is
not a solution at all, it is just a statement of the obvious.
Of course we need some way of "grouping" points, but coming up with how
to do that has been the subject of *many* research papers by people
whos whole lives consist of GIS. While it may seem simple to group all
points with X distance of each other, it gets really complex when you
actually try to decide what the value of X is, and it is almost always
application specific.
The best way, if you fully control the source of your data, is to tag
each lat/lng point with some sort of zoom level indicator, as has boon
done by some in this group. However, when you actually start talking
about hundereds of thousands of points (as in my case) that have just
been recorded by a GPS device, there is no logical grouping or zoom
level mechinism.
At this point, the next solution is what is known as VIP algorythms.
The VIP in this case stands for Very Important Point, and is
essentially a method of deciding which points are neccissary to display
- for instance, if you map all cities and towns in the US, then at a
specific zoom level, you may only want to show state capitals, then
county seats, then population over X, etc. Again, as mentioned above,
the method of determining the VIPs is extremely application specific.
If you have any further ideas, that actually are possible solutions,
please do share.
True, it may be obvious for some people. Mostly the computer science, programming, or logical thinking folk. I guess that's the type of people who are going to care anyways, so it was my mistake for even writing that long winded explaination. I misjudged my audience if I even thought about my audience at all. Sorry guys. Maybe some people will benefit.
I do have some more thoughts about exactly how to group points, but I'm going to save that until at least tomorrow.
http://www.oodle.com/maps is yet another site that does this, however they seem to limit the number results returned by their ajax call, and then load more points when you click more...
Not at all, Kyle. I misunderstood your first post and your clarification actually opened my eyes a bit toward new approaches. Granted, I fear my skills are lacking for execution of this strategy (as is probably the case for most).
I wish there were a way to accomplish this without the need to write a custom algorithmic solution for each application, like an extension to gmap' function set that would separate the need to crunch the numbers yourself and filter your raw data. But that sounds like a bit of a holy grail...
I've done something similar on http://www.nearby.org.uk/google/ there is about about 20 layers, each averaging 7000 (but upto 30000) points.
I use the mechanism of storing all the points (195k!) in a MySQL db and when the map is moved or zoomed simply request a list of points from the server (via a php script), however there still usually too many points to plot on one map, so the server jsut sends a random selection (currently 80, was 150 but few people had problems).
The datasets are fairly well spread out so, so havent really tried implementing grouping, as this random approch seems to work quite well. The major disadvantage, is that if you move the map a bit, the markers on screen all change ;-( - but i think i have a concept for getting round that. (the javascript is visible, the php is very simple but will post it)
I have coded the "grab points in a certain area" code that works for when you are zoomed in some to avoid grabbing points that you will not see, and I have a basic method in place for handling when you are zoomed way out. My method grabs points within a "cache zone" which is the viewport plus a one screen "buffer" all around it. When you drag the map, if you drag more than half a screen, it re-grabs points - otherwise it does nothing. This stops excessive grabbing of points. A zoom will ALWAYS cause points to be re-grabbed. My "grouping" method is thus: I categorise my points by Country, then State, then County For each Country, state or county I will have a designated "centre point" and a designated "max zoom" Any point, for example, in a certain county will not show if the zoom level is above the max zoom, it will only show the "designated centre point" for the region it is in, and this will only be done once no matter how many of these unshown points there are for that region. IMHO it is not the smartest idea to try to solve both issues (zoomed way out and zoomed in) with one algo - you want one for each. I don't have the region filtering by zoom code in yet, but each point does already have a region, a maxzoom and a minzoom.
> I have coded the "grab points in a certain area" code that works for
> when you are zoomed in some to avoid grabbing points that you will not
> see, and I have a basic method in place for handling when you are
> zoomed way out. My method grabs points within a "cache zone" which is
> the viewport plus a one screen "buffer" all around it.
> When you drag the map, if you drag more than half a screen, it re-grabs
> points - otherwise it does nothing. This stops excessive grabbing of
> points. A zoom will ALWAYS cause points to be re-grabbed.
> My "grouping" method is thus:
> I categorise my points by Country, then State, then County
> For each Country, state or county I will have a designated "centre
> point" and a designated "max zoom"
> Any point, for example, in a certain county will not show if the zoom
> level is above the max zoom, it will only show the "designated centre
> point" for the region it is in, and this will only be done once no
> matter how many of these unshown points there are for that region.
> IMHO it is not the smartest idea to try to solve both issues (zoomed
> way out and zoomed in) with one algo - you want one for each.
> I don't have the region filtering by zoom code in yet, but each point
> does already have a region, a maxzoom and a minzoom.
The random method does sound like it will work, but again, you say it changes for every zoom level. If you have semi-static data, you could cache the locations that your randomizer picks at every zoom level and work that way. If you cache what it picks at the closest, then for the next level, just use the points it picked for the previous level as a base point for your next zoom level instead of using all of them. That way, when you zoom in and out, points won't change. Points that were there at closer zoom levels might disapear, but points that were there at farther ones will always stay on your zoomed in map.
Now that I think about this, I really like this method. It's like sampling. But, then you have those outliers that you want to always show. Those one or two points that are away from the bunch. Hmm...
I do see how the Country, State, County categorisation will work, but I have a few questions for you guys.
What if the data is not on land? What if the data is just a bunch of ships at sea? What if it is planes in the air? I'm sure people have solved this problem, and I would be interested in reading more about it.
How do you decide where points are located (country, state, county)? Is each data point tagged with this information beforehand? Or, do you have polygons that you are able to use to check to see if points are in them? If the later, I would be interested in how that works.
you got me on the planes and boats thing man, I have no Idea. Regarding how you decide whether points are in what country / state etc it's a matter of what's available. To start off with I am going to have to ask people to correctly designate where it is via a menu (see add point in my demo at www.evilc.com/xmb/gmaps.php ) but ultimately I would like to be able to do it by polygon However, here in the UK, we do not have easy access to geospatial data like the US does.
I think I've figured out a way to reduce points on a map without knowing anything about the data or the map. The only thing this algorithm knows is the x and y coordinates of each point, and the minimum distance you want points to be spaced out. For each zoom level, it will return the location of all points which should be displayed and a number representing the number of points near that point. So, if you have static data, this information could be cached. That way you would only have to run the algorithm when you changed the data being represented.
Each point will represent more than one point if the original points are within a certain distance away from each other. That way, outliers will always be represented as themselves, and hevely populated areas will be represented as a point associated with a number. The number (or shade of color for better visibility) will be the number of points that single point represents. So, to see the points that this special point represents, you will have to zoom in at least once if not more.
Does that make sense? If not, I'll try to implement it in PHP using MySQL for storage.
OnOneMap (http://ononemap.com) utilises a combination of several of the techniques mentioned in this thread. We have just under 200,000 markers at the moment, so naturally we only load those that are within the map bounds, subject to a cap. If the number actually falling within the viewable map area is realistically plottable, we plot them selectively to ensure that where two points are close enough to overlap, only one is displayed. When you zoom in, they separate and more can be displayed. When the user is fully zoomed in, however, we need to ensure that we can display all the markers, so some are actually shifted slightly from their designated spot so that we can be sure of displaying all of them.
Kyle, that is EXACTLY the system I envisaged, but I did not know how
the hell I would impliment it.
Dynamic grouping of close entities whilst maintaining "outliers" and a
number or shade to represent "bunched" markers. (I prefer a standard
shade to denote "bunched" points, plus a number to denote how many
personally)
One idea I did have on the subject (to increase speed) was to have the
back end db pre-calculate these values, and then set the max, min zoom
of points itself and create the "bunch" points in the DB. Then only
when a point was added would the calculations need to be made, which
would mean your algorithm wouldn't need to be run every time someone
moved the viewport. It may mean that you would need extra points in the
DB (as the "bunch" points would be in the DB, along with the number of
points it represents and it's min and max zoom) but you could add an
extra bool field that flagged it as a "bunch" point and not a real
point so the program knows it can delete it if it wants to.
Does that make sense ?
Oh, and one more thing we need to think about:
PolyLines
My system is going to handle paths as well as points, and I need to
decide how the hell I am going to do this. One idea was to treat a path
as only it's start point with regards to "bunching" and such. This
would probably work for the scale of paths I want to do (Think mountain
bike trails in the woods) but if people wanted to do long trails (say
from one country to another) may well not work
I'm sure you can find some algorithms that simply reduce the number of points in a polyline. Those are easy to find, I just can't remember the name of them right now.
Yup, that makes sense. We should be storing our calculations in the database and flag the special markers as so. Here are the fields I think we would need to represent the result of the algorithm:
x y zoom numOfPoints
The zoom would be the exact zoom. I'm not sure that we even need the min and max zoom. That would just be extra calculation at run time. You can spare an extra few megs in your database, right?
I think the flag here as to which are bunch points is simply the numOfPoints. If it is 1, it certainly is not a "bunch" point. Anything greater is.
IMHO we would want a min zoom and a max zoom. Imagine this scenario for the US: At zoom levels 16 & 17, the US is still too small to differentiate between states, so just one bunch point would be used to represent all US points It's not until zoomlevel 14 that individual states are marked on the map.
So for the top level bunch points, the maxzoom would be 17 and the minzoom would be 16. Then the next level of bunch points (for states) would have a maxzoom of 15 and the minzoom would vary depending on the size of the state. The Texas bunchpoint would have a minzoom of 12 probably, but the conneticut bunchpoint would probably need a minzoom of something like 9. Thinking about it this way, this mechanism could also be used as a "menu" of sorts. with things nicely aggregated into thes "bunch" points, it should be easy to see how much data there is by region. Plus, if you select a bunchpoint, it should have a link in the infowindow which does a centerandzoom to show you the points hidden by that bunchpoint.
I've been talking to a couple of my CS profs about this. It's a fairly common problem in the computer science world (as you would imagine) and has quite a few people working on it. I was doing a couple searches around the web and noticed a few pages with algorithms that might help:
I'll continue looking around, but it would appear that with the relatively heavy-duty math involved, a backend would probably be the best place for this work to occur.
What I was saying is that if you repeat points for each zoom level
between max and min, you wouldn't need the max and min. Your sql query
or whatever it is would only need to query on the zoom level instead of
having to do a range for each point. I guess either way works, its just
a question of how you want to do it.
In your last message it sounds like your algorithm knows where state
boundarys are. That's all well and good, but in my case, I wanted to
generalize even further. No biggy.
I was actually thinking the same thing. Clicking on a point would
actually center and zoom 1 level automatically. Or, it could zoom all
the way in to the point where all the points represented by that point
would still be in bounds.