Federation Use Cases and Methodologies

162 views
Skip to first unread message

Jeffrey Johnson

unread,
Sep 26, 2011, 2:35:35 PM9/26/11
to geonode-dev
Hi Folks,

<beware> long mail ahead </beware>

In the context of a project that OpenGeo is working on for an
Australia / New Zealand SDI, I've been investigating the topic of
Federation among GeoNode's and between GeoNode's and other web
services. After an initial round of looking into things, I thought I
would bring my findings to this list for some feedback and discussion.

As I see it, there are basically 3 'methodologies' for doing the
federation, and I can envision use cases for each of the 3 and dont
see them as necessarily mutually exclusive, but perhaps complementary.
So, I would like to hear what others think. Below are discussions of
each, and Im sure I missed some details, so please speak up if so.

1) Using the cascading WMS and WFS features of GeoServer, and building
out a cascading WCS module in GeoServer.

There already exists functionality to 'cascade' WMS services through
GeoServer http://docs.geoserver.org/latest/en/user/data/wms.html ... I
prepared some screenshots on how to do this if you want to take a
look. There is a step that is _not_ shown here in the screenshots
which is to run updatelayers ... indicated inline.

https://skitch.com/ortelius/f498i/geoserver-new-data-source
https://skitch.com/ortelius/f498a/dock-17
https://skitch.com/ortelius/f4984/geoserver-new-layer
https://skitch.com/ortelius/f4986/dock-18
https://skitch.com/ortelius/f49er/geoserver-import-cascading-wms-layer
--- django-admin.py updatelayers --settings=geonode.settings ----
https://skitch.com/ortelius/f49ji/search-data-geonode
https://skitch.com/ortelius/f49jh/dock
https://skitch.com/ortelius/f49j6/flood-plain-geonode

GeoServer also supports cascading WFS services
http://docs.geoserver.org/latest/en/user/data/wfs.html although it is
not called explicitly called cascading as the WMS is. I've
experimented less with the WFS feature than the WMS, but did run into
several problems which will be brought up with the GeoServer folks.

There is not currently a way to cascade WCS services, but based on a
discussion with Jody Garnett at FOSS4G, it does not seem that this
would be too difficult to implement either.

Once the Stores and Layers (Resources) are configured in GeoServer,
GeoNode treats them as any other layer once they are synced via
updatelayers or other.

Several points to note about this methodology:

a) The Cascading WMS functionality does _not_ currently support
authentication against the remote service. Gabriel Roldan from OpenGeo
is currently working on implementing this, and he doesn't envision it
being too difficult. Once this is implemented, it will be possible for
a GeoServer/GeoNode to use a specific set of credentials to
communicate with the remote WMS and then apply Security on top of
those layers when serving to GeoNode clients ... this is something
required for the ANZ SDI. The cascading WFS _does_ currently support
authenticated requests to external servers, and is apparently setup to
handle read and write ... but I've not personally tested this.

b) There is currently no concept of 'pairing' the cascaded services,
so assuming that a WMS has a related WFS for vector data and/or a
related WCS for raster data, there is currently no way to link the WMS
Resource and its paired WFS Resource, resulting in 'duplicate'
layers/resource in geoserver. This will make it difficult to provide
the download functionality in GeoNode without some effort to link them
within GeoNode itself.

c) It seems entirely possible that for this ANZ SDI project and
various others, using this methodology could result in 1000s or 10s of
1000s of resources referenced in GeoServers configuration, and I've
already run into problems with this kind of scaling. Its been
suggested that moving to the Database Backed GeoServer configuration
(as opposed to the default way which is to store the config on the
filesystem) may be preferable in cases like this. My initial
investigation of the dbconfig module
(http://geoserver.org/display/GEOS/DBConfig+Module) , and discussion
with the GeoServer folks leads to the conclusion that this module is
highly experimental and not at all ready for production use ... but Im
told its not a great deal of work to bring it up to that standard.
That said, switching to the dbconfig module should not be seen as
panacea for the scaling problems encountered with a HUGE amount of
cascaded layers in GeoServer. It makes sense to me that we at OpenGeo
can/should spend some time to address this kind of problem.

d) Using a database backed GeoServer configuration would allow us to
create Django Model Classes that represent geoservers database
configuration and address that database (presumably in a read-only
way) directly in Django via the ORM. This may provide advantages
insofar as not requiring that all requests for basic configuration
details be routed through GeoServer directly making it less of a
bottleneck. There may be other advantages to doing things this way,
including making the concept of keeping the various databases that
GeoNode uses in sync, but I've not thought those through carefully
yet.

e) Using this cascading methodology allows us to use GWC to cache the
cascaded layers such that the GeoNode could still serve tile
representations of the external services even if they didn't currently
answer. This of course only works for WMS tiles, but seems to be an
important enough feature that it merits consideration when comparing
these methodologies.

In the interest of exploring this methodology, I've begun working on a
patch to gsconfig.py to support the Cascading WMS Store in GeoServer
https://github.com/jj0hns0n/gsconfig.py/commits/wms_store and found it
to work reasonably well under simple circumstances. Here is a simple
script that makes use of this. https://gist.github.com/1235483 ... It
should be noted that Im using owslib to interrogate the WMS separately
from gsconfig.py to find the list of available layers and then add
them. It appears that GeoServers REST API may provide an alternate way
to do this, but its not currently implemented in gsconfig.py which I
will be working on.

2) Maintaining an index of external Web Services (OGC and otherwise)
in Django's database.

It seems entirely logical that a new 'Service' model class could be
added to the geonode.maps.models module which stored the connection
and other service metadata about an external service ... and the
geonode.maps.models.Layer class could be extended to support external
layers that were _not_ cascaded through GeoNode's GeoServer. I spent a
bit of time exploring this concept last year
https://github.com/ortelius/django_wms_browser ... This was only a < 1
day crack at this, but addresses the main ideas ... Essentially, a
GeoNode user would be able to provide a WMS endpoint on the add layer
page ... using owslib, the endpoint would be interrogated and the
GetCapabilities fetched ... the list of layers would be presented to
the user and they would be able to select which ones to add to the
GeoNode. From that point, they would be treated as normal GeoNode
layers and be available for adding to Maps etc. This kind of
functionality is 'sort of' provided already in the Map Composer, but
it is a one-time thing and the metadata about the service is not
stored in GeoNode or anywhere else except in the Map Configuration.

Furthermore There is nothing that prevents us from taking a wider view
of this 'Service' model class to include things like ArcGIS Services
and other things that are not strictly OGC, but are supported in
OpenLayers and therefore in the Map Composer in GeoNode.

A few things to consider with this methodology are:

a) Services becoming stale ... i.e. no longer accessible. It makes
sense to have some periodic task that checks the services to see if
they are up and after some configurable number of retries marks them
as no longer accessible and therefore hidden from search results.

b) Keeping the basic metadata about a layer up-to-date with the
external service if/when things change on the other end. Again, a
periodic task may do the trick here.

c) Authentication/Security against external services ... If the
external services require authentication, and the GeoNode end-user
provides their *personal* credentials when adding the services, what
metadata is appropriate to store in GeoNodes database that was
accessed using those credentials?

d) Pairing of WFS/WCS and WMS ... same set of issues discussed above
with the previous methodology

Im sure there are more here, but these should be sufficient for now.

3) Searching against other services at 'run-time'

While the previous two methodologies involve 'storing' information
about external services in either GeoServer or in GeoNodes Django
database ... it is also conceivable to simply store just the endpoints
of these external services and query them at the time the user makes
an actual query in the UI (or via the API when it exists). Standards
like OpenSearch Geo are specifically designed for this kind of thing,
and it makes sense that we could support querying remote services via
CSW and/or other standards and getting back the WMS/WFS/WCS endpoints
need to get at the data ... as well as querying services that are not
really 'standard' in any way but do exist in the real world like
ArcGIS.com, Data.gov or services like Google Earth Builder ... this
methodology is probably the most flexible, but would also likely be
slow and perhaps clunky

My apologies for the rambling mail, but I think that provides enough
detail to get the conversation started. Very much looking forward to
what all of you have to say.

Jeff

tva...@gmail.com

unread,
Mar 12, 2013, 9:22:10 AM3/12/13
to geono...@opengeo.org
Hi Jeff and Others

We are working on trying to tackle an number of the same challenges as you however our issues are related to other service types including opendap and SOS.
I see this post was from 2011 so would be really great to hear from you what progress has been made, what lessons have been learnt and how we can get involved.

Terence
Reply all
Reply to author
Forward
0 new messages