A seed to start from...

40 views
Skip to first unread message

Brett Antonides

unread,
Feb 23, 2012, 11:17:27 AM2/23/12
to geospatial-mobile-da...@googlegroups.com
My thoughts to start discussion:

  • Support for both Android and iOS
    • Android may have the larger first adopters, but iOS is needed for broad acceptance
    • This is not intended as a data format for desktops, workstations, and servers (that's what things like PostGIS and SDE are for)
    • Desktop should be are to create and read these files in order to pull data out of larger data stores and other formats
  • This spec should be simple and quick to read (closer to 5 pages than 100)
  • This spec is intended to define ONLY the format of how the data is stored
    • Any APIs developed on top of this format are outside of this initial scope.  That is not to say they should not be developed and could not be extremely valuable for developers.
  • Geometry Types
    • Point, Line, and Polygon at a minimum
    • 3D polygons and volumes (in the future?)
  • Spatial References
    • Only support a handful to keep things relatively quick
    • By supporting a small subset do we gain speed or battery life?
    • Most common? WGS84, UTM, Web Mercator  (what did I miss that essential?)
  • Spatial Indexing
    • Definitely need spatial index support
    • SpatialLite?  maybe, but for platforms that don't support spatial indexes can we degrade gracefully
    • What do you mean by "Degrade Gracefully"?
      • Not all platforms might have all of the libraries needed for a good spatial index to work.  So, in these cases the file format should still be readable, but you will have to fall back to something like a full table scan when working with the data.
      • Pros: Data is still usable on older or bleeding edge platforms that haven't had the geospatial libraries ported to them
      • Cons: In degraded mode you may have slower access, increased processing requirements, lower battery life, probably won't be able to handle large datasets
Ok, fire away.

Brett

pa...@imagemattersllc.com

unread,
Feb 24, 2012, 10:25:44 AM2/24/12
to Geospatial Mobile Data Format for Vectors
Hi Brett:
Thanks for your seed to get us started.

I'd argue we shouldn't try to reinvent the wheel here, as there are
already OGC/ISO standards, for Simple Features (SF) Access, and SF/
SQL, that address the problem.
http://www.opengeospatial.org/standards/sfa
http://www.opengeospatial.org/standards/sfs
They are widely implemented in the spatial database / GIS world, and
Spatialite claims to mostly support SF/SQL. Although I've used the SF/
SQL API in Spatialite, I have not done any conformance tests (the
existing conformance test suite is for an earlier version of the SF/
SQL standard.)

From the SF/SQL intro:

"This second part of OpenGIS(R) Simple Features Access (SFA), also
called ISO 19125, is to define a standard Structured Query Language
(SQL) schema that supports storage, retrieval, query and update of
feature collections via the SQL Call-Level Interface (SQL/CLI) (ISO/
IEC 9075-3:2003)." ...

"In a SQL-implementation, a collection of features of a single type
are stored as a "feature table" usually with some geometric valued
attributes (columns). Each feature is primarily represented as a row
in this feature table, and described by that and other tables
logically linked to this base feature table using standard SQL
techniques. The non-spatial attributes of features are mapped onto
columns whose types are drawn from the set of SQL data types,
potentially including SQL3 user defined types (UDT). The spatial
attributes of features are mapped onto columns whose types are based
on the geometric data types for SQL defined in this standard and its
references. Feature-table schemas are described for two sorts of SQL-
implementations: implementations based a more classical SQL relational
model using only the SQL predefined data types and SQL with additional
types for geometry.In any case, the geometric representations have a
set of SQL accessible routines to support geometric behavior and
query."

I'd argue that the table design we need is already an international
standard:

"In an implementation based on predefined data types, a geometry-
valued column is implemented using a "geometry ID" reference into a
geometry table. A geometry value is stored using one or more rows in a
single geometry table all of which have the geometry ID as part of
their primary key. The geometry table may be implemented using
standard SQL numeric types or SQL binary types; schemas for both are
described in this standard."

The question I'd raise is whether we want the SQL table model using
SQL data types, the one using geometry data types (Well Known Bindary
(WKB)), or both. Implementations without spatial libraries can make
due and provide terrible performance and functionality without spatial
functions or indexing. Note that without spatial libraries there is no
support for coordinate transformations between different spatial
reference systems.

The introduction goes on to explain why the SQL API is important:

"The term --SQL with Geometry Types|| is used to refer to a SQL-
implementation that has been extended with a set of --Geometry Types.||
In this environment, a geometry-valued column is implemented as a
column whose SQL type is drawn from this set of Geometry Types. The
mechanism for extending the type system of an SQL-implementation is
through the definition of user defined User Defined Types. Commercial
SQL-implementations with user defined type support have been available
since mid-1997 and an ISO standard is available for UDT definition.
This standard does not prescribe a particular UDT mechanism, but
specifies the behavior of the UDTs through a specification of
interfaces that must be supported. These interfaces are describe for
SQL3 UDTs in ISO/IEC 13249-3."

This lets the standard specify a common SF/SQL API that can be
implemented on top of different UTDs in different spatial database
implementations - PostGIS, SDE, OracleSpatial, SpatiaLite .... This
is what needs to be done for interoperability whenever possible --
specify interfaces, not implementations. If we HAVE to fall back to
table implementation specifications, then I'd argue these are the ones
we should use:

CREATE TABLE ANNOTATION_TEXT_METADATA AS
{
F_TABLE_CATALOG AS CHARACTER VARYING NOT NULL,
F_TABLE_SCHEMA AS CHARACTER VARYING NOT NULL,
F_TABLE_NAME AS CHARACTER VARYING NOT NULL,
F_TEXT_KEY_COLUMN AS CHARACTER VARYING NOT NULL,
F_TEXT_ENVELOPE_COLUMN AS CHARACTER VARYING NOT NULL,
A_ELEMENT_TABLE_CATALOG AS CHARACTER VARYING NOT NULL,
A_ELEMENT_TABLE_SCHEMA AS CHARACTER VARYING NOT NULL,
A_ELEMENT_TABLE_NAME AS CHARACTER VARYING NOT NULL,
A_ELEMENT_TEXT_KEY_COLUMN AS CHARACTER VARYING NOT NULL
A_ELEMENT_TEXT_SEQ_COLUMN AS CHARACTER VARYING NOT NULL
A_ELEMENT_TEXT_VALUE_COLUMN AS CHARACTER VARYING NOT NULL,
A_ELEMENT_TEXT_LEADERLINE_COLUMN AS CHARACTER VARYING NOT NULL,
A_ELEMENT_TEXT_LOCATION_COLUMN AS CHARACTER VARYING NOT NULL,
A_ELEMENT_TEXT_ATTRIBUTES_COLUMN AS CHARACTER VARYING NOT NULL,
A_MAP_BASE_SCALE AS NUMBER NOT NULL,
A_TEXT_DEFAULT_EXPRESSION AS CHARACTER VARYING,
A_TEXT_DEFAULT_ATTRIBUTES AS CHARACTER VARYING
}

CREATE TABLE SPATIAL_REF_SYS
(
SRID
INTEGER NOT NULL PRIMARY KEY,
AUTH_NAME CHARACTER VARYING,
AUTH_SRID INTEGER,
SRTEXT CHARACTER VARYING(2048)
)

CREATE TABLE GEOMETRY_COLUMNS (
F_TABLE_CATALOG CHARACTER VARYING NOT NULL,
F_TABLE_SCHEMA CHARACTER VARYING NOT NULL,
F_TABLE_NAME CHARACTER VARYING NOT NULL,
F_GEOMETRY_COLUMN CHARACTER VARYING NOT NULL,
G_TABLE_CATALOG CHARACTER VARYING NOT NULL,
G_TABLE_SCHEMA CHARACTER VARYING NOT NULL,
G_TABLE_NAME CHARACTER VARYING NOT NULL,
STORAGE_TYPE INTEGER,
GEOMETRY_TYPE INTEGER,
COORD_DIMENSION INTEGER,
MAX_PPR INTEGER,
SRID INTEGER NOT NULL REFERENCES SPATIAL_REF_SYS,
CONSTRAINT GC_PK PRIMARY KEY (F_TABLE_CATALOG, F_TABLE_SCHEMA,
F_TABLE_NAME, F_GEOMETRY_COLUMN)
)

The general format of a feature table shall be as follows:
CREATE TABLE <feature table name> (
<primary key column name> <primary key column type>, ...
(other attributes for this feature table)
<geometry column name> <geometry column type>, ...
(other geometry columns for this feature table)
PRIMARY KEY <primary key column name>,
FOREIGN KEY <geometry column name> REFERENCES <geometry table name>,
...
(other geometry column constraints for this feature table) )

The following CREATE TABLE statement creates an appropriately
structured table for Geometry stored as individual ordinate values
using SQL numeric types. Implementations shall either use this table
format or provide stored procedures to create, to populate and to
maintain this table.

CREATE TABLE <table name> (
GID INTEGER NOT NULL,
ESEQ INTEGER NOT NULL,
ETYPE INTEGER NOT NULL,
SEQ INTEGER NOT NULL,
X1 <ordinate type>,
Y1 <ordinate type>,
Z1 <ordinate type>, !Optional if Z-value is included
M1 <ordinate type>, !Optional if M-value is included
... <repeated for each ordinate, repeated for each point>
X<MAX_PPR> <ordinate type>,
Y<MAX_PPR> <ordinate type>,
Z1<MAX_PPR> <ordinate type>, !Optional if Z-value is included
M1<MAX_PPR> <ordinate type>, !Optional if M-value is included
...,
<attribute>
<attribute type>
CONSTRAINT GID_PK PRIMARY KEY (GID, ESEQ, SEQ)
)

>AND / OR<

The following CREATE TABLE statement creates an appropriately defined
table for Geometry stored using the Well-known Binary Representation
for Geometry. The size of the WKB_GEOMETRY column is defined by the
implementation. Implementations shall either use this table format or
provide stored procedures to create, populate and maintain this table.

CREATE TABLE <table name> (
GID NUMERIC NOT NULL PRIMARY KEY,
XMIN <ordinate type>,
YMIN <ordinate type>,
ZMIN <ordinate type>,
MMIN <ordinate type>,
XMAX <ordinate type>,
YMAX <ordinate type>,
ZMAX <ordinate type>,
MMAX <ordinate type>,
WKB_GEOMETRY BIT VARYING(implementation size limit),
{<attribute> <attribute type>}*
)

>AND<

The general format of a feature table in the SQL with Geometry Types
implementation shall be as follows:

CREATE TABLE <feature table name> (
<primary key column name> <primary key column type>, ...
(other attributes for this feature table)
<geometry column name> <geometry type>, ...
(other geometry columns for this feature table)
PRIMARY KEY <primary key column name>,
CONSTRAINT SRS_1 CHECK (SRID(<geometry column name>) in
( SELECT SRID from GEOMETRY_COLUMNS
where F_TABLE_CATALOG = <catalog> and F_TABLE_SCHEMA = <schema> and
F_TABLE_NAME = <feature table name>
and F_GEOMETRY_COLUMN = <geometry column> ) ...
( spatial reference constraints for other geometry columns in this
feature table)
)

There. That was less than 5 pages. But developers should read the
other 96 to use the standard geometry type codes etc. to get the
implementation right ;-}

Cheers,

Paul



On Feb 23, 11:17 am, Brett Antonides
<brett.antoni...@lmnsolutions.com> wrote:
> My thoughts to start discussion:
>
> - Support for both Android and iOS
> - Android may have the larger first adopters, but iOS is needed for
> broad acceptance
> - This is not intended as a data format for desktops, workstations,
> and servers (that's what things like PostGIS and SDE are for)
> - Desktop should be are to create and read these files in order to
> pull data out of larger data stores and other formats
> - This spec should be simple and quick to read (closer to 5 pages than
> 100)
> - This spec is intended to define ONLY the format of how the data is
> stored
> - Any APIs developed on top of this format are outside of this
> initial scope. That is not to say they should not be developed and could
> not be extremely valuable for developers.
> - Geometry Types
> - Point, Line, and Polygon at a minimum
> - 3D polygons and volumes (in the future?)
> - Spatial References
> - Only support a handful to keep things relatively quick
> - By supporting a small subset do we gain speed or battery life?
> - Most common? WGS84, UTM, Web Mercator (what did I miss that
> essential?)
> - Spatial Indexing
> - Definitely need spatial index support
> - SpatialLite? maybe, but for platforms that don't support spatial
> indexes can we degrade gracefully
> - What do you mean by "Degrade Gracefully"?
> - Not all platforms might have all of the libraries needed for a
> good spatial index to work. So, in these cases the file format should
> still be readable, but you will have to fall back to something like a full
> table scan when working with the data.
> - Pros: Data is still usable on older or bleeding edge platforms
> that haven't had the geospatial libraries ported to them
> - Cons: In degraded mode you may have slower access, increased

Paul Ramsey

unread,
Feb 24, 2012, 1:00:43 PM2/24/12
to Geospatial Mobile Data Format for Vectors
So:

- Core agreement that SFSQL is the starting point for the database
layout
- Do we agree that SQLite should be the storage engine? We don't
strictly speaking have to, but for the purposes of a reference
implementation we might want to go there.
- Regarding the flavor of SFSQL, I strongly disagree with using the
SFSQL feature table/geometry table separation (which strikes me as a
hack put in moons ago to allow them to optionally define the (oft
investigated, never used) basic-SQL-types implementation). A simple
"geometry-as-blob-column" implementation where our specification
defines the kind of blob to expect will be a far simpler
implementation, and has the benefit of mapping directly to the
existing sqlite, oracle, postgis, sqlserver implementations of
geometry in a database.
- I don't think we have to punt on indexing, but it will require a
good deal of care to describe a sidecar-table based indexing scheme
(mytable_spidx is the spatial index table for mytable) clearly enough
to get multiple interoperable implementations. Here a reference
implementation would be good for providing copyable clarity to
implementors coming behind.
- Do we want to look at SQL/MM types as additional supported types?
CIRCULARSTRING, etc. On the one hand, the documentation is there, we
just need to add it to our blog spec and spec for type name strings,
but on the other, YAGNI, I don't know of any mobile folks using those
types.
- The irregularity of SRS WKT in the wild may or may not be an issue
for implementors. To the extent that everyone either uses OGR SRS
objects (which paper over them) maybe we don't care? Specifying a
particular flavor of SRS WKT might just be spitting into the wind.
- SFSQL seems to give us everything we need for storage and management
(with the addition of a sidecar index scheme) but it says nothing at
all about rendering or human-readable presentation. Should we be
talking about the layout of a "Layers" table that a client could use
to provide a legend and a "maps" table that combines those layers into
different cartographic products?
- The previous question actually ties into a meta-thought, which is
that dividing the raster and vector discussions is not actually
something we should do. There should be one format, sharing as much
infrastructure (spatial_ref_sys table, layers/maps information) as
possible.

That's my brain dump, how many more folks are coming? Once we get some
general ideas on the table I'd like to draft.

P.

On 24 Feb, 07:25, "pa...@imagemattersllc.com"
<pa...@imagemattersllc.com> wrote:
> Hi Brett:
>   Thanks for your seed to get us started.
>
>   I'd argue we shouldn't try to reinvent the wheel here, as there are
> already OGC/ISO standards, for Simple Features (SF) Access, and SF/
> SQL, that address the problem.http://www.opengeospatial.org/standards/sfahttp://www.opengeospatial.org/standards/sfs
> ...
>
> read more »

pa...@imagemattersllc.com

unread,
Feb 24, 2012, 2:03:56 PM2/24/12
to Geospatial Mobile Data Format for Vectors
Paul R:

See comments inline.

v/r

Paul D.

On 2/24/2012 1:00 PM, Paul Ramsey wrote:
> So:
>
> - Core agreement that SFSQL is the starting point for the database
> layout
Good!
> - Do we agree that SQLite should be the storage engine? We don't
> strictly speaking have to, but for the purposes of a reference
> implementation we might want to go there.
I think so. The single-file db is a major attraction.
> - Regarding the flavor of SFSQL, I strongly disagree with using the
> SFSQL feature table/geometry table separation (which strikes me as a
> hack put in moons ago to allow them to optionally define the (oft
> investigated, never used) basic-SQL-types implementation). A simple
> "geometry-as-blob-column" implementation where our specification
> defines the kind of blob to expect will be a far simpler
> implementation, and has the benefit of mapping directly to the
> existing sqlite, oracle, postgis, sqlserver implementations of
> geometry in a database.
I agree, but there is a clear choice, and I want us to make it
explicitly. Thanks!
> - I don't think we have to punt on indexing, but it will require a
> good deal of care to describe a sidecar-table based indexing scheme
> (mytable_spidx is the spatial index table for mytable) clearly enough
> to get multiple interoperable implementations. Here a reference
> implementation would be good for providing copyable clarity to
> implementors coming behind.
That's the table naming convention used by SpatiaLite / RasterLite.
What do you think
of its mytable_metadata_geometry[_node | _parent | _rowid] design?
> - Do we want to look at SQL/MM types as additional supported types?
> CIRCULARSTRING, etc. On the one hand, the documentation is there, we
> just need to add it to our blog spec and spec for type name strings,
> but on the other, YAGNI, I don't know of any mobile folks using those
> types.
(multi)[point|curve|surface] are probably enough for v1
> - The irregularity of SRS WKT in the wild may or may not be an issue
> for implementors. To the extent that everyone either uses OGR SRS
> objects (which paper over them) maybe we don't care? Specifying a
> particular flavor of SRS WKT might just be spitting into the wind.
probably so
> - SFSQL seems to give us everything we need for storage and management
> (with the addition of a sidecar index scheme) but it says nothing at
> all about rendering or human-readable presentation. Should we be
> talking about the layout of a "Layers" table that a client could use
> to provide a legend and a "maps" table that combines those layers into
> different cartographic products?
Yes, but that might be for v2. Do you have a model in mind?
> - The previous question actually ties into a meta-thought, which is
> that dividing the raster and vector discussions is not actually
> something we should do. There should be one format, sharing as much
> infrastructure (spatial_ref_sys table, layers/maps information) as
> possible.
I agree, especially because there needs to be one manifest format
(based on ows:Manifest from OGC Common v2?) that can handle both.
Reply all
Reply to author
Forward
0 new messages