Custom handling of errors from spatial functions...

71 views
Skip to first unread message

Pieter Roggemans

unread,
Nov 7, 2023, 3:51:04 PM11/7/23
to SpatiaLite Users
Hello!

Regularly spatial functions return some kind of error due to topologic problems. Sometimes because the input is invalid, sometimes because someting "illegal" is created by the spatial function.

At the moment all spatialite functions return NULL in case of a failure + a warning is written to stdout.

Because processing large files can sometimes take a long time it is a pity to have to restart the processing because e.g. a few rows gave issues, so some extra fault-tolerance in some cases could be very interesting.

E.g. 
- for most spatial functions it would be useful to - when an error occurs - apply makevalid to the input(s) and retry and possibly the operations then succeeds. If a warning would still be outputted to stdout the user still knows what happened and has the option to validate if the result is OK. 
- for ST_ReducePrecision I just had a case where even though the input was valid, ann error occured during execution of it. In this case, for ST_ReducePrecision, I think it would be better if the original geometry would be returned instead of NULL + warning to stdout in this case.

I wanted to try implementing something like that myself, but because the NULL being returned can both be a failure or a correct result I don't think this is possible?

Any thoughts?

a.fu...@lqt.it

unread,
Nov 9, 2023, 2:26:45 AM11/9/23
to spatiali...@googlegroups.com
Hi Pietr,

I'm sorry but I don't agree with this type of approach at all.

An SQL function should do a single, clearly identifiable thing,
avoiding any form of automagic or silver bullets as much as
possible.
SQL functions (and SQL queries too) should be seen as successive
stages of a pipeline, keeping each step isolated and clearly
identifiable.
Even more so when processing large datasets that require long
processing times.

In these cases the correct approach is to automate the processes
using more or less complex SQL scripts so to always allow the
identification of all possible anomalies which will then have
to be remedied subsequently (probably by applying some manual
correction).

At least, this is the lesson I learned when in past years I
worked side by side with the cartographers of the Tuscany
Region with their huge 20 million polygons topological
datasets.
The best solution was always to develop a SQL script
(possibly of hundreds of queries) that worked exactly
like a pipeline, and which at the end presented a list
of (few) problematic features to check and possibly fix.

Let's go back to the use case you hypothesized:

1) following a professional approach ST_IsValid() and
eventually MakeValid() should always be executed in
the very early stages of the pipeline, because it's
well known that very often the input datasets can
be dirty.
it makes no sense to automagically delegate this
absolutely critical preliminary step to the SQL
functions that will be called later.

2) even when ST_ReducePrecision() returns invalid
geometries (NULL), it's not at all difficult for
the Query of the next step of the pipeline to
recover the original values.
but at least this way a clear trace will be kept
of all the critical issues encountered, which can
then be verified case by case trying to identify
(and possibly correct) the offending causes.

If instead we start introducing silver bullets everywhere
we'll eventually lose any form of atomic control over
operations.
Perhaps such an approach would be appreciated by less savvy
users who are in a hurry, but would deeply disappoint
more professional users.

bye Sandro


Pieter Roggemans

unread,
Nov 9, 2023, 3:16:04 AM11/9/23
to spatiali...@googlegroups.com
Hey Sandro,

I don't think I make clear enough what I meant. The last thing I want is more magic happening, I mainly would like some unwanted magic to be removed, or at least have an option not to use it.

The most important thing I'd like to be possible is to be able to detect myself whether an NULL value being returned is a real NULL or a failure in the function. To me, both situations being mapped to the same result is some kind of unwanted "magic" that makes it impossible to handle some cases correctly, hides important information and leads to inefficiency.

In addition to that, I think it would be interesting (but as a secondary thought) to implement some optional "shortcuts" that can be controlled using optional parameters to get some specific additional behaviour. E.g. have an optional parameter "retry_if_input_invalid" so there is a more friendly way to get some more advanced behaviours than putting IIF 's around many functions. I don't quite know what you mean with this kind of approach being less traceable or professional. I clearly indicated that there should be logging when such a case was triggered so the user can verify what happened. I've been working with GIS data for a very long time in a heavily audited, ISO security certified IT environment so I know perfectly well that tracability is very important.

Regards,
Pieter


Op do 9 nov 2023 om 08:26 schreef <a.fu...@lqt.it>:
--
You received this message because you are subscribed to the Google Groups "SpatiaLite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spatialite-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spatialite-users/ebf227201e9f55994f53e0795f20cc8f%40lqt.it.

Pieter Roggemans

unread,
Nov 9, 2023, 3:22:49 AM11/9/23
to SpatiaLite Users
PS: I'm obviously not talking about source data of an analysis containing invalid geometries. If you've done complex GIS analysis on large datasets you'll know that there are regularly cases that popup where the geos spatial functions either produce invalid output or crash on valid input. It is those cases that are now difficult to deal with properly.

Op donderdag 9 november 2023 om 09:16:04 UTC+1 schreef Pieter Roggemans:

Pieter Roggemans

unread,
Nov 9, 2023, 4:21:10 AM11/9/23
to SpatiaLite Users
PPS: to make it a bit less abstract, a more practical way to present what I was thinking about:

Some possible ways to implement what I mean, possibly there are better options...:
- add a parameter "error_handling" to all relevant functions that can be used to specify the behaviour wanted (just some examples of options, I don't suggest to support them all):
    - "RETURN_NULL" (default): keeps current behaviour
    - "RETURN_1": return the integer 0 (or blobzero, or ???) on an error instead of NULL, so you can handle it yourself in the way you like. The question is what happens it this gets inserted.
    - "RETURN ERROR": return the error as a string, so you can handle it yourself in the way you like
    - "RAISE_ERROR": just raise the error and stop further execution
    - "RETRY_AFTER_MAKEVALID": apply makevalid on the inputs and retry
    - "RETURN_INPUT": return the input if an error occurs
   - ...
- add a global setting/PRAGMA/... to choose the (default) behaviour

Op donderdag 9 november 2023 om 09:22:49 UTC+1 schreef Pieter Roggemans:

a.fu...@lqt.it

unread,
Nov 9, 2023, 4:31:14 AM11/9/23
to spatiali...@googlegroups.com
On Thu, 9 Nov 2023 00:22:49 -0800 (PST), Pieter Roggemans wrote:
> PS: I'm obviously not talking about source data of an analysis
> containing invalid geometries.
> It is those cases that are now difficult to deal with properly.
>

Hi Pieter,

take it as a simple suggestion
have you tried using the functions that report the errors raised by
GEOS?

GEOS_GetLastWarningMsg()
GEOS_GetLastErrorMsg()
GEOS_GetLastAuxErrorMsg()
GEOS_GetCriticalPointFromMsg()

they could probably help you in identifying the critical issues more
precisely.


> If you've done complex GIS analysis on
> large datasets you'll know that there are regularly cases that popup
> where the geos spatial functions either produce invalid output or
> crash on valid input.
>

we can't do anything about crashes: fortunately it happens very rarely,
unless there are bugs in functions recently introduced and not yet well
tested.

but in case GEOS produces invalid results there is a very simple
solution:

SELECT ..... MakeValid(<some GEOS function returning a Geometry>) ...

bye Sandro

a.fu...@lqt.it

unread,
Nov 9, 2023, 4:37:50 AM11/9/23
to spatiali...@googlegroups.com
On Thu, 9 Nov 2023 01:21:10 -0800 (PST), Pieter Roggemans wrote:
> PPS: to make it a bit less abstract, a more practical way to present
> what I was thinking about:
>
> Some possible ways to implement what I mean, possibly there are
> better
> options...:
> - add a parameter "error_handling" to all relevant functions that can
> be used to specify the behaviour wanted (just some examples of
> options, I don't suggest to support them all):
>
>     - "RETURN_NULL" (default): keeps current behaviour
>     - "RETURN_1": return the integer 0 (or blobzero, or ???) on an
> error instead of NULL, so you can handle it yourself in the way you
> like. The question is what happens it this gets inserted.
>     - "RETURN ERROR": return the error as a string, so you can
> handle it yourself in the way you like
>     - "RAISE_ERROR": just raise the error and stop further execution
>     - "RETRY_AFTER_MAKEVALID": apply makevalid on the inputs and
> retry
>     - "RETURN_INPUT": return the input if an error occurs
>    - ...
> - add a global setting/PRAGMA/... to choose the (default) behaviour
>

Hi Pieter,

what you are proposing violates all the provisions of the international
standard ISO/IEC 13249-3 (Information technology -- Database languages
​​-- SQL multimedia and application packages -- Part 3: Spatial aka
SQL/MM)

but this is not my main objection.
the real problem is that in order to implement what you ask in the code
will take months of work.
time that at the moment (and for a reasonable future) I absolutely will
not have at my disposal.

bye Sandro

Pieter Roggemans

unread,
Nov 10, 2023, 4:01:32 AM11/10/23
to SpatiaLite Users
I wasn't aware of those functions. I had a quick look in the documentation and in code on what they do and they seem like a possible solution to my issue: if a GEOS function returns NULL I should be able to use these functions to find out if it is a genuine NULL or an NULL caused by an error.

Based on some firsts its not yet working out to get one of the function returning something when I use ST_ReducePrecision on a geometry that leads to an error in GEOS (NULL is returned but no error returned) but I'll get back to it when I dit some more testing.

Thanks!

Op donderdag 9 november 2023 om 10:31:14 UTC+1 schreef a.fu...@lqt.it:

Pieter Roggemans

unread,
Nov 10, 2023, 4:03:16 AM11/10/23
to SpatiaLite Users
Thats quite possible, I'm not familiar with those standards :-).

I'm aware that such a change would touch a lot of code paths... so indeed not a trivial change. Nonetheless thanks for thinking along.

Op donderdag 9 november 2023 om 10:37:50 UTC+1 schreef a.fu...@lqt.it:
Reply all
Reply to author
Forward
0 new messages