[postgis-users] ERROR: array size exceeds the maximum allowed (134217727)

288 views
Skip to first unread message

Mike Leahy

unread,
Mar 20, 2010, 5:17:50 AM3/20/10
to postgi...@postgis.refractions.net
Hello list,

I have an interesting error that's coming up on an Ubuntu 64-bit system,
running PostgreSQL 8.4.2, and PostGIS 1.5.1 (packaged by me) as well as 1.4.0
(from Ubuntu repos). The query itself is somewhat complex (I'm sure it could
be optimized better - but it's been working well for quite some time). I've
attached two files - one is a dummed-down version of the query I am running
(crashing.sql). After discussing this with some folks in the postgresql IRC
channel, it seems that the st_geom(), or something PostGIS-related is causing
the error.

Running this query on various data will produce one of two results. One is
the error mentioned in the subject (ERROR: array size exceeds the maximum
allowed (134217727)). I can find very little information on this error. The
other outcome is that it often causes the PostgreSQL backend to segfault (see
gdboutput.txt).

I can attest that this query works fine on many installations of PostgreSQL
8.2+, and PostGIS 1.3+. The only main difference I can see in this case where
I'm getting errors and segfaults is that I'm running this on Ubuntu, whereas
all of my successful cases are on Fedora (7 through 12).

If I replace the st_union() function in the query with accum(), it still
produces the error/segfaults.

Also, strangely, if I remove one of the where conditions near the end of the
query (e.g., " AND (cl.depth = 0)"), it suddenly works fine without any issues
(except that I actually need that condition).

Does this sort of problem look familiar to anyone? I can try to put together
an example if that would be of any use, but since the same data/query works on
other Fedora systems that I am running, I'm not sure how easy it is to
replicate.

Regards,
Mike

crashing.sql
gdboutput.txt

Mike Leahy

unread,
Mar 20, 2010, 5:49:42 AM3/20/10
to postgi...@postgis.refractions.net
Hello again,

It might be of interest to point out that substituting st_union() with
st_memunion() seems to have worked around this. I'm curious though, because
there is not a great deal of data being processed, and I am running this on a
fairly sturdy system that that has more capacity than some of the Fedora
systems I'm running.

Mike

_______________________________________________
postgis-users mailing list
postgi...@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

strk

unread,
Mar 20, 2010, 10:41:41 AM3/20/10
to mgl...@alumni.uwaterloo.ca, PostGIS Users Discussion
On Sat, Mar 20, 2010 at 05:49:42AM -0400, Mike Leahy wrote:
> Hello again,
>
> It might be of interest to point out that substituting st_union() with
> st_memunion() seems to have worked around this. I'm curious though, because
> there is not a great deal of data being processed, and I am running this on a
> fairly sturdy system that that has more capacity than some of the Fedora
> systems I'm running.

st_memunion builds a big array with all geometries in it..
you were hitting a limit of the array type.
st_union should behave better.

--strk;

() Free GIS & Flash consultant/developer
/\ http://strk.keybit.net/services.html

Paul Ramsey

unread,
Mar 20, 2010, 10:56:14 AM3/20/10
to PostGIS Users Discussion, mgl...@alumni.uwaterloo.ca
Actually, memunion does the opposite, it passes the resultant and
preserves mem. The default behavior is fast-but-memory-hungry. And has
been for some time, though in different forms. There were some bugs in
the array handling code, but Mark CA killed most of them, so the
latest 1.5 and 1.4 streams should be good. If it's possible that the
issue is one of array size, maybe Mike could find the dial that
controls that maximum, and turn it up and down and see if it makes his
problem go away/happen sooner.

P.

Paragon Corporation

unread,
Mar 20, 2010, 11:32:35 AM3/20/10
to PostGIS Users Discussion, mgl...@alumni.uwaterloo.ca
Paul,
I doubt array size limit is the issue. He said when he left the where
condition out it worked. I would think it would definitely blow up in that
case.

Mike,
The issue from before was that array aggregate functions did not handle
NULLs correctly. 64-bit systems were more likely to segfault or give
strange
Errros in this case.

To rule out that we still have some of these issues in the code base, can
you add a

geom IS NOT NULL

Condition to your WHERE filter. If that works, then the NULL issue is
probably still lurking somewhere.

Leo and Regina
http://www.postgis.us


-----Original Message-----
From: postgis-us...@postgis.refractions.net
[mailto:postgis-us...@postgis.refractions.net] On Behalf Of Paul
Ramsey
Sent: Saturday, March 20, 2010 10:56 AM
To: PostGIS Users Discussion; mgl...@alumni.uwaterloo.ca
Subject: Re: [postgis-users] ERROR: array size exceeds the maximum
allowed(134217727)

strk

unread,
Mar 20, 2010, 12:19:52 PM3/20/10
to PostGIS Users Discussion, mgl...@alumni.uwaterloo.ca
On Sat, Mar 20, 2010 at 07:56:14AM -0700, Paul Ramsey wrote:
> Actually, memunion does the opposite, it passes the resultant and
> preserves mem.

Oops, sorry for the confusion. Not-so-intuitive name :P

Chris Hermansen

unread,
Mar 20, 2010, 1:03:13 PM3/20/10
to PostGIS Users Discussion
Very recently there was a kernel patch in Ubuntu regarding - if memory
serves - memory copying. How up to date is the system in question?


--
Regards,

Chris Hermansen · mailto:chris.h...@timberline.ca
tel+1.604.714.2878 · fax+1.604.733.0631 · mob+1.778.840.4625
Timberline Natural Resource Group · http://www.timberline.ca
401 · 958 West 8th Avenue · Vancouver BC · Canada · V5Z 1E5

Mike Leahy

unread,
Mar 20, 2010, 2:00:59 PM3/20/10
to Paragon Corporation, PostGIS Users Discussion
Hi Leo/Regina,

Thanks for the suggestion. I tried adding the clause 'not geom is null' to
the where statement in each of the two sub-queries that have the
st_union(geom) functions are used, but it still segfaults.

I also tried this on a fresh database with very little data, and it doesn't
seem to cause problems. But I have two databases with live data where I can
cause this. I have been able to pare one of these down to remove personal
information and reduce unnecessary data, while still generating the crash with
that query. Would someone be interested in a dump of this db? Of course,
that someone would ideally be able to test this on a 64-bit (K)ubutnu system,
in the hopes that the problem can be replicated.

Mike

Mike Leahy

unread,
Mar 20, 2010, 3:03:47 PM3/20/10
to strk, PostGIS Users Discussion
Chris,

Sorry for not catching this one sooner (I'm subscribed to the digest)...but
thanks for pointing this out. The system is completely up to date. I could
reboot into an older kernel to see if that solve the problem. I'll report
results a bit later today.

Mike

------------------------------

Message: 7
Date: Sat, 20 Mar 2010 10:03:13 -0700
From: Chris Hermansen <chris.h...@timberline.ca>
Subject: Re: [postgis-users] ERROR: array size exceeds the maximum
allowed (134217727)
To: PostGIS Users Discussion <postgi...@postgis.refractions.net>
Message-ID: <4BA4FFD1...@timberline.ca>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Very recently there was a kernel patch in Ubuntu regarding - if memory
serves - memory copying. How up to date is the system in question?

Mike Leahy

unread,
Mar 20, 2010, 4:34:49 PM3/20/10
to PostGIS Users Discussion
Chris/list:

I have tried booting into the following kernels, all exhibiting the same
issue:

2.6.31-20-generic
2.6.31-19-generic
2.6.31-14-generic

I believe 2.6.31-14-generic is the original kernel installed with this version
of Ubuntu, so unless the change to the kernel you're thinking occurred
previous to the 9.10 Karmic release, it's not the culprit here.

Mike

Chris Hermansen

unread,
Mar 20, 2010, 5:08:14 PM3/20/10
to mgl...@alumni.uwaterloo.ca, PostGIS Users Discussion
No, the patch was very recent or I would not have remembered it. FWIW
I'm also running 2.6.31-20-generic.

Too bad, it would have been an easy solution!


--
Regards,

Chris Hermansen · mailto:chris.h...@timberline.ca
tel+1.604.714.2878 · fax+1.604.733.0631 · mob+1.778.840.4625
Timberline Natural Resource Group · http://www.timberline.ca
401 · 958 West 8th Avenue · Vancouver BC · Canada · V5Z 1E5

_______________________________________________

Paragon Corporation

unread,
Mar 20, 2010, 5:53:24 PM3/20/10
to mgl...@alumni.uwaterloo.ca, PostGIS Users Discussion
Mike,

We apologize, didn't notice this was in a subquery and that you have a limit
statement in your query. So we presume regardless of your WHERE only 26
records are being selected.

So Paul could be right that you do have data that is hitting some compiled
or variable limit.

Does running the subquery alone work or you didn't try because it takes a
long time?

Thanks,

Mike Leahy

unread,
Mar 20, 2010, 6:10:38 PM3/20/10
to PostGIS Users Discussion
Hey,

The query (when it works) is virtually instantaneous when I use st_memunion()
(or when I took out part of the where clause - although now I can't even get
that to work, so maybe it was just lucky). Each of the subqueries also work
fine without any obvious problem.

To me, it seems to be a combination of things that somehow add up to this
limitation...I don't see what settings could affect the array size limit (in
postgresql.conf). Can anyone point to a config option that might make a
difference, or maybe point to another place with settings I can tinker with?

Mike

Mike Leahy

unread,
Mar 20, 2010, 6:30:42 PM3/20/10
to PostGIS Users Discussion
FWIW, if I remove the entire where clause at the end of the statement, it
works again (with st_union()) and is virtually instantaneous.

Paul Ramsey

unread,
Mar 20, 2010, 8:23:11 PM3/20/10
to mgl...@alumni.uwaterloo.ca, PostGIS Users Discussion
Mike

Well, if you can reduce it to a small db dump and query that exercises
it, I can install a 64bit ubuntu VM and see if it happens here too.

P

Paragon Corporation

unread,
Mar 20, 2010, 11:28:41 PM3/20/10
to PostGIS Users Discussion, mgl...@alumni.uwaterloo.ca
Mike,
Actually scanning thru the archives that 13... is a common number so suspect
that is an OS limit.

This could be very well a bug in PostgreSQL 8.4.

Have you tried this on other PostgreSQL 8.4 installs on other your Fedoras -
you mentioned trying on older installs?

Your problem sounds exactly like this guy's and he wasn't using PostGIS at
all but array_agg (which is logic that PostGIS borrowed from to implement
the faster collecting ST_Union behavior)i
http://archives.postgresql.org/pgsql-hackers/2009-06/msg01171.php

If you haven't already probably good to check the plan difference between
the two queries (one with addtional WHERE and one without). It could be
doing something nuts like running the sub select for each record you have in
the outer only when your extra WHERE condition is added and doing more or
less the sane thing when you take it out.

We've had some odd behavior in 8.4 with the planner doing strange things
that we haven't been able to pin down and had to implement workaround for.
As we described in this article.. Its probably unrelated though but just a
thought to throw out at the wind.

http://www.postgresonline.com/journal/index.php?/archives/149-Forcing-the-pl
anners-hand-with-set-enable_seqscan-off-WTF.html

Leo and Regina,

Paragon Corporation

unread,
Mar 20, 2010, 11:35:32 PM3/20/10
to PostGIS Users Discussion, mgl...@alumni.uwaterloo.ca
Mike,
Another thought. I suppose it could be the more or less two identical
subselects both with array_aggs
Just to give Paul something hopefully shorter to work with. Does this fail
too?

SELECT csls.chat_id, csls.ugeom
FROM (

SELECT '[stuff]' AS selection, st_union(geom) AS ugeom, csl1.chat_id
FROM testdb.user_selections AS us1
INNER JOIN testdb.chat_selection_links AS csl1 ON us1.id =
csl1.selection_id
WHERE (not us1.user_drawing and not csl1.deleted)
GROUP BY chat_id

) AS csls

LEFT JOIN (

SELECT '[stuff]' AS drawing, st_union(geom) AS ugeom, csl2.chat_id
FROM testdb.user_selections AS us2
INNER JOIN testdb.chat_selection_links AS csl2 ON us2.id =
csl2.selection_id
WHERE (us2.user_drawing and us2.deleted and not csl2.deleted)
GROUP BY chat_id

) AS csld ON csls.chat_id = csld.chat_id

Mike Leahy

unread,
Mar 21, 2010, 12:04:00 AM3/21/10
to Paragon Corporation, PostGIS Users Discussion
Hi,

That query there does not crash. I'll try to follow up on your suggestions in
the prior email.

Mike

> l anners-hand-with-set-enable_seqscan-off-WTF.html

Mike Leahy

unread,
Mar 21, 2010, 12:29:40 AM3/21/10
to Paragon Corporation, PostGIS Users Discussion
Leo/Regina,

In response to your earlier message, I did find that report of the identical
error message. However, as I'm sure you noticed, there is no apparent
investigation that followed after it. That was why I went to the PostgreSQL
IRC channel first...they walked me through generating a core dump, which is
what then pointed to it being a problem related to PostGIS. So while it's
certainly the same error being reported, the source of the problem in this
case appears PostGIS-specific - likely due to the logic that you note was
borrowed from the array_agg that caused the problem in the previous report.

As for looking at the different query plans for working/non-working versions of
the statement with different where conditions, I've attached several results in
text files. Two examples that worked, one that segfaults, and one that
produces the 'array size exceeds...' error message. All of these are the same
query, just with variations in the where condition. The query plan is a but
much for me to interpret though - what does this look like to the experts?

Mike

> l anners-hand-with-set-enable_seqscan-off-WTF.html

works.txt
segfaults.txt
works2.txt
limit_134217727.txt

Paul Ramsey

unread,
Mar 21, 2010, 12:57:03 PM3/21/10
to mgl...@alumni.uwaterloo.ca, PostGIS Users Discussion
As I was installing KUbuntu, I realized I should try the query on my
own system, and under OS/X 10.6 I see the array error, though it seems
to be cleanly trapped, there is no segfault.

crashdb=# \i ./crashing.sql
psql:./crashing.sql:36: ERROR: array size exceeds the maximum allowed
(134217727)
crashdb=#


P.

Mike Leahy

unread,
Mar 21, 2010, 3:58:03 PM3/21/10
to Paul Ramsey, PostGIS Users Discussion
Paul,

Thanks for looking into it - at least I know it's not just me (for the array
size limit at least).

What does a cleanly trapped error like this suggest to you? If you try
different variations on the where condition at the end of the statement, are
you able to get a segfault?

Mike

> >>e-p l anners-hand-with-set-enable_seqscan-off-WTF.html

Paul Ramsey

unread,
Mar 21, 2010, 6:28:59 PM3/21/10
to mgl...@alumni.uwaterloo.ca, PostGIS Users Discussion
Suggests nothing in particular, I'm afraid, because this isn't my area
of expertise ;) It's probably an 8-hour bug, so it could be a while
before I can devote an entire day to tracking back into it. It would
be ideal to have a smaller query that does the same thing.

On Sun, Mar 21, 2010 at 12:58 PM, Mike Leahy

Mike Leahy

unread,
Mar 21, 2010, 7:43:02 PM3/21/10
to Paul Ramsey, PostGIS Users Discussion
Ok,

I've attached the smallest version of this query I can make before the error
goes away. It seems to be very particular, but at least I have been able to
take out some of the subqueries. Does this still produce the error in your
environments as well?

Mike

> >> >>-th e-p l anners-hand-with-set-enable_seqscan-off-WTF.html

crashing_smaller.sql

Paragon Corporation

unread,
Mar 21, 2010, 11:01:34 PM3/21/10
to mgl...@alumni.uwaterloo.ca, Paul Ramsey, PostGIS Users Discussion
Mike and Paul,
We took a quick look at the query plans. Admittedly we suck at reading text
plans; always rely on the graphical explain.

Anyrate -- one striking thing it seems between the ones that don't work and
the ones that work are

The ones that work never employ a nested loop scan

Both the segfault one and the exceed max allowed employ nested loops. So
maybe its the interaction between nested loops and hash agg causing the
problem.


Could you two try disabling nested loop scan and run the crashing ones. Also
verify that the plan has indeed changed to not use nested loops.

set enable_nestloop = off;
set enable_hashjoin = on;

run query


--- If you can't disable nested loop, try disabling the hash_join to see if
you can make the working queries crash

set enable_hashjoin = off;
set enable_nestloop = on;
run query

Ok,

Mike

> >> >>cing -th e-p l anners-hand-with-set-enable_seqscan-off-WTF.html

Mike Leahy

unread,
Mar 21, 2010, 11:54:44 PM3/21/10
to Paragon Corporation, PostGIS Users Discussion
Leo/Regina,

Yes, I can confirm that the non-working queries work when enable_nestloop is
off, and enable_hashjoin is on.

The opposite is not the case - setting enable_nestloop on and enable_hashjoin
off does not make the 'working' variations of my query crash.

Nice catch - is there anything further I can do from my end to troubleshoot
this?

Mike

Paul Ramsey

unread,
Mar 22, 2010, 12:45:33 PM3/22/10
to mgl...@alumni.uwaterloo.ca, PostGIS Users Discussion
Did you already try replacing your postgis functions with array_agg
calls to see if we can push the problem back over the fence to pgsql
land?

http://www.postgresonline.com/journal/index.php?/archives/126-PostgreSQL-8.4-Faster-array-building-with-array_agg.html

P.

Greg Stark

unread,
Mar 22, 2010, 4:29:21 PM3/22/10
to PostGIS Users Discussion, <pgsql-hackers@postgresql.org>, mgl...@alumni.uwaterloo.ca
On Mon, Mar 22, 2010 at 4:45 PM, Paul Ramsey <pra...@cleverelephant.ca> wrote:
> Did you already try replacing your postgis functions with array_agg
> calls to see if we can push the problem back over the fence to pgsql
> land?

On Sat, Mar 20, 2010 at 9:17 AM, Mike Leahy <mgl...@alumni.uwaterloo.ca> wrote:
> Running this query on various data will produce one of two results.  One is
> the error mentioned in the subject (ERROR:  array size exceeds the maximum
> allowed (134217727)).  I can find very little information on this error.  The
> other outcome is that it often causes the PostgreSQL backend to segfault (see
> gdboutput.txt).

gdboutput.txt:

Program terminated with signal 11, Segmentation fault.
#0 0x00007fa4be23615b in pfree () from
/usr/lib/postgresql/8.4/bin/postgres
#1 0x00007fa4be18091b in makeMdArrayResult () from
/usr/lib/postgresql/8.4/bin/postgres
#2 0x00007fa4b7f038bc in pgis_accum_finalfn () from
/usr/lib/postgresql/8.4/lib/postgis-1.5.so
#3 0x00007fa4b7f039ee in pgis_geometry_union_finalfn () from
/usr/lib/postgresql/8.4/lib/postgis-1.5.so

Any chance you can generate one of these seg faults from a build with
symbols and with assertions enabled? It might catch the problem
earlier and provide more info.

IIRC there were some memory management changes which required changes
to array_agg() and which had some risk of causing problems for other
sites with similar coding. Is it possible your'e missing some of these
changes? I'm having trouble tracking them all down but at least
there's these:


commit 3d332de2eab8a01c0ef3f58ea69de2010fe8a1e1
Author: Tom Lane <t...@sss.pgh.pa.us>
Date: Thu Jul 23 20:45:27 2009 +0000

In a non-hashed Agg node, reset the "aggcontext" at group
boundaries, instead
of individually pfree'ing pass-by-reference transition values. This should
be at least as fast as the prior coding, and it has the major advantage of
clearing out any working data an aggregate function may have stored in or
underneath the aggcontext. This avoids memory leakage when an aggregate
such as array_agg() is used in GROUP BY mode. Per report from Chris Spotts.

Back-patch to 8.4. In principle the problem could arise in prior versions,
but since they didn't have array_agg the issue seems not critical.

commit 7f83b61cc26eeac0c5a09add49f6cf899f87fc0b
Author: Tom Lane <t...@sss.pgh.pa.us>
Date: Sat Jun 20 18:45:28 2009 +0000

Fix things so that array_agg_finalfn does not modify or free its input
ArrayBuildState, per trouble report from Merlin Moncure. By adopting
this fix, we are essentially deciding that aggregate final-functions
should not modify their inputs ever. Adjust documentation and comments
to match that conclusion.

--
greg

Paul Ramsey

unread,
Mar 22, 2010, 5:15:58 PM3/22/10
to PostGIS Users Discussion, mgl...@alumni.uwaterloo.ca
Mike,
See if this patch makes anything better. It matches tlane's changes to
the 8.4 agg finalfunc.
P.
finalfunc.patch

Paul Ramsey

unread,
Mar 22, 2010, 5:27:18 PM3/22/10
to PostGIS Users Discussion
This looks like a winner, at least insofar as it removes the max array
error condition for me in OS/X. Huge thanks to Greg Stark for digging
up the references to Tom Lane's changes in the array agg stuff...
Reply all
Reply to author
Forward
0 new messages