If it did that, you (or someone) would complain about the enormous
overhead imposed on trivial updates of the function. Since determining
whether the function actually did change behavior is Turing-complete,
we can't realistically try to determine that in software. So we leave
it up to the user to reindex if he makes a behavioral change in an
indexed function.
(Changing the behavior of an allegedly IMMUTABLE function has a number
of other pitfalls besides that one, btw.)
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-...@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
overhead imposed on trivial updates of the function. Since determiningIf it did that, you (or someone) would complain about the enormous
whether the function actually did change behavior is Turing-complete,
we can't realistically try to determine that in software. So we leave
it up to the user to reindex if he makes a behavioral change in an
indexed function.
(Changing the behavior of an allegedly IMMUTABLE function has a number
of other pitfalls besides that one, btw.)
Determining if changing any function will cause an index to break is
not a straight-forward problem. I don't believe that PG right now
keeps a graph of which functions call which, so replacing a function
deep in the logical calling hierarchy could break the index as easily
as one mentioned at the top.
The main one I can think of offhand is that a call of the function might
have been folded to a constant in some cached plan somewhere, and
there's no mechanism to cause that plan to get redone. (This might or
might not get fixed in 8.4 --- since the plan no longer contains any
reference at all to the function, it's not exactly trivial to fix.)
Another thing that's sort of related to the OP's complaint is something
like a table CHECK constraint that calls a user-defined function.
If you alter the function, is the system supposed to run around and
re-verify that constraint on every row? (And if so, what's supposed to
happen on a failure?) We don't enforce any such thing at the moment.
(In fact, putting the two concepts together, it's possible that
redefining a user function that's used in a UNIQUE index might mean that
the UNIQUE condition now fails ... what should happen then?)
On Aug 28, 2008, at 3:21 PM, Matthew Dennis wrote:Determining if changing any function will cause an index to break is not a straight-forward problem. I don't believe that PG right now keeps a graph of which functions call which, so replacing a function deep in the logical calling hierarchy could break the index as easily as one mentioned at the top.
I have no doubt that someone would complain about it, but I think it's better than the alternative.
I believe the EXECUTE statement would thwart such plans.
There's no way at all in the general case (a function name could be
passed as a parameter, for example). I think Matthew is suggesting to
track dependencies at run time, but that seems a recipe for burnt
fingers and an overall performance loss, for what seems a dubious gain.
Also, you have to keep in mind that we support pluggable languages. The
function's source code is just an opaque string. It could be anything.
I doubt anyone uses PL/LOLCODE functions in production, but PL/Java
functions are more likely, and for these we only have a class and method
name, and little else.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Oh, ouch, right.
I think that this is one of those cases where it's better that we
simply advertise: BE AWARE OF THIS, rather than try to provide a half-
baked solution that gives the illusion of safety without the reality.
EXECUTE in PL/pgSQL accepts a string and executes it as a statement. (That's different from the SQL-level EXECUTE.) I'm not sure how one would track dependencies there.
On Aug 28, 2008, at 6:10 PM, Matthew Dennis wrote:
I'm not sure I follow. Couldn't you track which statements were prepared that called a function and either reprepare (just like reindex, recheck, etc) or in the case of dropping a function, refuse to drop it because something depends on it?
Of course, the string could come from anywhere. There's no inherent
reason that I can think of (except good taste) that you could not
write a function that retrieved a string out of a field in a table,
executed it, and returned that as a value for use in an index. The
client didn't send the string along, but it still creates dependencies.
Anyway, as was correctly pointed out, tracking dependencies in PL/
pgSQL is bad enough, but PL/Java, PL/Perl...
Oh, ouch, right.
On Aug 28, 2008, at 6:51 PM, Alvaro Herrera wrote:
Also, you have to keep in mind that we support pluggable languages. The
function's source code is just an opaque string.
I think that this is one of those cases where it's better that we simply advertise: BE AWARE OF THIS, rather than try to provide a half-baked solution that gives the illusion of safety without the reality.
Of course, the string could come from anywhere. There's no inherent reason that I can think of (except good taste) that you could not write a function that retrieved a string out of a field in a table, executed it, and returned that as a value for use in an index. The client didn't send the string along, but it still creates dependencies.
On Aug 28, 2008, at 7:04 PM, Matthew Dennis wrote:
The plpgsql execute statement, as I understand it, means "take this string and execute like a client sent it to you".
Anyway, as was correctly pointed out, tracking dependencies in PL/pgSQL is bad enough, but PL/Java, PL/Perl...
This would mean that, for example, if you changed any single function
(no matter how distant and irrelevant to the one used to create a
functional index), the jar would change, so we would have to
invalidate all functional indexes written using functions contained
in that jar. I'd certainly rebel at that: It would make deploying a
new version of the jar very expensive, and unavoidably so.
I'd have to say that the current situation falls into the category of
"slightly annoying," but it has the benefit that whether or not to
rebuild the index is left up to me.
There's no way at all in the general case (a function name could be
passed as a parameter, for example). I think Matthew is suggesting to
track dependencies at run time, but that seems a recipe for burnt
fingers and an overall performance loss, for what seems a dubious gain.
> The cases about taking a string and sending it via execute don't seem to fit
> here for 1) cases where it is impossible to track the dependencies can be
> trivially constructed and 2) the very nature of the execute statement makes
> it obvious that it I shouldn't expect it to be tracked. Poor Analogy: In C,
> if foo calls bar and I remove bar I expect the compiler to tell me. If
> elsewhere in my code, I construct a memory address of where I believe bar
> should be and execute it I have no expectations for the compiler to tell me
> bar was removed.
The analogy is poor, yes. A better analogy is the use of dlopen() on a
library. If the library is not present at run time, the compiler will
not tell you anything.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
--
Another question though. Since I could potentially start transaction, drop indexes/checks, replace function, create indexes/checks, commit tranasaction could I deal with the case of the constant folding into the cached plan by flushing the entire cache in the same transaction? Is cache flushing transactional? The cases I have for this are infrequent in time and the overhead of reindexing things, rechecking checks/unique indexes already dwarf the performance lost to flushing the cache.
On a related note, if I had a maintenence window where I can shutdown all DB access, make the referenced changes to the functions/indexes/caches/checks and restart PG - in your opinion, are there other likely problems to changing an immutable function under those circumstances, or should that be pretty safe? In other words, I have a function that has indexes on it that does the wrong thing - what do I do to replace it?
I have added a Todo item about this. But as you see, the problem is
quite large and involved.