escaping identifiers in sql generation functions

2 views
Skip to first unread message

Rick Gigger

unread,
Apr 19, 2008, 10:16:37 PM4/19/08
to zoop-...@googlegroups.com
Richard,

I decided to post this to the list.

> ooh... minor problem. there isnt' any way to detect what type of
> class I'm in when I'm in a static function, is there?
> that's the late static binding stuff, right?

Yes. You need lsb to do that.

> hmm. well, for when you get back, I'll explain my problem and my
> solution, but it's kinda a hack, so maybe you'll have a better idea
> since it's a static method (generateInsertInfo) I couldn't override
> anything that it calls with any success, so I made it an optionally
> static method and if it's called statically, it works the same as
> normal, but if it's called from inside a class it escapes the field
> names as I planned
> then I changed the insertArray method to call it inside the class
> and that solved my problem
> but it's kinda ugly
> I won't commit anything for now... let me know what you think

As a policy we don't have optionally static methods anymore. There
might still be a few places where we still do it but intend to remove
them all. Luckily there is a better way to do it in this case. Do a
svn diff to see the changes made in revision 228 (or just go here http://code.google.com/p/zoop/source/detail?r=228
and click on the little plus). I added support for quoting
identifiers and changed one place in the sql generation functions
where it needs to be done. Go ahead and change any others that need it.

Does that do what you need?

Rick

Richard Bateman

unread,
Apr 20, 2008, 2:07:18 AM4/20/08
to zoop-...@googlegroups.com

> As a policy we don't have optionally static methods anymore. There
> might still be a few places where we still do it but intend to remove
> them all. Luckily there is a better way to do it in this case. Do a
> svn diff to see the changes made in revision 228 (or just go here http://code.google.com/p/zoop/source/detail?r=228
> and click on the little plus). I added support for quoting
> identifiers and changed one place in the sql generation functions
> where it needs to be done. Go ahead and change any others that need it.
>
> Does that do what you need?
>
Unfortunately, no. it does solve the problem if it's a table name,
which is great, but it doesn't help at all in my case where I need to
escape one of the fields. I could, technically, rename the field... it
wouldn't be a big deal at all. I'd much rather find a way to fix it,
though, for obvious reasons.

my insert statement gets built something like this:

insert into people (first, middle, last, call, addr, phone) values (....);

nothing particularly frightening there. The problem is, for some reason
"call" is a reserved word in mysql. So, in order for that query to
succeed, it's got to look like this:

insert into people (first, middle, last, `call`, addr, phone) values (....);

ideally, the ` should be always around all fields and table names (and a
" in most other databases)

So, brainstorming, I see a few possible ways of fixing this:

we could add to our syntax some way of escaping field names
"FIELD::call, FIELD::id, " or some such and then later escape the names
properly. This would add overhead to the queries and a bunch of code,
but might not be too bad.

we could always escape with a " and force all databases to support the
ansi standard (which can be turned on in mysql with the right config
switch, but this could potentially cause problems if someone is using it
on a database server that they don't want to or can't change the
configuration of to support that behavior

we could change those methods to no longer be static. I'm not sure
exactly why they are, except that they don't directly involve a database
connection. However, since they are generating sql statements, it's
conceivable that there may be other optimizations that could be made on
a per-database basis, and it would be handy to be able to make them
database specific if need be. Also, there probably isn't any time that
we'll need to do that when we don't have an instance of the object. The
disadvantage to this idea, of course, is that we would then break any
code that still accesses it as a static method.

My personal preference leans towards the last option (as is probably
noticable) because it seems like the cleanest, "best" way of doing
things. I wasn't around for the creation of those methods, though, so
there may be other ways of doing things that I don't understand.

Richard

Rick Gigger

unread,
Apr 21, 2008, 3:15:08 AM4/21/08
to zoop-...@googlegroups.com

On Apr 20, 2008, at 12:07 AM, Richard Bateman wrote:

>> As a policy we don't have optionally static methods anymore. There
>> might still be a few places where we still do it but intend to remove
>> them all. Luckily there is a better way to do it in this case. Do a
>> svn diff to see the changes made in revision 228 (or just go here http://code.google.com/p/zoop/source/detail?r=228
>> and click on the little plus). I added support for quoting
>> identifiers and changed one place in the sql generation functions
>> where it needs to be done. Go ahead and change any others that
>> need it.
>>
>> Does that do what you need?
>>
> Unfortunately, no. it does solve the problem if it's a table name,
> which is great, but it doesn't help at all in my case where I need to
> escape one of the fields. I could, technically, rename the field...
> it
> wouldn't be a big deal at all. I'd much rather find a way to fix it,
> though, for obvious reasons.

No, you shouldn't need to change the name of the field. There should
be no need to do that.

> we could add to our syntax some way of escaping field names
> "FIELD::call, FIELD::id, " or some such and then later escape the
> names
> properly. This would add overhead to the queries and a bunch of code,
> but might not be too bad.

Why would you do that? I created the identifier escape type because
all identifiers are escaped in the same way, fields as well as table
names. Why not adapt the sql create functions to just escape all
fields names along with the table name? As far as adding overhead to
the queries it can't be adding much because it's already escaping the
the same number of data items as their would be field names and I
don't think that is adding much overhead. I could do some tests on
this if you are concerned.

> we could always escape with a " and force all databases to support the
> ansi standard (which can be turned on in mysql with the right config
> switch, but this could potentially cause problems if someone is
> using it
> on a database server that they don't want to or can't change the
> configuration of to support that behavior

No, I don't want to do that, we should try to support the database
with it's default settings.

> we could change those methods to no longer be static. I'm not sure
> exactly why they are, except that they don't directly involve a
> database
> connection.

Well, yes that would be the reason. The idea was to make a set of
generalized sql generation functions.

> However, since they are generating sql statements, it's
> conceivable that there may be other optimizations that could be made
> on
> a per-database basis, and it would be handy to be able to make them
> database specific if need be.

That is a better argument for it. As they are now they are so basic
that I doubt we would do much of that if any though. Also they are
used in so many places now that there is a least a slight backwards
compatibility concern.

> Also, there probably isn't any time that
> we'll need to do that when we don't have an instance of the object.
> The
> disadvantage to this idea, of course, is that we would then break any
> code that still accesses it as a static method.

That is true. The idea was to handle sql generation that has nothing
to do with the database connection. I'm not yet convinced that I need
to go away from that, especially given that the case you are proposing
can actually be handled quite easily with the current methods. (see
above)

> My personal preference leans towards the last option (as is probably
> noticable) because it seems like the cleanest, "best" way of doing
> things. I wasn't around for the creation of those methods, though, so
> there may be other ways of doing things that I don't understand.

It seems you missed the point of the code I committed above. It
wasn't meant to solve it completely for you, just to show you how it
could be solved. The method I was demonstrating was a method that was
developed specifically to handle the problem you are proposing:
escaping things to be inserted into sql queries. So even if I made
the generation functions non-static I would probably still escape them
in the same way as everything else. It doesn't make any sense to
escape some things through the designed escape mechanisms and put
other ones right into the sql. I suppose you could argue that
escaping schema identifiers is fundamentally different from escaping
data but I'm not completely convinced that it should be handled in a
completely different way.

The functions in question were never intended to be the end all be all
of sql generation. They are just some little utilities that I whipped
up to perform some basic functions. I envision a much more robust API
with two components to it. 1. A DbQuery class and subclasses with an
object oriented API for generation all sorts of queries. 2. A bunch
of SQL generation functions similar to what we have now for single
table queries but with significantly more options.

What you wan to do though doesn't go outside the scope of what's
already there. If you need me too I can make the changes to the
insert function to work as you want and always escape field names.

Rick

Richard Bateman

unread,
Apr 21, 2008, 3:25:53 AM4/21/08
to zoop-...@googlegroups.com
>
> Why would you do that? I created the identifier escape type because
> all identifiers are escaped in the same way, fields as well as table
> names. Why not adapt the sql create functions to just escape all
> fields names along with the table name? As far as adding overhead to
> the queries it can't be adding much because it's already escaping the
> the same number of data items as their would be field names and I
> don't think that is adding much overhead. I could do some tests on
> this if you are concerned.
>
ok, I think I didn't totally understand the code I was looking at. I
think I see what you're getting at. Let me play with it a bit; now that
I understand I think I can get it working fairly easy.

> It seems you missed the point of the code I committed above. It
> wasn't meant to solve it completely for you, just to show you how it
> could be solved. The method I was demonstrating was a method that was
> developed specifically to handle the problem you are proposing:
> escaping things to be inserted into sql queries. So even if I made
> the generation functions non-static I would probably still escape them
> in the same way as everything else. It doesn't make any sense to
> escape some things through the designed escape mechanisms and put
> other ones right into the sql. I suppose you could argue that
> escaping schema identifiers is fundamentally different from escaping
> data but I'm not completely convinced that it should be handled in a
> completely different way.
>
> The functions in question were never intended to be the end all be all
> of sql generation. They are just some little utilities that I whipped
> up to perform some basic functions. I envision a much more robust API
> with two components to it. 1. A DbQuery class and subclasses with an
> object oriented API for generation all sorts of queries. 2. A bunch
> of SQL generation functions similar to what we have now for single
> table queries but with significantly more options.
>
> What you wan to do though doesn't go outside the scope of what's
> already there. If you need me too I can make the changes to the
> insert function to work as you want and always escape field names.
Yeah, I missed the point. =] I'm slowly getting to where I understand
what's going on again. those classes have some complex weird-ish stuff
in them at first glance.

thanks for the clarification. Hopefully I can find time to try it out soon.

Richard

Rick Gigger

unread,
Apr 21, 2008, 3:31:46 AM4/21/08
to zoop-...@googlegroups.com

On Apr 21, 2008, at 1:25 AM, Richard Bateman wrote:
> Yeah, I missed the point. =] I'm slowly getting to where I understand
> what's going on again. those classes have some complex weird-ish
> stuff
> in them at first glance.
>
> thanks for the clarification. Hopefully I can find time to try it
> out soon.

Yeah, they are a bit complicated. I really appreciate having someone
combing through the code though. It is good to know that I'm not the
only one who has ever looked at or understands it. It will be a huge
asset having more than one person who knows how everything works.
Also I'm sure you will catch a ton of things that I never thought of.

Thanks!

Rick

Richard Bateman

unread,
Apr 21, 2008, 8:30:38 PM4/21/08
to zoop-...@googlegroups.com

> What you wan to do though doesn't go outside the scope of what's
> already there. If you need me too I can make the changes to the
> insert function to work as you want and always escape field names.
>

Okay, here's what I've done:

:<something>:keyword will escape like an identifier but still use the
passed in variable of name "something", and :<something>:identifier will
just escape <something> with the database's defaults

I'm pretty sure this is what you intended. I'll wait for your "ok"
before I commit it and do some testing, but I think it should work
correctly this way. We may want the default DbConnection object to not
escape it with anything by default (just return the string passed in
w/out changes) and then in the individual classes override it. That way
when we add a new database it won't make assumptions.

On the other hand, it'll be really obvious really fast if we forget to
override it on a database that doesn't escape using the standard, and
we'll have to fix it, thus preventing random surprises later on =]

I'll let you worry about that. I'm just the mysql maintainer, right?
not my problem. :-P

These new data access tools are growing on me, btw. I think they'll
work out nicely.

Richard

Reply all
Reply to author
Forward
0 new messages