cfqueryparam faster or slower?

458 views
Skip to first unread message

Mattijs Spierings

unread,
Jan 6, 2013, 9:12:07 AM1/6/13
to ra...@googlegroups.com
Hi,

after reading this article:
http://www.coldfusionmuse.com/index.cfm/2008/11/18/performance.and.cfqueryparam

I was wondering if Railo acts the same as Adobe ColdFusion towards queryparams that it should improve the query execution plan on the DB.
I have been using cfqueryparams for all variables, in the insert statements, in the setter for an update and also in static booleans in the where clause.

Am I overdoing it or is it fine and dandy?

Randy Johnson

unread,
Jan 6, 2013, 9:33:23 AM1/6/13
to ra...@googlegroups.com

My general rule of thumb is to use it for everything because I am worried that another developer will get into the code and change my static variable to dynamic and will not use a cfqueryparam.  If I use cfqueryparam’s everywhere and the developer changes a  static variable to dynamic the cfqueryparam will already be in place.

 

Randy

Matt Quackenbush

unread,
Jan 6, 2013, 9:48:53 AM1/6/13
to ra...@googlegroups.com

To expand slightly on Randy's comment, the rule of thumb is to *always* use cfqueryparam unless and until you've proven that the query in question is causing a performance problem. Trying to optimize before there's a problem is about as efficient as the US Congress.

Adam Cameron

unread,
Jan 6, 2013, 3:45:39 PM1/6/13
to ra...@googlegroups.com
Agreed.

My "rule of thumb" is that there are two components to an SQL operation... the SQL, and the values the SQL "uses". The values don't belong in the SQL string, they belong as parameters.

In general parameterised queries will run faster, or at least be no different than hard-coding values in the SQL string. Very occasionally they might perform slower, and to a degree that is a problem: then one might need to look at how the query is constructed.

--
Adam




On Sunday, 6 January 2013 14:48:53 UTC, QuackFuzed wrote:

To expand slightly on Randy's comment, the rule of thumb is to *always* use cfqueryparam unless and until you've proven that the query in question is causing a performance problem. Trying to optimize before there's a problem is about as efficient as the US Congress.

GB

unread,
Jan 6, 2013, 7:07:39 PM1/6/13
to ra...@googlegroups.com
It's a good rule of thumb, but I have found that a lot of cfqueryparam can really slow down the cfml processing. I don't like that.

I never use cfqueryparam on numeric values, dates, or other pre-validated values. But I always, always use cfqueryparam on strings, or anything not pre-validated as absolutely safe.

There are must, much faster ways to accomplish what cqueryparam does without using that tag. For instance, #VAL(variables.incoming_int)# is equally as safe as <cfqueryparam cfsqltype="cf_sql_integer" value="#variables.incoming_int#" />, and VAL() is much, much faster. If you need to make sure there's no decimal then do INT().

I can pre-validate dates or values that qualify a predetermined list, say, 'A,B, or C', or 'Y, N' - that's a string value that I i've pre-checked in the code so I know I can skip the cfqueryparam.

That's my 2 cents on the matter.

Bruce Kirkpatrick

unread,
Jan 7, 2013, 1:48:14 AM1/7/13
to ra...@googlegroups.com
I've always handled escaping SQL variables like this:
<cfscript>
function zescape(v){
return replace(replace(arguments.v,"\","\\","all"),"'","\'","all"));
}
v="anything unsafe";
</cfscript>
<cfquery name="q" datasource="database">
select * from table where column ='#zescape(v)#'
</cfquery>
I just treat all variables as strings because any invalid queries will just come back as recordcount zero.  No need to write validation or use strict data types unless there is a real security concern.  It has always been confusing how coldfusion something automatically escapes when there is no function call, but then with a function result, it doesn't escape.  By writing all my queries with cfscript, I get more consistency so I know I always have to escape values myself.

I didn't fully understand how cfqueryparam creates a preparedstatement in java in the past and I like to write most of my queries as cfscript strings, which wasn't possible with cfqueryparam before.

However, I'm converting to cfqueryparam in the future because I think the code will be cleaner - Adam said he thinks of queries as a statement and data being separate and I haven't really built my app like that so far, but I can see how that makes sense.   I'm not sure why you are seeing less performance with cfqueryparam.  I did tests earlier today and after running 1000 queries with 8 parameters, the cfqueryparam version was only a few milliseconds slower then the plain query.   653ms vs 638ms.  For just 10 queries, the difference is under 1 millisecond.  

Today, I wrote code to make cfqueryparam work with simple java-like question mark syntax in cfscript.  Just 10% slower then a plain query.

The regular prepared statement syntax just makes way more sense to me:
select * from table where id= ?

I really don't want to have complex code in the query statement anymore.

And Igal said Railo will have a executeQuery function later that will be similar to what I just did on my UDF.

The zescape function was tested to be the same performance as using cfqueryparam today as well.

The problem with CFML is that it doesn't expose the preparedstatement java object for reuse.  It keeps building a new one.  If we could execute it again with different arguments, it should be faster.   But that is only useful for a batch script or persistent connection.   Even when you don't use <cfqueryparam> Railo is still making a preparedstatement for that query.   It will just result in the database needing to keep track of more execution plans.   If your whole app is only use a few hundred prepared statements, then potentially the database will be able to keep them all in memory a little easier.  It would take a tremendous amount of random traffic to really see the difference probably.

It seems that you can't gain any performance by commiting yourself to cfqueryparam, but if you use the new lazy="true", it will be much faster returning the data to your code, but I'm finding that you have to handle null values everywhere if you use it.

Adam Cameron

unread,
Jan 7, 2013, 3:51:39 AM1/7/13
to ra...@googlegroups.com
Basing your <cfqueryparam> usage on the basis that it's intended purpose is for data validation is missing the point somewhat.

That parameterising one's SQL provides SQL-injection protection is a side-effect of using params, not the chief reason for. The chief reason for it is to allow the query's execution plan to be compiled and kept, which is difficult if the values the query is using are hard-coded into the SQL statement.

I've got a blog article that discusses this (albeit tangentially, I admit):
http://adamcameroncoldfusion.blogspot.co.uk/2012/07/what-one-can-and-cannot-do-with.html

Reading that might help clarify things.

--
Adam

Michael Offner

unread,
Jan 7, 2013, 3:55:39 AM1/7/13
to ra...@googlegroups.com
let me explain what happens with queryparam behind the curtain.
if you use queryparam in your sql statement, railo creates a PreparedSatement (or reuse a existing -> cache).

there are 2 benefits with preparedstatement:
- the value is not passed as part of the SQL (see example code in the javadoc of PreparedSatement), so SQL injection is not possible.
- like you write, if you have the same SQL again, the db has no to parse again the SQL String.

What is faster?
you cannot answer this question in general, because this depends on the jdbc driver (JDTS-MSQL,MS-MSSQL,MySQL,Postgre ...) in use. the PreparedStatement or regular Statement (for none cfqueryparam queries) coming from the jdbc driver and how they work internally is up to the driver.
I think the overhead (caching on driver and Railo/ACF level) you have with PreparedStatement are minimal (in Railo), so i would say the chance are good that PreparedStatement are faster than an regular Statement. i'm not sure about the overhead in ACF to cache PreparedStatements

Micha






2013/1/6 Mattijs Spierings <mattijss...@gmail.com>



--
/micha

Michael Offner CTO Railo Technologies GmbH

Bruce Kirkpatrick

unread,
Jan 7, 2013, 1:06:59 PM1/7/13
to ra...@googlegroups.com
I'm sorry, I didn't mention I'm using mysql 5.5 here.  I'm seeing no performance gain with preparedstatement vs no preparedstatement under heavy load.   It is possible another database or driver would act differently.   And maybe it's possible to perform different when the data access is more random / not cached.  Perhaps a query that has 5 joins, subqueries, sorting, having, limit, group by and 10 different indexes to choose from will have more of a performance hit on deciding the query plan.   But if you need that query to be faster, you'd probably optimize it so that query doesn't need joins, subqueries, sorting, etc.   It seems like preparedstatement is more likely to help queries that are too complex.  I had tried to have a simple query with 8 parameters to determine if more parameters effected performance, but it only slows down "new query()" CFC approach - The java versions are all fast. If I did another test with more elaborate SQL, I may find different results.   I do want to use cfqueryparam exclusively in the future regardless.  Adam, I didn't understand what preparedstatement was when I built most of my queries year ago, and I've yet to convert them to cfqueryparam, but I'm looking at doing that now.   I'm sure I'll squeeze a little performance out of this, but I thought others might be interested in knowing that a larger application that doesn't use it at all can still be incredibly fast.

Mattijs Spierings

unread,
Jan 7, 2013, 3:11:38 PM1/7/13
to ra...@googlegroups.com
Micha,

The article that I posted mentioned a danger that the first time the SQL is parsed and cached might set the standard for any future execution. So if the first time would have values which demands certain complex things to happen (not sure if this is possible if you design your database well), then every statement after that will be treated the same way.

Is there any thruth in this?

Op maandag 7 januari 2013 09:55:39 UTC+1 schreef Michael Offner het volgende:

Bruce Kirkpatrick

unread,
Jan 7, 2013, 3:23:41 PM1/7/13
to ra...@googlegroups.com
Don't forget that you CAN restart your database server.  After doing so, any "bad" query plan, would be re-created using the latest statistics.  If you have a massive amount of changes in the size of your tables in that short of a time-frame, then it may be an issue.  If your application is mostly read, low write, then it's unlikely you'd ever get a bad query plan from this.   Maybe some databases even let you flush this data without restarting if you think it is an issue.  Most likely you are restarting your server when updating the kernel every few weeks, so it would only be lacking optimization for that period of time at the worst case.

You can get around this by forcing an index for specific queries in MySQL.   
SELECT * FROM user FORCE INDEX (myindex) WHERE user_id=<cfqueryparam value="#userid#">

If you have an area you are concerned with, you'd want to use any trick in the book.

I research a little and it was known that at least Mysql <= 5.1 didn't support caching of query execution plans.  This might be true for 5.5 - I can't find anything saying that IT DOES cache this.

Thus, you may find it important to research what your database supports and use that specific optimization instead.

For example, mysql relies mostly on query cache and forcing indexes.  You can run a EXPLAIN version of your statement to determine how the query optimizer is choosing things for different queries.

I've only needed to use force index a handful of times out of the 1500 queries I have.

Bruce Kirkpatrick

unread,
Jan 7, 2013, 3:31:54 PM1/7/13
to ra...@googlegroups.com
Mysql does seem to cache prepared statements according to this documentation:

There is a variable that lets you adjust how many prepared statements.
 max_prepared_stmt_count which has a default of 16382 in mysql 5.5

Perhaps the caching of the sql parsing occurs, but not the optimizer plan.

Further reducing the benefit of this cache, the docs state that mysql caches this information per session and then drops the prepared statement cache.  Thus the cache is only useful for the duration of a session.

This makes prepared statement really only useful for batch queries.   And for that, you could probably just use INSERT with multiple VALUES instead.   I'm not seeing that this does anything that useful in mysql.  You'd have to be running more scientific types of applications.  A basic CMS type of web site, is going to barely show any improvement.

Bruce Kirkpatrick

unread,
Jan 7, 2013, 3:49:06 PM1/7/13
to ra...@googlegroups.com
If you had a persistent connection per user that survived for the duration of their use of your application, then prepared statement caching would boost the performance as described.

I looked at sql server docs and it seems they have a global in memory cache for execution plans.  Because Microsoft doesn't allow your to publish performance benchmarks, you don't really know if this is faster without testing it yourself.  But in my case, my most complex demanding queries performed better on mysql when compared to sql server or postgresql.   Mainly because of the unique memory engine format in mysql.  Disk based tables are up to 10 times slower even when cached.  My main app is a complex real estate search with a few hundred thousand records in several tables.

Adam Cameron

unread,
Jan 7, 2013, 5:04:29 PM1/7/13
to ra...@googlegroups.com
Cheers Bruce.
Yeah, I don't doubt that queries can still run fine if they're not parameterised. It'd perhaps not even be the first place I'd look if I had to shave time from a situation in which I needed to shave time. Well, OK, it would be (just as a matter of course), but I would not expect it to be where the biggest wins were found.

My experience was a few years ago on Oracle, and there was was a (vaguely, but real) measurable difference between running an already-compiled vs needs-to-be-compiled query. Also the DB has a finite amount of resource available for compiled queries to be cached before they need to be cycled out of memory, so cluttering up that memory with every single different possible value that could have been parameterised was just daft.

I don't do DB stuff any more (we have dedicated DB people at my current gig), so my experiences are getting old and tired (and outside where my focus lies), so new investigations are going to be more relevant than my old findings / experiences.

My comment before was mostly that people basing their code on the notion that parameterising queries is mostly / chiefly / significantly about preventing SQL injection are only seeing a side-effect of the reason for doing it, not the main reason.

--
Adam

Adam Cameron

unread,
Jan 7, 2013, 5:13:42 PM1/7/13
to ra...@googlegroups.com
I'd read the same thing. I'm sure it's a real possibility, but I think it matters how much emphasis one puts on the word "might" (might set the standard), and it also relies on on a fairly idiosyncratic situation to be the case, I'd think? Where the specific parameter being used when the plan is first compiled coincidentally causes the compiler to compile the execution plan in such a way that works against the way the data comes together when being queried using other parameter values?

That'd definitely be something I'd tuck in the back of my mind, but it would not be something that I would let influence whether I'd use parameters or not.

THAT SAID, everything I write above is based on intuition and my take on "common sense", and not backed up by any hard investigation.

--
Adam

Bruce Kirkpatrick

unread,
Jan 7, 2013, 8:21:15 PM1/7/13
to ra...@googlegroups.com
Avoiding waste of system memory like you describe also means not caching data for queries that are known to have far too many parameter permutations.  A lot more memory is spent on storing query results then SQL and execution plans.  An execution plan might only be a few bytes per query internally.   For example, I was going to mention in mysql, you can selectively enable or disable query caching per query with sql_cache or sql_no_cache like this:

select sql_no_cache * from user where user_id='1'

When the table is updated frequently, or when the database connection is closed, most or all of the cache is dropped, so you really shouldn't be caching queries like this.

The real expense of a query is the inter-process communication.  Your code is faster when it does everything in 1 process.  A lot of the hibernate ORM benchmarks are showing embedded solutions (hsqldb and objectdb) to be up to 20 times faster then commonly used database servers (mysql, postgresql, etc) and that is because embedded database eliminates the inter-process communication.  If you have to use JDBC to connect to another process, that is where 99% of the performance is lost.  The CPU related code on both sides finishes almost instantly.  So the cost of the query execution plan and parsing is unnoticeable on a fast CPU (far below 1 millisecond).  Even memcache is ridiculously slow compared to using CFML application struct keys for the same reason.   The only reason memcache has any popularity is facebook.  A bunch of people who use PHP can't handle caching very well because the language itself has no centralized server process.   I theorize that if PHP had a server which could handle shared memory in-process using pointers (references), it would be way faster and it would also probably seem more unstable like CFML too since PHP processes die without affecting the others and CFML isn't able to do that in all cases.  CFML apps have to be tightly controlled in comparison, so we probably write better quality apps because of that.  I say this having tested all PHP's shared memory features thoroughly against Railo.   A complex Railo app running from memory can be way faster then the most optimized PHP app because shared memory in php doesn't function like pointers, it uses very slow streams, serialization and inter-process communication.

That brings me to my second point.   You should deploy your application to the fastest available CPU instead of a system with many slower cores.  There are xeon servers with 16 cores, but they are only 2ghz.  It is much better to get the box with the newest 3+ghz cpu.  The faster the cpu, the less time it takes for everything.   You also want the database to be as close to the processor as possible (local disk - preferably with the data stored on a SSD).

With a fast CPU, requests finish faster.  The differences between 2 operations becomes less noticeable.   I think you'll find that most performance / stability problems with a web app come from having too much concurrency and too much disk I/O.   The fastest query is the one you didn't run.  You don't want anything to leave the Railo Java process if you need it to be fast.   You'll eventually need to reduce hits to Railo with a static proxy cache in the web server layer to avoid getting more expensive hardware.  Allowing the masses to hit Railo directly will surely seem like a constant DDoS attack once you reach mass popularity with your app.  I assume most of us here aren't getting millions of hits per day yet.   No matter how good your app is, once you add enough concurrency, it will slow to a crawl in garbage collection constantly and probably crash.

It's possible that modern hardware has reduced the CPU time for these database operations to be nearly invisible compared to the time spent doing inter-process communications.

My server is Intel e3-1230 3.2ghz sandy bridge Xeon and a 2.8ghz core i7 for development.  This was one of the fastest server chips available at hosting companies.  Ivy bridge is around 10% faster.  I use an Intel 520 series SSD for the database.  It is a dedicated server.   If you are running slower hardware, then maybe cfqueryparam shows itself more.



Reply all
Reply to author
Forward
0 new messages