Reusing PreparedStatements

591 views
Skip to first unread message

timowest

unread,
Sep 26, 2011, 12:45:20 PM9/26/11
to quer...@googlegroups.com
Hi,

QueryDSL-SQL looks nice but it has a pretty glaring flaw: it constructs a new PreparedStatement every time the user retrieves results.

I'd like to use QueryDSL-SQL to construct a PreparedStatement once, and issue the query multiple times without having to construct a new PreparedStatement every time.

See http://www.theserverside.com/news/1365244/Why...-how-to-use-them-properly for a rough idea of why this is important.

Thanks,
Gili

timowest

In my understanding constructing PreparedStatements is cheap. The point is to use a query template like "select * from table where column = ?".

Also the query state can change between invocations, so caching PreparedStatements on the Querydsl level is difficult and doesn't make much sense.

Could you post a quote from the article you mentioned where it explicitly suggests to reuse PreparedStatements on the client side for performance reasons?

The performance benefits for this


PreparedStatement ps = conn.prepareStatement("select a,b from t where c = ?");
For(int I = 0; I < 1000; ++I)
{
        ps
.setInt(1, I);
       
ResultSet rs = ps.executeQuery();
       
Rs.close();
}
ps
.close();


compared to this one


For(int I = 0; I < 1000; ++I)
{
       
PreparedStatement ps = conn.prepareStatement("select a,b from t where c = ?");
        ps
.setInt(1, I);
       
ResultSet rs = ps.executeQuery();
       
Rs.close();
        ps
.close();
}


are probably not significant. The importance is in using a prepared template, which is what Querydsl does already.

PreparedStatement caches can also be set via Connection pools.

cowwoc

It takes CPU power to do the access plan generation. Ideally, if we send the same statement to the database twice, then we'd like the database to reuse the access plan for the first statement. This uses less CPU than if it regenerated the plan a second time. [...] This allows the database to reuse the access plans for the statement and makes the program execute more efficiently inside the database


The database execution plan remains the same, regardless of what values are plugged in at runtime. That being said, I've posted this question on Stackoverflow: http://stackoverflow.com/questions/4737717/ca...connection-pool-handle-it

We'll see what they come up with :)

Gili

timowest

There is another problem with reusing the PreparedStatement. When do you close it? Somehow explicitly via the Query instance (query.close()) or do you trust the Connection to close it?

cowwoc

I put together the following class to improve usability in my own code:

/**
* Binds a database connection and dialect.
*
* @author Gili Tzabari
*/
@RequestScoped
public class Session
{
private final Connection connection;
private final SQLTemplates dialect;

/**
* Creates a new Session.
*
* @param connection the database connection
* @param dialect the database dialect
*/
@Inject
public Session(Connection connection, SQLTemplates dialect)
{
this.connection = connection;
this.dialect = dialect;
}

/**
* Creates a new query.
*
* @return a new query
*/
public SQLQuery query()
{
return new SQLQueryImpl(connection, dialect);
}

/**
* Creates a new insert command.
*
* @param table the table to insert into
* @return the insert command
*/
public SQLInsertClause insert(RelationalPath<?> table)
{
return new SQLInsertClause(connection, dialect, table);
}

/**
* Creates a new update command.
*
* @param table the table to update
* @return the update command
*/
public SQLUpdateClause update(RelationalPath<?> table)
{
return new SQLUpdateClause(connection, dialect, table);
}
}

I construct one of these per request (notice @RequestScoped on top) and close them at the end of the request. We could register PreparedStatements with a Session and close them when it is closed.

To be honest, I am also hoping that connection pools will make the performance difference negligible (it will result in cleaner code for everyone). Let's wait and see what the benchmarks say.

timowest

Please wrap the example code into code blocks. That makes it more readable.


We could register PreparedStatements with a Session and close them when it is closed.


That would need Querydsl to provide listeners for created PreparedStatements. Additional complexity with little benefits.

Also do you really have use cases for it? Executing the same query in the same session with the exact same query state multiple times. Take transactional isolation into the game, and you will mostly see changes in results you have caused inside the session scope.

Try a benchmark yourself if you don't get proper answers via SO. This is really JDBC implementation testing and not so much testing of the RDBMS backend.

McKinley

For what it is worth, SQL Server 2005 becomes horribly unoptimized when reusing its pre compilation of stored procedures with different parameters. It makes certain query plan assumptions based on the parameters used the first time the stored procedure it run. It is always faster (in non trivial cases) to run straight SQL text than to use a stored procedure multiple times with different parameters in SQL 2005.

Now, this is one database vendor and one particular version of their database and it is stored procedures and not prepared statements. However, I expect that the JDBC driver would use temporary stored procedures to cache the prepared statement so it is probably the same. Regardless, the point is that there is at least one case where you are likely to hit a severe performance penalty by reusing prepared statements.

Prepared statements, along with static type checking are a good defense against SQL injection and I think that should be the extent of their use. Time would be better spent seeking better statement caching at the RDBMS and JDBC driver level. At the application level you can beat any optimization by just caching the output of popular queries for a minute or five and asynchronously updating the cache for the very next user request after time is up.

If you want to reuse the prepared statement because it might be faster to get invoices for user x and then invoices for user y I just haven't found that to be faster than using a new statement in practical cases. The records for use x and user y are different and they are going to compete for cache and other resources in the RDBMS and that is where the biggest slow down is in my experience. Requests to the same records repeatedly run fast and requests to different records run slower because the RDBMS is going different places for them and clearing caches. But, some databases are better than others and some JDBC drivers don't do the right thing by default.

I would search for a JDBC, RDBMS combination that gets it right with intuitively caching statement usage and submit patches to them if you see chances for improvement. But, if there are speed problems and you see an easy victory then I don't want to dissuade you from recommending it here. I just wanted to share my experience.
Reply all
Reply to author
Forward
0 new messages