Hi,
QueryDSL-SQL looks nice but it has a pretty glaring flaw: it
constructs a new PreparedStatement every time the user retrieves
results.
I'd like to use QueryDSL-SQL to construct a
PreparedStatement once, and issue the query multiple times without
having to construct a new PreparedStatement every time.
See
http://www.theserverside.com/news/1365244/Why...-how-to-use-them-properly for a rough idea of why this is important.
Thanks,
Gili
timowestIn my understanding constructing PreparedStatements is cheap. The point
is to use a query template like "select * from table where column = ?".
Also
the query state can change between invocations, so caching
PreparedStatements on the Querydsl level is difficult and doesn't make
much sense.
Could you post a quote from the article you mentioned
where it explicitly suggests to reuse PreparedStatements on the client
side for performance reasons?
The performance benefits for this
PreparedStatement ps = conn.prepareStatement("select a,b from t where c = ?");
For(int I = 0; I < 1000; ++I)
{
ps.setInt(1, I);
ResultSet rs = ps.executeQuery();
Rs.close();
}
ps.close();
compared to this one
For(int I = 0; I < 1000; ++I)
{
PreparedStatement ps = conn.prepareStatement("select a,b from t where c = ?");
ps.setInt(1, I);
ResultSet rs = ps.executeQuery();
Rs.close();
ps.close();
}
are probably not significant. The importance is in using a prepared template, which is what Querydsl does already.
PreparedStatement caches can also be set via Connection pools.
cowwocIt takes CPU power to do the access plan generation.
Ideally, if we send the same statement to the database twice, then we'd
like the database to reuse the access plan for the first statement. This
uses less CPU than if it regenerated the plan a second time. [...] This
allows the database to reuse the access plans for the statement and
makes the program execute more efficiently inside the database
The
database execution plan remains the same, regardless of what values are
plugged in at runtime. That being said, I've posted this question on
Stackoverflow:
http://stackoverflow.com/questions/4737717/ca...connection-pool-handle-itWe'll see what they come up with :)
Gili
timowestThere is another problem with reusing the PreparedStatement. When do you
close it? Somehow explicitly via the Query instance (query.close()) or
do you trust the Connection to close it?
cowwocI put together the following class to improve usability in my own code:
/**
* Binds a database connection and dialect.
*
* @author Gili Tzabari
*/
@RequestScoped
public class Session
{
private final Connection connection;
private final SQLTemplates dialect;
/**
* Creates a new Session.
*
* @param connection the database connection
* @param dialect the database dialect
*/
@Inject
public Session(Connection connection, SQLTemplates dialect)
{
this.connection = connection;
this.dialect = dialect;
}
/**
* Creates a new query.
*
* @return a new query
*/
public SQLQuery query()
{
return new SQLQueryImpl(connection, dialect);
}
/**
* Creates a new insert command.
*
* @param table the table to insert into
* @return the insert command
*/
public SQLInsertClause insert(RelationalPath<?> table)
{
return new SQLInsertClause(connection, dialect, table);
}
/**
* Creates a new update command.
*
* @param table the table to update
* @return the update command
*/
public SQLUpdateClause update(RelationalPath<?> table)
{
return new SQLUpdateClause(connection, dialect, table);
}
}
I
construct one of these per request (notice @RequestScoped on top) and
close them at the end of the request. We could register
PreparedStatements with a Session and close them when it is closed.
To
be honest, I am also hoping that connection pools will make the
performance difference negligible (it will result in cleaner code for
everyone). Let's wait and see what the benchmarks say.
timowestPlease wrap the example code into code blocks. That makes it more readable.
We could register PreparedStatements with a Session and close them when it is closed.
That would need Querydsl to provide listeners for created PreparedStatements. Additional complexity with little benefits.
Also
do you really have use cases for it? Executing the same query in the
same session with the exact same query state multiple times. Take
transactional isolation into the game, and you will mostly see changes
in results you have caused inside the session scope.
Try a
benchmark yourself if you don't get proper answers via SO. This is
really JDBC implementation testing and not so much testing of the RDBMS
backend.
McKinleyFor what it is worth, SQL Server 2005 becomes horribly unoptimized when
reusing its pre compilation of stored procedures with different
parameters. It makes certain query plan assumptions based on the
parameters used the first time the stored procedure it run. It is always
faster (in non trivial cases) to run straight SQL text than to use a
stored procedure multiple times with different parameters in SQL 2005.
Now,
this is one database vendor and one particular version of their
database and it is stored procedures and not prepared statements.
However, I expect that the JDBC driver would use temporary stored
procedures to cache the prepared statement so it is probably the same.
Regardless, the point is that there is at least one case where you are
likely to hit a severe performance penalty by reusing prepared
statements.
Prepared statements, along with static type checking
are a good defense against SQL injection and I think that should be the
extent of their use. Time would be better spent seeking better statement
caching at the RDBMS and JDBC driver level. At the application level
you can beat any optimization by just caching the output of popular
queries for a minute or five and asynchronously updating the cache for
the very next user request after time is up.
If you want to reuse
the prepared statement because it might be faster to get invoices for
user x and then invoices for user y I just haven't found that to be
faster than using a new statement in practical cases. The records for
use x and user y are different and they are going to compete for cache
and other resources in the RDBMS and that is where the biggest slow down
is in my experience. Requests to the same records repeatedly run fast
and requests to different records run slower because the RDBMS is going
different places for them and clearing caches. But, some databases are
better than others and some JDBC drivers don't do the right thing by
default.
I would search for a JDBC, RDBMS combination that gets
it right with intuitively caching statement usage and submit patches to
them if you see chances for improvement. But, if there are speed
problems and you see an easy victory then I don't want to dissuade you
from recommending it here. I just wanted to share my experience.