Batch performance improvement (mybatis-3.2.2-SNAPSHOT)

511 views
Skip to first unread message

Iwao AVE!

unread,
Mar 23, 2013, 1:22:56 PM3/23/13
to mybatis-user
Hi all,

A new snapshot of mybatis-3 is now available on the google code download page.
https://code.google.com/p/mybatis/downloads/detail?name=mybatis-3.2.2-SNAPSHOT.jar

Among other fixes and enhancements, it improves performance of
non-dynamic batch operations.
If you are experiencing performance issues with batch operation and
your statement can be written in 'raw' language, please try the
snapshot and let us know the result.

If you are not familiar with the raw language, please see the
following section of the online manual.
http://mybatis.github.com/mybatis-3/dynamic-sql.html#Pluggable_Scripting_Languages_For_Dynamic_SQL

Questions and feedback are welcome.

Thank you,
Iwao

Eduardo

unread,
Apr 3, 2013, 4:46:36 PM4/3/13
to mybati...@googlegroups.com
Hi all,

I did an small and simple sample as a showcase to let you know the results of this enhancement.


It consist on inserting oine millon rows in hsqldb using a batch session:

  @Test
  public void shouldInsertAUser() {
    SqlSession sqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH);
    try {
      Mapper mapper = sqlSession.getMapper(Mapper.class);
      User user = new User();
      user.setId(1);
      user.setName("User");
      for (int i = 0; i < 1000000; i++) {
        mapper.insertUser(user);
      }
    } finally {
      sqlSession.close();
    }
  }
 
The statement is a simple insert with two fields:

<insert id="insertUser">
insert into users values(#{id}, #{name})
</insert>

These are the results in my laptop.
- Using MyBatis 3.1.1. Runs in 14 seconds.
- Using MyBatis 3.2.2. Runs in 9 seconds.
- Using MyBatis 3.2.2 and lang=raw [1]. Runs in 3,5 seconds.

[1]
<insert id="insertUser" lang="raw">
insert into users values(#{id}, #{name})
</insert>

Hope you find it useful!

Paul Krause

unread,
Apr 19, 2013, 8:47:11 AM4/19/13
to mybati...@googlegroups.com
The problem I've had when I try to use raw is that it does not support the includes tag.  I neither need nor want dynamic XML, but I do find the ability to reuse bite-size chunks extremely useful when building up complex queries.  Can raw mode be enhanced to support inclusion?  I'm desperately in need of some speedups.

Iwao AVE!

unread,
Apr 19, 2013, 10:20:15 AM4/19/13
to mybatis-user
Thank you for the feedback, Paul.

I considered about it when I was working on this, but the problem is
that <include /> can contain dynamic tags like <if /> which depend on
the parameter passed at execution time.

It might be a good idea to provide another point of interception which
is for 'static' statement parsing (i.e. it does not depend on the
runtime factors like parameters and is executed only once).
The recent request ( https://github.com/mybatis/mybatis-3/issues/32 )
reminds me of this idea as well.

Would such interceptor satisfy your requirement?
Or any other suggestions?

p.s.
By the way, have you actually tried 'raw' statement and see if it
improves the performance?
I had ruined Eduardo's weekend for this and could use some positive feedback ;D

Regards,
Iwao



2013/4/19 Paul Krause <paulkr...@alum.mit.edu>:
> --
> You received this message because you are subscribed to the Google Groups
> "mybatis-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mybatis-user...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Paul Krause

unread,
Apr 19, 2013, 12:40:56 PM4/19/13
to mybati...@googlegroups.com
My use case is one big complicated query that is used to repeatedly read records from a long table (a feed) joined with several side tables.  I don't really care how long it takes to set up that query, as long as it is fast to execute, ideally as a prepared statement, and that it fast and efficient to fetch results into a bean. In particular, I don't want repeat any work that was done determining how to map the first row when processing the second row.   If I understand your proposal correctly, it seems like it should satisfy these requirements at least in part. 

However, I didn't see any speedup in my test case.  It takes 130 seconds to read my test-bed of 38000 records with raw, which is the same as the default. However, there is an unrelated hot-spot which seems to be slowing everything down.  I will post updated results after I isolate and fix it.

One issue I noted with raw is that CDATA blocks get treated as part of the statement.  This is going to be a problem for any statement that uses inequalities.

Eduardo Macarron

unread,
Apr 19, 2013, 2:10:29 PM4/19/13
to mybatis-user
Hi guys...

I thought the include already worked with raw!! Sorry for that, I
should have written a test for that. It is now fixed in trunk. I will
upload an snapshot to gcode in a moment.

Paul, note that raw saves parsing time, not processing time. Raw will
not make any difference with a query that gets 38000 records but will
do a big difference in 38000 queries with 1 record (that can only
happen with batches)


2013/4/19 Paul Krause <paulkr...@alum.mit.edu>:

Paul Krause

unread,
Apr 19, 2013, 2:32:47 PM4/19/13
to mybati...@googlegroups.com
Excellent!   Thank you, Eduardo.

I assume that whatever you did for include must have automatically fixed CDATA as well, but you should add a test for that too.

I actually will be running this query in batches, but my test case did not account for that.  I guess what I really need to compare is read time of batch verses non-batch.  I'll get started.

Iwao AVE!

unread,
Apr 20, 2013, 2:11:27 AM4/20/13
to mybatis-user
Eduardo,
Sorry about my misunderstanding and thank you for the correction!

Paul,
If your result map contains <collection /> or <association />, make
sure to use <id /> where it's possible.

Regards,
Iwao


2013/4/20 Paul Krause <paulkr...@alum.mit.edu>:
Reply all
Reply to author
Forward
0 new messages