Multi threaded insert failing

308 views
Skip to first unread message

__jm__

unread,
Sep 18, 2007, 9:16:32 PM9/18/07
to H2 Database
Hi Thomas,

My initial goal was to test a connection pool on top of h2 in order to
scale better (I have a variable number of tables in each I insert
millons of lines). I have a number of tasks which insert rows in the
different tables. When I switch MULTI_THREAD=1 and if I insert in
several tables in parallel I got this exception:
org.h2.jdbc.JdbcSQLException: Unique index or primary key violation:
PRIMARY_KEY_0 ON PUBLIC.SYS(ID) [23001-57]
at org.h2.message.Message.getSQLException(Message.java:84)
at org.h2.message.Message.getSQLException(Message.java:88)
at org.h2.message.Message.getSQLException(Message.java:66)
at org.h2.index.BaseIndex.getDuplicateKeyException(BaseIndex.java:61)
at org.h2.index.TreeIndex.add(TreeIndex.java:53)
at org.h2.table.TableData.addRow(TableData.java:87)
at org.h2.engine.Database.addMeta(Database.java:635)
at org.h2.engine.Database.addSchemaObject(Database.java:688)
at org.h2.table.Column.convertAutoIncrementToSequence(Column.java:
254)
at org.h2.command.ddl.CreateTable.update(CreateTable.java:115)
at org.h2.command.CommandContainer.update(CommandContainer.java:64)
at org.h2.command.Command.executeUpdate(Command.java:122)
at org.h2.jdbc.JdbcStatement.execute(JdbcStatement.java:156)
at com.orcsoftware.cameronfix.H2MultithreadTest
$RowInsertor.run(H2MultithreadTest.java:26)
at java.lang.Thread.run(Thread.java:534)

I also got something similar while creating tables in parallel.


Below is a snippet of code to reproduce it, you only need h2 in your
classpath:
import java.io.File;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;

/**
* Test class that demonstrates failures in H2 when used in a multi-
threaded context.
*/
public class H2MultithreadTest {
private static final File DATABASE_LOCATION = new
File(System.getProperty("java.io.tmpdir"),
"database");

private static class RowInsertor implements Runnable {
/**
* @see java.lang.Runnable#run()
*/
public void run() {
Connection connection = getConnection();

try {
// Create test table
String tableName = "TABLE_" +
Thread.currentThread().getName();
connection.createStatement().execute(
"CREATE TABLE " + tableName
+ "(COL1 BIGINT AUTO_INCREMENT PRIMARY
KEY, COL2 BIGINT)");
System.out.println("Table created: " + tableName);

PreparedStatement statement = connection
.prepareStatement("insert into " + tableName +
" (col2) values (?)");
int count = 0;

while (true) {
statement.setLong(1, count);
System.out.println("Table " + tableName +
"\tInserting row:" + count + "...");
statement.execute();
System.out.println("Table " + tableName
+ "\tInserting row:"
+ count
+ "...done");
count++;
}
} catch (SQLException e) {
e.printStackTrace();
}
}

}

/**
* Main method: no argument
*
* @param args
* @throws Exception
*/
public static void main(String[] args) throws Exception {
// Clean DB
if (DATABASE_LOCATION.exists() && !
deleteDirectory(DATABASE_LOCATION)) {
throw new Error("Unable to clean database folder");
}

// Create runnable and start them
int numberOfRunnable = 5;
Runnable[] runnable = new Runnable[numberOfRunnable];
for (int i = 0; i < runnable.length; i++) {
runnable[i] = new RowInsertor();
Thread thread = new Thread(runnable[i]);
thread.setName("Runnable_" + i);
thread.start();
}
}

private static Connection getConnection() {
try {
Class.forName("org.h2.Driver");
return DriverManager.getConnection("jdbc:h2:file:" +
DATABASE_LOCATION
.getAbsolutePath()
+ "/testdb"
+
";MULTI_THREADED=1;LOCK_MODE=3;AUTOCOMMIT=ON;WRITE_DELAY=0");
} catch (Exception e) {
throw new Error("Unable to get connection:" +
e.toString());
}
}

/**
* Recursively delete a directory.
*
* @param path
* Path to delete
* @return True if it succeeded
*/
private static boolean deleteDirectory(File path) {
if (path.exists()) {
File[] files = path.listFiles();

for (int i = 0; i < files.length; i++) {
if (files[i].isDirectory()) {
if (deleteDirectory(files[i]) == false) {
throw new Error("Unable to delete:" +
files[i].getAbsolutePath());
}
} else {
if (files[i].delete() == false) {
throw new Error("Unable to delete:" +
files[i].getAbsolutePath());
}
}
}
}
return (path.delete());
}
}


Let me know if you need other information.

Thanks.

Thomas Mueller

unread,
Sep 20, 2007, 2:40:06 PM9/20/07
to h2-da...@googlegroups.com
Hi,

Thanks for your help! This is a bug, it looks like related to
MULTI_THREADED. Is it OK for you if I integrate your code into the
test suite?

Thanks for your help!
Thomas

__jm__

unread,
Sep 28, 2007, 12:43:56 AM9/28/07
to H2 Database
No problem Thomas, feel free to use any code provided below.

Cheers,

On Sep 21, 4:40 am, "Thomas Mueller" <thomas.tom.muel...@gmail.com>
wrote:


> Hi,
>
> Thanks for your help! This is a bug, it looks like related to
> MULTI_THREADED. Is it OK for you if I integrate your code into the
> test suite?
>
> Thanks for your help!
> Thomas
>

__jm__

unread,
Oct 9, 2007, 8:59:54 PM10/9/07
to H2 Database
Hi Thomas,

I've tested it in the latest release and it works fine. However.. my
goal is to insert a lot of rows in various tables with multiple
threads in order to increase the number of rows I could insert in H2
per second. As I have a dual core PC I was expecting H2 to take
advantage of my two CPU (my customers may have 8 or more cores)... but
the contrary happens, H2 is much slower under contention... and the
MULTI_THREADED option slows it even more.

500 000 rows inserted in one table by one thread: 11 seconds
500 000 rows inserted in four tables by four threads: 36 seconds with
MULTI_THREADED=1
500 000 rows inserted in four tables by four threads: 16 seconds with
MULTI_THREADED=0

BTW, I tried to have a look in the source code but I got lost... where
should I start looking if I want to see how request are parsed and
executed? Thanks.


I modified slightly the example I provided to report the concurrency
issue:

import java.io.File;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;

import edu.emory.mathcs.backport.java.util.concurrent.CountDownLatch;

/**
* Test class that demonstrates performance issue in H2 when used in a
multi-threaded context.


*/
public class H2MultithreadTest {
private static final File DATABASE_LOCATION = new
File(System.getProperty("java.io.tmpdir"),
"database");

private static class RowInsertor implements Runnable {

private int rowNumber;

private RowInsertor(int count) {
rowNumber = count;
}

/**
* @see java.lang.Runnable#run()
*/
public void run() {
Connection connection = getConnection();

try {
// Create test table
String tableName = "TABLE_" +
Thread.currentThread().getName();
connection.createStatement().execute(
"CREATE TABLE " + tableName
+ "(COL1 BIGINT AUTO_INCREMENT PRIMARY
KEY, COL2 BIGINT)");
System.out.println("Table created: " + tableName);

PreparedStatement statement = connection
.prepareStatement("insert into " + tableName +
" (col2) values (?)");
int count = 0;

while (count < rowNumber) {
statement.setLong(1, count);
statement.execute();
count++;
}

latch.countDown();
System.out.println( rowNumber + " rows inserted in " +
tableName);
} catch (SQLException e) {
e.printStackTrace();
}
}

}

private static CountDownLatch latch;

/**
* Main method: no argument
*
* @param args
* @throws Exception
*/
public static void main(String[] args) throws Exception {
// Clean DB
if (DATABASE_LOCATION.exists() && !
deleteDirectory(DATABASE_LOCATION)) {
throw new Error("Unable to clean database folder");
}

// Create runnable and start them

int totalNumberOfRows = 500000;
int numberOfRunnable = 4;


Runnable[] runnable = new Runnable[numberOfRunnable];

latch = new CountDownLatch(numberOfRunnable);

for (int i = 0; i < runnable.length; i++) {

runnable[i] = new RowInsertor((totalNumberOfRows/
numberOfRunnable));


Thread thread = new Thread(runnable[i]);
thread.setName("Runnable_" + i);
thread.start();
}

long startTime = System.currentTimeMillis();

latch.await();
System.out.println(totalNumberOfRows + " inserted in " +
(System.currentTimeMillis() - startTime) + " ms");
}

private static Connection getConnection() {
try {
Class.forName("org.h2.Driver");
return DriverManager.getConnection("jdbc:h2:file:" +
DATABASE_LOCATION
.getAbsolutePath()
+ "/testdb"

+ ";MULTI_THREADED=1");


} catch (Exception e) {
throw new Error("Unable to get connection:" +
e.toString());
}
}

/**
* Recursively delete a directory.
*
* @param path
* Path to delete
* @return True if it succeeded
*/
private static boolean deleteDirectory(File path) {
if (path.exists()) {
File[] files = path.listFiles();

for (int i = 0; i < files.length; i++) {
if (files[i].isDirectory()) {
if (deleteDirectory(files[i]) == false) {
throw new Error("Unable to delete:" +
files[i].getAbsolutePath());
}
} else {
if (files[i].delete() == false) {
throw new Error("Unable to delete:" +
files[i].getAbsolutePath());
}
}
}
}
return (path.delete());
}
}


Cheers,


Jean

Thomas Mueller

unread,
Oct 12, 2007, 11:55:49 AM10/12/07
to h2-da...@googlegroups.com
Hi,

> I've tested it in the latest release and it works fine. However.. my
> goal is to insert a lot of rows in various tables with multiple
> threads in order to increase the number of rows I could insert in H2
> per second.

Maybe a solution is to use multiple databases?

> As I have a dual core PC I was expecting H2 to take
> advantage of my two CPU (my customers may have 8 or more cores)... but
> the contrary happens, H2 is much slower under contention...

Usually the problem is IO. Just using more threads will slow things
down. Example: the disk write head will go back and forth a lot. Not
sure if this happens in your case.

> MULTI_THREADED option slows it even more.

Yes, because finer grained synchronization is used in this case.

> where should I start looking if I want to see how request are parsed and executed?

Do you use PreparedStatements? If yes parsing is not an issue. If not,
why can it not be used? Parsing is done in Parser.java

I suggest to use profiling tools to find performance problems, for
example the YourKit tool.

Thanks,
Thomas

__jm__

unread,
Oct 15, 2007, 2:14:16 AM10/15/07
to H2 Database
Hi Thomas,

I don't agree on your analysis.

1. one thread, one table, H2 inserts 500 000 rows in 11 seconds
2. four threads, each of them inserts in a dedicated table ,2 CPUs, H2
inserts 500 000 rows in 36 seconds (MULTI_THREADED=1)
3. four threads, each of them inserts in a dedicated table, 2 CPUs, H2
inserts 500 000 rows in 16 seconds (MULTI_THREADED=0)

My analysis is:
a) I'm inserting very small data (2 BIGINT per row, 16 bytes), the
disk head should not have to move a lot as a sector is 512 bytes (for
NTFS) and I'm writing in the same database (same file and if I'm lucky
sectors will be consecutive)
b) On example 2 I am not IO bound (example 1 demonstrates it).
c) On example 2, most of time is spent on synchronization (example 3
demontrates it)

Examples where more than one thread is involved insert in different
tables. There should no interference between inserts in different
tables as you are using table lock. Thus my conclusion is there is an
issue with the way things are synchronized in H2 (this is my feeling
after these tests, I'll be happy if you could demonstrate the
opposite ;-) )

Don't you agree that two threads each inserting data in a different
table on a machine with two CPUs should insert data almost twice
faster than a single thread inserting data on one table? If not, what
do you reckon should be the behaviour of the "perfect" database?

I'll try to run these tests on Derby and HSQL to check the behaviour,
maybe I'm all wrong.

Let me know what you think.

Jean

On Oct 13, 1:55 am, "Thomas Mueller" <thomas.tom.muel...@gmail.com>
wrote:

Thomas Mueller

unread,
Oct 18, 2007, 1:32:23 PM10/18/07
to h2-da...@googlegroups.com
Hi,

Without having the actual source code it is very hard to understand the reason.

> Don't you agree that two threads each inserting data in a different
> table on a machine with two CPUs should insert data almost twice
> faster than a single thread inserting data on one table?

Not if only one hard drive is used. The performance of H2 is IO bound
up to some point.

> If not, what do you reckon should be the behaviour of the "perfect" database?

It should be faster than one thread, but not double as fast.

> I'll try to run these tests on Derby and HSQL to check the behaviour,
> maybe I'm all wrong.

Could you also post the test code?

Thanks,
Thomas

Reply all
Reply to author
Forward
0 new messages