mass-inserts becomming slower and slower

Marcus Wolschon

unread,

Dec 22, 2009, 3:18:40 AM12/22/09

to h2-da...@googlegroups.com

The following code works incredibly fast the first 100´000 times but
as I´m reaching
500´000 it gets so slow, I would be faster to note down the inserted
rows with pen and paper.

Does anyone see something obvious I missed?
I´m using 1.2.126 and there are 4 such databases open
(one containing all ways and 3 containing only the simplified
versions of the most important ways)
It caculates the last 250`000 entries to take another 6 hours.
The 4 databases I am importing into are currently:
1GB
270MB
28MB and
20MB in size
and slowly growing

/**
* {@inheritDoc}
*/
@Override
public void addWay(final Way aW) {
try {
IConnection connection = getConnection();
try {
PreparedStatement addWayStmt = connection.getAddWayStmt();
int i = 1;
addWayStmt.setLong(i++, aW.getId());
addWayStmt.setInt(i++, aW.getVersion());
addWayStmt.setString(i++, serializeTags(aW.getTags()));
addWayStmt.execute();

PreparedStatement deleteWaysNodesStmt =
connection.getDeleteWaysNodesStmt();
deleteWaysNodesStmt.setLong(1, aW.getId());
deleteWaysNodesStmt.execute();

PreparedStatement addWayNodeStmt =
connection.getAddWayNodeStmt();
List<WayNode> wayNodes = aW.getWayNodes();
int index = 0;
for (WayNode wayNode : wayNodes) {
i = 1;
addWayNodeStmt.setLong(i++, aW.getId());
addWayNodeStmt.setLong(i++, wayNode.getNodeId());
addWayNodeStmt.setInt(i++, index++);
addWayNodeStmt.execute();
Node cached = myNodeCache.get(wayNode.getNodeId());
if (cached != null && cached instanceof ExtendedNode) {
((ExtendedNode) cached).addReferencedWay(aW.getId());
}
}
connection.getConnection().commit();
} finally {
returnConnection(connection);
}
} catch (SQLException e) {
LOG.log(Level.SEVERE, "Cannot add way", e);
}

}

=================================
referenced methods:

/**
* If no database connection is open, a new connection is opened. The
* database connection is then returned.<br/>
* <b>When you are done with it, use {@link
#returnConnection(MyConnection)}.</b>
* @return The database connection.
* @throws SQLException if we cannot provide a connection
* @see #returnConnection(MyConnection)
*/
protected IConnection getConnection() throws SQLException {
if (myConnectionPool == null) {

loadDatabaseDriver();

try {
myConnectionPool =
JdbcConnectionPool.create(getDatabaseURL(), getDatabaseUser(),
getDatabasePassword());
Connection connection = myConnectionPool.getConnection();
if (connection == null) {
throw new SQLException("could not get
database-connection from pool!");
}
checkSchema(connection);
myConnectionPool.setMaxConnections(Short.MAX_VALUE);

} catch (SQLException e) {
throw new OsmosisRuntimeException(
"Unable to establish a database connection to
'" + myDatabaseURL + "'.", e);
}
}
IConnection connection = myConnection.poll();
if (connection == null || connection.isClosed()) {
try {
LOG.log(Level.INFO, "Opening new connection. " +
(connection == null?"because pool is empty" : "because
pooled-connection is closed")
+ " (#connections so far: " +
myTotalConnectionsCount + ", #closed=" + totalClosedConnections + ")
DB=" + myDatabaseURL/*, new Exception("DEBUG")*/);
myTotalConnectionsCount++;
connection = new MyConnection(myConnectionPool.getConnection());
} catch (NullPointerException e) {
//connection = new
MyConnection(myConnectionPool.getConnection());
myConnectionPool = null;
return getConnection();
}
}
return connection;
}

/**
* Return a used connection to the pool.
* @param aConnection the connection to return
* @see #getConnection()
*/
protected void returnConnection(final IConnection aConnection) {
if (!myConnection.offer(aConnection)) {
LOG.info("Closing connection. as the pool is full");
aConnection.close();
totalClosedConnections++;
}
}

/**
* {@inheritDoc}
* @see org.openstreetmap.osm.data.h2.IConnection#getAddWayStmt()
*/
public PreparedStatement getAddWayStmt() throws SQLException {
if (myAddWayStmt == null) {
myAddWayStmt = myConnection.prepareStatement("MERGE
INTO ways (wayid, version, tags) VALUES (?,?,?)");
}
return myAddWayStmt;
}

/**
* {@inheritDoc}
* @see
org.openstreetmap.osm.data.h2.IConnection#getDeleteWaysNodesStmt()
*/
public PreparedStatement getDeleteWaysNodesStmt() throws SQLException {
if (myDeleteWaysNodesStmt == null) {
myDeleteWaysNodesStmt =
myConnection.prepareStatement("DELETE FROM waynodes WHERE wayid =
?");
}
return myDeleteWaysNodesStmt;
}

/**
* {@inheritDoc}
* @see org.openstreetmap.osm.data.h2.IConnection#getAddWayNodeStmt()
*/
public PreparedStatement getAddWayNodeStmt() throws SQLException {
if (myAddWayNodeStmt == null) {
myAddWayNodeStmt =
myConnection.prepareStatement("MERGE INTO waynodes (wayid, nodeid,
index) VALUES (?,?,?)");
}
return myAddWayNodeStmt;
}

Mikkel Kamstrup Erlandsen

unread,

Dec 22, 2009, 7:21:49 AM12/22/09

to h2-da...@googlegroups.com

2009/12/22 Marcus Wolschon <Mar...@wolschon.biz>:

> The following code works incredibly fast the first 100´000 times but
> as I´m reaching
> 500´000 it gets so slow, I would be faster to note down the inserted
> rows with pen and paper.
>
> Does anyone see something obvious I missed?
> I´m using 1.2.126 and there are 4 such databases open
> (one containing all ways and 3 containing only the simplified
> versions of the most important ways)
> It caculates the last 250`000 entries to take another 6 hours.
> The 4 databases I am importing into are currently:
> 1GB
> 270MB
> 28MB and
> 20MB in size
> and slowly growing

I have a 22GB database with 10M records and insertion is fine, so I
must assume that there is something wrong in your setup. The primary
difference is that I use "try INSERT else UPDATE" instead of MERGE (I
don't think that is the culprit though).

You need to specify how your database looks and what indexes,
constraints, or triggers you might have for people to analyze this
further.

--
Cheers,
Mikkel

Marcus Wolschon

unread,

Dec 22, 2009, 9:20:05 AM12/22/09

to h2-da...@googlegroups.com

On 2009-12-22, Mikkel Kamstrup Erlandsen <mikkel....@gmail.com> wrote:
>
> You need to specify how your database looks and what indexes,
> constraints, or triggers you might have for people to analyze this
> further.

It´s quite simple.
I did not include it yet as I was searching for a memory.-leak or
unclosed transaction getting out of bounds or so. Something that
accumulates.

You can find everything else including unit-tests in the SVN
of travelingsales.sourceforge.net in the projekt at trunk/libosm.
This class is the H2DataSet in the package org.openstreetmap.osm.data.h2 .

/**
* Make sure all tables we need exist.
* Convert old versions of the database-schema if needed.
* @param aConnection our connections. Not yet used.
* @throws SQLException if we cannot create/alter the tables or
the current schma is incompatible.
*/
protected void checkSchema(final Connection aConnection) throws
SQLException {
try {
Statement stmt = aConnection.createStatement();
try {
stmt.executeUpdate("CREATE CACHED TABLE IF NOT EXISTS nodes ("
+ "nodeid BIGINT PRIMARY KEY,"
+ "version INT,"
+ "lat INT ,"
+ "lon INT ,"
+ "location BIGINT ,"
+ "tags LONGVARCHAR(32767)"
+ ");");
stmt.executeUpdate("CREATE INDEX IF NOT EXISTS
nodesLocation ON nodes (location);");
stmt.executeUpdate("CREATE CACHED TABLE IF NOT EXISTS
waynodes ("
+ "wayid BIGINT NOT NULL,"
+ "nodeid BIGINT NOT NULL,"
+ "index INT NOT NULL,"
+ "PRIMARY KEY (wayid, index)"
+ ");");
stmt.executeUpdate("CREATE INDEX IF NOT EXISTS
waynodes ON waynodes (wayid);");
stmt.executeUpdate("CREATE INDEX IF NOT EXISTS
nodeways ON waynodes (nodeid);");
stmt.executeUpdate("CREATE CACHED TABLE IF NOT EXISTS ways ("
+ "wayid BIGINT PRIMARY KEY,"
+ "version INT,"
+ "tags LONGVARCHAR(32767)"
+ ");");
stmt.executeUpdate("CREATE CACHED TABLE IF NOT EXISTS
relations ("
+ "relid BIGINT PRIMARY KEY,"
+ "version INT,"
+ "tags LONGVARCHAR(32767)"
+ ");");
stmt.executeUpdate("CREATE CACHED TABLE IF NOT EXISTS
relmembers ("
+ "relid BIGINT NOT NULL,"
+ "entityid BIGINT NOT NULL,"
+ "entitytype SMALLINT NOT NULL,"
+ "index INT NOT NULL,"
+ "role VARCHAR(64),"
+ "PRIMARY KEY (relid, index)"
+ ");");
stmt.executeUpdate("CREATE INDEX IF NOT EXISTS
relmembers ON relmembers (relid);");
stmt.executeUpdate("CREATE INDEX IF NOT EXISTS
memberrels ON relmembers (entityid, entitytype);");

} finally {
stmt.close();
}
} finally {
aConnection.close();
}
}

Marcus Wolschon

unread,

Dec 22, 2009, 9:24:12 AM12/22/09

to h2-da...@googlegroups.com

PS:
the import is running since yesterday evening and has reached 408000
of 545949 ways.
When it started it was counting multiple hundrets per second, now it´s
minutes before the counter goes up another 50 ways.

I don´t understand what could cause this. :/

Marcus

Chuck Remes

unread,

Dec 22, 2009, 9:48:42 AM12/22/09

to h2-da...@googlegroups.com

Marcus,

give us some more information on your setup.

1. RAM

2. RAM allocation to the JVM (-Xmx option)

3. What happens if you downgrade to an older version of H2?

4. What version of the JVM and all of its options

5. Give us your connection URL.

Don't hold back on sharing. You already shot a bunch of code at us but
that only tells half the story.

cr

> --
>
> You received this message because you are subscribed to the Google
> Groups "H2 Database" group.
> To post to this group, send email to h2-da...@googlegroups.com.
> To unsubscribe from this group, send email to h2-database...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/h2-database?hl=en
> .
>
>

Alexander Hartner

unread,

Dec 22, 2009, 10:03:45 AM12/22/09

to h2-da...@googlegroups.com

It would be good to see which part takes the most time. Maybe it's one of the queries which doesn't have the right indexes set and with the amount of data increasing a full table scan becomes more and more expensive ?

Marcus Wolschon

unread,

Dec 22, 2009, 10:12:17 AM12/22/09

to h2-da...@googlegroups.com

On 2009-12-22, Chuck Remes <cremes....@mac.com> wrote:
> Marcus,
>
> give us some more information on your setup.
>
> 1. RAM

2GB

> 2. RAM allocation to the JVM (-Xmx option)

1.5GB and 512MB for memory-mapped IO
-Xmx1500M -XX:MaxDirectMemorySize=512M

> 3. What happens if you downgrade to an older version of H2?

I noticed the same with 121 and 125 before. Doing a test-run with
this data-set will probably require a few days as it´s still running.

> 4. What version of the JVM and all of its options

Java 1.6.0_10
options given above

> 5. Give us your connection URL.

"jdbc:h2:" + aDatabaseFile.getAbsolutePath();
then when an import starts
stmt.execute("SET LOCK_MODE 0");
stmt.execute("SET LOG 1");
stmt.execute("SET UNDO_LOG 1");

Directory with the databases:

D:\.osm>dir
Datenträger in Laufwerk D: ist SDA3
Volumeseriennummer: 52F7-C3C7

Verzeichnis von D:\.osm

21.12.2009 22:02 <DIR> .
21.12.2009 22:02 <DIR> ..
21.12.2009 21:32 <DIR> LOD0
22.12.2009 15:56 1.208.758.272 LOD0.h2.db
21.12.2009 21:32 103 LOD0.lock.db
21.12.2009 22:03 796 LOD0.trace.db
21.12.2009 22:02 <DIR> LOD1
22.12.2009 15:59 355.741.696 LOD1.h2.db
21.12.2009 22:02 103 LOD1.lock.db
21.12.2009 22:03 906 LOD1.trace.db
21.12.2009 22:02 <DIR> LOD2
22.12.2009 15:50 42.479.616 LOD2.h2.db
21.12.2009 22:02 103 LOD2.lock.db
21.12.2009 22:03 906 LOD2.trace.db
21.12.2009 22:02 <DIR> LOD3
22.12.2009 15:37 28.352.512 LOD3.h2.db
21.12.2009 22:02 103 LOD3.lock.db
21.12.2009 22:02 906 LOD3.trace.db
22.12.2009 13:43 17.051.648 streets.h2.db
21.12.2009 22:02 103 streets.lock.db
21.12.2009 22:03 196 streets.trace.db
21.12.2009 23:53 <DIR> tiles
21.12.2009 21:32 1.323.008 trafficmessages.h2.db
21.12.2009 21:32 103 trafficmessages.lock.db
21.12.2009 21:32 0 trafficmessages.trace.db
18 Datei(en) 1.653.711.080 Bytes
7 Verzeichnis(se), 40.852.856.832 Bytes frei

> Don't hold back on sharing. You already shot a bunch of code at us but
> that only tells half the story.

I tried to limit myself to the important segments to not discourage anyone.
There´s a lot of code around it that has nothing to do with this.

I can tell you how to reproduce this complete setup in a few steps.
if you want to. It´s nothing more then checkout. ant, starting an executable
jar (or running a class in eclipse) and selecting an easily downloadable file
(geodata of south america) for import into.
When this is done I can do any test you would like using the debugger.
However I need this data imported for now to reproduce an error in my
program that was reported to me. So I can´t abort now to run tests.

I´ll try to look very hard at the part that reads the data to be imported but
that part ("Osmosis") is production-quality code that reads not just
100MB of south
america but tens of gigabytes (compressed) or a few terrabyte(uncompressed)
of the whole planet every day. (This program is one of the navigations-systems
of the OpenStreetMap.)

Marcus

Marcus Wolschon

unread,

Dec 22, 2009, 10:22:20 AM12/22/09

to h2-da...@googlegroups.com

On 2009-12-22, Alexander Hartner <lostins...@googlemail.com> wrote:
> It would be good to see which part takes the most time. Maybe it's one of
> the queries which doesn't have the right indexes set and with the amount of
> data increasing a full table scan becomes more and more expensive ?

I´m not sure how to do that.
Any profiler you can recommend?

I do have a test data-set of hambuurg that takes about 0min
to import.
I can run that when south america will finally be done.

Marcus

Chuck Remes

unread,

Dec 22, 2009, 10:24:38 AM12/22/09

to h2-da...@googlegroups.com

On Dec 22, 2009, at 9:12 AM, Marcus Wolschon wrote:

On 2009-12-22, Chuck Remes <cremes....@mac.com> wrote:

[snip some response]

4. What version of the JVM and all of its options

Java 1.6.0_10
options given above

That release is over a year old. I definitely recommend upgrading. There have been substantial improvements in the "minor" updates.

5. Give us your connection URL.

"jdbc:h2:" + aDatabaseFile.getAbsolutePath();
then when an import starts
               stmt.execute("SET LOCK_MODE 0");
               stmt.execute("SET LOG 1");
               stmt.execute("SET UNDO_LOG 1");

Try adding:

stmt.execute("SET CACHE_SIZE 393216");

That would boost your cache size from 16MB (default) to 384MB.

Don't hold back on sharing. You already shot a bunch of code at us but
that only tells half the story.

I tried to limit myself to the important segments to not discourage anyone.
There´s a lot of code around it that has nothing to do with this.

I can tell you how to reproduce this complete setup in a few steps.
if you want to. It´s nothing more then checkout. ant, starting an executable
jar (or running a class in eclipse) and selecting an easily downloadable file
(geodata of south america) for import into.
When this is done I can do any test you would like using the debugger.
However I need this data imported for now to reproduce an error in my
program that was reported to me. So I can´t abort now to run tests.

Feel free to email me the details on getting the program & data, setting it up and running it. I have a bit of time since things are slow at work.

cr

Tsvetozar

unread,

Dec 22, 2009, 10:27:07 AM12/22/09

to h2-da...@googlegroups.com

Alexander Hartner <lostins...@googlemail.com> wrote:
> Maybe it's one of the queries which doesn't have the right indexes set and with the amount of data increasing a full table scan becomes more and more expensive ?

Here's how you define one of your tables:

> CREATE CACHED TABLE IF NOT EXISTS waynodes (

> wayid BIGINT NOT NULL,
> nodeid BIGINT NOT NULL,
> index INT NOT NULL,"
> PRIMARY KEY (wayid, index)
> );

> CREATE INDEX IF NOT EXISTS waynodes ON waynodes (wayid);

> CREATE INDEX IF NOT EXISTS nodeways ON waynodes (nodeid);

Correct me if I'm wrong but I think index waynodes does not help here, because you have already an index (primary key) with first field wayid.

I've stumbled upon a thread in this mailing list that there was some similar problems in a very similar case with duplicating indexes like that and the optimizer did not choose the right index.

If it could be a problem with the indexes may you should try removing the waynodes index, because it's of no use (and probably could have confused the optimizer to use it instead of primary key when merging/insertig new data). You've got already that index in your primary key (as wayid is the first field).

Marcus Wolschon

unread,

Dec 22, 2009, 10:40:34 AM12/22/09

to h2-da...@googlegroups.com

On 2009-12-22, Tsvetozar <tsve...@email.bg> wrote:

> Here's how you define one of your tables:
>> CREATE CACHED TABLE IF NOT EXISTS waynodes (
>> wayid BIGINT NOT NULL,
>> nodeid BIGINT NOT NULL,
>> index INT NOT NULL,"
>> PRIMARY KEY (wayid, index)
>> );
>> CREATE INDEX IF NOT EXISTS waynodes ON waynodes (wayid);
>> CREATE INDEX IF NOT EXISTS nodeways ON waynodes (nodeid);
>
> Correct me if I'm wrong but I think index waynodes does not help here,
> because you have already an index (primary key) with first field wayid.
>
> I've stumbled upon a thread in this mailing list that there was some similar
> problems in a very similar case with duplicating indexes like that and the
> optimizer did not choose the right index.
>
> If it could be a problem with the indexes may you should try removing the
> waynodes index, because it's of no use (and probably could have confused the
> optimizer to use it instead of primary key when merging/insertig new data).
> You've got already that index in your primary key (as wayid is the first
> field).

I can try that.
Does a combined index on (wayid, index) speed up queries with
"WHERE wayid = ?" ?
I´m not sure about combined indice and their effect if only one part
of them is given in the where-condition.

Here we do MERGE INTO, so any query is on the complete primary key.
If it would choose any other index then the one on the primary key this
could be a possible reason.

Marcus

Thomas Mueller

unread,

Dec 22, 2009, 2:06:28 PM12/22/09

to h2-da...@googlegroups.com

Hi,

> 2GB

> 1.5GB and 512MB for memory-mapped IO
> -Xmx1500M -XX:MaxDirectMemorySize=512M

With 2 GB memory, those two settings could cause the operating system
to 'trash' because you are very close to the physical memory. Try
using -Xmx1024m or -Xmx512m instead.

Why exactly do use -XX:MaxDirectMemorySize=512M? Do you use the nio:
file system? If yes please try without - it could be the problem. Or
do you use memory mapped files in another area of your code? If yes,
are you sure this does not cause (performance) problems? In my
experience, memory mapped files can have the reverse effect than one
would expect.

> Java 1.6.0_10

> That release is over a year old. I definitely recommend upgrading. There have been substantial improvements in the "minor" updates.

While upgrading is a good idea (for security reasons) it will have
little or no effect to performance.

> stmt.execute("SET CACHE_SIZE 393216");
> That would boost your cache size from 16MB (default) to 384MB.

That's a good idea. It should improve performance (maybe 15% or so).

> Correct me if I'm wrong but I think index waynodes does not help here, because you have already an index (primary key) with first field wayid.

That's true, index waynodes doesn't help. It will only slow down
performance (maybe 10% or so).

> Does a combined index on (wayid, index) speed up queries with "WHERE wayid = ?" ?

Yes.

> Any profiler you can recommend?

A very simple Java profiler is built in (experimental). To use it, use
the following template:

import org.h2.util.Profiler;
Profiler prof = new Profiler();
prof.startCollecting();
// .... some long running process, should be at least 10 seconds
prof.stopCollecting();
System.out.println(prof.getTop(3));

Regards,
Thomas

Marcus Wolschon

unread,

Dec 22, 2009, 2:29:49 PM12/22/09

to h2-da...@googlegroups.com

On 2009-12-22, Thomas Mueller <thomas.to...@gmail.com> wrote:
>> 2GB
>> 1.5GB and 512MB for memory-mapped IO
>> -Xmx1500M -XX:MaxDirectMemorySize=512M
>
> With 2 GB memory, those two settings could cause the operating system
> to 'trash' because you are very close to the physical memory. Try
> using -Xmx1024m or -Xmx512m instead.

You DO know that memory-space for memory-mapped IO is not physical memory
but address-space?
The largest java-process here uses just 370MB of memory.space and 360KB of
swap-space. with 6-22% of cpu. (Mostly below 10%)
That process has constant disk-IO, reading and writing but not terribly much.
So strangely it is neither CPU-bound nor IO-bound.
The SDD is mostly still idling as is the CPU.

> Why exactly do use -XX:MaxDirectMemorySize=512M? Do you use the nio:
> file system? If yes please try without - it could be the problem. Or
> do you use memory mapped files in another area of your code?

I do often but not in this configuration. This is for the OsmBinDataSet
that is replaced by the H2DataSet. Again, it is a maximum and hardly used
at all.

> If yes,
> are you sure this does not cause (performance) problems? In my
> experience, memory mapped files can have the reverse effect than one
> would expect.

Where it was used heavy profiling showed a speedup on a factor
of 5-10 against conventional block-IO with RandoAccessFiles
and a consistent drop in performance the moment the file grew too large
to be mapped. What was very expensive was to create such a mapping
and thus size-changes.

>> stmt.execute("SET CACHE_SIZE 393216");
>> That would boost your cache size from 16MB (default) to 384MB.
>
> That's a good idea. It should improve performance (maybe 15% or so).

As I said, I´ll try larger cache-sizes. Maybe implement a way to
make the cache-sizes of the different databases that are used at
the same time configurable.

Still, I need to know if that cache-setting is globaly for H2 or
individual and accumulating for each database opened.

>> Does a combined index on (wayid, index) speed up queries with "WHERE
>> wayid = ?" ?
>
> Yes.

Thanks.
Then I´ll remove that and compare performance one of these days. :)

>> Any profiler you can recommend?
>
> A very simple Java profiler is built in (experimental). To use it, use
> the following template:
>
> import org.h2.util.Profiler;
> Profiler prof = new Profiler();
> prof.startCollecting();
> // .... some long running process, should be at least 10 seconds
> prof.stopCollecting();
> System.out.println(prof.getTop(3));

Will this get interfered with if other H2-databases
are accessed at the same time?
Do I have to prevent multiple such profilers
from being instanciated?
I guess with a simple singleton this will be easy.

Marcus

Marcus Wolschon

unread,

Dec 23, 2009, 12:12:38 PM12/23/09

to h2-da...@googlegroups.com

On 2009-12-22, Thomas Mueller <thomas.to...@gmail.com> wrote:

> A very simple Java profiler is built in (experimental). To use it, use
> the following template:
>
> import org.h2.util.Profiler;
> Profiler prof = new Profiler();
> prof.startCollecting();
> // .... some long running process, should be at least 10 seconds
> prof.stopCollecting();
> System.out.println(prof.getTop(3));

The import was done after 3 days and now I was able to run a smaller
test-case with this profiler: (only 67`000 ways insted of 545´000)
The test removes 3 smaller databases with lower-resolution maps
from the equation but does everything else like the fill program does.
I did not yet touch the cache-settings without having repeatable times
of the unmodified code yet.
Chuck Remes seems to also be trying to do some tests to find out
the reasons for this poor performance.

I´m not sure how to interpret the output.

Marcus

====================

Benchmark starting. Testdata=D:\Data\downloads\osm\hamburg.osm.bz2
...
23.12.2009 18:09:15
org.openstreetmap.travelingsalesman.actions.LoadMapFileActionListener$AddToMapSink
complete
INFO: Imported new map-data in:
337601 nodes in 543609ms = 0.6210364434731581 nodes/ms
67354 ways in 740141ms = 0.09100157942878452 ways/ms
835 relations in 23406ms = 0.035674613347005044 relations/ms
sum 1307156ms

Speichere 831/835 Relationen in der Datenbank...
Profiler: top stack trace(s) [build-126]
1256/21861
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
at org.h2.store.FileStore.write(FileStore.java:335)
at org.h2.store.PageStore.writePage(PageStore.java:1007)
at org.h2.store.PageStreamData.write(PageStreamData.java:110)
at org.h2.store.PageOutputStream.storePage(PageOutputStream.java:153)
at org.h2.store.PageOutputStream.write(PageOutputStream.java:132)
at org.h2.store.PageLog.write(PageLog.java:492)
at org.h2.store.PageLog.addUndo(PageLog.java:468)
at org.h2.store.PageStore.logUndo(PageStore.java:745)
at org.h2.index.PageBtreeLeaf.addRow(PageBtreeLeaf.java:128)
at org.h2.index.PageBtreeLeaf.addRowTry(PageBtreeLeaf.java:94)
at org.h2.index.PageBtreeNode.addRowTry(PageBtreeNode.java:202)
at org.h2.index.PageBtreeIndex.addRow(PageBtreeIndex.java:89)
at org.h2.index.PageBtreeIndex.add(PageBtreeIndex.java:80)
at org.h2.table.TableData.addRow(TableData.java:130)
1161/21861
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
at org.h2.store.FileStore.write(FileStore.java:335)
at org.h2.store.PageStore.writePage(PageStore.java:1007)
at org.h2.store.PageStreamData.write(PageStreamData.java:110)
at org.h2.store.PageOutputStream.storePage(PageOutputStream.java:153)
at org.h2.store.PageOutputStream.write(PageOutputStream.java:132)
at org.h2.store.PageLog.write(PageLog.java:492)
at org.h2.store.PageLog.logAddOrRemoveRow(PageLog.java:596)
at org.h2.store.PageStore.logAddOrRemoveRow(PageStore.java:1096)
at org.h2.index.PageDataIndex.addTry(PageDataIndex.java:188)
at org.h2.index.PageDataIndex.add(PageDataIndex.java:126)
at org.h2.table.TableData.addRow(TableData.java:130)
at org.h2.command.dml.Merge.merge(Merge.java:180)
at org.h2.command.dml.Merge.update(Merge.java:126)
at org.h2.command.CommandContainer.update(CommandContainer.java:71)
730/21861
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.PrintStream.write(PrintStream.java:432)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272)
at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:85)
at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:168)
at java.io.PrintStream.write(PrintStream.java:477)
at java.io.PrintStream.print(PrintStream.java:619)
at java.io.PrintStream.println(PrintStream.java:756)
at org.openstreetmap.travelingsalesman.actions.LoadMapFileActionListener$AddToMapSink.process(LoadMapFileActionListener.java:391)
at org.openstreetmap.osmosis.core.xml.v0_6.impl.NodeElementProcessor.end(NodeElementProcessor.java:117)
at org.openstreetmap.osmosis.core.xml.v0_6.impl.OsmHandler.endElement(OsmHandler.java:107)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601)

The import took us 1471735 milliseconds

Marcus Wolschon

unread,

Dec 23, 2009, 12:56:41 PM12/23/09

to h2-da...@googlegroups.com

...and here are the resuls with 384MB cache and the redundand index removed.
I´ll now test with a large data-set if the slowdown still happens.
I do get the impression that it is still there.

23.12.2009 18:54:27

org.openstreetmap.travelingsalesman.actions.LoadMapFileActionListener$AddToMapSink
complete
INFO: Imported new map-data in:

337601 nodes in 528469ms = 0.6388283891770378 nodes/ms
67354 ways in 631250ms = 0.10669940594059406 ways/ms
835 relations in 33843ms = 0.024672753597494313 relations/ms
sum 1193562ms

Profiler: top stack trace(s) [build-126]

1054/21873

at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
at org.h2.store.FileStore.write(FileStore.java:335)
at org.h2.store.PageStore.writePage(PageStore.java:1007)

at org.h2.index.PageBtreeLeaf.write(PageBtreeLeaf.java:246)
at org.h2.store.PageStore.writeBack(PageStore.java:720)
at org.h2.store.PageStore.writeBack(PageStore.java:325)
at org.h2.store.PageStore.checkpoint(PageStore.java:341)
at org.h2.store.PageStore.commit(PageStore.java:1111)
at org.h2.log.LogSystem.commit(LogSystem.java:473)
at org.h2.engine.Session.commit(Session.java:453)
at org.h2.command.Command.stop(Command.java:168)
at org.h2.command.Command.executeUpdate(Command.java:245)
at org.h2.jdbc.JdbcPreparedStatement.execute(JdbcPreparedStatement.java:181)
at org.openstreetmap.osm.data.h2.H2DataSet.addNode(H2DataSet.java:947)
at org.openstreetmap.travelingsalesman.actions.LoadMapFileActionListener$AddToMapSink.process(LoadMapFileActionListener.java:405)
907/21873

at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
at org.h2.store.FileStore.write(FileStore.java:335)
at org.h2.store.PageStore.writePage(PageStore.java:1007)
at org.h2.store.PageStreamData.write(PageStreamData.java:110)
at org.h2.store.PageOutputStream.storePage(PageOutputStream.java:153)
at org.h2.store.PageOutputStream.write(PageOutputStream.java:132)
at org.h2.store.PageLog.write(PageLog.java:492)
at org.h2.store.PageLog.logAddOrRemoveRow(PageLog.java:596)
at org.h2.store.PageStore.logAddOrRemoveRow(PageStore.java:1096)
at org.h2.index.PageDataIndex.addTry(PageDataIndex.java:188)
at org.h2.index.PageDataIndex.add(PageDataIndex.java:126)
at org.h2.table.TableData.addRow(TableData.java:130)
at org.h2.command.dml.Merge.merge(Merge.java:180)
at org.h2.command.dml.Merge.update(Merge.java:126)
at org.h2.command.CommandContainer.update(CommandContainer.java:71)

812/21873

at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
at org.h2.store.FileStore.write(FileStore.java:335)

at org.h2.store.PageStore.checkpoint(PageStore.java:356)
at org.h2.store.PageStore.commit(PageStore.java:1111)
at org.h2.log.LogSystem.commit(LogSystem.java:473)
at org.h2.engine.Session.commit(Session.java:453)
at org.h2.command.Command.stop(Command.java:168)
at org.h2.command.Command.executeUpdate(Command.java:245)
at org.h2.jdbc.JdbcPreparedStatement.execute(JdbcPreparedStatement.java:181)
at org.openstreetmap.osm.data.h2.H2DataSet.addWay(H2DataSet.java:1064)
at org.openstreetmap.travelingsalesman.actions.LoadMapFileActionListener$AddToMapSink.process(LoadMapFileActionListener.java:449)
at org.openstreetmap.osmosis.core.xml.v0_6.impl.WayElementProcessor.end(WayElementProcessor.java:116)

at org.openstreetmap.osmosis.core.xml.v0_6.impl.OsmHandler.endElement(OsmHandler.java:107)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601)

at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1774)

The import took us 1350422 milliseconds

Thomas Mueller

unread,

Dec 27, 2009, 10:48:22 AM12/27/09

to h2-da...@googlegroups.com

Hi,

>>> 2GB
>>> 1.5GB and 512MB for memory-mapped IO
>>> -Xmx1500M -XX:MaxDirectMemorySize=512M
>> With 2 GB memory, those two settings could cause the operating system
>> to 'trash' because you are very close to the physical memory. Try
>> using -Xmx1024m or -Xmx512m instead.
> You DO know that memory-space for memory-mapped IO is not physical memory
> but address-space?

No, I didn't find that anywhere. Could you provide a link where this
is documented? You sound like you already know that this is not the
problem, but I wouldn't be so sure yet. Do you start multiple JVMs
with -Xmx1500M, or just one? In any case, I wouldn't use -Xmx1500m if
you only have 2 GB of physical memory.

> The largest java-process here uses just 370MB of memory.space and 360KB of
> swap-space.

How did you measure that?

> The SDD is mostly still idling as is the CPU.

I guess you mean SSD? Good to know. I didn't do any tests with SSDs so
far, so I don't know how this affects performance.

> Where it was used heavy profiling showed a speedup on a factor
> of 5-10 against conventional block-IO with RandoAccessFiles
> and a consistent drop in performance the moment the file grew too large
> to be mapped.

OK if you are not using any memory mapped files then I guess it's not
a problem. But why do you use -XX:MaxDirectMemorySize=512M then?

> Still, I need to know if that cache-setting is globaly for H2 or
> individual and accumulating for each database opened.

It's for one database. See
http://www.h2database.com/html/grammar.html#set_cache_size - I will
improve the documentation.

> Will this get interfered with if other H2-databases
> are accessed at the same time?

Yes, it will profile the current JVM (all threads).

> Do I have to prevent multiple such profilers
> from being instanciated?

Yes.

> I guess with a simple singleton this will be easy.

I will think about that. Currently it's not a singleton.

The stack traces you provided mainly show the expected 'writing to the
database files' traces. A bit strange is the 730/21861 at
org.openstreetmap.travelingsalesman.actions.LoadMapFileActionListener$AddToMapSink.process(LoadMapFileActionListener.java:391)
- it seems to be a System.out that uses a lot of time.

Regards,
Thomas

Marcus Wolschon

unread,

Dec 27, 2009, 3:33:52 PM12/27/09

to h2-da...@googlegroups.com

On Sun, Dec 27, 2009 at 4:48 PM, Thomas Mueller
<thomas.to...@gmail.com> wrote:
> Hi,
>
>>>> 2GB
>>>> 1.5GB and 512MB for memory-mapped IO
>>>> -Xmx1500M -XX:MaxDirectMemorySize=512M
>>> With 2 GB memory, those two settings could cause the operating system
>>> to 'trash' because you are very close to the physical memory. Try
>>> using -Xmx1024m or -Xmx512m instead.
>> You DO know that memory-space for memory-mapped IO is not physical memory
>> but address-space?
>
> No, I didn't find that anywhere. Could you provide a link where this
> is documented? You sound like you already know that this is not the
> problem, but I wouldn't be so sure yet. Do you start multiple JVMs
> with -Xmx1500M, or just one? In any case, I wouldn't use -Xmx1500m if
> you only have 2 GB of physical memory.

I tried with and without. Since not much memory-mapping is done
in that code-path at all it made no difference.

I was able to make the slowdown lower by repairing a broken
caching-function on application-code I came across.

I`ll do more profiling and testing when I`m back from vacation.
I`m currently at the 26C3-conference and am busy with talks
and workshops.

>> The largest java-process here uses just 370MB of memory.space and 360KB of
>> swap-space.
>
> How did you measure that?

Windows just tells you. ;)

>> The SDD is mostly still idling as is the CPU.
>
> I guess you mean SSD? Good to know. I didn't do any tests with SSDs so
> far, so I don't know how this affects performance.

I`ll do more tests with hard-disks after the vacation to compare.

>> Do I have to prevent multiple such profilers
>> from being instanciated?
>
> Yes.
>
>> I guess with a simple singleton this will be easy.
>
> I will think about that. Currently it's not a singleton.

I ment a singleton on my side to prevent multiple
profilers from being instanciated.
Anyway I`m using real profilers now to have a deeper look.

> The stack traces you provided mainly show the expected 'writing to the
> database files' traces. A bit strange is the 730/21861 at
> org.openstreetmap.travelingsalesman.actions.LoadMapFileActionListener$AddToMapSink.process(LoadMapFileActionListener.java:391)
> - it seems to be a System.out that uses a lot of time.

Yes, I`m looking into that,
It´s the logging of the test-case and it´s in another thead.

Marcus

Reply all

Reply to author

Forward