@Autowired
private DefaultDSLContext jooq;
@Transactional
public void loadActuals(Collection<JRawActualsRecord> records) {
jooq.deleteFrom(RAW_ACTUALS).execute();
jooq.batchInsert(records).execute();
}17:48:15.353 [http-apr-8084-exec-15] DEBUG org.jooq.tools.LoggerListener - Executing query : delete from "raw_actuals"
17:48:15.366 [http-apr-8084-exec-15] DEBUG o.s.jdbc.datasource.DataSourceUtils - Fetching JDBC Connection from DataSource
17:48:15.367 [http-apr-8084-exec-15] DEBUG o.s.jdbc.datasource.DataSourceUtils - Registering transaction synchronization for JDBC Connection
17:48:15.369 [http-apr-8084-exec-15] DEBUG o.s.j.d.LazyConnectionDataSourceProxy - Connecting to database for operation 'prepareStatement'
17:48:15.773 [http-apr-8084-exec-15] DEBUG org.jooq.tools.LoggerListener - Affected row(s) : 80223
17:48:15.773 [http-apr-8084-exec-15] DEBUG org.jooq.tools.StopWatch - Query executed : Total: 465.477ms
17:48:15.774 [http-apr-8084-exec-15] DEBUG org.jooq.tools.StopWatch - Finishing : Total: 466.433ms, +0.956ms
17:48:23.317 [http-apr-8084-exec-15] DEBUG org.jooq.tools.LoggerListener - Executing batch query : insert into "raw_actuals" ("year", "month", "lr3", "lr4", "lr5", "customer_channel", "regional_product_brand", "regional_product_sub_brand", "regional_product_material_group", "trial_indicator", "net_direct_sales_amt", "net_direct_units", "team", "national_net_combined_sales_amt", "national_net_combined_revenue_units", "strategic_net_combined_sales_amt", "strategic_net_combined_revenue_units", "associate", "associate_number", "associate_desc", "sales_manager_number", "sales_manager_desc") values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
17:48:23.317 [http-apr-8084-exec-15] DEBUG o.s.j.d.LazyConnectionDataSourceProxy - Using existing database connection for operation 'prepareStatement'
18:03:08.812 [http-apr-8084-exec-15] DEBUG org.jooq.tools.StopWatch - Query executed : Total: 14:45
18:03:08.812 [http-apr-8084-exec-15] DEBUG org.jooq.tools.StopWatch - Finishing : Total: 14:45, +0.207ms
18:03:08.840 [http-apr-8084-exec-15] DEBUG o.s.jdbc.datasource.DataSourceUtils - Returning JDBC Connection to DataSource
18:03:08.906 [http-apr-8084-exec-15] DEBUG o.s.j.d.LazyConnectionDataSourceProxy - Using existing database connection for operation 'close'
18:03:08.906 [http-apr-8084-exec-15] DEBUG o.s.orm.jpa.JpaTransactionManager - Initiating transaction commit
18:03:08.906 [http-apr-8084-exec-15] DEBUG o.s.orm.jpa.JpaTransactionManager - Committing JPA transaction on EntityManager [org.hibernate.jpa.internal.EntityManagerImpl@2343b67e]
18:03:08.906 [http-apr-8084-exec-15] DEBUG o.h.e.t.spi.AbstractTransactionImpl - committing
18:03:08.906 [http-apr-8084-exec-15] DEBUG o.h.e.t.i.jdbc.JdbcTransaction - committed JDBC Connection
Let me preface this by saying that I am on my second day of evaluating jOOQ so user error is a distinct possibility.
Environment:
Postgresql 9.4.4
Postgresql jdbc driver 9.4-1202-jdbc42
jOOQ 3.6.2
Java 1.8.31
Spring 4.1.7
I am implementing a bulk upload of records by parsing a CSV file and loading them into a table. I have an existing script I execute using the psql command that can load 80K sample records in about a second using the "\copy" command provided by psql.
I have written code using jOOQ to load these same records in a Spring service function. The code parses the CSV and loads the data into Record objects that were generated by the jOOQ code generator from a database designed in Vertabelo (Very nice integration by the way!!!).
My service method looks like this:@Autowired
private DefaultDSLContext jooq;
@Transactional
public void loadActuals(Collection<JRawActualsRecord> records) {
jooq.deleteFrom(RAW_ACTUALS).execute();
jooq.batchInsert(records).execute();
}
The delete runs in about 0.5 seconds but the insert of 80k records takes almost 15 minutes!
That's bad indeed!
First off, my immediate recommendation is for you to use jOOQ's loader API instead of batchInsert(). There's native support for loading CSV data:
15:19:57.691 [http-apr-8084-exec-73] DEBUG org.jooq.tools.StopWatch - Query executed : Total: 16.3ms
15:19:57.691 [http-apr-8084-exec-73] DEBUG org.jooq.tools.StopWatch - Finishing : Total: 16.312ms, +0.011ms
15:19:57.691 [http-apr-8084-exec-73] DEBUG org.jooq.tools.LoggerListener - Executing query : insert into "raw_actuals" ("year", "month", "sales_manager_number", "sales_manager_desc", "associate", "associate_number", "associate_desc", "customer_channel", "team", "lr3", "lr4", "lr5", "regional_product_material_group", "trial_indicator", "regional_product_sub_brand", "regional_product_brand", "net_direct_sales_amt", "net_direct_units", "national_net_combined_sales_amt", "national_net_combined_revenue_units", "strategic_net_combined_sales_amt", "strategic_net_combined_revenue_units") values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
15:19:57.691 [http-apr-8084-exec-73] DEBUG org.jooq.tools.LoggerListener - -> with bind values : insert into "raw_actuals" ("year", "month", "sales_manager_number", "sales_manager_desc", "associate", "associate_number", "associate_desc", "customer_channel", "team", "lr3", "lr4", "lr5", "regional_product_material_group", "trial_indicator", "regional_product_sub_brand", "regional_product_brand", "net_direct_sales_amt", "net_direct_units", "national_net_combined_sales_amt", "national_net_combined_revenue_units", "strategic_net_combined_sales_amt", "strategic_net_combined_revenue_units") values (/snip DATA REMOVED /snip)
15:19:57.691 [http-apr-8084-exec-73] DEBUG o.s.j.d.LazyConnectionDataSourceProxy - Using existing database connection for operation 'prepareStatement'
15:19:57.701 [http-apr-8084-exec-73] DEBUG org.jooq.tools.LoggerListener - Affected row(s) : 1
You can play around with:1. bulk sizes (how many rows per insert)2. batch sizes (how many inserts per statement)3. commit sizes (how many statements per transaction)
Unlike the loader API, batchInsert() generates individual SQL statements for each record and batches only those statements that are equal. If 80k records produce equal SQL statements, that's a lot of useless SQL rendering. We'll look into optimising this:However, I'd still recommend using the loader API.
I hope this helps,
Lukas
I had dismissed your recommendation for bulk and batch sizes because I was confusing them with the commit size. I tried using bulkAll() but it did not make any difference.
COPY consider using multi-valued INSERTs if practical. You seem to be doing this already. Don't try to list too many values in a single VALUES though; those values have to fit in memory a couple of times over, so keep it to a few hundred per statement.Then I turned debug logging back on and noticed that with bulkAll() it was still logging 80k individual debug inserts.
I changed the call to uses bulkAfter(1000) which took the load time down to 18 seconds! This is still 35 times longer than the psql \copy command but a perfectly usable number for our use case!
Any chance that jOOQ will support the postgresql CopyManager class (See http://stackoverflow.com/questions/6958965/how-to-copy-a-data-from-file-to-postgresql-using-jdbc ) in some future release? Sounds like it can lead to some worthwhile speed improvements for people who need every last drop.
--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Then I turned debug logging back on and noticed that with bulkAll() it was still logging 80k individual debug inserts.Are you sure? bulkAll() should just generate one huge INSERT statement. Do you still have the source code to reproduce this?
Caused by: java.io.IOException: Tried to send an out-of-range integer as a 2-byte value: 1764906You are right, I must have been running the wrong code when I ran my original test. I just retested with bulkAll() and I get an error now which is not surprising since it would be a single insert statement with 80,000 rows!Caused by: java.io.IOException: Tried to send an out-of-range integer as a 2-byte value: 1764906Seems like the JDBC request has to be under 64K :-)
bulkAfter() is what we needed and it is working great, thanks!