Mix two different tables

71 views
Skip to first unread message

vicenrico

unread,
Jun 12, 2012, 10:39:55 AM6/12/12
to h2-da...@googlegroups.com
Hello!.
I'm a software developer who has a program to catalog movies.
I have two questions that I would like you to ask me, please.
My program is currently version 1.0.
I have a table, for example:

Table  (long id, long key, String title, String category, Date year )

I would like to import data from older versions, as could be, for instance:

TableOld ((long id, long key, String title, String year)

Of course, my tables have a lot of columns, but I think with these you can undertand what my problem is.

My questions:

I want to import old data (tableold) to my current database which is already open (with ist own tables already created).

1) What is the best way to import data from tableold( a *data.db file in my computer) to Table (that is currently opened). Should I use merge, or create linked tables?
2) In the "year" case, in my old db was a String , but I want to convert to Date when doing the import. Is there a good way to convert "on the fly", I mean, while importing?

Thanks

Simon Melcher

unread,
Jun 12, 2012, 10:58:25 AM6/12/12
to h2-da...@googlegroups.com
hi,
 
1) use a tool such as squirrel SQL and create a data script of the old table. execute it by replacing the table name.
 
2)
don't know how to handle it with the date field. but if you use a timestamp field instead of a date field you could do something like that:
INSERT INTO tablename (id,key,title,category,year) VALUES (0, 0, 'movietitle', 'moviecategory', PARSEDATETIME('2012','yyyy', 'en', 'GMT'));
look at the parsedatetime function for more information:
 


 

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To view this discussion on the web visit https://groups.google.com/d/msg/h2-database/-/4YIJwRCXW74J.
To post to this group, send email to h2-da...@googlegroups.com.
To unsubscribe from this group, send email to h2-database...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/h2-database?hl=en.

Vicente Rico

unread,
Jun 12, 2012, 12:59:47 PM6/12/12
to h2-da...@googlegroups.com

Thanks, but i would like to do it into my  programm, i mean in java code.
all this is because I want to update database to this new version and I don't know how to do it..
I suppose I can do it by using linked tables or merging them, but I don't know the entire process.

Rami Ojares

unread,
Jun 12, 2012, 1:09:56 PM6/12/12
to h2-da...@googlegroups.com
Read the data row by row from the old db and write it with appropriate
transformations to the new db.
All you need to know is jdbc and sql.

- Rami

Vicente Rico

unread,
Jun 12, 2012, 1:51:47 PM6/12/12
to h2-da...@googlegroups.com

Yes. I did this way, but I though it could have a new way because db is thousands of rows end is slow importing row by. I' ve though I could use "create table as select..." And after drop table...or use linked tables.

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To post to this group, send email to h2-da...@googlegroups.com.
To unsubscribe from this group, send email to h2-database+unsubscribe@googlegroups.com.

Rami Ojares

unread,
Jun 12, 2012, 5:11:17 PM6/12/12
to h2-da...@googlegroups.com
To speed up transactions you can do all the basic tricks like disable
transaction log etc.
But usually an import is done only once when migrating from old to new.
Do you need to import on a constant basis?
Also, if you have thought of trying create table as select...
then why don't you just try it and see if it is faster?
But if you need to access 2 different databases then I don't think you
can apply "create table as select..." to that scenario.

vicenrico

unread,
Jun 12, 2012, 5:17:57 PM6/12/12
to h2-da...@googlegroups.com
Well. In fact , I would like to know if I could do the next things:

1)Open old database
2)Select tables/rows of this database and create linked tables.
3)open new database
4)Pass the data from linked tables to new tables with "create table as select"...

However, I have the new tables created, so I ask if "create table IF EXISTS as select" would be a right sentence... and this sentence gets the data from linked tables...
 

Thirion

unread,
Jun 13, 2012, 4:20:03 AM6/13/12
to h2-da...@googlegroups.com
Hi  vicenrico 
I agree with Rami that jdbc and sql is probably the best way to go.
Remember:  sometimes the long way round isn't so long.

Some suggestions:
Forget about using linked tables.  They were way too slow when I tested it. 

Use prepared statements for the read and insert.
Read the data in chunks.  Select the min/max values for the primary key/unique field of a table, then have a loop where you set those values in the select statement to get the resultset, incrementing by chunk size every loop.  Play around to get the best chunk size.  Possibly even set it differently for large tables.

You can use DatabaseMetaData to make all this generic.  Get the source connection metadata, get all tables, for each table get the columns and build your prepared statements dynamically.  
You can do the same for the insert statements.

Then use getObject and setObject of the prepared statement to get the values without caring what data type they are.  This also means you won't have to change your code if you add/remove columns.

When setting values for the insert, you can check for exceptions and handle those tables/columns differently.  eg. if table name is 'movies' and column name is 'release_date' create a Date object from the string, else use getObject.

Remember to close your resultsets and clear any temporary collections if you want to keep the memory consumption down. 

Increase the cache size of the db before you start, and reset it back to normal when you're done.

I used a list of table names to extract in a particular order so I didn't have to worry about relationship issues.  You can of course disable constraint checking, do the import, and then add the constraint checking back at the end. 

I did something similar before, and a 2.5 gig database with thousands of items in a table linked to some tables with millions of records and about 100 000 images takes about 10 minutes.  Without the images it took about 2.5 minutes.  If you only want to select a subset of the data (like only movies available for a particular store, if each store wanted a separate db) the selects would probably make it a bit slower if they're quite complex.

Hope it helps

vicenrico

unread,
Jun 13, 2012, 5:31:32 AM6/13/12
to h2-da...@googlegroups.com
Thansks for your great reply. I did this code so far. Could you have a look into it , and tell me if I'm doing something wrong?. I tell you so because for importing 4000 movies, my program needs 6 minutes, and I think is a lot...

CODE
---------------------------
Class.forName("org.h2.Driver");
Connection conn = DriverManager.getConnection("jdbc:h2:"
+ FilenameUtils.concat(GlobalSettings.TEMP_DIR, "Portable")
+ ";LOG=0;CACHE_SIZE=65536;LOCK_MODE=0;UNDO_LOG=0"
+ ";IFEXISTS=TRUE", "sa", "");
DatabaseFactory.connection.setAutoCommit(false);
String sqlCount = "SELECT * FROM Movies";
Statement statement = conn.createStatement();
ResultSet result = statement.executeQuery(sqlCount);
result.next();
while (result.next()) {
OldMovie085 om = null;
Movie movie = null;
om = new OldMovie085(conn, result);
movie = new Movie();
movie.setKey(om.getKey());
movie.setTitle(om.getTitle());
movie.setOriginalTitle(om.getOriginalTitle());
movie.setCategory(om.getCategory());
movie.setDirector(om.getDirector());
  movie.setProducer(om.getProducer());
movie.setWriter(om.getWriter());
movie.setMusic(om.getMusic());
movie.setPhotographer("");

movie.setBoxSet("");
movie.setReasonMarked("");
movie.setLanguage("");
movie.setSubtitle("");
movie.setUrlInfo1("");
movie.setUrlInfo2("");
movie.setUrlInfo3("");
movie.setMoviePath("");
movie.setTrailerPath("");
movie.setType(om.getType());
movie.setFormat(om.getFormat());
movie.setSeen(om.isSeen());
MOVIE.INSERT()

 }

conn.close();
result.close();
statement.close();


So, what I'm doing is open 2 connections (old database and new database). I keep open 2 Connection objects.
I load ALL the objects (4000 or 5000 movies), and into old movie object with getters, I do the parsing to correct format (String , Date, etc...).
I fill the new object with the getter results and when I'm done with all the fields( I have a lot more, but I've only written a couple) I call Movie.Insert() to insert movie object into new database.
I do the same with all the 4000 or 5000 movies.

But it's slow, so I would be glad to have a different opinion to optimize this code. Thanks!!

Sandeep Tamhankar

unread,
Jun 13, 2012, 12:55:48 PM6/13/12
to h2-da...@googlegroups.com
Here's another direction to look in: we use the liquibase library to perform schema updates for our application (for MySQL, SqlServer, and Oracle in our case, but they do support h2 as well). In our case, we have one database, update the schema, and there are hooks for doing data transformations along the way as well. It's not quite your use case since you have an old db and want to create a new one rather than upgrading your old one, but it may help you anyway. Not sure how its speed would compare to what you're doing.

-Sandeep

Vicente Rico

unread,
Jun 13, 2012, 5:59:25 PM6/13/12
to h2-da...@googlegroups.com
Thanks. I will try so I can decide which is the best way to accomplish what I want!!!

Thirion

unread,
Jun 14, 2012, 6:04:08 AM6/14/12
to h2-da...@googlegroups.com

Ok, I hacked apart an old example of mine and removed (hopefully) all the project specific stuff.

NOTE:  I didn't test all of it, so please forgive me if there are errors in it!

1. Open the zip file and extract where you want.
2. Open the project with NetBeans
3. Fix the reference problem by pointing it to your h2 jar file.

Its the source code for a GUI app that takes the source and destination db urls, usernames and passwords and copies the one to the other.

You can change the default location of the databases in the source, so that you only have to enter the folder name for the source/destination db if you want.  Check my comments in the source.

If you just open and run it, it will copy the source to the destination db as is.  All tables, all columns, all data, no conversions.  The destination db and tables must exist.  

Look at the source for how you can customize the copy process.
I have some variables with hard coded values, so the IF statements might not make much sense, but it's just to show you how you could import data differently depending on user selection/table names etc.

There are a few overloaded methods that do the copy per table.  
You can include a whereClause, a list of columns not to copy, the chunksize for selecting from the source table, and a Map of column names and conversion methods to convert them.

NOTE:  I did not test the column conversion, but I think it should work ;-)

So, while looping through the tables to import (we get the names dynamically) you could treat certain tables differently by adding a where clause/chunk size/conversion methods to use or whatever when calling insertData.
Look at my comments in the code.

I wrote this code, so feel free to use it as a basis for your own import app.
You'll probably want to get rid of the GUI part and handle all exceptions properly.

Let me know if it helped.
MovieCopy.zip

Vicente Rico

unread,
Jun 14, 2012, 7:01:09 PM6/14/12
to h2-da...@googlegroups.com
Thanks for the code. I think is more or less what I needed. Tomorrow I will begin with the test with your code and I expect they will be fine. I've had a look into the it, and I'm glad. There are a lot of things that I didn't know (I'm a software developer but not very good with db's   :-(  ), so I will do a deep scanning of this code, haha....
Thanks again.

The only part I can't use is , as you said, GUI, because I'm using SWT , but it's very easy to do the change.

I only have a doubt: when you say to "do the import by using chunks", what does it mean? I'm planning to have as maximum 10000 movies or so, so I think I can import complete database (select * from movies) in one pass. Do you recommend me that? or should I import chunks of 1000 movies to not spend memory?

And , of course, the million bucks question: To have the images saved (100-200 kbs per image) , do you recommend me to save path into database, or it's better to save this small images into database?. Currently I only save the paths, and do the loading on demand.

Sorry for my bad english, I'm spanish....I will try to improve it  :-D

Thirion

unread,
Jun 18, 2012, 4:44:07 AM6/18/12
to h2-da...@googlegroups.com
You're welcome.

Try to do the import without chunks first to see if you're happy with the performance and memory consumption.  I guess it's a little simpler.

If you don't care about people being able to copy all the images from the directory, I guess it's fine to store them outside of the db and just store the image names.

If you have a web app it also works better with the images outside the db so the browser can detect when the same image is used more than once (based on the URL) and just reuse the same instance.  

My client is slightly paranoid, so all images were watermarked and stored in the db, with the application's clipboard disabled for certain users.

When in doubt, create a small sample, benchmark it and then you'll know ;-)
Reply all
Reply to author
Forward
0 new messages