plugin legacy mysql: generates web2py code to access your mysql legacy db

132 views
Skip to first unread message

selecta

unread,
Aug 26, 2010, 6:36:20 PM8/26/10
to web2py-users
Create the web2py code needed to access your mysql legacy db.

To make this work all the legacy tables you want to access need to
have an "id" field.

This plugin needs:
mysql
mysqldump
installed and globally available.

Under Windows you will probably need to add the mysql executable
directory to the PATH variable,
you will also need to modify mysql to mysql.exe and mysqldump to
mysqldump.exe.
Just guessing here :)

Access your tables with:
legacy_db(legacy_db.mytable.id>0).select()

If the script crashes this is might be due to that fact that the
data_type_map dictionary is incomplete.
Please complete it, improve it and continue.

download http://jaguar.biologie.hu-berlin.de/~fkrause/web2py.plugin.legacymysql.w2p
screenshot http://jaguar.biologie.hu-berlin.de/~fkrause/screenshot_legacymysql.png

Dalen Kruse

unread,
Aug 27, 2010, 11:12:21 AM8/27/10
to web2py-users
Excellent! I was just coming up on a personal project that requires
me to use a legacy database. I wasn't looking forward to writing the
models myself. Thanks for this plugin.

Dalen


On Aug 26, 5:36 pm, selecta <gr...@delarue-berlin.de> wrote:
> Create the web2py code needed to access your mysql legacy db.
>
> To make this work all the legacy tables you want to access need to
> have an "id" field.
>
> This plugin needs:
> mysql
> mysqldump
> installed and globally available.
>
> Under Windows you will probably need to add the mysql executable
> directory to the PATH variable,
> you will also need to modify mysql to mysql.exe and mysqldump to
> mysqldump.exe.
> Just guessing here :)
>
> Access your tables with:
> legacy_db(legacy_db.mytable.id>0).select()
>
> If the script crashes this is might be due to that fact that the
> data_type_map dictionary is incomplete.
> Please complete it, improve it and continue.
>
> downloadhttp://jaguar.biologie.hu-berlin.de/~fkrause/web2py.plugin.legacymysq...
> screenshothttp://jaguar.biologie.hu-berlin.de/~fkrause/screenshot_legacymysql.png

mr.freeze

unread,
Aug 27, 2010, 3:54:27 PM8/27/10
to web2py-users
Nice!

On Aug 26, 5:36 pm, selecta <gr...@delarue-berlin.de> wrote:
> Create the web2py code needed to access your mysql legacy db.
>
> To make this work all the legacy tables you want to access need to
> have an "id" field.
>
> This plugin needs:
> mysql
> mysqldump
> installed and globally available.
>
> Under Windows you will probably need to add the mysql executable
> directory to the PATH variable,
> you will also need to modify mysql to mysql.exe and mysqldump to
> mysqldump.exe.
> Just guessing here :)
>
> Access your tables with:
> legacy_db(legacy_db.mytable.id>0).select()
>
> If the script crashes this is might be due to that fact that the
> data_type_map dictionary is incomplete.
> Please complete it, improve it and continue.
>
> downloadhttp://jaguar.biologie.hu-berlin.de/~fkrause/web2py.plugin.legacymysq...
> screenshothttp://jaguar.biologie.hu-berlin.de/~fkrause/screenshot_legacymysql.png

mdipierro

unread,
Sep 7, 2010, 6:47:58 PM9/7/10
to web2py-users
I turned this into a script in web2py/scripts in trunk. What is the
license? I assume it is GPL or BSD but it should be stated.

Massimo

On Aug 26, 5:36 pm, selecta <gr...@delarue-berlin.de> wrote:
> Create the web2py code needed to access yourmysqllegacydb.
>
> To make this work all thelegacytables you want to access need to
> have an "id" field.
>
> Thispluginneeds:mysql
> mysqldump
> installed and globally available.
>
> Under Windows you will probably need to add themysqlexecutable
> directory to the PATH variable,
> you will also need to modifymysqltomysql.exe and mysqldump to
> mysqldump.exe.
> Just guessing here :)
>
> Access your tables with:
> legacy_db(legacy_db.mytable.id>0).select()
>
> If the script crashes this is might be due to that fact that the
> data_type_map dictionary is incomplete.
> Please complete it, improve it and continue.
>
> downloadhttp://jaguar.biologie.hu-berlin.de/~fkrause/web2py.plugin.legacymysq...
> screenshothttp://jaguar.biologie.hu-berlin.de/~fkrause/screenshot_legacymysql.png

mdipierro

unread,
Sep 7, 2010, 8:10:35 PM9/7/10
to web2py-users
If you have a mysql database running locally, please help me test it.

python script/extract_mysql_models.py username:password@database_name
> applications/myapp/models/db_from_mysql.py

Check applications/myapp/models/db_from_mysql.py. Did it work?

ron_m

unread,
Sep 8, 2010, 4:03:11 AM9/8/10
to web2py-users


On Sep 7, 5:10 pm, mdipierro <mdipie...@cs.depaul.edu> wrote:
> If you have a mysql database running locally, please help me test it.
>

First problem I ran into
Last line parameters are out of order on mysql() call
so it outputs incorrect database URL string
$ python scripts/extract_mysql_models.py user:pass@db_name
user
pass
db_name
legacy_db = DAL('mysql://pass:db_name@localhost/user')

Changed last line
print mysql(m.group(1),m.group(2),m.group(3))
to be
print mysql(m.group(3),m.group(1),m.group(2))

Next run I get a key error on the data_type_map
$ python scripts/extract_mysql_models.py user:pass@db_name
user
pass
db_name
Traceback (most recent call last):
File "scripts/extract_mysql_models.py", line 100, in <module>
print mysql(m.group(3),m.group(1),m.group(2))
File "scripts/extract_mysql_models.py", line 82, in mysql
web2py_table_code += "\n Field('%s','%s'),"%
(name,data_type_map[d_type])
KeyError: 'datetime'

Thinking I could fix this I added a line to the data_type_map last 2
lines so you get a reference point are
timestamp = 'datetime',
datetime = 'datetime',
)


Next run produces
$ python scripts/extract_mysql_models.py user:pass@db_name
user
pass
db_name
Traceback (most recent call last):
File "scripts/extract_mysql_models.py", line 101, in <module>
print mysql(m.group(3),m.group(1),m.group(2))
File "scripts/extract_mysql_models.py", line 80, in mysql
name, d_type = hit.group(1), hit.group(2)
AttributeError: 'NoneType' object has no attribute 'group'

Added a print line statement just above line 80 and it stops on the
mysqldump of the auth_event table at line
`description` longtext,

So it looks like hit is empty because there is no match on this line
(print line and flush commented out
hit = re.search('(\S+) (\S+) .*', line)
# print line
# if hit != None:
# print hit.group(1), hit.group(2)
# sys.stdout.flush()
name, d_type = hit.group(1), hit.group(2)
Added a test for hit != None and see what hit.group(1) and
hit.group(2) are they are the name and type except for this line.
Looks like longtext needs to be added to the type dictionary but not
getting there because of this failure.

Added a test for hit == None with continue to skip around this code

Next failure is in the CODE() call, global name CODE is not defined.
Looks like a missing import - I am just running from the bash shell.

I don't really understand why the regular expression filling hit fails
but then I am weak in that area with Python.

Here is the complete mysqldump CREATE TABLE stanza for auth_event
I am using mysql 5.1.41 on Ubuntu 10.04 with all patches.

CREATE TABLE `auth_event` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`time_stamp` datetime DEFAULT NULL,
`client_ip` varchar(512) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`origin` varchar(512) DEFAULT NULL,
`description` longtext,
PRIMARY KEY (`id`),
KEY `user_id__idx` (`user_id`),
CONSTRAINT `auth_event_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES
`auth_user` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8;

Time to catch some ZZZs it is 1 am

Ron

mdipierro

unread,
Sep 8, 2010, 8:40:43 AM9/8/10
to web2py-users
I think I fixed some of this in trunk. Could you print form me the
line offending hit? If i see it I can fix the regular expression too.
Thanks.

ron_m

unread,
Sep 8, 2010, 11:30:40 AM9/8/10
to web2py-users

On Sep 8, 5:40 am, mdipierro <mdipie...@cs.depaul.edu> wrote:
> I think I fixed some of this in trunk. Could you print form me the
> line offending hit? If i see it I can fix the regular expression too.
> Thanks.
>

I ran the new version from trunk. The if hit!=None: line added at line
74 covers up the problem so now I get a key error on double. Added a
new key to data_type_map at line 38

datetime = 'datetime',
double = 'double',
)

Now it runs to completion and prints out a model.

The line that is causing the problem with hit is line 75 before I
added above key

name, d_type = hit.group(1), hit.group(2)

because hit is None

The listing of the auth_event table is missing the field for

`description` longtext,

because of the if hit!-None: test from this mysqldump output for the
auth_event table

CREATE TABLE `auth_event` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`time_stamp` datetime DEFAULT NULL,
`client_ip` varchar(512) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`origin` varchar(512) DEFAULT NULL,
`description` longtext,
PRIMARY KEY (`id`),
KEY `user_id__idx` (`user_id`),
CONSTRAINT `auth_event_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES
`auth_user` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8;

It appears the hit = re.search('(\S+) (\S+) .*', line)
fails the test because there is nothing after the field type on the
description line but I am guessing. There is also no longtext type in
the data_type_map.

It is very close. Of course there could be other key errors because my
database might not have every data type but I could check the MySQL
manual for that.

mdipierro

unread,
Sep 8, 2010, 2:56:51 PM9/8/10
to web2py-users
I think I fixed these. One more try?

ron_m

unread,
Sep 8, 2010, 3:38:14 PM9/8/10
to web2py-users


On Sep 8, 11:56 am, mdipierro <mdipie...@cs.depaul.edu> wrote:
> I think I fixed these. One more try?
>
> On Sep 8, 10:30 am, ron_m <ron.mco...@gmail.com> wrote:

I think the re match still has a problem but it is closer.

The fault is on a key error for "longtext," which should not be trying
to match on the field type including the comma. This the same
description field of type longtext that appears in table auth_event.

`description` longtext,

$ python scripts/extract_mysql_models.py user:pass@db_name
Traceback (most recent call last):
File "scripts/extract_mysql_models.py", line 94, in <module>
print mysql(m.group(3),m.group(1),m.group(2))
File "scripts/extract_mysql_models.py", line 80, in mysql
web2py_table_code += "\n Field('%s','%s'),"%
(name,data_type_map[d_type])
KeyError: 'longtext,'

Looks like the second (\S+) also pulled in the comma after longtext

hit = re.search('(\S+)\s+(\S+)( .*)?', line)

I added a print name, d_type just before the failure line and got this
output
`id` int(11)
`time_stamp` datetime
`client_ip` varchar(512)
`user_id` int(11)
`origin` varchar(512)
`description` longtext,

ron_m

unread,
Sep 8, 2010, 4:01:42 PM9/8/10
to web2py-users


On Sep 8, 11:56 am, mdipierro <mdipie...@cs.depaul.edu> wrote:
> I think I fixed these. One more try?
>

I did a temporary add of the key "longtext," to get it to run further
and found a key error mediumtext which also is at end of line with a
comma so the regular expression parsing the line pulls out
"mediumtext," as the key. I did the same to add a key "mediumtext,"
and it ran to completion.

The MySQL data types chapter is here

http://dev.mysql.com/doc/refman/5.1/en/data-type-overview.html

So maybe tonight do you want me to go through the manual and find all
the missing datatypes and try to add them to the map? There are
possibly some types that are not supported by web2py so not sure what
to do there. The other possible problem I see is fields like boolean
that are mapped to char but in the legacy database they are still
boolean so the DB would have to be migrated to work I think if the
user plans to put the web2py server over the legacy database as is.

Ron

Ron

mdipierro

unread,
Sep 8, 2010, 4:30:20 PM9/8/10
to web2py-users

> So maybe tonight do you want me to go through the manual and find all
> the missing datatypes and try to add them to the map?

I would not stop you. ;-)

> There are
> possibly some types that are not supported by web2py so not sure what
> to do there. The other possible problem I see is fields like boolean
> that are mapped to char but in the legacy database they are still
> boolean so the DB would have to be migrated to work I think if the
> user plans to put the web2py server over the legacy database as is.

I think boolean can be mapped to CHAR and/or INTEGER. One of them may
just work.
Worth a try.

>
> Ron
>
> Ron

ron_m

unread,
Sep 9, 2010, 1:01:13 AM9/9/10
to web2py-users


On Sep 8, 1:30 pm, mdipierro <mdipie...@cs.depaul.edu> wrote:
> > So maybe tonight do you want me to go through the manual and find all
> > the missing datatypes and try to add them to the map?
>
> I would not stop you. ;-)
>

Here is a replacement data_type_map, it was shuffled a bit to put like
types together for easier maintenance. I didn't know what to do with
the YEAR type so it is commented out. I am not sure the decimal type
should be mapped to integer since the DAL accepts decimal(n,M) as a
field type.

data_type_map = dict(
varchar = 'string',
int = 'integer',
integer = 'integer',
tinyint = 'integer',
smallint = 'integer',
mediumint = 'integer',
bigint = 'integer',
float = 'double',
double = 'double',
char = 'string',
decimal = 'integer',
date = 'date',
#year = 'date',
time = 'time',
timestamp = 'datetime',
datetime = 'datetime',
binary = 'blob',
blob = 'blob',
tinyblob = 'blob',
mediumblob = 'blob',
longblob = 'blob',
text = 'text',
tinytext = 'text',
mediumtext = 'text',
longtext = 'text',
)

I also fixed the remaining problem in the line match re.search so line
75 or line 87 after above dict is changed

75c87
< hit = re.search('(\S+)\s+(\S+)( .*)?', line)
---
> hit = re.search('(\S+)\s+(\S+)(,| )( .*)?', line)

this fixes the matching on lines like

`description` longtext,

The comma immediately after the type got included as part of the type
string match in error.

Here is the output for the auth_event table

legacy_db.define_table('auth_event',
Field('id','integer'),
Field('time_stamp','datetime'),
Field('client_ip','string'),
Field('user_id','integer'),
Field('origin','string'),
Field('description','text'),
migrate=False)

The id field should not be printed since it will be added
automatically.

The varchar fields have a length in () e.g. varchar(64) which could be
used to add on a length=N value for the Field constructor. The same
holds true for int fields but I don't think there is a real use for
that.

There is no recognition of foreign keys, fixing this would probably be
a significant effort. Some human intervention required as it
stands. ;-)

It certainly is a great start to getting a model file for an existing
MySQL database.

Ron

selecta

unread,
Sep 10, 2010, 9:11:44 AM9/10/10
to web2py-users
> I turned this into a script in web2py/scripts in trunk. What is the
> license? I assume it is GPL or BSD but it should be stated.
nice
license is GPL3 or above ... actually I just use GPL all the time
because I do not have a lawyer here to tell me the differences between
the licences :)
it should also be mentioned that this was developed for
the BMBF SysMO Project Translucent [contract number 0313982A]

selecta

unread,
Sep 10, 2010, 9:15:47 AM9/10/10
to web2py-users
wow I am impressed with the work you put into that little script that
was just a byproduct for moving my legacy mysqldb to web2py :)
thank you

mdipierro

unread,
Sep 10, 2010, 10:19:34 AM9/10/10
to web2py-users
If the license is GPL3 I will not include it web2py. Can you make it
GPL2 or BSD?

GPL3 conflicts with GPL2 becuase would not allow the use of the code
as a service and that is what web2py is about.

Massimo

selecta

unread,
Sep 10, 2010, 7:02:23 PM9/10/10
to web2py-users
then GPL2 is fine with me

Johann Spies

unread,
Sep 16, 2010, 7:15:45 AM9/16/10
to web...@googlegroups.com
How do I use this script?

I gave a fairly complex and relatively large database (few hunderd
thousand records) to try and use with web2py.

When I run this script (the version that came with 1.85.2) I get:

$ python extract_mysql_models.py user:passwd@kbase


Traceback (most recent call last):

File "extract_mysql_models.py", line 106, in <module>
print mysql(m.group(3),m.group(1),m.group(2))
File "extract_mysql_models.py", line 92, in mysql


web2py_table_code += "\n Field('%s','%s'),"%(name,data_type_map[d_type])

KeyError: '1'

Regards
Johann
--
    "The fool hath said in his heart, There is no God..."
                               Psalm 14:1

ron_m

unread,
Sep 17, 2010, 8:27:16 AM9/17/10
to web2py-users
The size of the data in the database shouldn't matter, it runs mysql
to get the table list and then mysqldump on each table using the first
output.

This script needs some testing but the only way to test is pass more
databases through it. It works well on my particular DB but yours has
a new case that is a problem.

The key error comes when the regular expression at line 87 fails
resulting in the key lookup for the data type in data_type_map also
failing. Obviously '1' isn't a data type.

It would help if you added a

print line

statement after line 87 to capture the line that breaks and then a
mysqldump of the table that caused the break.

Ron

mdipierro

unread,
Sep 17, 2010, 9:24:38 AM9/17/10
to web2py-users
I see a problem here:

Field('id','integer'),

should be

Field('id','id'),

ron_m

unread,
Sep 17, 2010, 10:25:01 AM9/17/10
to web2py-users
Program mysqldump produces a line like this for an id field

`id` int(11) NOT NULL AUTO_INCREMENT,

and because there is no special case processing int becomes integer.

It looks like a possible solution would be to recognise AUTO_INCREMENT
and for that case over ride the int translation to 'integer' replacing
with 'id' instead.

I am running the database created by a set of model files through the
script to test the result, not exactly a legacy database but a round-
trip test is a good test too. In my case the id field should be
dropped because it is auto generated. That brings the question if the
field is named id and is AUTO_INCREMENT should it be dropped? If the
source is a legacy database then that might be okay because the legacy
use is consistent with the way web2py uses the id field.

Another one that would be nice is the varchar(len) could be translated
to 'string', length=len in the emitted Field constructor.

Another problem I see is a boolean in the model becomes a char(1) in
the db which becomes a string in the extract but maybe I am pushing
the envelope since my source is a model to start with. Would it be
better to recognisu a char(1) as a boolean? Possibly not.

At this point the script provides an initial cut at a model.py file
and then "some human intelligence" needs to be applied to finish the
result. Of course it would be nice to have nirvana which would be a
100% accurate and complete translation with all possible Field()
parameters filled in for a given database which requires a lot more
work. I am willing to help out but my workload is high right now. Of
course it should not crash as it does for Johann right now.

If you would like to provide comments on what you think is most
important I can help out.

Ron

DJ

unread,
Oct 6, 2010, 7:56:09 PM10/6/10
to web2py-users
The legacy database to Web2py conversion would be a great add-on. I
get the following errors when I tried this script on a database with
tables having primary key 'id' set to autoincrement.

C:\Program Files (x86)\web2py\scripts>extract_mysql_models.py
bio:bio@nrcf
Traceback (most recent call last):
File "C:\Program Files (x86)\web2py\scripts
\extract_mysql_models.py", line 106
, in <module>
print mysql(m.group(3),m.group(1),m.group(2))
File "C:\Program Files (x86)\web2py\scripts
\extract_mysql_models.py", line 82,
in mysql
table_name = re.search('CREATE TABLE .(\S+). \(',
sql_lines[0]).group(1)
AttributeError: 'NoneType' object has no attribute 'group'

Has anyone figured out how to get this script working?

Thanks,
S

Ivan Matveev

unread,
Oct 7, 2010, 11:43:30 AM10/7/10
to web...@googlegroups.com
>2010/10/7 DJ <sebastia...@gmail.com>:

> The legacy database to Web2py conversion would be a great add-on. I
> get the following errors when I tried this script on a database with
> tables having primary key 'id' set to autoincrement.
>
> C:\Program Files (x86)\web2py\scripts>extract_mysql_models.py
> bio:bio@nrcf
> Traceback (most recent call last):
>  File "C:\Program Files (x86)\web2py\scripts
> \extract_mysql_models.py", line 106
> , in <module>
>    print mysql(m.group(3),m.group(1),m.group(2))
>  File "C:\Program Files (x86)\web2py\scripts
> \extract_mysql_models.py", line 82,
>  in mysql
>    table_name = re.search('CREATE TABLE .(\S+). \(',
> sql_lines[0]).group(1)
> AttributeError: 'NoneType' object has no attribute 'group'

Looks like sql_lines[0] is empty, the script has found no lines in
output from mysqldump/mysql(they are statnalone programs that are part
of MySQL server).
Is MySQL server installed?
Can you run mysqldump and mysql programs from the directory where the
script is from command line?
Did you check if the script can access mysqldump and mysql commands?

ron_m

unread,
Oct 7, 2010, 4:59:59 PM10/7/10
to web2py-users
I have it working here and worked on it to improve the dictionary. It
needs more work in the id field area because they are converted to an
integer, also boolean fields which get represented as char(1) would
become string. My testing has been to take the database produced by my
model files and run it through this script to see what I get. This
script came from another person as a project for their own use and
Massimo asked if anyone would help test because he uses postgresql so
it has not had a large selection of databases pushed though it to
verify it works for all schemas. Basically a regular expression
pattern matcher pulls fields out of a line from mysqldump and for the
correct lines and for one of the fields does a dictionary lookup to
see what the correct web2py field type should be in order to emit the
model code.

I usually put a print name, d_type after line 89 so I can see when it
fails and go from there looking into the mysqldump file for the table
being processed. Any patches you come up with are welcome for
consideration.

Ron
Reply all
Reply to author
Forward
0 new messages