Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

MySQL to Oracle schema conversion utility

27 views

Skip to first unread message

Tim Bunce

unread,

Aug 6, 2001, 9:50:25 AM8/6/01

--/9DWx/yDrRhgMJTb
Content-Type: text/plain; charset=us-ascii

Here's a MySQL to Oracle schema conversion utility that I've written.
It's rather more functional than the m2o one posted recently.

Tim.

p.s. I could have implemented the SQL 'create table' parser using
Parse::YAPP and the yacc syntax in the mysql source code but I wanted
to use this as an excuse to play with Parse::RecDescent.

--/9DWx/yDrRhgMJTb
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=mysql2ora

#!/home/value/apps/perl -w

=head1 NAME

mysql2ora - Convert MySQL schema into Oracle schema (including triggers etc)

=head1 SYNOPSIS

mysql2ora [-options] mysql_schema.sql > oracle_schema.sql

=head1 DESCRIPTION

Converts MySQL schema into Oracle schema, including emulation of some
MySQL features...

Handles the emulation of an auto_increment field by defining a
sequence and a trigger that uses the sequence to automatically set the
auto_increment field. (Has an option to specify the initial value of
the sequence.)

Handles the emulation of a timestamp field by defining a
and a trigger that automatically sets the timestamp field.

Automatically detects all Oracle reserved words for both table and
field names, and supports multiple methods to deal with them.

Automatically detects long names (>30) characters and truncates them
intelligently to ensure that they remain unique. (It applies the same
techniques to the names of automatically generated sequences and triggers.)

Automatically generates CHECK clauses for enumerated types (and sets
the length of the field to the length of the longest valid value).

For SET types it automatically sets the length of the field to the sum
of the length of the valid values allowing for a comma delimiter
between each. (It doesn't generate a CHECK clause for SET types. That'll
be a little task for someone.)

For partial-field indices it has some huristics to do reasonable things
in some cases. (But typically you'll need to reconsider those for Oracle anyway.)

=head2 OPTIONS

=over 4

=item B<--only>

Ignore any tables with names that don't match the specified regular expression.

=item B<--drop>

Add DROP statements into the generated SQL before each new schema
object is created.

=item B<--autoinc>

Specify the start value of all generated auto_increment sequences. Defaults to 1.

If you are migrating existing data then this should be greater than the
highest value in any auto_increment field in your database.

=item B<--unreserve>

Specify an sprintf() format string to use to I<unreserve> a reserved word.

The default is 'C<"%s">', in other words, don't change the name just
put double quotes around it to make it acceptable to Oracle. If you
use this approach then you'll need to quote all references to the field
in your SQL statements. If you run your mysqld in ANSI mode then you
can use double quotes for both MySQL and Oracle SQL and thus use the
same SQL statements for both.

Another common alternative is 'C<%s_>', which just appends an
underscore character to the name.

=back

=head2 LIMITATIONS

The parser has been tested on the output of C<mysqldump -d> and not on
hand-written SQL. So it's possible that some legal SQL will not parse
correctly.

Also, it has only been tested on the schemas that I have available to
me. It's quite probable that for some less common types it doesn't do
the right thing.

The parser error handling isn't good. I know that Parse::RecDescent can
generate good errors but I don't seem to be using it the right way to
get them.

=head2 TO DO (by someone)

Add optional support for the auto_increment trigger storing the last value
into a session global to simplify emulation of last_insert_id.

CHECK clause for SET types.

Handle embedded quotes etc in default values etc.

Query mysql db to set initial start value of auto_increment sequence

=head2 AUTHOR

=cut

use strict;

use Carp;
use Parse::RecDescent;
use Data::Dumper;
use Getopt::Long;

use vars qw(%VARIABLE %opt);

%opt = (
maxname => 30, # oracle names must be <=30 chars
d => 0,
quiet => 0,
autoinc => 1,
unreserve => '"%s"',
);
GetOptions(\%opt,
'd!',
'quiet!',
'drop!',
'prefix!',
'only=s',
'unreserve=s',
'autoinc=s',
'maxname=i',
) or die;

# Enable warnings within the Parse::RecDescent module.
$::RD_ERRORS = 1; # Make sure the parser dies when it encounters an error
$::RD_WARN = 1; # Enable warnings. This will warn on unused rules &c.
$::RD_HINT = 1; # Give out hints to help fix problems.
$::RD_TRACE = $opt{d}?90:undef; # Trace behaviour to help fix problems.

my $grammar = <<'_EOGRAMMAR_';

{ my @col_defn;
my @key_defn;
my $schema_obj_name = '(initial)';
}

VALUE : /[-+]?\.?\d+(?:[eE]\d+)?/
{ $item[1] }
| /'.*?'/ # XXX doesn't handle embedded quotes
{ $item[1] }
| /NULL/
{ 'NULL' }

NAME : "`" /\w+/ "`"
{ $item[2] }
| /\w+/
{ $item[1] }

parens_value_list : '(' VALUE(s /,/) ')'
{ $item[2] }

parens_name_list : '(' NAME(s /,/) ')'
{ $item[2] }

name_with_opt_paren : NAME parens_value_list(s?)
{ $item[2][0] ? "$item[1]($item[2][0][0])" : $item[1] }

index_field_list : '(' name_with_opt_paren(s /,/) ')'
{ $item[2] }

# --- specific productions

table_options : /[^;]*/
{ $item[1] }

type_qualifier : /(BINARY|UNSIGNED|ZEROFILL)/i
{ uc $item[1] }

type_spec : /\w+/ parens_value_list(?) type_qualifier(s?)
{ { type_name=>$item[1], list=>$item[2][0], type_qual=>$item[3] } }

default_value : /DEFAULT/i VALUE
{ $item[2] }

auto_increment : /AUTO_INCREMENT/i
{ 1 }

primary_key : /PRIMARY KEY/i
{ 1 }

primary_key_definition : /PRIMARY KEY/i <commit> parens_name_list
{
push @key_defn, {
type => uc $item[1],
name=>'PRIMARY KEY',
fields=>$item[-1],
}
}

index_definition : /(KEY|INDEX)|(UNIQUE|FULLTEXT)(\s+(KEY|INDEX))?/i <commit> NAME(?) index_field_list
{
push @key_defn, {
type => uc $item[1],
name => $item[3][0],
name_ur => main::unreserve($item[3][0]||'', $schema_obj_name),
fields => $item[-1],
}
}

column_definition : NAME <commit> type_spec /((NOT\s+)?NULL)?/i default_value(?) /((NOT\s+)?NULL)?/i auto_increment(?) primary_key(?)
{
push @col_defn, {
name => $item[1],
name_ur => main::unreserve($item[1]||'', $schema_obj_name),
type => $item[3],
nullable=> $item[4] || $item[6],
default => $item[5][0],
is_pk => $item[8][0],
is_auto_increment => $item[7][0],
}
}

create_definition : primary_key_definition | index_definition | column_definition

schema_obj_name : NAME
{
$schema_obj_name = $item[1];
warn "\u$arg[0] $schema_obj_name...\n" unless $::opt{quiet};
@col_defn = ();
@key_defn = ();
$item[1];
}

create_table : /CREATE TABLE/i <commit> schema_obj_name['table'] '(' create_definition(s /,/) ')' table_options ';'
{ main::create_table($item[3], \@col_defn, \@key_defn, $item[-2]) }

comment : /#[^\n]*/
{
($item[1] =~ m/^#\s*$/) ? print "/* # */\n" : print "/* $item[1]\t*/\n";
}

statement: comment(s?) create_table
| <error:near "[$text]">

eofile: /^\Z/

startrule: statement(s?) eofile

_EOGRAMMAR_

# Oracle reserved words from:
# http://technet.oracle.com/docs/products/oracle8i/doc_library/817_doc/server.817/a85397/ap_keywd.htm
my @ora_reserved = qw(
ACCESS ADD ALL ALTER AND ANY AS ASC AUDIT BETWEEN BY
CHAR CHECK CLUSTER COLUMN COMMENT COMPRESS CONNECT CREATE CURRENT
DATE DECIMAL DEFAULT DELETE DESC DISTINCT DROP
ELSE EXCLUSIVE EXISTS
FILE FLOAT FOR FROM
GRANT GROUP HAVING
IDENTIFIED IMMEDIATE IN INCREMENT INDEX INITIAL INSERT
INTEGER INTERSECT INTO IS
LEVEL LIKE LOCK LONG
MAXEXTENTS MINUS MLSLABEL MODE MODIFY
NOAUDIT NOCOMPRESS NOT NOWAIT NULL NUMBER
OF OFFLINE ON ONLINE OPTION OR ORDER
PCTFREE PRIOR PRIVILEGES PUBLIC
RAW RENAME RESOURCE REVOKE ROW ROWID ROWNUM ROWS
SELECT SESSION SET SHARE SIZE SMALLINT START
SUCCESSFUL SYNONYM SYSDATE
TABLE THEN TO TRIGGER
UID UNION UNIQUE UPDATE USER
VALIDATE VALUES VARCHAR VARCHAR2 VIEW
WHENEVER WHERE WITH
);
my %ora_reserved = map { $_ => 1 } @ora_reserved;

my %global_names;
my %unreserve;
my %truncated;

my $parser = Parse::RecDescent->new($grammar);

my $example = q{
CREATE TABLE check (
id int(7) unsigned zerofill NOT NULL default '0000000' auto_increment primary key,
successful date NOT NULL default '0000-00-00',
unsuccessful date default '0000-00-00',
i1 int(11) default '0' not null,
s1 set('a','b','c') default 'b',
e1 enum('a','b','c') default 'c',
name varchar(30) default NULL,
foo_type enum('vk','ck') NOT NULL default 'vk',
date timestamp,
time_stamp2 timestamp,
KEY (i1),
UNIQUE (date, i1),
KEY date_idx (date),
KEY name_idx (name(10))
) TYPE=MyISAM PACK_KEYS=1;
};

my $mysql_schema = $example;
my $mysql_schema_name = 'example';
if (@ARGV) {
$mysql_schema_name = $ARGV[0];
warn "Reading mysql schema from $mysql_schema_name\n";
local $/;
open FH, "<$ARGV[0]" or die "Can't open $mysql_schema_name: $!\n";
$mysql_schema = <FH>;
}

print "/* Oracle schema generated by $0 on ".localtime()." */\n";
print "\n";
print "/* We assume that default NLS_DATE_FORMAT has been changed */\n";
print "/* but we set it here anyway to be self-consistent. */\n";
print "ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD HH24:MI:SS';\n";
print "\n";

my $result = $parser->startrule($mysql_schema);

print Dumper($result) if $opt{d};

if (%truncated) {
warn "Truncated ".keys(%truncated)." names:\n";
warn "\t".join("\n\t", sort keys %truncated)."\n";
}
if (%unreserve) {
warn "Encounted ".keys(%unreserve)." unsafe names in $mysql_schema_name (reserved or invalid names):\n";
warn "\t".join("\n\t", sort keys %unreserve)."\n";
}

die "Parse failed.\n" unless defined $result;
exit 0;

# ------

sub unreserve {
my ($name, $schema_obj_name) = @_;
my ($suffix) = ($name =~ s/(\W.*)$//) ? $1 : '';
return $_[0] if !$ora_reserved{uc $name}
&& $name =~ /^[a-z]/i; # also trap fields that don't beging with a letter
$schema_obj_name ? ++$unreserve{"$schema_obj_name.$name"} : ++$unreserve{"$name (table name)"};
my $unreserve = sprintf $opt{unreserve}, $name; # eg '%s_' or '"%s"' etc
return $unreserve.$suffix;
}

sub mk_name {
my ($basename, $type, $scope, $critical) = @_;
my $basename_orig = $basename;
my $max_name = $opt{maxname} - (length($type)+1);
$basename = substr($basename, 0, $max_name) if length($basename) > $max_name;
my $name = ($type) ? ($opt{prefix} ? "${type}_$basename" : "${basename}_$type") : $basename;
if ($basename ne $basename_orig and $critical) {
my $show_type = $type ? "+'$type'" : "";
warn "Truncating '$basename_orig'$show_type to $opt{maxname} character limit to make '$name'\n";
$truncated{$basename_orig} = $name;
}
$scope ||= \%global_names;
return $name unless $scope->{$name}++;
my $name_orig = $name;
$name .= "02";
substr($name, $opt{maxname}-3) = "00" if length($name) > $opt{maxname};
++$name while $scope->{$name};
warn "The name '$name_orig' has been changed to '$name' to make it unique\n";
return $name;
}

sub create_table {
my ($tablename, $rows, $keys, $options) = @_;

if ($opt{only} && $tablename !~ /$opt{only}/o) {
warn "Skipped $tablename\n";
return 1;
}

$tablename = mk_name($tablename,'',undef,1);
my $tablename_ur = unreserve($tablename);
print "create_table:", Dumper( [ ($tablename, $rows, $keys, $options) ] ) if $opt{d};

my %field_name_scope;
my @ora_col_defn;
foreach (@$rows) {
my $field_name = mk_name($_->{name}, '', \%field_name_scope, 1);
my $field_name_ur = (unreserve($field_name,$tablename));
my @defn = ($field_name_ur);
push @defn, mysql2ora_type($field_name_ur, $_, $tablename); # also does default clause
if ($_->{is_pk}) {
push @defn, "CONSTRAINT ".mk_name($tablename,'pk')." PRIMARY KEY";
warn "@defn can't be nullable under Oracle"
if $_->{nullable} && $_->{nullable} !~ /not/i;
}
push @ora_col_defn, join " ", @defn;
}

my $idx_name_default = "idx00";
my @ora_key_defn;
foreach my $key (@$keys) {
my $idx_fields = join ",", map { unreserve($_, $tablename) } @{ $key->{fields} };
if ($idx_fields =~ m/$\d+$/) {
my $orig = $idx_fields;
if (@{ $key->{fields} } == 1) {
# if this is the only field in the index then drop it entirely
warn "Warning: Removed partial-field index on $idx_fields of $tablename (not supported by Oracle)\n";
next;
}
if ($idx_fields =~ s/$\d+$*$//) { # make LAST field non-partial
warn "Warning: Changed index on $tablename from '$orig' to '$idx_fields' (will be more expensive)\n";
}
# if more (or other) than the last field are partial then handle it manually
warn "Warning: Partial-field index on $tablename of '$orig' will need manual fixup for Oracle\n"
if $idx_fields =~ m/\(/;
}
my $type = $key->{type}; # KEY, PRIMARY KEY, UNIQUE, UNIQUE INDEX
if ($type eq 'PRIMARY KEY') {
push @ora_col_defn, "CONSTRAINT ".mk_name($tablename,'pk')." PRIMARY KEY ($idx_fields)";
next;
}
$type =~ s/\s*(INDEX|KEY)\s*//;
my $idx_name = mk_name($tablename, $key->{name} || ++$idx_name_default);
push @ora_key_defn, "CREATE $type INDEX $idx_name ON $tablename_ur ($idx_fields);";
}

print "DROP TABLE $tablename_ur;\n" if $opt{drop};
print "CREATE TABLE $tablename_ur (\n ";
print join ",\n ", @ora_col_defn;
print "\n);\n";
print join "\n", @ora_key_defn;
print "\n\n";

foreach (@$rows) {
next unless $_->{is_auto_increment};
my $base_name = $tablename . "_". $_->{name};
my $sq_name = mk_name($base_name,'sq');
my $trig_name = mk_name($base_name,'ai');
print "DROP SEQUENCE $sq_name;\n" if $opt{drop};
print <<"EOS";
CREATE SEQUENCE $sq_name START WITH $opt{autoinc};
CREATE OR REPLACE TRIGGER $trig_name
BEFORE INSERT ON $tablename_ur
FOR EACH ROW WHEN (new.$_->{name} IS NULL OR new.$_->{name} = 0)
BEGIN
SELECT $sq_name.nextval INTO :new.$_->{name_ur} FROM dual;
END;
/
EOS
print "\n";
}

foreach (@$rows) {
next unless lc $_->{type}->{type_name} eq 'timestamp';
my $base_name = $tablename . "_". $_->{name};
my $trig_name = mk_name($base_name,'ts');
print <<"EOS";
CREATE OR REPLACE TRIGGER $trig_name
BEFORE INSERT OR UPDATE ON $tablename_ur
FOR EACH ROW WHEN (new.$_->{name_ur} IS NULL)
BEGIN
SELECT sysdate INTO :new.$_->{name_ur} FROM dual;
END;
/
EOS
print "\n";
last; # only first timestamp column is auto-updated
}

return 1;
}

sub mysql2ora_type {
my ($field_name, $col_defn, $tablename) = @_;
my $spec = $col_defn->{type} || die;
my $qual = $spec->{type_qual}; # binary/unsigned/zerofill
my $type = uc $spec->{type_name};
my $list = $spec->{list} || [];
my $commalist = join ",", @$list;
my $default = ($col_defn->{default}) ? "DEFAULT $col_defn->{default}" : "";
my $not_null = ($col_defn->{nullable} =~ /not/i) ? "NOT NULL" : "";

warn "Warning: $tablename.$col_defn->{name}: zerofill attribute ignored.\n"
if $qual =~ /zerofill/i;

if ($type eq 'ENUM') {
my $len = 0;
$len = ($len < length($_)) ? length($_) : $len for (@$list);
my $check = "CHECK ($field_name IN ($commalist))";
return "VARCHAR2($len) $default $not_null $check";
}
elsif ($type eq 'SET') {
my $len = length $commalist;
# XXX add a CHECK constraint maybe (more tricky, and slower, than enum :)
return "VARCHAR2($len) /* set $commalist */ $default $not_null";
}
elsif ($type =~ /^INT/ or $type =~ /INT$/) {
return "INTEGER $default $not_null";
}
elsif ($type =~ /^(REAL|DOUBLE|FLOAT)$/) {
return "NUMBER $default $not_null";
}
elsif ($type =~ /^(DECIMAL|NUMERIC)$/) {
return "$type($commalist) $default $not_null" if @$list;
return "$type $default $not_null";
}
elsif ($type =~ /^(DATE|TIME|DATETIME|TIMESTAMP)$/) {
if ($not_null) { $default =~ s/'0000-00-00/'0001-01-01/; }
else { $default =~ s/'0000-00-00.*?'/NULL/; }
return "DATE /* $type */ $default $not_null";
}
elsif ($type =~ /^TINY(BLOB|TEXT)$/) {
return "VARCHAR2(255) /* $type */ $default $not_null";
}
elsif ($type =~ /BLOB$/) {
return "BLOB /* $type */ $default $not_null";
}
elsif ($type =~ /TEXT$/) {
return "CLOB /* $type */ $default $not_null";
}
elsif ($type =~ /CHAR$/) { # CHAR and VARCHAR
warn "Note: $tablename.$col_defn->{name} \L$type doesn't have BINARY qualifier for case-sensitivity.\n"
unless $type =~ /binary/i;
return "$type($commalist) $default $not_null";
}
warn "Unhandled MySQL type '$type' (on $col_defn->{name} column)\n";
return "$type $default $not_null";
}

__END__

--/9DWx/yDrRhgMJTb
Content-Type: text/plain; charset=iso-8859-1

---------------------------------------------------------------------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)

To request this thread, e-mail <mysql-th...@lists.mysql.com>
To unsubscribe, e-mail <mysql-unsubscribe-mysql=freebsd.csie...@lists.mysql.com>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
--/9DWx/yDrRhgMJTb--

0 new messages