Exporting from Oracle to Mongo

2,706 views
Skip to first unread message

Rajat Hubli

unread,
Aug 9, 2011, 10:33:44 AM8/9/11
to mongodb-user
Hello Everyone,

Finally after some hours of discussion on the schema and getting it
right now i finally want to move my data from Oracle to Mongo. So i
would like to know whats the procedure and the tool to do the same? I
was looking at Toad for cloud database as a tool for the same but it
seems a little complicated (may be to me as i am using it for the
first time). So according to me these would be the steps as follows to
move the data:
1] Defining the mongo schema on paper.
2] How to create this schema in Mongo? Do i have to enter some dummy
values and create the schema or actually take on of my rows from the
Oracle table and enter it manually in Mongo and then later export data
from Oracle?
3] Export data from Oracle

In the above steps i know a lot of steps would actually be missing so
i would like to have all kind of suggestions for the same.

Thanks and Regards,
Rajat Hubli

Andreas Jung

unread,
Aug 9, 2011, 10:48:23 AM8/9/11
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rajat Hubli wrote:
> Hello Everyone,
>
> Finally after some hours of discussion on the schema and getting it
> right now i finally want to move my data from Oracle to Mongo. So i
> would like to know whats the procedure and the tool to do the same?
> I was looking at Toad for cloud database as a tool for the same but
> it seems a little complicated (may be to me as i am using it for the
> first time). So according to me these would be the steps as follows
> to move the data: 1] Defining the mongo schema on paper.

MongoDB is +schemaless+

> 2] How to create this schema in Mongo?

MongoDB is +schemaless+

> Do i have to enter some dummy values and create the schema or
> actually take on of my rows from the Oracle table and enter it
> manually in Mongo and then later export data from Oracle?

You export your data somehow from Oracle and turn it into JSON with
a structure according to your needs and then save the data into MongoDB.

Once again: MongoDB is +schemaless+ - every document can have its *own*
schema.

What is the real question?

- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQGTBAEBAgAGBQJOQUi2AAoJEADcfz7u4AZjcpQLtA1aWh3D0pCHL9WnYcqqGeEr
yq0wU7dvoFI2jPIVQfsiIiLy28BxJdrNjR4eCAIcsxZL/zooeR/4uH0MaEuF9u4Z
+eAwY8dIFGTu8WlTRkszk3OOR1iAAi+CQKYuyCWIJX0sRlr7cOtykvBKgdex+12g
F4D30Ij+I1V51Y9gGMdRHrLRvaVHwrzTbWm5jHXpJTMvPi79GuNI9ATLLXFqlyrA
rD+94ygX0R3izXZc9RugeMZVH/rYGMHgOsufhktRUUTpmCa5QP0NVuBQqeEcWCa+
C6syFjAWqk9UQXKaj11dkGB95kgLDcmGHFEpncPUhOwLFUCxCV1R7n9dgBVQVyXi
8Z+OTm8+cksNvyBEhM2mYcGnQwzAWuke0+l+3NIeleRJRJOz717kbbYzed3IoORE
s0YxHv0jtB99/PfYZbfC1CS9fgnO7gIVh6X0ac9xX5u7caFCgKcF1GOAagdYmhMp
QMG112Kib0SFoTu84qPlZgzAVDllBw==
=qhYi
-----END PGP SIGNATURE-----

lists.vcf

Nathan Ehresman

unread,
Aug 9, 2011, 12:52:56 PM8/9/11
to mongod...@googlegroups.com
We do this routinely by using Ruby. Basic steps go like this:

set Ruby-OCI prefetch to a few thousand rows at a time
select out of Oracle
for each record retrieved
populate a Ruby hash as desired for your Mongo schema
use YAJL to convert to JSON
insert into Mongo

Sometimes we'll break it up into several smaller Ruby applications. For example,
today I'm in the middle of exporting a few billion records and have one Ruby app that
extracts from Oracle and writes to gzipped files that are rotated every 4-5 million
records. Then there is a second app that we run that reads from gzipped file and
inserts into Mongo.

Total is about 400 lines of Ruby code.

Nathan Ehresman

Rajat Hubli

unread,
Aug 10, 2011, 3:42:48 AM8/10/11
to mongodb-user
Thanks Nathan for your inputs.
I would like to know if its necessary to convert to .json. Mongo says
it also accepts .csv files. And i have been hearing that its easier to
export oracle data into .csv files. I know nothing about RUBY and
thats bothering me a bit. However i would like to learn it for sure if
i really need it. Could you help me out with some sample code which
atleast explains me the algorithm of what i am exactly supposed to do
and then i could write my own program.
Thanks

Andreas Jung

unread,
Aug 10, 2011, 3:47:55 AM8/10/11
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rajat Hubli wrote:
> Thanks Nathan for your inputs. I would like to know if its necessary
> to convert to .json. Mongo says it also accepts .csv files. And i
> have been hearing that its easier to export oracle data into .csv
> files. I know nothing about RUBY and thats bothering me a bit.
> However i would like to learn it for sure if i really need it. Could
> you help me out with some sample code which atleast explains me the
> algorithm of what i am exactly supposed to do and then i could write
> my own program.

If you are developer and familar with Oracle then you should be able
to write some script or program in your favorite language fetching the
related rows/data from Oracle and converting it into either JSON
or CSV. No idea how one would imported "embedded" documents through
CSV...so JSON is the favorite destination format to be used
through mongoimport. Where is your particular problem? It should obvious
that such a migration is specific to your own needs and requirements. So
what exactly do you need?

- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQGUBAEBAgAGBQJOQjerAAoJEADcfz7u4AZjnS0Lv0XaptC+j5RHVFtfZiED0yau
nHWy4g+eabk5+Xrak2HyplCzGIdYSaronDFWrugmANlOLb7WrNm6v/dIX5xh79eO
atsjN0QLPHX8Jdrk+UPxxXJ8ioWlWl2MUMkkzCoqk8ehI1f0QwXH6CQAzPg4j2tp
0NZFZRFiDGpbyjarFBQkZdSAcyoy/2VWYVNOoZImLpcKQdTW3hqVMKc7CO65YHv5
Pkwd8QYqnO76rpcp7kYWTC8i9WtyglSP5C/ZdFeJLGdVaRJYeh6h5F3mjhNvrDvi
+C+enet5s+wESa6hBNYRLcwSWA3SFQlVY/b2eVOVCBTgq1luU4dqBzXJjJFGbilu
UTUWjWWYEN3RPc+VxnXPFZPVM3OZuo7Q9EIOUQvtViGR5fYOrucIAYwM0nkM+GWD
TeEGVKPuPkzXW0tHW5qYz42w0yOgbAx+gLKgxpiNh8DtORPC+hUfV9Z6tJzbPo86
vmh0R5Kt1mmki5dHLkuI4Q8DdsQ+WTs=
=JzDF
-----END PGP SIGNATURE-----

lists.vcf

Rajat Hubli

unread,
Aug 10, 2011, 4:42:05 AM8/10/11
to mongodb-user
Hi,
I have designed a schema in Mongo which looks like this.
Payment= {
payment_id:
name:
creditcard_details: {
last_digit:
vendor:
}
banktransfer_details: {
transfercode:
bankname:
}cashpayment_details: {
agentname:
cashcode:
}
}
So basically it looks like three documents namely creditcard,
banktransfer and cashpayment embeded inside a single document named
payment. Now for this i have 4 separate tables in Oracle and i need to
move all of this data into Mongo because i dont wanna use Oracle
anymore. I dint know a thing about Oracle export so i dont know
if .csv would be better for my scenario or .json would be better. I
just have heard that oracle could directly export it to .csv and have
also heard that mongo could import a .json. I have also seen the tool
Toad for cloud but do not exactly understand if i could use that for
my case. So i hope what i need is much clear now. Any of your thoughts
on this and ideas are welcome. Thanks a lot for the same

On Aug 10, 9:47 am, Andreas Jung <li...@zopyx.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Rajat Hubli wrote:
> > Thanks Nathan for your inputs. I would like to know if its necessary
> > to convert to .json. Mongo says it also accepts .csv files. And i
> > have been hearing that its easier to export oracle data into .csv
> > files. I know nothing about RUBY and thats bothering me a bit.
> > However i would like to learn it for sure if i really need it. Could
> > you help me out with some sample code which atleast explains me the
> > algorithm of what i am exactly supposed to do and then i could write
> > my own program.
>
> If you are developer and familar with Oracle then you should be able
> to write some script or program in your favorite language fetching the
> related rows/data from Oracle and converting it into either JSON
> or CSV. No idea how one would imported "embedded" documents through
> CSV...so JSON is the favorite destination format to be used
> through mongoimport. Where is your particular problem? It should obvious
> that such a migration is specific to your own needs and requirements. So
> what exactly do you need?
>
> - -aj
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (Darwin)
> Comment: Using GnuPG with Mozilla -http://enigmail.mozdev.org/
>
> iQGUBAEBAgAGBQJOQjerAAoJEADcfz7u4AZjnS0Lv0XaptC+j5RHVFtfZiED0yau
> nHWy4g+eabk5+Xrak2HyplCzGIdYSaronDFWrugmANlOLb7WrNm6v/dIX5xh79eO
> atsjN0QLPHX8Jdrk+UPxxXJ8ioWlWl2MUMkkzCoqk8ehI1f0QwXH6CQAzPg4j2tp
> 0NZFZRFiDGpbyjarFBQkZdSAcyoy/2VWYVNOoZImLpcKQdTW3hqVMKc7CO65YHv5
> Pkwd8QYqnO76rpcp7kYWTC8i9WtyglSP5C/ZdFeJLGdVaRJYeh6h5F3mjhNvrDvi
> +C+enet5s+wESa6hBNYRLcwSWA3SFQlVY/b2eVOVCBTgq1luU4dqBzXJjJFGbilu
> UTUWjWWYEN3RPc+VxnXPFZPVM3OZuo7Q9EIOUQvtViGR5fYOrucIAYwM0nkM+GWD
> TeEGVKPuPkzXW0tHW5qYz42w0yOgbAx+gLKgxpiNh8DtORPC+hUfV9Z6tJzbPo86
> vmh0R5Kt1mmki5dHLkuI4Q8DdsQ+WTs=
> =JzDF
> -----END PGP SIGNATURE-----
>
>  lists.vcf
> < 1KViewDownload

Andreas Jung

unread,
Aug 10, 2011, 4:53:05 AM8/10/11
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rajat Hubli wrote:
> Hi, I have designed a schema in Mongo which looks like this. Payment=
> { payment_id: name: creditcard_details: { last_digit: vendor: }
> banktransfer_details: { transfercode: bankname: }cashpayment_details:
> { agentname: cashcode: } } So basically it looks like three documents
> namely creditcard, banktransfer and cashpayment embeded inside a
> single document named payment.

As said: I don#t know how to manage embedded documents on the import
side through CSV - trivial with JSON. But anyway...you know your options.

- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQGUBAEBAgAGBQJOQkbwAAoJEADcfz7u4AZjZQYLwNgFs0myKX6CxEexNT8IYzcO
7+lCpM79ycIDJEERBMbtrqKc/PkA9qcjqkOdRGkohNWUEEK7KwyMNMbYAFwwF8Eb
MjsD2G4ogBNHbYHzUKtR+2i3tzHDjq2Roa0CzRXMTrTU7L7wdUpDuQ9GjNGQQnWC
2krbZiT2JqxsOB8NgqVRThdfCHHt85ErIr6C99ZxUvc/v6QoC14zGMzVBATu6aG8
mpDkwtfCXRromt0GrG57d239JIGyY51cwjdtm+dFpOI5FTbhzz1aiLvavYicOhsq
vZvXsrTkxjddLrou+JO0KlpY4N4Xx/uI4qjWvwbUx94hkPfK6qbmf3Tj0B7BeI3J
+n0i6RW5TAhy5HThdKKXuHdX+/S01ckr+ZvKhf0CHhe6feY7zwMaJx+ODcK16jtY
XgzfOmHqUNwgK5OrW43pugoHtw3PBv6v1RXxu9gJXfq9hZJS+TnZ/LHw5MqW3xX8
zSimOrOSRycQtWwsqOwZ+ODKvpIS4II=
=zQpx
-----END PGP SIGNATURE-----

lists.vcf

Rajat Hubli

unread,
Aug 10, 2011, 9:01:01 AM8/10/11
to mongodb-user
Ok so after doing a little bit on this i am at a stage now where i get
a .csv file from oracle for one table and another .csv file for
another table. However in my JSON structure i would have these two
files embedded inside each other. How could i do this. I would be
using python. If anybody could share some code to do this kind of an
operation then it would be really nice.
Thanks a lot

On Aug 10, 10:53 am, Andreas Jung <li...@zopyx.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Rajat Hubli wrote:
> > Hi, I have designed a schema in Mongo which looks like this. Payment=
> > { payment_id: name: creditcard_details: { last_digit: vendor: }
> > banktransfer_details: { transfercode: bankname: }cashpayment_details:
> > { agentname: cashcode: } } So basically it looks like three documents
> > namely creditcard, banktransfer and cashpayment embeded inside a
> > single document named payment.
>
> As said: I don#t know how to manage embedded documents on the import
> side through CSV - trivial with JSON. But anyway...you know your options.
>
> - -aj
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (Darwin)
> Comment: Using GnuPG with Mozilla -http://enigmail.mozdev.org/
>
> iQGUBAEBAgAGBQJOQkbwAAoJEADcfz7u4AZjZQYLwNgFs0myKX6CxEexNT8IYzcO
> 7+lCpM79ycIDJEERBMbtrqKc/PkA9qcjqkOdRGkohNWUEEK7KwyMNMbYAFwwF8Eb
> MjsD2G4ogBNHbYHzUKtR+2i3tzHDjq2Roa0CzRXMTrTU7L7wdUpDuQ9GjNGQQnWC
> 2krbZiT2JqxsOB8NgqVRThdfCHHt85ErIr6C99ZxUvc/v6QoC14zGMzVBATu6aG8
> mpDkwtfCXRromt0GrG57d239JIGyY51cwjdtm+dFpOI5FTbhzz1aiLvavYicOhsq
> vZvXsrTkxjddLrou+JO0KlpY4N4Xx/uI4qjWvwbUx94hkPfK6qbmf3Tj0B7BeI3J
> +n0i6RW5TAhy5HThdKKXuHdX+/S01ckr+ZvKhf0CHhe6feY7zwMaJx+ODcK16jtY
> XgzfOmHqUNwgK5OrW43pugoHtw3PBv6v1RXxu9gJXfq9hZJS+TnZ/LHw5MqW3xX8
> zSimOrOSRycQtWwsqOwZ+ODKvpIS4II=
> =zQpx
> -----END PGP SIGNATURE-----
>
>  lists.vcf
> < 1KViewDownload

Nathan Ehresman

unread,
Aug 10, 2011, 10:20:28 AM8/10/11
to mongod...@googlegroups.com
If you want to go that route (ie, doing the joins in Mongo), then you can use
MapReduce to join your two source collections into a final collection. I wrote a
blog article a few weeks ago about how to do this with some example code:

http://tebros.com/2011/07/using-mongodb-mapreduce-to-join-2-collections/

Alternately do the join in Oracle and as Andreas has said, just write a small app in
your language of choice and after selecting your data out of Oracle, dump JSON out.
There are good JSON libraries in just about every language I can think of.

Nathan Ehresman

Steve Francia

unread,
Aug 10, 2011, 10:21:55 AM8/10/11
to mongodb-user
The problem with CSV is it's flat.
The ideal situation would be to write a python script that looped
through your primary table, joined all the related data and created a
python dictionary from it in the structure you want, then converted it
to json and wrote it to a file for mongoimport to use.

It shouldn't be more than 20 lines of code or so and should be rather
easy to write, the hardest part being the sql query.

-Steve
Message has been deleted

Steve Francia

unread,
Aug 10, 2011, 11:04:17 AM8/10/11
to mongodb-user
Sorry, I'm not sure what I was thinking with that response. Allow me
to amend it slightly. Start the same way, but rather than writing a
file to do a mongoimport, simply insert the dictionary into your
mongo
collection as part of the loop.

Here's some pseudocode to demonstrate the logic<br>

$mongocollection = getCollection('C')
while ($row = select * from X) {
$row['embed1'] = select * from Y where Y.a = X.a
$row['embed2'] = select * from Z where Z.b = X.b
$mongocollection->insert($row)

Rajat Hubli

unread,
Aug 10, 2011, 11:30:28 AM8/10/11
to mongodb-user
I am a bit slow to understand but i kinda got the idea that steve
gave. Let me try to rephrase it and see whether i actually got it.

Ok so on what data would the python script be running? Would it be
running on a .csv file generated by Oracle, taking the .csv file as
input using import csv and then run the python script to join the data
and create a dictionary? and then this dictionary is inserted into the
mongo? By this it seems the dictionary would be in JSON to be
recognised by mongo right? I would try doing this but just let me know
whether whatever i have inferred is fine or no?
thanks a lot steve

Steve Francia

unread,
Aug 10, 2011, 11:55:41 AM8/10/11
to mongod...@googlegroups.com
the script will directly be querying your oracle database and inserting it directly into mongo. No exporting or files at all.

Rajat Hubli

unread,
Aug 10, 2011, 11:59:03 AM8/10/11
to mongodb-user
Thats perfect. I would work on that now..Thanks a lot

NoSQL Guy

unread,
Mar 28, 2012, 1:58:34 PM3/28/12
to mongod...@googlegroups.com

Rajat Hubli <hublirajat@...> writes:

>
> Thats perfect. I would work on that now..Thanks a lot
>

> On Aug 10, 5:55 pm, Steve Francia <st...@...> wrote:
> > the script will directly be querying your oracle database and inserting it
> > directly into mongo. No exporting or files at all.
>

Nathan, Rajat and all,

Why are you moving from Oracle to MongoDB?
Is it for a new application that needs some function that Oracle cannot provide?
If so, what function? or is it primarily to save cost?
Rajat, sounds like you are trying to take an existing application and
rewrite it so that you could use MongoDB, and I am curious what is driving that?

I have not seen many existing apps being rewritten from using a commercial RDBMS
to NoSQL, hence my question.

Thanks!

Steve Francia

unread,
Mar 28, 2012, 3:06:23 PM3/28/12
to mongodb-user
This thread is over 6 months old and the initial question is answered.
Rather than change the topic and start it again, it would be
preferable to start a new thread.
Reply all
Reply to author
Forward
0 new messages