Hello.
When you have a new question it is best to start a new discussion, as
opposed to posting it on an old thread. This way, you will have the
best chance of people seeing and responding to your question.
I have noticed that you have asked this similar question on several
other threads, so by now you know that there is no "one size fits all"
solution to converting a database from SQL to MongoDB. Because the
data structures are so different (Relational versus Document-
Oriented), careful thought must be put into how your application is
going to input and retrieve data from the MongoDB collection: What
fields are going to be queried most? How frequently will your nested
documents be accessed? How frequently are your nested documents going
to change? How large will they become? (Each document has a maximum
size of 16mb, so if a value is an array of nested documents that could
potentially grow ad infinitum, that array should be made into its own
collection.)
If your data structure is relatively simple, Mongo has a tool called
mongoimport, which can be used for importing files that contain one
JSON, CSV, or TSV string per line. The documentation on MongoImport
is here:
http://www.mongodb.org/display/DOCS/Import+Export+Tools
Because your data structure appears to contain a lot of referenced
documents, this tool is probably not the best choice for your
requirements.
Most users find that it is preferable for them to write their own
program that reads in one line of their SQL table at a time, creates a
Mongo Document with the appropriate schema for their application, and
inputs it into their new Mongo collection. A Google search for
"converting mysql to mongodb" returns some articles written by other
people who have done this.
http://www.google.com/search?q=converting+mysql+to+mongodb
Mongo's "SQL to Mongo Mapping Chart" may give you an idea of how to
structurey our data based on the types of queries that you intend to
perform.
http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart
All that being said, I will give you an example of how your data might
be displayed as a Mongo document. However, I cannot make any claims
to as whether it is the correct format (and the likelihood is that it
is not), because I don't know the details of your application.
Hopefully, though it will provide you with a starting point:
{
provence_id : varchar(2),
regency_id : varchar(4),
sub-district_id : varchar(7),
village_id:{
sub-district_id:{
regency_id:{
province_id:{
province_id : varchar(2),
province_name : varchar(50),
}
regency_id : varchar(4),
regency_name : varchar(100),
}
sub-district_id : varchar(7),
sub-district_name : varchar(100),
}
village_id : varchar(10),
village_name : varchar(100),
}
NBS : varchar(4),
NSBS : varchar(3),
NUS : double,
NUP : varchar(5),
sample_type:{
sample_type : int(1),
sample_name : varchar(15),
}
name : varchar(100),
address : varchar(100),
RT : varchar(3),
RW : varchar(3),
zip_code : varchar(5),
phone : varchar(15),
EXT : varchar(4),
FAX : varchar(15),
EMAIL : varchar(50),
HOMEPAGE : varchar(100),
activity : varchar(100),
category_code : char(1),
kbli_code:{
category_code:{
category_code : char(1),
category_name : varchar(255),
}
klbi_code : int(5),
label : varchar(200),
}
business_name : varchar(30),
}
In the above example, all of the referenced tables have been turned
into embedded documents. Embedding documents inside documents inside
documents can be tricky (possible, but tricky) to write to and query
in Mongo. As was mentioned before, if a key will have an array of sub-
documents as its value, One could potentially run into trouble with
the 16mb document size limit if it is not initially known how large
the list could become. For example: if there will be multiple
klbi_codes, your document might look like:
{
provence_id : varchar(2),
regency_id : varchar(4),
...
kbli_code:[
{category_code : char(1),
category_name : varchar(255),
klbi_code : int(5),
label : varchar(200),
},
{category_code : char(1),
category_name : varchar(255),
klbi_code : int(5),
label : varchar(200),
}
...
}
This should be fine if the list will only contain a few sub-documents,
but there could be an issue if there will be hundreds of klbi_codes
per document.
In addition to embedding documents (denormalized data structure),
Mongo also supports values that are references to other documents
(normalized data structure). In a nutshell, reads will be slower with
a normalized data structure, but updates will be quicker, because only
one document will be changed. Here is a link to the Mongo
documentation of referencing documents:
http://www.mongodb.org/display/DOCS/Database+References
There was also a question asked on Stack Overflow on whether to embed
documents or reference them, which you may find useful:
http://stackoverflow.com/questions/5373198/a-simple-mongodb-question-embed-or-reference
Finally, I recommend that you read the Mongo Docs on Schema Design:
http://www.mongodb.org/display/DOCS/Schema+Design
In the "see also" section at the end, there are many good references
to books on the subject and presentations on schema design.
If you have any additional questions, please start a new discussion on
the mongodb-user Google group, and state specifically what your
application will be, what queries your application will be making, and
what parts of your data will be updated the most frequently and ask
for some recommendations. If you can even provide some example
documents of schemas that you are considering, then all the better.
The MongoDB community is here to help. Good luck!
Sincerely,
Marc