Hi,
I am evaluating and reading up on MongoDB. It seems as if MongoDB advises to store all data regarding an object in one record. Coming from Switzerland, we have to support at least French, German, Italian and English in our application, but I would like to support many more languages. The general question I have is what is the advisable route for general schema design of documents in order to store and retrieve multiple language information.
I could have a simplified country structure as in the following
{
country:{
code2:'CH',
nrlang:2,
countryname:[
{
name:'Schweiz',
lang:'ch_de'
},
{
name:'Switzerland',
lang:'en_us',
default:1
}
]
}
}
In this case I would be able to find all documents that have language dependent information, i.e. "nrlang >0", and find all documents that have not yet been translated, i.e. if I previously had two languages and now have four languages then I could search for all documents "nrlang<4" in order to find those documents that need to be translated. If my app does not find a translation for my current language, then I can search for a default and use this for the display string.
The problem I see here is, this would bloat the document the more languages I add to a document. I the case above this would not be that much of a problem, but for other documents with lots of translatable information this might be an issue.
Another option would be to store all translatable information in a separate document and reference this document in the structure above. But this would require two calls to MongoDB each time I want to display a translatable string in my app, one for the code, and one to the translation document.
Another option I can think of is that I have one document per language, e.g.
{
country:{
code2:'CH',
name:'Schweiz',
lang:'ch_de'
}
}
{
country:{
code2:'CH',
name:'Switzerland',
lang:'en_us',
default:1
}
}
Here I can foresee problems on the client side fetching all unique countries if not all countries have been translated yet. Also, it makes creating a unique contraint on the code2 difficult in order to prevent inadvertent duplication of country codes.
Are there any developers out there using MongoDB that have had to solve a similar issue and would be prepared to share some thoughts on this issue?
What general schema design do you use when working with multiple languages?
How do you find out which documents have not yet been translated?
How do you handle many languages on the client side?
As a side issue: how do you solve carriagereturns/linefeeds in strings?
Probably the answer will be "it depends" :-) but I would really appreciate a MongoDB view here, with me coming from a relational DB background.
Regards
Rudolf Bargholz