How to store dynamic map / embedded object

766 views
Skip to first unread message

CoreyS

unread,
Aug 29, 2011, 3:16:53 PM8/29/11
to mor...@googlegroups.com
Hello,

We are porting an application over to mongo using morphia and have the need to store a map of variable data points that is easily searchable (e.g. which I think is called an "embedded index").

My Java object is a simple wrapper for the map with an ObjectId, and some other attributes ("tag", "group", and some audit fields):

@Entity
public class Record {
         ObjectId objectId;

         String tag;
         Group group;
         Map<String, Object> values;

        /** g/setters **/
}

Morphia/Mongo is storing the records like so:
> db.Record.find();
{
"_id" : ObjectId("4e5bcb9161e2247bc2857283"), 
"values" : { "address" : "100 Main Street", "postalCode" : "10019", "gender" : "m"}, 
"tag" : "Set 1", 
"group" : { "$ref" : "Group", "$id" : ObjectId("4e56cf5261e291103a66c510") }, 
"createDate" : ISODate("2011-08-29T17:25:36.965Z"), 
"lastModifiedDate" : ISODate("2011-08-29T17:25:37.043Z"))
}

"_id" : ObjectId("4e5bcbbe61e2247bc2857284"), 
"values" : { "address" : "30 Spadina", "favouriteColour" : "red", "gender" : "m", "city" : "Toronto" },
"tag" : "Set 1", 
"group" : { "$ref" : "Group", "$id" : ObjectId("4e56cf5261e291103a66c510") }, 
"createDate" : ISODate("2011-08-29T17:25:51.497Z"), 
"lastModifiedDate" : ISODate("2011-08-29T17:26:22.564Z")
}

"_id" : ObjectId("4e5be0c661e2247bc2857285"), 
"values" : { "source" : "upstream", "favouriteFood" : "Hamburger", "gender" : "m", "city" : "New York" },
"tag" : "Set 1", 
"group" : { "$ref" : "Group", "$id" : ObjectId("4e56cf5261e291103a66c510") }, 
"createDate" : ISODate("2011-08-29T18:56:06.210Z"), 
"lastModifiedDate" : ISODate("2011-08-29T18:56:06.211Z"), 
}

My questions are: considering we need very quick searches on the "values" set over potentially 100 million records (sharding etc aside), 

e.g.
db.Record.find({"values.source" : "upstream"})
OR
db.Record.find({"values.favouriteFood" : "Hamburger", "values.city":"New York"})
... etc...

1) Is the above method of storing the data recommended (e.g. in an embedded set)?
2) If so, using Morphia, how can I ensure that each of the dynamically added key pairs of "values" are indexed as they're added to the collection?
3) Is there another recommended way of storing dynamic data sets as above?

I'm happy to contribute code to the project is no such thing exists, and it makes sense to store/annotate the data as above.

I appreciate your help!
Corey

CoreyS

unread,
Aug 29, 2011, 3:45:37 PM8/29/11
to mor...@googlegroups.com
I think this was partially answered here:

but let us know if you have additional suggestions

Thanks,
Corey

Scott Hernandez

unread,
Aug 29, 2011, 4:02:40 PM8/29/11
to mor...@googlegroups.com
I have been thinking that maybe there should be a way to control the
output format of maps, like into the list structure I suggested in the
other thread.

The problem with things like is that it is hard to get your head
around in the java world since the structures produced are much less
like their java counter-parts. Also, adding support for indexing is a
little tricky, or just very specialized.

Here would be what I could imagine:

class Foo {

@PersistAsList([keyName = "key"], [valueName = "val"])
Map<String, Object> values = ...;

CoreyS

unread,
Aug 31, 2011, 2:45:00 PM8/31/11
to mor...@googlegroups.com
As I'm coming up against some more obstacles with indexing dynamic name/value pairs (see http://groups.google.com/group/nosql-databases/browse_thread/thread/3a0b0495766346a7/6c400eeec6f625db#6c400eeec6f625db) I'm wondering if it would not make sense to have the option to @Flatten a Map; e.g. take all the keys in the map, and flatten them out to fields in the stored Document; perhaps this is what you meant by your @PersistAsList suggestion?

e.g.

class Record {
@Index
String email;
@Flatten(index=true)
Map<String, Object> attributeMap = new HashMap<String,Object>();
/* g/setters */
}

for values, say,

email : 'm...@domain.com'
attributeMap :  { source : web, postalCode: 10019 }

with the @Flatten annotation would get stored in Mongo as

{
"_id" : ObjectId("4e5bcb9161e2247bc2857283"), 
"email" : "m...@domain.com", 
"source" : "web",
"postalcode" : "10019"
}

Another record with different values in the attributeMap, say

attributeMap : { city : "Iqaluit", weather : "cold" }

would get stored as

{
"_id" : ObjectId("5e5bcb9161e2247bc2857283"), 
"email" : "m...@domain.com", 
"city" : "Iqaluit",
"forecast" : "snow"
}

and if the "index" parameter in the @Flatten annotation was set to true

db.Record.ensureIndex( { "<< field name >>" : 1 } );

e.g.

db.Record.ensureIndex( { "source" : 1 } );
db.Record.ensureIndex( { "postalcode" : 1 } );
etc.

To load documents into Java, I supposed you would try to:
- match the field name with a field name in the Java object (which I suppose is already happening)
- if no match, do the additional check for a @Flatten(ed) map, and if it exists, put the field there

Can you let me know your thoughts on this? If you think this might be the way to do this, I'll work on a fix and submit it to the project.

Thanks,
Corey
Reply all
Reply to author
Forward
0 new messages