Searching in subobjects

1,885 views
Skip to first unread message

Dan Dart

unread,
Aug 23, 2011, 9:45:49 AM8/23/11
to mongod...@googlegroups.com
Hi all,

So if I for example have in my db:

[
{
"_id":"someid",
"Emails" : {
"anotherId" : {
"Email" : "a...@a.com"
}
}
},
{
"_id":"someid2",
"Emails" : {
"anotherId2" : {
"Email" : "b...@b.com"
}
}
}
]

How would I search for any records having "b...@b.com" as any email
within its Emails object (i.e. return the second one :P )?

Sam Millman

unread,
Aug 23, 2011, 10:20:10 AM8/23/11
to mongod...@googlegroups.com
if the _id field represents the top level parent something like:

find({ "Emails.anotherId2.Email": "b...@b.com" })

Would, should, return the second object from those two docs.

Hope this helps,


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Dan Dart

unread,
Aug 23, 2011, 10:31:52 AM8/23/11
to mongod...@googlegroups.com
> if the _id field represents the top level parent something like:
> find({ "Emails.anotherId2.Email": "b...@b.com" })
> Would, should, return the second object from those two docs.
> Hope this helps,

Hi,

Sorry, I meant that I wouldn't know the anotherId. I just want to
search for any document that has that b...@b.com in any object under
Emails, so it could look like this, and I could find the second
document only:


[
{
"_id":"someid",
"Emails" : {
"anotherIdblahblah" : {
"Email" : "a...@a.com"
}
"unknownId" : {
"Email" : "c...@c.com"
}

}
},
{
"_id":"someid2",
"Emails" : {

"WhatIsThisId" : {
"Email" : "b...@b.com"
}
"IDontKnowThisId" : {
"Email" : "d...@d.com"
}
}
}
]

Cheers

Scott Hernandez

unread,
Aug 23, 2011, 10:35:12 AM8/23/11
to mongod...@googlegroups.com
There is no wildcard for field names. Store the otherid along with the
email in the array value.

Emails: [
{id: "...", email:"..."}
]

Sam Millman

unread,
Aug 23, 2011, 10:37:56 AM8/23/11
to mongod...@googlegroups.com
That is currently unsupported and insanely difficult and slow to do since you would have to search every element a subodcument with every element in nested suboducments (nth) multiplied.

In this case I would strongly recommend, for performance and support reasons, to remake your schema some other way.


--

Dan Dart

unread,
Aug 23, 2011, 10:58:15 AM8/23/11
to mongod...@googlegroups.com
Not sure remaking our schemas would be very viable... there's a very
slim chance it might be an option.

So I'm guessing the slow way would be to use JS and $where, would it?

> Store the otherid along with the email in the array value.

What would be the solution then? I don't care about the otherid btw.

Sam Millman

unread,
Aug 23, 2011, 11:08:24 AM8/23/11
to mongod...@googlegroups.com
"What would be the solution then? I don't care about the otherid btw."

Hmm, can you tell us a little more about this? You don't care about it but you seem to store info under it.

If it really is not possible to change the schema you can use an MR or possibly the $where clause but yes they would be insanely slow depending on the size of your document and the number of documents within the constricted query and how often it is being queried. Both methods are of course not advised but personally I would prolly go with $where adding a JS function to it.


--

Dan Dart

unread,
Aug 23, 2011, 11:14:51 AM8/23/11
to mongod...@googlegroups.com
> Hmm, can you tell us a little more about this? You don't care about it but
> you seem to store info under it.

I meant I don't care about retrieving it along with the email. It's
just an md5 of the email string in this case.

> If it really is not possible to change the schema you can use an MR or
> possibly the $where clause but yes they would be insanely slow depending on
> the size of your document and the number of documents within the constricted
> query and how often it is being queried. Both methods are of course not
> advised but personally I would prolly go with $where adding a JS function to
> it.

Usually there'll be one or two emails per document - but potentially
ludicrous amounts of documents... erkk...

So the calculation time wouldn't be THAT much more than doing a
standard query I guess...
although it would be nm where n is number of documents and m is number
of emails - m would average perhaps around 1.
Mostly it'll be 0 and occasionally it'll be 2 - rarely will it be more.

Sam Millman

unread,
Aug 23, 2011, 11:22:02 AM8/23/11
to mongod...@googlegroups.com
Yea so long as the subdoucment count stays low it is ok if once or twice it goes above 2, it should *hopefully* be ok to do a $where on that.

Though only one way to see for sure :)


--

Scott Hernandez

unread,
Aug 23, 2011, 11:24:54 AM8/23/11
to mongod...@googlegroups.com
The biggest issues with using $where is that it won't use an index and
all javascript executes on a single thread (the js engine has this
limitation). This means it could dramatically reduce concurrency on an
active system.

Marc

unread,
Aug 23, 2011, 2:17:21 PM8/23/11
to mongodb-user
As a rule, values that unknown should never be assigned as keys. As
the other users have written, that makes querying data very
inefficient.

The Database should be restructured such that the unknown IDs are
stored as values. Something like:

{
"_id":"someid",
"Emails" : [{
"Email" : "a...@a.com",
"Email_ID" : "anotherIdblahblah"
},
{
"Email" : "c...@c.com",
"Email_ID" : "unknownId"
}]
},
{
"_id":"someid2",
"Emails" : [{
"Email" : "b...@b.com",
"Email_ID" : "WhatIsThisId"
},
{
"Email" : "d...@d.com",
"Email_ID" : "IDontKnowThisId"
}]
}

Queries may then be performed like so:

> db.email.find({"Emails":{$elemMatch:{"Email":"a...@a.com"}}})

or with $elemMatch:
> db.email.find({"Emails":{$elemMatch:{"Email":"a...@a.com"}}})

If changing the data structure is absolutely impossible, it may be
searched using the $where query and a for loop, as other users have
been discussing:

> db.email.find({ $where : function() {for(var unknownID in this["Emails"]){if(this["Emails"][unknownID]["Email"] == "a...@a.com"){return true;}} return false;}});

Marc

unread,
Aug 23, 2011, 3:57:45 PM8/23/11
to mongodb-user
Whoops! For the first part of my response, when I recommended
changing the data structure, I meant to type:

Queries may then be performed like so:
db.email.find({"Emails.Email":"a...@a.com"}

Instead of putting the $elemMatch option twice. I have been told that
$elemMatch is somewhat confusing because dot-notation is not being
used.

Please forgive the type-o!

Patrick Thompson

unread,
Sep 1, 2011, 12:41:20 PM9/1/11
to mongodb-user
Seems like a hole in MongoDB - if I have a polymorphic property like

Person
Name: String
Address: AddressType

where AddressType can be USAddress or EUAddress. It doesn't seem like
there is a natural way to serialize the Address property value and
preserve the type of the address and have it be queriable. Seems as
though the obvious serialization of

Person{
Name:"fred",
Address:
USAddress: { City: "Seatttle", Zip: 98125 }
}

Is impossible to query - seems kind of ugly to have a property on the
type that represents the type of the instance. Is that the expected
approach?

What I'd like to do is

find({"Person.Address.*.City":"Seattle"})

Looks like xpath...

Scott Hernandez

unread,
Sep 1, 2011, 12:45:21 PM9/1/11
to mongod...@googlegroups.com
Seems like storing a discriminator like "type" is a more natural representation.

Person = { Name:"fred",  Address: { type: "us", City: "Seatttle", Zip: 98125 }}

Reply all
Reply to author
Forward
0 new messages