Sort embedded array?

1,660 views
Skip to first unread message

sdotsen

unread,
Mar 23, 2010, 9:01:04 PM3/23/10
to mongodb-user
With structure given below, how can I go about sorting the values in
the "plans" array?
I decided to dump the status inside one collection to go along w/ the
user account info.
When querying in PHP and dumping the info onto the screen, is there a
way to display the latest
"status" at the top?


{ "_id" : ObjectId("4ba960dc7f8b9a7d0a000000"), "plans" : [
{
"status" : "cooking",
"timestamp" : "Tue Mar 23 2010 20:46:20 GMT-0400 (EDT)"
},
{
"status" : "washing clothes",
"timestamp" : "Tue Mar 23 2010 20:48:43 GMT-0400 (EDT)"
}
{
"status" : "enjoying some tea",
"timestamp" : "Tue Mar 23 2010 21:01:07 GMT-0400 (EDT)"
}

], "username" : "jimbob" }

sdotsen

unread,
Mar 23, 2010, 9:41:43 PM3/23/10
to mongodb-user
Would it make sense to put the "status" updates in a separate
collection?
What if a user has thousands of updates?

Of course this works, if I separate the updates from the account
collection, I can then sort the "status" by _id.

Eliot Horowitz

unread,
Mar 23, 2010, 9:43:05 PM3/23/10
to mongod...@googlegroups.com
Can you just sort client side?
Might be just as fast.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

sdotsen

unread,
Mar 23, 2010, 9:47:50 PM3/23/10
to mongodb-user
How would I go about doing that? Sort by timestamp?


On Mar 23, 9:43 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> Can you just sort client side?
> Might be just as fast.
>

Eliot Horowitz

unread,
Mar 23, 2010, 9:48:44 PM3/23/10
to mongod...@googlegroups.com
i assume php has some mechanism for sorting arrays.
not really a php guy though

sdotsen

unread,
Mar 23, 2010, 9:50:31 PM3/23/10
to mongodb-user
I'll look at my options.

Any input on storing statuses in the same collection?
I thought I read another thread that said having embedded arrays would
slow things down.
But maybe that's when you embed deep arrays whereas my arrays aren't
as drilled down as others.

On Mar 23, 9:48 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> i assume php has some mechanism for sorting arrays.
> not really a php guy though
>

Eliot Horowitz

unread,
Mar 23, 2010, 9:52:02 PM3/23/10
to mongod...@googlegroups.com
It all depends on the use case.
What kind of updates, etc...

sdotsen

unread,
Mar 23, 2010, 9:56:08 PM3/23/10
to mongodb-user
Think Twitter ... I'm creating something similar to it but it's geared
towards something completely different.
Folks will add status updates. No replies or anything.


On Mar 23, 9:52 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> It all depends on the use case.
> What kind of updates, etc...
>

dwight_10gen

unread,
Mar 23, 2010, 10:06:29 PM3/23/10
to mongodb-user
are they stored in order, just older to newer? if so just iterate
backwards after fetching?


On Mar 23, 9:01 pm, sdotsen <samnang....@gmail.com> wrote:

Shaun Kruger

unread,
Mar 24, 2010, 1:42:56 AM3/24/10
to mongod...@googlegroups.com
I ran into this earlier this week in my PHP app. It would really be
nice if there was a way to sort in mongo, but I can see that it's just
not in the cards.

I created this function for doing my sort:

function vhost_path_sort($a,$b){
return -strcmp($a["path"],$b["path"]);
}

I do my insert/update/etc then I then use usort() to sort the array. I
serialize the array and take the md5 sum before and after the sort so I
can know if it changed.

// Get updated object
$vhost = MDB::$conn->lb->config->findOne(array("_id"=>$vhost_id));
$pathmd5 = md5(serialize($vhost["paths"]));
usort($vhost["paths"],"vhost_path_sort");
// md5(serialize()) to see if the array changed
if($pathmd5 != md5(serialize($vhost["paths"]))){
MDB::$conn->lb->config->update(array("_id"=>$vhost_id),
array('$set'=>array("paths"=>$vhost["paths"])));
}

I do the sort after insert/update in this case because my application
has a correct order these paths need to be in that will never change
when querying. If you want different orders when doing a query I would
just create different sort functions for the different sorts you want.
You can pass the name of the one you wish to use to usort() after you do
your query.

It's times like this that I really wish PHP had anonymous functions.
Custom sorts are a textbook example of what anonymous functions are good
for.

Shaun

Mardix

unread,
Mar 24, 2010, 4:42:00 AM3/24/10
to mongodb-user
I would definitely have another collection for the status. The reason
is because of the 4MB cap per document on mongo and some other issues
that may come along if you want to add more stuff in the statuses.

Actually I am building something similar.

I have two collections, Accounts and Statuses. Accounts has info of
the user (login,email,password,etc). Statuses contains updates posted
by the user in the Accounts collection. And I link each status to
their appropriate Account holder with the field Account_id, where
Account_id is the MongoId of the document of the user. What's good
about having each status on its own document, is if one day I would
like to add a comment section for this status, this document will
contain it's own comments.

Anyway...

One thing I would also do, is put your timestamp into mongo time,
which is time in seconds. The MongoDate class in PHP can help you
achieve it.
http://www.php.net/manual/en/class.mongodate.php

So if you want to order by timestamp you can by this
db.Statuses.find().sort({timestamp:-1}) // newest first

in PHP

$myCollections->find()->sort(array(timestamp=>-1));

Me, I use PHP 100%, but I keep everything Mongo. lol.

- I used BOOL (true|false) instead of 1 or 0.

- For time I use MongoDate

- For a document _id, I don't store it as string, but as an Object
instead via MongoId()

- I don't use the function nl2br when saving data in mongodb, instead
when outputting it I use it.

I do all these simple things because I want to keep the integrity of
the data for all drivers: C++, Java, Javascript, Python, Ruby, etc. If
I make my data too PHP dependent, now I have to convert stuff if I'm
using another driver. That's just my $0.03.

Don't work harder, work smarter.

Peace!

sdotsen

unread,
Mar 24, 2010, 7:45:44 AM3/24/10
to mongodb-user
What happens if you max out on the 4MB cap? Can I create another
collection and continue onwards?
Is this where sharding comes into play?


On Mar 24, 4:42 am, Mardix <mcx2...@gmail.com> wrote:
> I would definitely have another collection for the status. The reason
> is because of the 4MB cap per document on mongo and some other issues
> that may come along if you want to add more stuff in the statuses.
>
> Actually I am building something similar.
>
> I have two collections, Accounts and Statuses. Accounts has info of
> the user (login,email,password,etc). Statuses contains updates posted
> by the user in the Accounts collection. And I link each status to
> their appropriate Account holder with the field Account_id, where
> Account_id is the MongoId of the document of the user.  What's good
> about having each status on its own document, is if one day I would
> like to add a comment section for this status, this document will
> contain it's own comments.
>
> Anyway...
>
> One thing I would also do, is put your timestamp into mongo time,
> which is time in seconds. The MongoDate class in PHP can help you

> achieve it.http://www.php.net/manual/en/class.mongodate.php

Eliot Horowitz

unread,
Mar 24, 2010, 8:27:27 AM3/24/10
to mongod...@googlegroups.com
That cap is per object, not collection.

Kristina Chodorow

unread,
Mar 24, 2010, 9:47:22 AM3/24/10
to mongod...@googlegroups.com
Going back to the sorting thing, I could make MongoDate comparable.  Then, you could store the statuses as:

{<date> : <status>, <date> : <status>, ...}

and call asort() on the associative array (http://php.net/manual/en/function.asort.php).

If this sounds good, please file a feature request at http://jira.mongodb.org/browse/PHP.

sdotsen

unread,
Mar 24, 2010, 10:02:26 AM3/24/10
to mongodb-user
That could work, but going back to the 4M cap limit, I think I'm
better off storing the status in a separate collection.
With that, I can sort by the _id and be done with it!

I will probably still file a request, it might come in handy in the
future.

Thanks,

Sam


On Mar 24, 9:47 am, Kristina Chodorow <krist...@10gen.com> wrote:
> Going back to the sorting thing, I could make MongoDate comparable.  Then,
> you could store the statuses as:
>
> {<date> : <status>, <date> : <status>, ...}
>
> and call asort() on the associative array (http://php.net/manual/en/function.asort.php).
>

> If this sounds good, please file a feature request athttp://jira.mongodb.org/browse/PHP.


>
> On Wed, Mar 24, 2010 at 8:27 AM, Eliot Horowitz <eliothorow...@gmail.com>wrote:
>
> > That cap is per object, not collection.
>

> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>


> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/mongodb-user?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to mongod...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>

Colin M

unread,
Mar 24, 2010, 12:59:51 PM3/24/10
to mongodb-user
Great and easy improvement, so I added an issue for this:
http://jira.mongodb.org/browse/PHP-94

Thanks,
Colin

On Mar 24, 9:47 am, Kristina Chodorow <krist...@10gen.com> wrote:

> Going back to the sorting thing, I could make MongoDate comparable.  Then,
> you could store the statuses as:
>
> {<date> : <status>, <date> : <status>, ...}
>
> and call asort() on the associative array (http://php.net/manual/en/function.asort.php).
>

> If this sounds good, please file a feature request athttp://jira.mongodb.org/browse/PHP.
>
> On Wed, Mar 24, 2010 at 8:27 AM, Eliot Horowitz <eliothorow...@gmail.com>wrote:> That cap is per object, not collection.

> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsubscribe@google groups.com>


> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/mongodb-user?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to mongod...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsubscribe@google groups.com>

Mardix

unread,
Mar 24, 2010, 2:34:10 PM3/24/10
to mongodb-user
@Kristina Chodorow

Saving your data in the format {<date> : <status>, <date> :
<status>, ...} is not a good idea at all, because you will have
different time as a key name, and you won't be able to get an entry
unless you know the time. And the key name must be a name, not a
value.

{time:$time, status:$status}

If I say db.Status.find().sort({time:-1}), it will know to sort in the
descendant order by time. Newer (most recent) time will show first.

And just so you know, data append after each other. Which means, your
newest data will be inserted after the last one. So you to navigate
thru your data, instead of working harder, you can just lopp thru the
result without doing excessive work.

Sorting data by timestamp would be the easiest since we know that data
append after each other. So a simple way without PHP asort()

$Results = $Status_Collection->find();
// Will get all the status

$countResults = $Results->count();

Now we can loop in reverse

for($i=$countResults; $i>=0; $i--){
// Your data will be in reverse order here
}

but $Status_Collection->find()->sort(array(timestamp=>-1)) should do
the same thing.

On Mar 24, 9:47 am, Kristina Chodorow <krist...@10gen.com> wrote:

> Going back to the sorting thing, I could make MongoDate comparable.  Then,
> you could store the statuses as:
>
> {<date> : <status>, <date> : <status>, ...}
>
> and call asort() on the associative array (http://php.net/manual/en/function.asort.php).
>

> If this sounds good, please file a feature request athttp://jira.mongodb.org/browse/PHP.


>
> On Wed, Mar 24, 2010 at 8:27 AM, Eliot Horowitz <eliothorow...@gmail.com>wrote:
>
>
>
> > That cap is per object, not collection.
>

> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsubscribe@google groups.com>


> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/mongodb-user?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to mongod...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsubscribe@google groups.com>

Nano.

unread,
Apr 1, 2010, 3:11:25 AM4/1/10
to mongodb-user
Is there any plan on limiting the results of the embedded objects?
Following initial example, it will be nice if the returning document
can set a limit on the "plans" array, let's say:
"Get the account with is latest 5 statuses only" which will imply in
this case sorting and limiting the embedded objects.
"Get the account with 5 statuses only" will not make much sense in
this case because they will be the same all the time (first 5 added to
document).

Thanks,

Nano.

Michael Dirolf

unread,
Apr 1, 2010, 10:13:11 AM4/1/10
to mongod...@googlegroups.com
I think you're looking for this: http://jira.mongodb.org/browse/SERVER-142

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.

> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.

Reply all
Reply to author
Forward
0 new messages