$group, $first and $project

2,472 views
Skip to first unread message

Sam Halliday

unread,
Dec 22, 2012, 8:01:43 AM12/22/12
to mongod...@googlegroups.com
Hi all,

I would like to create an aggregation query which

1. filters (done! with $match)
2. sorts (done! with $sort)
3. groups by an identifier (done! with $group)
4. selects the first entry from each group...

and that's where I'm stuck, because $first only seems to work on an individual field basis. I actually want the full first entry for each group. I'm getting a sense that $project might be part of the solution, but I can't quite figure out the right notation.

Any help please?

Ronald Stalder

unread,
Dec 22, 2012, 1:49:19 PM12/22/12
to mongod...@googlegroups.com
Hi Sam

I'm not sure I understand what you mean by "first entry of each group". Aggregate returns you one array element per group.

Samuel Halliday

unread,
Dec 23, 2012, 10:33:48 AM12/23/12
to mongod...@googlegroups.com
Hi Ronald,

Let's say the data looks like this (expanding the example docs with a made up field):

{
"city": "NEW YORK",
"state": "NY",
"pop": 5574,
"party": "democrat"
}

and I would like to make queries such as "for a given party, return the state with the highest population", so:

{ $match: {party: "democrat"}},
{ $sort: {pop: -1} },

What I would then like to get back would be the full documents organised by state. I currently do this using the following explicit $group

{
$group:
{
_id: "$state",
result:
{
city: {$first: "$city"},
state: {$first: "$state"},
pop: {$first: "$pop"},
party: {$first: "$party"}
}
}
}

but this seems overly verbose (especially when my documents are more complicated). Is there a way to write an equivalent query which doesn't involve explicitly writing every field? Something as terse as this would be perfect:

{ $group: { _id: "$state", result: $first }}

which would also allow the query to return more details, such as

{ $group: { _id: "$state", highest: $first, lowest: $last}}


--
Sam

Ronald Stalder wrote:
>
> I'm not sure I understand what you mean by "first entry of each group". Aggregate returns you one array element per group.
>

Samuel Halliday

unread,
Dec 23, 2012, 10:35:47 AM12/23/12
to mongod...@googlegroups.com
Sorry, typo, this should read: "for a given party, return the document with the highest population in each state."

--
Sam

Ronald Stalder

unread,
Dec 23, 2012, 11:09:42 AM12/23/12
to mongod...@googlegroups.com
ok, I'll take a look. Please, gimme some time, I might be able to do it
tonight, or then tomorrow

Ronald Stalder

unread,
Dec 24, 2012, 7:34:16 AM12/24/12
to mongod...@googlegroups.com
Hi Samuel,

if I understood well, you want, for a given party, a structure like this:

{
_id: %party%
states: [ // array of states
state: %state%
cities: [ // array of documents, sorted by pop DESC
{
... // your original document here
}
]
]

}


Now, I cannot imagine how you'll get there with aggregate() or even
mapReduce(). You'll need to build this structure in a program or mongo
shell script like so (abbreviated pseudo code):

for party in [ "democrat", "republican", ... ] {
cursor = db.xxx.find( "party":party ).sort("state":1, "pop":-1 )
doc = {}
loop cursor {
doc._id = party
push ( doc[states.state.cities], cursor ) // will be sorted, as arrays keep order
}
db.yyyy.save( doc )
}

For querying you'd then do something like:

doc = db.yyyy.findOne("party":%party%)
for ( state in doc states ) {
print( doc[state].state )
printjson( doc[state].cities.slice(0,1) ) // for the first one
printjson( doc[state].cities.slice(-1) ) // for the last one
}


hope this helped,
Merry Christmas!
Ronald

Samuel Halliday

unread,
Dec 24, 2012, 12:52:46 PM12/24/12
to mongod...@googlegroups.com
Hi Ron,

Actually I can already do the query with aggregate (see my longer post below), but it is really verbose.

I only need to get one result per state (not per party) and I just want to pluck the first one. It's more a syntax question: is it possible to use $first to extract an entire document from a sort/group, rather than just per-field?

I would rather have verbose syntax than any solution which sits outside of the aggregate framework.

--
Sam
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb

Stephen Steneker

unread,
Jan 2, 2013, 11:49:16 PM1/2/13
to mongod...@googlegroups.com
 
Actually I can already do the query with aggregate (see my longer post below), but it is really verbose.

I only need to get one result per state (not per party) and I just want to pluck the first one. It's more a syntax question: is it possible to use $first to extract an entire document from a sort/group, rather than just per-field?

I would rather have verbose syntax than any solution which sits outside of the aggregate framework.

Hi Sam,

As at MongoDB 2.2 there isn't any shortcut syntax to insert a whole document.

There is a similar feature request you may wish to vote on / watch:

Cheers,
Stephen

Samuel Halliday

unread,
Jan 3, 2013, 6:27:09 AM1/3/13
to mongod...@googlegroups.com
Thanks Stephen,

That RFE looks like exactly what I am looking for.

(Oh joy, yet another username/password to create...)

--
Sam
Reply all
Reply to author
Forward
0 new messages