Find duplicates in array field

569 views
Skip to first unread message

Szaniszlo Szöke

unread,
Apr 25, 2016, 10:26:11 AM4/25/16
to mongodb-user
Hi,

I get duplicates of fields and subfield using aggregation. Searching for duplicated sub field "k01.v" works fine:

db.table_0.aggregate({"$group" : { _id: "$k01.v", "count": { "$sum": 1 } } }
{"$match": {"count" : {"$gt": 1} } }
{"$project": {"k01.v" : "$_id", "_id" : 0} });

Unfortunately, this doesn't work when searching for sub field "a1.0" (field "a1" is an array):

db.table_0.aggregate({"$group" : { _id: "$a1.0", "count": { "$sum": 1 } } }
{"$match": {"count" : {"$gt": 1} } }
{"$project": {"a1.0" : "$_id", "_id" : 0} });

Any workaround ?

TIA

Asya Kamsky

unread,
Apr 25, 2016, 8:39:45 PM4/25/16
to mongodb-user
Are you trying to access the first element of array "a1"? The syntax
for that in aggregation is

{$arrayElemAt:["$a1", 0] }

Asya
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user"
> group.
>
> For other MongoDB technical support options, see:
> https://docs.mongodb.org/manual/support/
> ---
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-user...@googlegroups.com.
> To post to this group, send email to mongod...@googlegroups.com.
> Visit this group at https://groups.google.com/group/mongodb-user.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mongodb-user/b0b29f4a-921e-4744-b022-d6fb7bd8f8e7%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
Free MongoDB Monitoring - cloud.mongodb.com
Free Online Education - university.mongodb.com
Get Involved - mongodb.org/community
We're Hiring! - https://www.mongodb.com/careers

Szaniszlo Szöke

unread,
Apr 26, 2016, 3:48:37 AM4/26/16
to mongodb-user
Thanks for the advise.
I tried a bunch of different syntax, like this one:

{"$group" : { _id: "$arrayElemAt:[$a1, 0]", "count": { "$sum": 1 } } }
{"$match": {"count" : {"$gt": 1} } }
{"$project": {"arrayElemAt:[$a1, 0]" : "$_id", "_id" : 1} }

to no avail. I can't get the number of documents having one precise element of array a1 duplicated (in this sample the first one, could be any other one).

Any idea ?

Asya Kamsky

unread,
Apr 27, 2016, 12:31:37 PM4/27/16
to mongodb-user
The syntax is wrong, should be:

{"$group" : { _id: { "$arrayElemAt:["$a1", 0]", "count": { "$sum": 1 } } }

It would help if you indicated what happens when it "doesn't work" -
do you get an error?

What version of MongoDB are you using?

Asya
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user"
> group.
>
> For other MongoDB technical support options, see:
> https://docs.mongodb.org/manual/support/
> ---
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-user...@googlegroups.com.
> To post to this group, send email to mongod...@googlegroups.com.
> Visit this group at https://groups.google.com/group/mongodb-user.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mongodb-user/b9fe984f-216f-4173-a7b9-8232690d1d44%40googlegroups.com.

Szaniszlo Szöke

unread,
May 3, 2016, 7:09:15 AM5/3/16
to mongodb-user
Hi,

Testing your suggestion in MongoVue simply returns a syntax error. At least a missing closing brace, and probably also wrong located double quotes...
I let you try.

So I'm still trying to guess (that's the right word) the syntax to find duplicated values at location 0 of array a1.
Just something that I can paste in MongoVue as an aggregation pipeline.
Something like:

{"$group" : { _id: "$arrayElemAt:[$a1, 0]", "count": { "$sum": 1 } } }
{"$match": {"count" : {"$gt": 1} } }
{"$project": {a1 : "$_id", "_id" : 0} }

Thanks in advance

Asya Kamsky

unread,
May 19, 2016, 7:12:07 PM5/19/16
to mongodb-user
Never saw this reply, sorry.  Yes, my example duplicated your quotes which were not correct.   Look at the expression - all individual tokens that are strings should be quoted:

{"$group" : { "_id": {"$arrayElemAt":["$a1", 0]}, "count": { "$sum": 1 } } }

That works - in your original example (and my copy of it) the closing quote for $arrayElemAt was after the ] which of course made the whole thing nonsense.  You also must have quotes around $a1 *and* the expression for arrayElemAt must be inside { }.


Asya


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.org/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.

For more options, visit https://groups.google.com/d/optout.

Szaniszlo Szöke

unread,
May 25, 2016, 10:53:21 AM5/25/16
to mongodb-user
Thanks a lot for your suggestion.
Reply all
Reply to author
Forward
0 new messages