match and project for values in an array type

73 views
Skip to first unread message

sv savage

unread,
Mar 29, 2017, 9:08:54 PM3/29/17
to mongodb-user
Hi
I am just learning mongodb.  I have a many documents that looks like this

{ File: [
     { type: alice, data: morealica, stuff: junk }
     { type: bob, other: diffbob}
     { type: charlie, stuff, charliestuff }
   ]
}
{ File: [
     { type: alice, data: morealica, stuff: diffjunk }
     { type: bob, other: whatbob}
     { type: charlie, stuff, nocharlie }
   ]
}
{ File: [
     { type: alice, data: morealica, stuff: morejunk }
     { type: bob, other: lessbob}
     { type: charlie, stuff, charliewhy }
   ]
}
more documents...


My goal is to get every File.stuff for { $match: {File.type: alice}}

the output would look like is a document that is an array of but only use the stuff from type: alice
 [
 { stuff: junk}
 { stuff:diffjunk}
 { stuff: morejunk}
]


The next step would be to count the different junk
{ junk: 5 }
{ diffjunk: 3}
{ morejunk: 7}

I am sorry but I am still learning the terminology of mongodb  so I hope you can understand this question.






compchap.nikhil

unread,
Mar 30, 2017, 7:31:22 PM3/30/17
to mongodb-user
Hello SV,

You should be able to do it using below AP stages..

unwind > match > project > group

I hope this helps. If not, let me know I can write a quick one for you.                                              

Thanks,
Nikhil

sv savage

unread,
Mar 31, 2017, 6:56:08 PM3/31/17
to mongodb-user

This is my result

db.tst3.aggregate(
    { $unwind: "$File"},
    { $match: {"File.-type": "General"}},
    { $group: { _id: "$File.Format", count: {$sum:1}}}
)

compchap.nikhil

unread,
Mar 31, 2017, 9:18:23 PM3/31/17
to mongodb-user
Perfect. are you able to get expected result?

Thanks,
Nikhil 

sv savage

unread,
Apr 1, 2017, 11:28:30 AM4/1/17
to mongod...@googlegroups.com
I am using Mongo to store output from mediainfo. Then normalize the images for machine learning 

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/ZCrH0sPrfv8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user+unsubscribe@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/b0d65491-01ae-4254-9013-a6b5b22badb3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Akshat

unread,
Apr 6, 2017, 7:14:33 PM4/6/17
to mongodb-user
Hello SV,
Although you may be able to achieve what you what want with the aggregation pipeline operator $unwind, I would recommend to reconsider your data schema for better performance.
MongoDB supports a flexible schema, and you need to be able to use this for the benefit of your use case. For example, rather than storing an array of File that you would need to $unwind later, you could store File in separate documents.

{ file_type: "alice", data: "morealica", stuff: "junk", myid: 1 }, 
    { file_type: "bob", other: "diffbob", myid: 1}, 
    { file_type: "charlie", stuff: "charliestuff", myid: 1}
    { file_type: "alice", data: "morelica", stuff: "diffjunk", myid: 2},
    { file_type: "bob", other: "whatbob", myid: 2}
    { file_type: "charlie", stuff:"nocharlie", myid: 2}


Depending on your use case, you can create indexes on field file_type.
Using the example schema above, you can then filter and count stuff using an example aggregation pipeline below:

db.collection.aggregate([
                   {$match:{"file_type":"alice"}}, 
                   {$group:{_id:"$stuff", count:{$sum:1}}}
]);

For more information, I would suggest to review the following resources:
I would also recommend to enrol on a free course at MongoDB University. Especially the M101 or M102 courses which cover data modelling and aggregation pipeline.


Best,
Akshat
To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.

sv savage

unread,
Apr 7, 2017, 1:03:50 PM4/7/17
to mongodb-user

Thanks for the suggestion. I understand the idea. I have tried a couple of different schema before. The true schema is alice is really the header for bob , charlie, doris, ...
 
{ data: "morealica", stuff: "junk", more_data :      // i do not need "file_type: alice" here
    [
{ file_type: "bob", other: "diffbob"},{ file_type: "charlie", stuff: "charliestuff"}, {}, ....]}

in sql I would have one table for alice and use alice's key as foreign key in an other table that has the data for bob, charlie, ...
# TABLE
   alice_key foreign key,
   int type,   bob, charlie, doris,...
   varchar key,
   varchar  value

 where type is bob, charlie then index on alice_key and type with another index on key
but i did not know if that was mongo correct.

Each file_type has different keys and unknown number of key-value pairs.

The searching in mongo of this schema is still beyond my understand at this time.
thanks

Akshat

unread,
Apr 18, 2017, 7:45:11 PM4/18/17
to mongodb-user
Hi SV,

Data in MongoDB has a flexible schema, and you should utilise this flexibility for your application benefits. I would suggest to analyse your applications data access patterns i.e. (queries, updates, and processing of the data) to find the most suitable schema. For example, you may be able to embed frequently queried information in documents. 

See also: 
Thanks,
Akshat
Reply all
Reply to author
Forward
0 new messages