db.test1.aggregate( [
{ "$unwind" : "$studGroup" },
{ "$sort" : { "studGroup" : 1 } },
{ "$group" : { "_id" : "$studGroup", "totalMarks" : { "$sum" : "$marks" } } }
] )
{ "_id" : "Varsity", "totalMarks" : NumberDecimal("90.0") }
{ "_id" : "Regular", "totalMarks" : NumberDecimal("90.0") }
{ "_id" : "Fulltime", "totalMarks" : NumberDecimal("90.0") }
...it lets you get a certain view of the academic performance of students in each $studGroup.
But then trying to group on "teacherGroup", for example, seems a little strange. If you have a lot of teacherGroup classifications and a lot of teachers, you'll have to do a lot of array unwinding, which brings up challenges in pipeline size. Does it make sense to do that? I do not know.
Bob
I want to know the total marks of all student who are ‘FULLTIME’, subject that belongs to ‘MICROSOFT’ and taught by a teacher from a group called ‘IMS’
Hi Sck,
It’s not entirely clear from your example document and expected output what you’re trying to achieve. Below are example documents, which are extrapolated from your example to simulate approximately what you’re after:
{ "_id" : "1001", "student_name" : "Mac", "studGroup" : [ "Varsity", "Regular", "Fulltime" ], "subject" : "R", "score" : 9, "marks" : 90, "subjectGroup" : [ "Artificial_Intelligence", "Microsoft", "R" ], "teacher" : "Phil", "teacherGroup" : [ "IMS", "CognitiveServices", "MachineLearning" ] }
{ "_id" : "1002", "student_name" : "John", "studGroup" : [ "Regular", "Fulltime" ], "subject" : "R", "score" : 9, "marks" : 80, "subjectGroup" : [ "Artificial_Intelligence", "Microsoft" ], "teacher" : "Phil", "teacherGroup" : [ "IMS", "CognitiveServices" ] }
{ "_id" : "1003", "student_name" : "Mac", "studGroup" : [ "Regular", "Fulltime" ], "subject" : "Math", "score" : 9, "marks" : 80, "subjectGroup" : [ "Artificial_Intelligence", "Google" ], "teacher" : "Phil", "teacherGroup" : [ "IMS", "CognitiveServices" ] }
{ "_id" : "1004", "student_name" : "Mac", "studGroup" : [ "Regular", "Fulltime" ], "subject" : "Math", "score" : 9, "marks" : 60, "subjectGroup" : [ "Artificial_Intelligence", "Microsoft" ], "teacher" : "Robert", "teacherGroup" : [ "IMS" ] }
You can try below aggregation pipeline:
db.student.aggregate([
{"$match":{
"studGroup":"Fulltime",
"subjectGroup":"Microsoft",
"teacherGroup":"IMS"}
},
{"$group":{"_id":"$student_name",
"total_marks":{"$sum":"$marks"},
"subjectGroup":{"$addToSet":"$subjectGroup"},
"studGroup":{"$addToSet":"$studGroup"},
"teacherGroup":{"$addToSet":"$teacherGroup"}
}
}
]);
As you can see from the above example, three of the filters are grouped into a single $match stage.
The aggregation pipeline grouped the documents by student_name and just combined the groups of the student (duplicates).
Is the schema so bad ? We had nested documents earlier but the performance was terrible when the data was just around 3 mil .
Data Model suitability depends on your application interaction with the data. Without knowing your application requirements it’s hard to determine whether the schema is bad or good.
In addition to MongoDB University course that Bob’s mentioned, I would also recommend to review the following resources:
Regards,
Wan.
school : { "University" , "Institue" ... }
schedule :{ "Regular" , "PartTime" ... }
course_type : {"Fulltime" , "Distance" ....}
|-school
|-schedule
|-course_type
The data of these fields is represented as one single array in the document as "studGroup" : [ "University", "Regular", "Fulltime" ] => student group hierarchy
"please ignore the names of the fields"
The need is to group the data at any level -> school or schedule or course_type
and should be able to pass the values for the level chosen [course_type : 'FullTime']
and should be able to choose the number of levels to project ; E.g.[ $slice : 2] , to retrieve the parent/child field values. The outcome is formatted to show the actual fields
subjectGroup
teachGroup
it works but how do I use $slice with $addToSet .
Hi,
You can append $project stage after the $group stage, to specify $slice operation. Note that you can specify negative number to specify the starting position from the end of the array. See the operator manual for more information on usage.
If there is any better way of representing multiple parent hierarchies in a single document please suggest .
Depending on your use requirements, you can see more example schemas on Model Tree Structures.
Regards
Wan.