We're implementing database operations tracing by adding a
trace ID in the
$comment or
comment element of most queries right
before sending them off to MongoDB through the Java driver so this ID
may appear in the database log in case of slow queries.
For most
operations that support this comment element this seems to work just
fine but we have some concerns regarding aggregations. The docs in
https://docs.mongodb.com/manual/reference/operator/query/comment/#op._S_comment
and
https://docs.mongodb.com/manual/reference/operator/query/comment/#ex-comment-agg-expression says "
You can
use the $comment with any expression taking a query predicate" so for aggregations it means the
$match stage.
Our tracing code that
handles aggregations looks for a
$match stage in the pipeline and adds
the comment if one is found and lets the operation proceed to the
driver. If none is found, it adds a new
$match stage with the comment as
sole attribute as first element of the pipeline (this is because some
of our aggregations don't make use of
$match).
So, looking at the system.profile for debugging purposes, an aggregation would look like this:
Before tracing:
--------------
{
"op" : "command", d
"ns" : "cloudwife.cw_task_discussion",
"command" : {
"aggregate" : "cw_task_discussion",
"pipeline" : [
{ "$unwind" : "$comments" },
{ "$group" : { "_id" : "commentsCount", "count" : { "$sum" : NumberInt(1) } } }
]
} , ... After tracing it looks like this:
--------------------
{
"op" : "command",
"ns" : "cloudwife.cw_task_discussion",
"command" : {
"aggregate" : "cw_task_discussion",
"pipeline" : [
{ "$match" : { "$comment" : "[amzn_trace_id:-]" } },
{ "$unwind" : "$comments" },
{ "$group" : { "_id" : "commentsCount", "count" : { "$sum" : NumberInt(1) } } }
]
}, ...- This newly added "dummy"
$match stage should not alter the behaviour/performance of the aggregation...? I'm specifically wondering if it might cause the aggregation to use a different index. Any advice?
Thank you and kind regards,
Daniel Rodríguez