There's some more overhead for running a find-like query in the aggregation pipeline, but generally the performance should be similar for queries like
db.test.find(Q).sort(S).limit(L)
and pipelines like
db.test.aggregate([
{ "$match" : Q },
{ "$sort" : S },
{ "$limit" : L }
])
One important difference is that aggregation pipelines cannot use an index to cover a pipeline.
> db.test.drop()
> db.test.insert({ "a" : 1 })
> db.test.insert({ "a" : 2 })
> db.test.ensureIndex({ "a" : 1 })
> db.test.find({ "a" : 1 }, { "_id" : 0, "a" : 1 }) // covered, only needs to read index and not actual documents
> db.test.aggregate([
{ "$match" : { "a" : 1 } },
{ "$project" : { "_id" : 0, "a" : 1 } }
]} // still needs to read the matching documents even though it looks equivalent to the above find
I would prefer using find for jobs that can be done by find, and only using aggregation when find is not capable of doing the same thing.
Also, pagination is usually better done using $gt/$lt and limit in combination with a (indexed) sort, rather than skip+limit. For example, to paginate the results of the following query:
db.test.find().sort({ "a" : 1 })
add the appropriate page size to the query
db.test.find().sort({ "a" : 1 }).limit(50)
then, when it's time to retrieve the next page, store the a value of the last document on the page, which for an ascending sort will be the highest a value seen, and then use $gt to find the next documents using the index, rather than having to skip over them
db.test.find({ "a" : { "$gt" : highest_a_value_of_previous_page }).sort("a" : 1 }).limit(50)
-Will