Better approach to deal with activity logs in MongoDB

419 views
Skip to first unread message

Surinaidu Majji

unread,
Aug 22, 2016, 7:31:30 AM8/22/16
to mongodb-user
We are integrating MongoDB database into our application. In our case we have an activity log for the most of the operations that the user do.

So if we treat every activity as an each document, there will be a lot of documents to be created in the collection. Basically our document size is increasing up to 1kb. So We are thinking two possibilities:

1. We will store 1k logs into the one document.
2. Store each activity log as one document.
    
If We follow the first approach we don't know how to stop storing activity logs when the document is filled with 1k logs.

If we follow the second approach, while the client is reading the activity logs, it needs to scan more number of documents (every doc creates index which we don't need), so it may affect the storage and reading performance.
    
Could anybody recommend one way or the other, or make other suggestions?

Rhys Campbell

unread,
Aug 22, 2016, 7:51:28 AM8/22/16
to mongodb-user
We really can't say without seeing your data, example documents and read use-cases but I'd probably go for #2.

If we follow the first approach we don't know how to stop storing activity logs when the document is filled with 1k logs.

There's a discussion here about measuring the size of a MongoDB document...


If we follow the second approach, while the client is reading the activity logs, it needs to scan more number of documents (every doc creates index which we don't need), so it may affect the storage and reading performance.

How do you know scanning a large number of small documents will be worse than scanning a smaller number of 1K documents?

every doc creates index which we don't need) <- What does this mean?


 

Surinaidu Majji

unread,
Aug 23, 2016, 6:53:32 AM8/23/16
to mongodb-user
every doc creates index which we don't need) <- What does this mean?
Whenever I am storing the each activity log as one document, It will create default "_id" which is indexed by default, Like that We will have more number of indexes if we create more number of docs. I felt finding the information from more documents is faster than the finding from less number of docs.
Please let me know If I am wrong.

Rhys Campbell

unread,
Aug 23, 2016, 7:07:19 AM8/23/16
to mongodb-user
The default _id index is not likely to be causing you any problem at all. This is a default and sensible.

Nobody can advise you until you provide detailed information about you data, sample documents, the queries you want to run and so on.

Surinaidu Majji

unread,
Aug 23, 2016, 7:18:12 AM8/23/16
to mongodb-user
Please find my sample activity log which is only 1/2 kb


{
  "log": {
    "accountId": "0",
    "info1": {
      "itemName": "-",
      "value": "-"
    },
    "info2": {
      "itemName": "-",
      "value": "-"
    },
    "errorCode": "",
    "internalInformation": "",
    "kind": "Infomation",
    "loginId": "0",
    "opeLogId": "G1_1",
    "operation": "startDiscovery",
    "result": "normal",
    "targetId": "1",
    "timestamp": "1470980265729",
    "undoFlag": "false"

Surinaidu Majji

unread,
Aug 29, 2016, 1:31:35 AM8/29/16
to mongodb-user
Please find the below is my activity log document:
{
  "log": {
    "accountId": "0",
    "info1": {
      "itemName": "-",
      "value": "-"
    },
    "info2": {
      "itemName": "-",
      "value": "-"
    },
    "errorCode": "",
    "internalInformation": "",
    "kind": "Infomation",
    "loginId": "0",
    "opeLogId": "G1_1",
    "operation": "startDiscovery",
    "result": "normal",
    "targetId": "1",
    "timestamp": "1470980265729",
    "undoFlag": "false"
  }
}

BasicDBObject projectFields = new BasicDBObject();
projectFields.append("log.accountId", 1).append("_id", 0)
.append("log.info1.value", 1).append("log.operation", 1).append("log.result", 1)
.append("log.timestamp", 1);
BasicDBObject queryCondition = new BasicDBObject();
queryCondition.append("log.accountId", 0);
List<String> retList = new ArrayList<String>();
DBCollection targetCollection=null;
DB mintDB = mongoClient.getDB("mint");
targetCollection =mintDB.getCollection(collectionName);
DBCursor cursor = targetCollection.find(searchField, projectFields).skip(skip).limit(limit).sort(sortFields);
while (cursor.hasNext()) {
retList.add(cursor.next().toString());
}
return retList;

Kindly let me know if any information is required.

Surinaidu Majji

unread,
Aug 29, 2016, 7:07:26 AM8/29/16
to mongodb-user
Hello Rhys Campbell,
Please let me know your opinion on the information which I have provided.

Amar

unread,
Sep 1, 2016, 9:59:16 PM9/1/16
to mongodb-user

Hi,

Schema design in MongoDB is highly dependent on your use case, goals, and data access pattern of your application. Factors like how the data is generated and accessed should determine your schema design. If you want reads to be faster then the schema should align with the way the reads would be performed.

If We follow the first approach we don’t know how to stop storing activity logs when the document is filled with 1k logs.

Could you elaborate on the significance of the 1KB number? My understanding is that you want to “buffer” your logs until their total size reaches 1KB, at which point you write them as a single document (instead of writing the individual log entries). In this case, the logic must be coded in your application.

Whenever I am storing the each activity log as one document, It will create default “_id” which is indexed by default

Remember that you can choose the value of _id when inserting documents (as long as it’s unique), if you can use it for your queries as indexes are essential to achieve fast reads. If you don’t provide an _id field, a default _id will be provided for you.

You can find a lot of useful information in MongoDB Use Cases and in particular there is a Storing Log Data Use Case.

Regards,

Amar


Surinaidu Majji

unread,
Sep 10, 2016, 5:10:56 AM9/10/16
to mongodb-user
Hi Amar,
Thank you for your suggestions.

 Could you elaborate on the significance of the 1KB number? My understanding is that you want to “buffer” your logs until their total size reaches 1KB, at which point you write them as a single document (instead of writing the individual log entries). In this case, the logic must be coded in your application.


- A small correction. We want to store the logs till it reaches 8 MB. If we write only one log at once, the document will be creating by storing 1/2 kb log entry in the array(document), for every write we will update the array in the document till it reaches 8 MB, if it reaches 8 MB it, logs will be create new document. This all logic controlled by our application.
Reply all
Reply to author
Forward
0 new messages