Better approach to deal with activity logs in MongoDB

Surinaidu Majji

unread,

Aug 22, 2016, 7:31:30 AM8/22/16

to mongodb-user

We are integrating MongoDB database into our application. In our case we have an activity log for the most of the operations that the user do.

So if we treat every activity as an each document, there will be a lot of documents to be created in the collection. Basically our document size is increasing up to 1kb. So We are thinking two possibilities:

1. We will store 1k logs into the one document.

2. Store each activity log as one document.

If We follow the first approach we don't know how to stop storing activity logs when the document is filled with 1k logs.

If we follow the second approach, while the client is reading the activity logs, it needs to scan more number of documents (every doc creates index which we don't need), so it may affect the storage and reading performance.

Could anybody recommend one way or the other, or make other suggestions?

Rhys Campbell

unread,

Aug 22, 2016, 7:51:28 AM8/22/16

to mongodb-user

We really can't say without seeing your data, example documents and read use-cases but I'd probably go for #2.

If we follow the first approach we don't know how to stop storing activity logs when the document is filled with 1k logs.

There's a discussion here about measuring the size of a MongoDB document...

http://stackoverflow.com/questions/22008822/mongo-get-size-of-single-document

If we follow the second approach, while the client is reading the activity logs, it needs to scan more number of documents (every doc creates index which we don't need), so it may affect the storage and reading performance.

How do you know scanning a large number of small documents will be worse than scanning a smaller number of 1K documents?

every doc creates index which we don't need) <- What does this mean?

Surinaidu Majji

unread,

Aug 23, 2016, 6:53:32 AM8/23/16

to mongodb-user

every doc creates index which we don't need) <- What does this mean?

Whenever I am storing the each activity log as one document, It will create default "_id" which is indexed by default, Like that We will have more number of indexes if we create more number of docs. I felt finding the information from more documents is faster than the finding from less number of docs.

Please let me know If I am wrong.

Rhys Campbell

unread,

Aug 23, 2016, 7:07:19 AM8/23/16

to mongodb-user

The default _id index is not likely to be causing you any problem at all. This is a default and sensible.

Nobody can advise you until you provide detailed information about you data, sample documents, the queries you want to run and so on.

Surinaidu Majji

unread,

Aug 23, 2016, 7:18:12 AM8/23/16

to mongodb-user

Please find my sample activity log which is only 1/2 kb

{

"log": {

"accountId": "0",

"info1": {

"itemName": "-",

"value": "-"

},

"info2": {

"itemName": "-",

"value": "-"

},

"errorCode": "",

"internalInformation": "",

"kind": "Infomation",

"loginId": "0",

"opeLogId": "G1_1",

"operation": "startDiscovery",

"result": "normal",

"targetId": "1",

"timestamp": "1470980265729",

"undoFlag": "false"

Surinaidu Majji

unread,

Aug 29, 2016, 1:31:35 AM8/29/16

to mongodb-user

Please find the below is my activity log document:

{

"log": {

"accountId": "0",

"info1": {

"itemName": "-",

"value": "-"

},

"info2": {

"itemName": "-",

"value": "-"

},

"errorCode": "",

"internalInformation": "",

"kind": "Infomation",

"loginId": "0",

"opeLogId": "G1_1",

"operation": "startDiscovery",

"result": "normal",

"targetId": "1",

"timestamp": "1470980265729",

"undoFlag": "false"

}

BasicDBObject projectFields = new BasicDBObject();

projectFields.append("log.accountId", 1).append("_id", 0)

.append("log.info1.value", 1).append("log.operation", 1).append("log.result", 1)

.append("log.timestamp", 1);

BasicDBObject queryCondition = new BasicDBObject();

queryCondition.append("log.accountId", 0);

List<String> retList = new ArrayList<String>();

DBCollection targetCollection=null;

DB mintDB = mongoClient.getDB("mint");

targetCollection =mintDB.getCollection(collectionName);

DBCursor cursor = targetCollection.find(searchField, projectFields).skip(skip).limit(limit).sort(sortFields);

while (cursor.hasNext()) {

retList.add(cursor.next().toString());

}

return retList;

Kindly let me know if any information is required.

Surinaidu Majji

unread,

Aug 29, 2016, 7:07:26 AM8/29/16

to mongodb-user

Hello Rhys Campbell,

Please let me know your opinion on the information which I have provided.

Amar

unread,

Sep 1, 2016, 9:59:16 PM9/1/16

to mongodb-user

Hi,

Schema design in MongoDB is highly dependent on your use case, goals, and data access pattern of your application. Factors like how the data is generated and accessed should determine your schema design. If you want reads to be faster then the schema should align with the way the reads would be performed.

If We follow the first approach we don’t know how to stop storing activity logs when the document is filled with 1k logs.

Could you elaborate on the significance of the 1KB number? My understanding is that you want to “buffer” your logs until their total size reaches 1KB, at which point you write them as a single document (instead of writing the individual log entries). In this case, the logic must be coded in your application.

Whenever I am storing the each activity log as one document, It will create default “_id” which is indexed by default

Remember that you can choose the value of _id when inserting documents (as long as it’s unique), if you can use it for your queries as indexes are essential to achieve fast reads. If you don’t provide an _id field, a default _id will be provided for you.

You can find a lot of useful information in MongoDB Use Cases and in particular there is a Storing Log Data Use Case.

Regards,

Amar

Surinaidu Majji

unread,

Sep 10, 2016, 5:10:56 AM9/10/16

to mongodb-user

Hi Amar,

Thank you for your suggestions.

Could you elaborate on the significance of the 1KB number? My understanding is that you want to “buffer” your logs until their total size reaches 1KB, at which point you write them as a single document (instead of writing the individual log entries). In this case, the logic must be coded in your application.

- A small correction. We want to store the logs till it reaches 8 MB. If we write only one log at once, the document will be creating by storing 1/2 kb log entry in the array(document), for every write we will update the array in the document till it reaches 8 MB, if it reaches 8 MB it, logs will be create new document. This all logic controlled by our application.

Reply all

Reply to author

Forward