Additional indexes on DB improved time for running megatron.sh and therefore mega-sender.py

19 views
Skip to first unread message

Kick Megatron

unread,
Oct 8, 2013, 8:17:57 AM10/8/13
to megatron...@googlegroups.com
Hi,
Over the last weeks we have taken Megatron in production to auto parse and sent shadowserver and another source for our complete infrastructure.
For last week we processed 139075 issues. It all works well, but I saw that the complete time a megatron-python/script/mega-sender.py script would run was exceeding an hour (I have removed the manual interaction, not working on the volumes we process :-)).

An investigation showed that the DB interaction was bringing mysql to 100% for a long time.

Therefore the below indexes have been added in our setup, which improved the situation quiet a bit for us.

Maybe useful for others as well.

Regards,
Kick

===
create index mj_started on mail_job (started);
create index mj_finished on mail_job (finished);
create index mjlem_job_id on mail_job_log_entry_mapping (mail_job_id);
create index mjlem_log_id on mail_job_log_entry_mapping (log_entry_id);
===


Before index
===
2013-10-06 06:27:35.271 INFO  - Mail sending finished. Sent Mails: 12, Sent Entries: 10983, Quarantined Entries: 0, Filtered Entries: 0, ID: 106, RTIR ID: -, Duration: 0:15:01.809
2013-10-06 06:53:17.446 INFO  - Mail sending finished. Sent Mails: 11, Sent Entries: 2427, Quarantined Entries: 0, Filtered Entries: 0, ID: 107, RTIR ID: -, Duration: 0:25:38.027
2013-10-06 16:26:05.072 INFO  - Mail sending finished. Sent Mails: 11, Sent Entries: 18549, Quarantined Entries: 0, Filtered Entries: 0, ID: 108, RTIR ID: -, Duration: 0:26:43.947
2013-10-07 04:27:29.216 INFO  - Mail sending finished. Sent Mails: 11, Sent Entries: 2933, Quarantined Entries: 0, Filtered Entries: 0, ID: 109, RTIR ID: -, Duration: 0:27:20.613
2013-10-07 08:04:03.912 INFO  - Mail sending finished. Sent Mails: 12, Sent Entries: 10284, Quarantined Entries: 0, Filtered Entries: 0, ID: 111, RTIR ID: -, Duration: 0:14:53.857
2013-10-07 08:04:35.682 INFO  - Mail sending finished. Sent Mails: 11, Sent Entries: 271, Quarantined Entries: 0, Filtered Entries: 0, ID: 112, RTIR ID: -, Duration: 0:00:27.740
===


After index
===
2013-10-07 13:32:57.595 INFO  - Mail sending finished. Sent Mails: 9, Sent Entries: 1074, Quarantined Entries: 0, Filtered Entries: 0, ID: 113, RTIR ID: -, Duration: 0:00:07.166
2013-10-08 03:34:28.138 INFO  - Mail sending finished. Sent Mails: 12, Sent Entries: 10766, Quarantined Entries: 0, Filtered Entries: 0, ID: 115, RTIR ID: -, Duration: 0:00:47.731
2013-10-08 03:34:44.735 INFO  - Mail sending finished. Sent Mails: 10, Sent Entries: 1795, Quarantined Entries: 0, Filtered Entries: 0, ID: 116, RTIR ID: -, Duration: 0:00:12.623
2013-10-08 07:32:51.960 INFO  - Mail sending finished. Sent Mails: 12, Sent Entries: 273, Quarantined Entries: 0, Filtered Entries: 0, ID: 118, RTIR ID: -, Duration: 0:00:03.564
===

Megatron Developer

unread,
Oct 8, 2013, 8:52:29 AM10/8/13
to megatron...@googlegroups.com
Thanks a lot Kick!
 
The number one performance bottleneck in Megatron is quarantine SQL-queries, which is executed in DbManager.existsMailForIp. To avoid intrusive abuse emails, CERT-SE is only sending one email per IP address and week. This function is called "quarantine" in Megatron.
 
If you don't want to use quarantine, set the property to 0 (this will speed up things):
 
mail.ipQuarantinePeriod=0
 
We have added the indices and got a huge performance gain.
 
Example: megatron.sh --list-jobs 25
 
Result:
* Before indices: ~1 minute 40 seconds
* After indices: ~12 seconds
 
 
Thanks again, Kick!
 
  /Tor
 

Megatron Developer

unread,
Oct 8, 2013, 8:56:00 AM10/8/13
to megatron...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages