clean up of lmt database

87 views
Skip to first unread message

lisa

unread,
Jul 16, 2012, 10:16:30 AM7/16/12
to lmt-d...@googlegroups.com
Does anyone have any guidance on cleaning up the LMT database? Are there any scripts available to clean up logs or purge old records?

lisa

Andrew Uselton

unread,
Jul 16, 2012, 11:05:55 AM7/16/12
to lmt-d...@googlegroups.com
I'm not sure what logs you mean. Are you talking about deleting records from the LMT MySQL database? I can give some guidance about doing that "manually". I don't believe there is any script or lmt-based utility for it. Give me a little more idea what you a re looking for and I'd be happy to give you a pointer.
Cheers,
Andrew


On Mon, Jul 16, 2012 at 7:16 AM, lisa <lagiac...@gmail.com> wrote:
Does anyone have any guidance on cleaning up the LMT database? Are there any scripts available to clean up logs or purge old records?

lisa

--
You received this message because you are subscribed to the Google Groups "lmt-discuss" group.
To view this discussion on the web visit https://groups.google.com/d/msg/lmt-discuss/-/OQnwfBc-DfEJ.
To post to this group, send email to lmt-d...@googlegroups.com.
To unsubscribe from this group, send email to lmt-discuss...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lmt-discuss?hl=en.

Ryan Haasken

unread,
Aug 2, 2013, 4:19:31 PM8/2/13
to lmt-d...@googlegroups.com
I also have questions about the LMT MySQL database.

1.  Does the MySQL database continue to grow as long as LMT is running, or does old data get deleted?

2.  Does the size of the MySQL database depend on the size of the filesystem (e.g. number of lustre servers) or on the amount of I/O performed on that filesystem?

3.  Is there a way within LMT to automatically purge old data in the database?  For example, keep only the most recent week of data.

4.  If there is no way to automatically purge old data in the database, how can the user manage the growing database which contains old data?  Perhaps a cron job which periodically deletes data older than a certain date?

Thanks,

Ryan

Jim Garlick

unread,
Aug 3, 2013, 11:04:43 AM8/3/13
to lmt-d...@googlegroups.com
Hi Ryan,

Right, the amount of data should be proportional to the number of Lustre servers that comprise the file system being monitored.

If you have a look at the schema:
you may note that the 'hourly' tables have a MAX_ROWS setting.  I believe the "low resolution" tables are only populated if you run the hourly cron script that populates them.   That script doesn't have an option to purge old data but it would be a reasonable addition IMHO.

Regards,
Jim

Ryan Haasken

unread,
Oct 16, 2013, 2:10:10 PM10/16/13
to lmt-d...@googlegroups.com
Hi Jim,

Thanks for the information.

You are correct about the low resolution tables only being populated if you run the hourly cron script.  I don't believe the MAX_ROWS table option is relevant here.  That won't actually enforce a limit on the size of the tables.  Here's what the MySQL documentation says about MAX_ROWS:

The maximum number of rows you plan to store in the table. This is not a hard limit, but rather a hint to the storage engine that the table must be able to store at least this many rows.

Source: http://dev.mysql.com/doc/refman/5.1/en/create-table.html

In fact, the MAX_ROWS value in the LMT schema is 2,000,000,000, which is about half the maximum allowed value for MAX_ROWS.  I think this means it's there to tell the storage engine that the table will potentially get very big.

You said that an option to purge old data would be a reasonable addition to the aggregation cron job script.  I agree that the functionality should be added to LMT, but maybe it could be added as a separate shell script that could be set up as a cron job.  I'm thinking a shell script that does something like this would do the trick:

#!/bin/sh
# Delete all MDS_OPS_DATA before a given time
TIMESTAMP="2013-10-16 00:00:00"
mysql --password=$(cat /etc/lmt/rwpasswd) -e "DELETE MDS_OPS_DATA FROM MDS_OPS_DATA INNER JOIN TIMESTAMP_INFO ON MDS_OPS_DATA.TS_ID=TIMESTAMP_INFO.TS_ID WHERE TIMESTAMP < '$TIMESTAMP';"

I think this would be useful for deleting the raw data which gets added every 5 seconds.  A script could also be written that deletes aggregated data, and that could be set up to run less frequently as a cron job.  I suppose it's important to also keep old data around long enough for the lmt_agg.cron script to aggregate it into the lower resolution tables.  What do you think?

-Ryan
Reply all
Reply to author
Forward
0 new messages