scaling projections for dashboard database?

73 views
Skip to first unread message

Jo Rhett

unread,
Jan 9, 2012, 1:40:00 PM1/9/12
to Puppet Users
So I got dashboard up and running on our production system on Thursday before I left. Within 48 hours it had completed filled the /var filesystem.  The ibdata1 file is currently at 8GB in size.

1. What size should I expect for ~500 nodes reporting every 30 minutes?

2. Are there some database cleanup scripts which I have managed to overlook that need to be run?

-- 
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source and other randomness

Darin Perusich

unread,
Jan 9, 2012, 2:30:52 PM1/9/12
to puppet...@googlegroups.com
Hi Jo,

The ibdata1 file only grows and never shrinks so I'd recommend
setting/adding "innodb_file_per_table" in /etc/my.cnf. You'll need to
go through the steps to purge it first, google is your friend, first
but you'll now longer have the ever growing idbata1 file. You probably
have a bunch of old mysql-bin.0* replication logs that can be nuked as
well.

I'll be happy once the dashboard support PostgreSQL

--
Later,
Darin

> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-users...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.

Jo Rhett

unread,
Jan 9, 2012, 3:43:00 PM1/9/12
to puppet...@googlegroups.com
On Jan 9, 2012, at 11:30 AM, Darin Perusich wrote:
The ibdata1 file only grows and never shrinks so I'd recommend
setting/adding "innodb_file_per_table" in /etc/my.cnf. You'll need to
go through the steps to purge it first, google is your friend, first
but you'll now longer have the ever growing idbata1 file.

I'm not tracking this answer.  I'm familiar with that option, and it means that instead of one I will have eighteen ever-growing files, right?  How does this change the total space used?

I have no problem with the database size never getting smaller on disk, I'm just curious what size is expected for it to grow to, and are there any cleanup scripts should should be done to free rows?

Darin Perusich

unread,
Jan 9, 2012, 4:06:13 PM1/9/12
to puppet...@googlegroups.com
When mysql is running with innodb_file_per_table enabled you can use
"OPTIMIZE TABLE" free space in the table files. When you have a single
ibdata file it does not. I'm not aware of any cleanup scripts or what
size you should expect the db to grow to.

--
Later,
Darin

Stefan Heijmans

unread,
Jan 9, 2012, 5:16:56 PM1/9/12
to puppet...@googlegroups.com
Op maandag 9 januari 2012 19:40:00 UTC+1 schreef Jo het volgende:
2. Are there some database cleanup scripts which I have managed to overlook that need to be run?
 
have you tried this?

perhaps also give the 'optimize the database' as try.

Stefan

Jo Rhett

unread,
Jan 9, 2012, 5:47:48 PM1/9/12
to puppet...@googlegroups.com
Yeah I saw these. We had a whopping 3 days of collected reports.  I think we want a bit more than that available for browsing ;-)  I was wondering if there was some hourly cleanup or something which needed to be done?

Is there any reasonable estimate for what amount of space you expect one system to use?  I realize this likely varies with the report size, but the rate of growth seems high enough that I'm surprised it wasn't mentioned in the installation docs.  I mean, it's grown half a gigabyte in the last 6 hours.  With that kind of growth rate, you'd expect a warning to provide enough space for it and how to estimate your needs.

Daniel Pittman

unread,
Jan 9, 2012, 6:31:03 PM1/9/12
to puppet...@googlegroups.com

That growth rate seems ... excessive. Ultimately, the size of the
stored data is pretty directly related to the size of your YAML
reports; can you capture one of those and see how big it is on disk?

Daniel
--
⎋ Puppet Labs Developer – http://puppetlabs.com
♲ Made with 100 percent post-consumer electrons

Christopher Johnston

unread,
Jan 9, 2012, 6:55:54 PM1/9/12
to puppet...@googlegroups.com
How often are you running puppet?  I have 1200 nodes running a few times a week and our growth is nothing like that.



-------- Original message --------
Subject: Re: [Puppet Users] scaling projections for dashboard database?
From: Jo Rhett <jrh...@netconsonance.com>
To: puppet...@googlegroups.com
CC:


On Jan 9, 2012, at 2:16 PM, Stefan Heijmans wrote:
Op maandag 9 januari 2012 19:40:00 UTC+1 schreef Jo het volgende:
2. Are there some database cleanup scripts which I have managed to overlook that need to be run?
 
have you tried this?

perhaps also give the 'optimize the database' as try.

Yeah I saw these. We had a whopping 3 days of collected reports.  I think we want a bit more than that available for browsing ;-)  I was wondering if there was some hourly cleanup or something which needed to be done?

Is there any reasonable estimate for what amount of space you expect one system to use?  I realize this likely varies with the report size, but the rate of growth seems high enough that I'm surprised it wasn't mentioned in the installation docs.  I mean, it's grown half a gigabyte in the last 6 hours.  With that kind of growth rate, you'd expect a warning to provide enough space for it and how to estimate your needs.

-- 
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source and other randomness

Jo Rhett

unread,
Jan 9, 2012, 7:32:01 PM1/9/12
to puppet...@googlegroups.com
A little less than 500 nodes running every 30 minutes.  We do have some extensive modules though, and the reports from software deployments are quite large.

Can you share what size your database has grown to?

Jo Rhett

unread,
Jan 9, 2012, 11:06:28 PM1/9/12
to puppet...@googlegroups.com
On Jan 9, 2012, at 3:31 PM, Daniel Pittman wrote:
Is there any reasonable estimate for what amount of space you expect one
system to use?  I realize this likely varies with the report size, but the
rate of growth seems high enough that I'm surprised it wasn't mentioned in
the installation docs.  I mean, it's grown half a gigabyte in the last 6
hours.  With that kind of growth rate, you'd expect a warning to provide
enough space for it and how to estimate your needs.

That growth rate seems ... excessive.  Ultimately, the size of the
stored data is pretty directly related to the size of your YAML
reports; can you capture one of those and see how big it is on disk?

FYI, in 10 hours the database has grown slightly more than 1G. That's an extensive growth rate.

Looking at the yaml files, I'm seeing 410k per file * 400 nodes = 160Mb per 30 minutes.

Is there really no optimization that is performed on the data stored in the database?  Coming up with a few hundred gigabytes of file storage is one thing.  Trying to make mysql perform well with 100Gb database is an entirely different matter.

Walter Heck

unread,
Jan 10, 2012, 3:00:16 AM1/10/12
to puppet...@googlegroups.com
FYI: MySQL performs fine with 100G files if you set it up correctly. I
haven't used the dashboard or looked at the source code, but with that
kind of storage I'd say you have a write-heavy application. You can
tune for that quite easily, although scaling beyond a single master
will be a bit more tricky as opposed to write-heavy apps where you can
just add slaves.

innodb_file_per_table should imho be set for every mysql server in
existence for many reasons, but it will only work if you have innodb
tables and not MyISAM (duh to me, not so duh to others maybe ;)).

Don't go and delete binlogs at random if you love your data and want
to be able to do proper backups. If you see too many binlogs, just set
expire-logs-days to something sane (read: larger then the time between
your backups). If you really want to get rid of some binlogs, purge
them usign mysql, don't just delete the files:
http://dev.mysql.com/doc/refman/5.0/en/purge-binary-logs.html

Just saying: mysql is not as bad as people make it seem :)

Walter

> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-users...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.

--
Walter Heck

--
follow @walterheck on twitter to see what I'm up to!
--
Check out my new startup: Server Monitoring as a Service @ http://tribily.com
Follow @tribily on Twitter and/or 'Like' our Facebook page at
http://www.facebook.com/tribily

Bernd Adamowicz

unread,
Jan 10, 2012, 4:44:08 AM1/10/12
to puppet...@googlegroups.com

Besides all the answers already provided by others, there might be also another reason for the fast growing database. This is the table ‘resource_statuses’ inside dashboard’s database which is not purged by the rake script (at least not in Puppet 2.6.6 and 2.6.12). Patching the rake script will dramatically reduce the overall database size. I’ve provided more details here: http://berndadamowicz.wordpress.com/2011/12/07/keeping-puppet-dashboards-database-small/

 

Bernd

--

Daniel Pittman

unread,
Jan 10, 2012, 1:55:14 PM1/10/12
to puppet...@googlegroups.com
On Mon, Jan 9, 2012 at 20:06, Jo Rhett <jrh...@netconsonance.com> wrote:
>> On Jan 9, 2012, at 3:31 PM, Daniel Pittman wrote:
>>
>>> Is there any reasonable estimate for what amount of space you expect one
>>> system to use?  I realize this likely varies with the report size, but the
>>> rate of growth seems high enough that I'm surprised it wasn't mentioned in
>>> the installation docs.  I mean, it's grown half a gigabyte in the last 6
>>> hours.  With that kind of growth rate, you'd expect a warning to provide
>>> enough space for it and how to estimate your needs.
>>
>> That growth rate seems ... excessive.  Ultimately, the size of the
>> stored data is pretty directly related to the size of your YAML
>> reports; can you capture one of those and see how big it is on disk?
>
> FYI, in 10 hours the database has grown slightly more than 1G. That's an
> extensive growth rate.
>
> Looking at the yaml files, I'm seeing 410k per file * 400 nodes = 160Mb per
> 30 minutes.
>
> Is there really no optimization that is performed on the data stored in the
> database?

Sadly, it is true that there is no optimization that is performed on
the data store in the database.

> Coming up with a few hundred gigabytes of file storage is one
> thing.  Trying to make mysql perform well with 100Gb database is an entirely
> different matter.

Yes. It sounds like the current storage of reports isn't going to
work well for you, at least if you want to retain history. This is
absolutely unfortunate, and is one of the serious shortfalls we are
aware of around the Dashboard and StoreConfigs databases. We are
working on improving these, but there isn't anything presently public
available.

Jo Rhett

unread,
Jan 24, 2012, 5:55:54 PM1/24/12
to puppet...@googlegroups.com
Sorry for the long delay, had my head down on some other issues. Reply below.

On Jan 10, 2012, at 10:55 AM, Daniel Pittman wrote:
Yes.  It sounds like the current storage of reports isn't going to
work well for you, at least if you want to retain history.  This is
absolutely unfortunate, and is one of the serious shortfalls we are
aware of around the Dashboard and StoreConfigs databases.  We are
working on improving these, but there isn't anything presently public
available.

My main concern here is that we're keeping a large amount of data twice.  We have the report file, and then we have all of the same content in the database. I think that it should be documented:

1. What does keeping the reports around give you?

2. What does keeping the database reports around give you?

If I am right, we could stop storing the reports for as long if we can browse them in the dashboard interface, right?  Or is there some loss of functionality by discarding reports after just a few days?

Daniel Pittman

unread,
Jan 24, 2012, 8:34:22 PM1/24/12
to puppet...@googlegroups.com

Yeah, you can ditch the files on disk in favour of the database with
no real loss of anything.

--
Daniel Pittman

Reply all
Reply to author
Forward
0 new messages