mysql or postgresql

605 views
Skip to first unread message

akaroot

unread,
Jun 4, 2012, 4:05:42 PM6/4/12
to Puppet Users
Hi!
Which database would you recommend to use mysql or postgresql for
puppet?
Estimated number of nodes 30000, they send reports every 30 minutes.
Thanks

Denmat

unread,
Jun 4, 2012, 10:54:09 PM6/4/12
to puppet...@googlegroups.com
Hi,

You should checkout puppetdb which currently locks you into PostgreSQL.

Docs here:
http://docs.puppetlabs.com/#puppetdbpuppetdb09

It's new so may be buggy - though I haven't seen any reports on this (haven't looked ;) )

Den
> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
>

Gary Larizza

unread,
Jun 4, 2012, 11:19:41 PM6/4/12
to puppet...@googlegroups.com
On Mon, Jun 4, 2012 at 7:54 PM, Denmat <tu2b...@gmail.com> wrote:
> Hi,
>
> You should checkout puppetdb which currently locks you into PostgreSQL.
>
> Docs here:
> http://docs.puppetlabs.com/#puppetdbpuppetdb09
>
> It's new so may be buggy - though I haven't seen any reports on this (haven't looked ;) )

PuppetDB will store Inventory Data as well as exported resources and
catalog data (from Puppet itself), but it will not store reports from
the Puppet Dashboard. To do that, the Puppet Dashboard requires a
MySQL database backend. It doesn't CURRENTLY support Postgres, though
I suspect that will change in the future.



>
> Den
>
> On 05/06/2012, at 4:05, akaroot <mykola...@gmail.com> wrote:
>
>> Hi!
>> Which database would you recommend to use mysql or postgresql for
>> puppet?
>> Estimated number of nodes 30000, they send reports every 30 minutes.
>> Thanks
>>
>> --
>> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
>> To post to this group, send email to puppet...@googlegroups.com.
>> To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
>> For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
>



--

Gary Larizza
Professional Services Engineer
Puppet Labs

Brian Gupta

unread,
Jun 4, 2012, 11:48:54 PM6/4/12
to puppet...@googlegroups.com
On Mon, Jun 4, 2012 at 11:19 PM, Gary Larizza <ga...@puppetlabs.com> wrote:
> On Mon, Jun 4, 2012 at 7:54 PM, Denmat <tu2b...@gmail.com> wrote:
>> Hi,
>>
>> You should checkout puppetdb which currently locks you into PostgreSQL.
>>
>> Docs here:
>> http://docs.puppetlabs.com/#puppetdbpuppetdb09
>>
>> It's new so may be buggy - though I haven't seen any reports on this (haven't looked ;) )
>
> PuppetDB will store Inventory Data as well as exported resources and
> catalog data (from Puppet itself), but it will not store reports from
> the Puppet Dashboard.  To do that, the Puppet Dashboard requires a
> MySQL database backend.  It doesn't CURRENTLY support Postgres, though
> I suspect that will change in the future.

Gary,

Assuming we are just talking about report stores, Akaroot is looking
at storing 60,000 reports per hour. Can a single storeconfigs MySQL
instance handle that? (That's 1000 reports per minute.)

Thanks,
Brian

Gary Larizza

unread,
Jun 5, 2012, 12:07:36 AM6/5/12
to puppet...@googlegroups.com
On Mon, Jun 4, 2012 at 8:48 PM, Brian Gupta <brian...@brandorr.com> wrote:
> On Mon, Jun 4, 2012 at 11:19 PM, Gary Larizza <ga...@puppetlabs.com> wrote:
>> On Mon, Jun 4, 2012 at 7:54 PM, Denmat <tu2b...@gmail.com> wrote:
>>> Hi,
>>>
>>> You should checkout puppetdb which currently locks you into PostgreSQL.
>>>
>>> Docs here:
>>> http://docs.puppetlabs.com/#puppetdbpuppetdb09
>>>
>>> It's new so may be buggy - though I haven't seen any reports on this (haven't looked ;) )
>>
>> PuppetDB will store Inventory Data as well as exported resources and
>> catalog data (from Puppet itself), but it will not store reports from
>> the Puppet Dashboard.  To do that, the Puppet Dashboard requires a
>> MySQL database backend.  It doesn't CURRENTLY support Postgres, though
>> I suspect that will change in the future.
>
> Gary,
>
> Assuming we are just talking about report stores, Akaroot is looking
> at storing 60,000 reports per hour. Can a single storeconfigs MySQL
> instance handle that? (That's 1000 reports per minute.)
>

Just to correct the terminology - storeconfigs refers to the act of
storing resources that have been exported from nodes, as well as
storing your inventory data (Facter facts). There's documentation
here that should help -->
http://docs.puppetlabs.com/guides/inventory_service.html With
storeconfigs, it's information that can be pulled from within a Puppet
run.

Reports, on the other hand, contain information on transactional
events FROM a Puppet run. Reports are stored in the Puppet Dashboard,
which is a separate database (Storeconfigs uses one database -
potentially PuppetDB, Dashboard uses another database. Storeconfigs
can be MySQL or Postgres but the database backend for the Dashboard
must be MySQL).

As for handling 1000 reports per minute, it's up to how you tune your
MySQL setup. Puppet is going to send reports to your database
'server' (credentials are provided in puppet.conf, so it could be a
VIP based on what you're doing on the backend). I'll let others with
better experience in that area comment on what they've seen in
production.

Brian Gupta

unread,
Jun 5, 2012, 12:10:56 AM6/5/12
to puppet...@googlegroups.com
Thanks Gary for clarifying. My apologies, it's late here.

Ryan Coleman

unread,
Jun 5, 2012, 12:14:30 AM6/5/12
to puppet...@googlegroups.com
On Mon, Jun 4, 2012 at 9:07 PM, Gary Larizza <ga...@puppetlabs.com> wrote:
> Reports, on the other hand, contain information on transactional
> events FROM a Puppet run.  Reports are stored in the Puppet Dashboard,
> which is a separate database

I think it's also important to point out, since you weren't specific
in your post, that you don't -need- a database in order to consume
reports from Puppet runs. That raw data from a Puppet run is consumed
by report handlers which turn it in something you can use, keep and
analyze.

If you want to view reports in the Puppet Dashboard, it uses the http
report handler to consume the report, process it and store it within
mysql.

There are many other report processors (including just yaml on disk).
Here's the doc on the built-in ones:
http://docs.puppetlabs.com/references/stable/report.html and here's a
recent blog post on many others (like IRC, pagerduty, etc):
http://puppetlabs.com/blog/when-puppet-reports-part-1/

akaroot

unread,
Jun 5, 2012, 2:20:18 AM6/5/12
to Puppet Users
I do not use puppet dashboard, I use my own software for store reports
and generating hosts config. The database used to store inventory data
(Facter facts). It create the most load on the database (insert many
requests).

akaroot

unread,
Jun 5, 2012, 4:17:21 AM6/5/12
to Puppet Users
Now I have about 1 000 hosts and I received deadlock errors:

Puppet: Could not retrieve catalog from remote server: Error 400 on
SERVER: Mysql::Error: Deadlock found when trying to get lock; try
restarting transaction: INSERT INTO `fact_values` (`fact_name_id`,
`updated_at`, `host_id`, `value`, `created_at`) VALUES(76, '2012-06-05
10:44:34', 115, '993.80 MB', '2012-06-05 10:44:34')
Puppet: Using cached catalog
Puppet: Could not retrieve catalog; skipping run

akaroot

unread,
Jun 5, 2012, 5:57:24 AM6/5/12
to Puppet Users
In mysql error log:
120605 12:50:50 [ERROR] /usr/sbin/mysqld: Deadlock found when trying
to get lock; try restarting transaction
120605 12:50:50 [ERROR] /usr/sbin/mysqld: Sort aborted: Deadlock

Kewin

unread,
Jun 5, 2012, 10:51:24 AM6/5/12
to Puppet Users
now we use mysql (manage about 1000 nodes) ,the pressure is not the
mysql , but from the puppetmaster .always occur timeout for nodes to
get the infomation
I recommand the percona if U prefer the mysql

akaroot

unread,
Jun 5, 2012, 11:00:44 AM6/5/12
to Puppet Users


On 5 Чер, 17:51, Kewin <kewin2...@gmail.com> wrote:
> now we use mysql (manage about 1000 nodes) ,the pressure is not the
> mysql , but from the puppetmaster .always occur timeout for nodes to
> get the infomation
> I recommand the percona if U prefer the mysql
>
> On 6月5日, 上午4时05分, akaroot <mykolaakar...@gmail.com> wrote:

I plan migrate to MariaDB.
How do you store facts and how often nodes send reports?

akaroot

unread,
Jun 5, 2012, 11:21:57 AM6/5/12
to Puppet Users
I suspect that the deadlock caused by insert trigger (for a log of
changes some of the facts)
DELIMITER $ $
DROP TRIGGER IF EXISTS `facts_insert`;
CREATE TRIGGER facts_insert BEFORE INSERT ON `fact_values`
FOR EACH ROW BEGIN
IF (SELECT `name` FROM `fact_names` where `id` = NEW.fact_name_id)
in ('ipaddress', 'dhcp', 'memory_size') THEN
SET @ fact_val = (SELECT `value` FROM `fact_values_log` WHERE
`host_id` = NEW.host_id AND `fact_name_id` = NEW.fact_name_id ORDER BY
`date` DESC LIMIT 1);
IF @ fact_val IS NULL OR @ fact_val! = NEW.value THEN
INSERT INTO `fact_values_log` SET host_id = NEW.host_id,
`fact_name_id` = NEW.fact_name_id, `value` = NEW.value, `date` =
CURRENT_TIMESTAMP ();
END IF;
END IF;
END $ $
DELIMITER;

But why there is a deadlock? trigger writes to another table. deadlock
occurs only once every 30-60 minutes

Deepak Giridharagopal

unread,
Jun 5, 2012, 3:16:39 PM6/5/12
to puppet...@googlegroups.com
On Mon, Jun 4, 2012 at 7:54 PM, Denmat <tu2b...@gmail.com> wrote:
You should checkout puppetdb which currently locks you into PostgreSQL.

Docs here:
http://docs.puppetlabs.com/#puppetdbpuppetdb09

It's new so may be buggy - though I haven't seen any reports on this (haven't looked ;) )


Indeed, it's targeted specifically at fact and catalog storage. You can set it up solely as an inventory service backend, if you like...in that event, it'll deal just with fact storage. If I'm following this thread correctly, akaroot is trying to store just facts for 30k at a 30m interval per host? 30k hosts is a lot, but PuppetDB does a number of different things than the built-in inventory service backend that may help:

* The SQL and the schema are more explicitly targeting speed
* We front DB operations with queues to prevent exhaustion of connections or I/O bandwidth (# of workers is configurable)
* We automatically handle DB failures through built-in retry mechanisms (with exponential backoff)

You'll need to use PostgreSQL, though. 30k hosts at a 30m runinterval will require some tuning of both PuppetDB and the database, surely...that's around 17 sets of facts every second, which is a lot.

Roughly how many facts per node are we talking about, here? I could try running some simulations and see what happens.

deepak
  
Den

On 05/06/2012, at 4:05, akaroot <mykola...@gmail.com> wrote:

> Hi!
> Which database would you recommend to use mysql or postgresql for
> puppet?
> Estimated number of nodes 30000, they send reports every 30 minutes.
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
>

--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To post to this group, send email to puppet...@googlegroups.com.
To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.


--
Deepak Giridharagopal / Puppet Labs / [irc,twitter] grim_radical

akaroot

unread,
Jun 5, 2012, 3:51:39 PM6/5/12
to Puppet Users


On 5 Чер, 22:16, Deepak Giridharagopal <dee...@puppetlabs.com> wrote:

> Roughly how many facts per node are we talking about, here? I could try
> running some simulations and see what happens.


On average 80-90 facts per host, it's about 1500 inserts per second.
Will mysql cope with the load? I plan migrate to MariaDB. Or I should
use postgres?
Now I use thin storeconfig (storeconfigs = true thin_storeconfigs =
true).
Can PuppetDB store only facts?

Deepak Giridharagopal

unread,
Jun 8, 2012, 7:30:28 PM6/8/12
to puppet...@googlegroups.com
PuppetDB handles storeconfigs duties as well as inventory service duties; that is, it stores catalogs and facts. 30k nodes is a lot of data, but theoretically once we get a single catalog for each node then subsequent catalogs should be deduped (assuming your actual puppet code changes relatively infrequently, catalogs should often be the same for a node from run to run). If nothing else, you can at least try it out and if it doesn't work for you, you can just change some configuration file settings and point back at your current DB if you like.

Feel free to hit me up on #puppet on Freenode, I'm grim_radical. I'm very much interested in getting PuppetDB working for you if at all possible. For your level of scale, it may be better to chat about this in real-time as opposed to over email. :)

Cheers,
Deepak

--
Deepak Giridharagopal / Puppet Labs / grim_radical
Reply all
Reply to author
Forward
0 new messages