puppet-dashboard 2.0.0 (open source) and postgresq 8.4l tuning

Pete Hartman

unread,

Mar 17, 2014, 4:29:26 PM3/17/14

to puppet...@googlegroups.com

I deployed the open source puppet-dashboard 2.0.0 this past weekend for our production environment. I did a fair amount of testing in the lab to ensure I had the deployment down, and I deployed as a passenger service knowing that we have a large environment and that webrick wasn't likely to cut it. Overall, it appears to be working and behaving reasonably--I get the summary run status graph, etc, the rest of the UI. Load average on the box is high-ish but nothing unreasonable, and I certainly appear to have headroom in memory and CPU.

However, when I click the "export nodes as CSV" link, it runs forever (Hasn't stopped yet).

I looked into what the database was doing and it appears to be looping over some unknown number of report_ids, doing

    7172 | dashboard | SELECT COUNT(*) FROM "resource_statuses" WHERE "resource_statuses"."report_id" = 39467 AND "resource_statuses"."failed" = 'f' AND (
IN ( | 00:00:15.575955
                     :           SELECT resource_statuses.id FROM resource_statuses

                     :             INNER JOIN resource_events ON resource_statuses.id = resource_events.resource_status_id

                     :             WHERE resource_events.status = 'noop'

                     :         )

                     : )

I ran the inner join by hand and it takes roughly 2 - 3 minutes each time. The overall query appears to be running 8 minutes per report ID.

I've done a few things to tweak postgresql before this--it could have been running longer earlier when I first noticed the problem.

I increased checkpoint segments to 32 from the default of 3, the checkpoint_completion_target to 0.9 from the default of 0.5, and to be able to observe what's going on I set stats_command_string to on.

Some other details: we have 3400 nodes (dashboard is only seeing 3290 or so, which is part of why I want this CSV report to determine why it's a smaller number). This postgresql instance is also the instance supporting puppetdb, though obviously a separate database. The resource statuses table has 47 million rows right now, and the inner join returns 4.3 million.

I'm curious if anyone else is running this version on postgresql with a large environment and if there are places I ought to be looking to tune this so it will run faster, or if I need to be doing something to shrink those tables without losing information, etc.

Thanks

Pete

Pete Hartman

unread,

Mar 17, 2014, 5:31:09 PM3/17/14

to puppet...@googlegroups.com

I also increased bgwriter_lru_maxpages to 500 from the default 100.

Gav

unread,

Dec 19, 2014, 3:48:14 PM12/19/14

to puppet...@googlegroups.com

Pete, what version of Passenger are you running? I have deployed puppet-dashboard 2.0.0 this week with Passenger 4.0.56 and Ruby 1.9.3, but Passenger is just eating the memory.

------ Passenger processes -------

PID VMSize Private Name

----------------------------------

5173 6525.1 MB 3553.0 MB Passenger RackApp: /local/puppet/dashboard/dashboard

5662 5352.7 MB 4900.8 MB Passenger RackApp: /local/puppet/dashboard/dashboard

5682 5736.8 MB 5307.1 MB Passenger RackApp: /local/puppet/dashboard/dashboard

8486 6525.2 MB 4469.5 MB Passenger RackApp: /local/puppet/dashboard/dashboard

10935 6525.0 MB 3282.3 MB Passenger RackApp: /local/puppet/dashboard/dashboard

11885 6380.3 MB 3905.9 MB Passenger RackApp: /local/puppet/dashboard/dashboard

20886 209.8 MB 0.1 MB PassengerWatchdog

20889 2554.9 MB 7.2 MB PassengerHelperAgent

20896 208.9 MB 0.0 MB PassengerLoggingAgent

21245 2602.8 MB 2268.6 MB Passenger RackApp: /local/puppet/dashboard/dashboard

22912 500.7 MB 115.4 MB Passenger RackApp: /local/puppet/etc/rack

24873 6505.1 MB 3592.6 MB Passenger RackApp: /local/puppet/dashboard/dashboard

26226 1944.3 MB 1616.6 MB Passenger RackApp: /local/puppet/dashboard/dashboard

29012 6525.0 MB 3460.4 MB Passenger RackApp: /local/puppet/dashboard/dashboard

30564 4072.7 MB 3675.4 MB Passenger RackApp: /local/puppet/dashboard/dashboard

31060 3526.8 MB 3181.6 MB Passenger RackApp: /local/puppet/dashboard/dashboard

31733 6505.5 MB 5761.4 MB Passenger RackApp: /local/puppet/dashboard/dashboard

31740 6525.4 MB 5812.2 MB Passenger RackApp: /local/puppet/dashboard/dashboard

### Processes: 18

### Total private dirty RSS: 54910.21 MB

Any help would be appreciated.

Cheers,

Gavin

Pete Hartman

unread,

Dec 19, 2014, 4:30:44 PM12/19/14

to puppet...@googlegroups.com

I'm no longer at that position, haven't seen it in 8 months....

--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/Cq6h0bl_wvw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/9facbf64-4dab-4566-b967-1d36f1235e2f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ramin K

unread,

Dec 19, 2014, 6:28:21 PM12/19/14

to puppet...@googlegroups.com

I would trim down the number of dashboard processes you need to a max
of 2-4, a min of 1, and recycle every 10k requests. You can set all of
that in the vhost IIRC. The Passenger docs are pretty good in the that
regard.

Ramin

> <http://resource_statuses.id> FROM resource_statuses
>
> : INNER JOIN resource_events ON
> resource_statuses.id <http://resource_statuses.id> =

> resource_events.resource_status_id
>
> : WHERE resource_events.status =
> 'noop'
>
> : )
>
> : )
>
>
>
> I ran the inner join by hand and it takes roughly 2 - 3 minutes each
> time. The overall query appears to be running 8 minutes per report ID.
>
> I've done a few things to tweak postgresql before this--it could
> have been running longer earlier when I first noticed the problem.
>
> I increased checkpoint segments to 32 from the default of 3, the
> checkpoint_completion_target to 0.9 from the default of 0.5, and to
> be able to observe what's going on I set stats_command_string to on.
>
> Some other details: we have 3400 nodes (dashboard is only seeing
> 3290 or so, which is part of why I want this CSV report to determine
> why it's a smaller number). This postgresql instance is also the
> instance supporting puppetdb, though obviously a separate database.
> The resource statuses table has 47 million rows right now, and the
> inner join returns 4.3 million.
>
> I'm curious if anyone else is running this version on postgresql
> with a large environment and if there are places I ought to be
> looking to tune this so it will run faster, or if I need to be doing
> something to shrink those tables without losing information, etc.
>
> Thanks
>
> Pete
>

> --
> You received this message because you are subscribed to the Google
> Groups "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to puppet-users...@googlegroups.com
> <mailto:puppet-users...@googlegroups.com>.

> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-users/9facbf64-4dab-4566-b967-1d36f1235e2f%40googlegroups.com

> <https://groups.google.com/d/msgid/puppet-users/9facbf64-4dab-4566-b967-1d36f1235e2f%40googlegroups.com?utm_medium=email&utm_source=footer>.

Gav

unread,

Jan 2, 2015, 6:39:47 AM1/2/15

to puppet...@googlegroups.com, ramin...@badapple.net

Thanks chaps. It turns out that an internal process was DOS'ing the dashboard with wget's for nodes.csv.

> <mailto:puppet-users+unsub...@googlegroups.com>.

Reply all

Reply to author

Forward