puppetdb export - WTF?!

74 views
Skip to first unread message

JonY

unread,
Oct 7, 2014, 11:30:35 AM10/7/14
to puppet...@googlegroups.com
(ok - terrribly unprofessional title.. I get it).

I'm trying to 'do the right thing' and move from the embedded DB to Postgres. Following the instructions I figured I would dump out the contents of the embedded DB and import this into Postgres.

So I start 'puppetdb export --outfile <someplace>'. 

Pour some coffee.

Watch it dump one node.

Have some more coffee.

Wait.

One more node.

Hours pass. Glaciers melt. 

3+ hours into the process and it has dumped 10 nodes (of > 100). At this rate I'm looking at about 1.5 days to get this done. Really? 

System has 16 cores, 32Gb of RAM.. barely running at idle. Tell me I missed some critical parameter.


Wyatt Alt

unread,
Oct 7, 2014, 12:00:17 PM10/7/14
to puppet...@googlegroups.com
Hey JonY,

Sounds interesting.  What version of PuppetDB are you using?  Do you have reports, facts, and catalogs, or only some of those? Can you paste your config.ini?

Also can you give the output of

ps aux |grep java
top -n1
free

in a gist maybe?

Wyatt





--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/1fbe251a-2c25-42ea-9529-d01bbf561980%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

JonY

unread,
Oct 7, 2014, 2:03:30 PM10/7/14
to puppet...@googlegroups.com
https://gist.github.com/ce60c590a0531c0b09cd.git


# rpm -qa | grep puppet
puppet-server-3.7.1-1.el6.noarch
puppetlabs-release-6-11.noarch
puppetdb-terminus-2.2.0-1.el6.noarch
puppet-3.7.1-1.el6.noarch
vim-puppet-2.7.20-1.el6.rf.noarch
puppetdb-2.2.0-1.el6.noarch

Am storing 30 days of data. Yes - it's a fair bit more then the default.

Wyatt Alt

unread,
Oct 7, 2014, 4:19:04 PM10/7/14
to puppet...@googlegroups.com
Thanks Jony.

I've loaded up an embedded database to comparable capacity and while export isn't quick, it's not nearly as slow as you're experiencing.

Fro your process list the PDB appears to be running with a max heap size (Xmx) of 1024m.  Perhaps increasing this could make a difference?

The export process appears to be using 192m, but I think that should be using the same JAVA_ARGS as PuppetDB itself and so could be a bug.

Wyatt


JonY

unread,
Oct 7, 2014, 4:32:02 PM10/7/14
to puppet...@googlegroups.com
Running for 7 hours now. Has exported ~15-20% of the data.

I'm intrigued to see what I end up with.

JonY

unread,
Oct 9, 2014, 6:50:13 AM10/9/14
to puppet...@googlegroups.com
Update: 45 hours - ~75% complete

Felix Frank

unread,
Oct 10, 2014, 7:30:37 PM10/10/14
to puppet...@googlegroups.com
On 10/09/2014 12:50 PM, JonY wrote:
> Update: 45 hours - ~75% complete

Well, that's nice and all, but...

> Fro your process list the PDB appears to be running with a max heap
> size (Xmx) of 1024m. Perhaps increasing this could make a difference?

...have you tried this then? What was the effect?

Wyatt Alt

unread,
Oct 10, 2014, 9:13:04 PM10/10/14
to puppet...@googlegroups.com, JonY
Hey Jon,

Thanks for the update and the ticket:
https://tickets.puppetlabs.com/browse/PDB-947

We've been trying to reproduce this today and have had relatively
little luck. As you alluded earlier, it seems like a possible
contributor might be the number of reports per node. Would you mind
giving us the sizes of your exported tarball and contained folders, as
well as the average number of reports per node?

I'm currently running a 200 node hsql export with 745 reports per node
(149k total), which is beyond the scale we would expect to be
performant on hsql, and at current pace it will still be done in seven
hours or so.

To the extent that you're still having trouble a stack dump may be
helpful in diagnosing the issue.

Thanks,
Wyatt
> puppet-users...@__googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/__msgid/puppet-users/1fbe251a-__2c25-42ea-9529-d01bbf561980%__40googlegroups.com
> <https://groups.google.com/d/msgid/puppet-users/1fbe251a-2c25-42ea-9529-d01bbf561980%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit
> https://groups.google.com/d/__optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to
> the Google Groups "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails
> from it, send an email to puppet-users...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-users/9b735460-1722-48dc-9ad5-f34f554567f4%40googlegroups.com
> <https://groups.google.com/d/msgid/puppet-users/9b735460-1722-48dc-9ad5-f34f554567f4%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to puppet-users...@googlegroups.com
> <mailto:puppet-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-users/b0bd8590-0649-4188-a325-fe41168f96bc%40googlegroups.com
> <https://groups.google.com/d/msgid/puppet-users/b0bd8590-0649-4188-a325-fe41168f96bc%40googlegroups.com?utm_medium=email&utm_source=footer>.

JonY

unread,
Oct 12, 2014, 9:19:22 PM10/12/14
to puppet...@googlegroups.com, ethr...@gmail.com
I pulled the plug on it after > 48 hours. The file is 14Mb at this point. 

I may start the Postgres server and try the export again. Perhaps if puppetdb isn't running this will be more efficient.

Wyatt Alt

unread,
Oct 13, 2014, 11:07:28 AM10/13/14
to puppet...@googlegroups.com, Jon Yeargers
Hey Jon, you can inform us how many reports you have without rerunning the export.  That would be useful information:

curl -D - -G http://localhost:8080/v4/reports --data-urlencode 'include-total=true' -o /dev/null

obviously substituting your own host if it's different.  Also note the dash after D. What does it give you under X-Records?

Wyatt


Reply all
Reply to author
Forward
0 new messages