** Release Candidate **
PuppetDB 1.6.0-rc2
Prerelease: PuppetDB 1.6.0 is not yet released
* RC1: January 7th, 2014.
* RC2: January 14th, 2014.
PuppetDB 1.6.0-rc2 Downloads
----------------------------
Available in native package format in the pre-release repositories at:
http://yum.puppetlabs.com and
http://apt.puppetlabs.com
For information on how to enable the Puppet Labs pre-release repos, see:
http://docs.puppetlabs.com/guides/puppetlabs_package_repositories.html#enabling-the-prerelease-repos
Binary tarball:
http://downloads.puppetlabs.com/puppetdb/
Source:
http://github.com/puppetlabs/puppetdb
Please report feedback via the Puppet Labs tickets site, using an
affected PuppetDB version of 1.6.0-rc2:
https://tickets.puppetlabs.com/browse/PDB
Documentation:
http://docs.puppetlabs.com/puppetdb/1.6
Puppet module:
http://forge.puppetlabs.com/puppetlabs/puppetdb
PuppetDB 1.6.0 Release Notes
----------------------------
PuppetDB 1.6.0 is a performance and bugfix release.
Things to take note of before upgrading:
* There are a number of database migrations performed upon first
startup. This will require extra free disk space on your PostgreSQL
server while we rebuild some tables, so make sure you have enough free
space _before_ you start the upgrade. Since a full table migration
takes a copy of the old table, a good rule of thumb would be to check
your current database usage (using du -sh on the database directory
for example) and ensure you have as much free space on that partition.
While all migrations are protected by an atomic database transaction,
it never hurts to backup your database beforehand just in case.
* Also due to the migrations that will take place, it might take
several minutes (depending on the size of your database) before
PuppetDB can respond to requests again. Give the migration ample time
to complete and you should be fine.
* Make sure all your PuppetDB instances are shutdown and only upgrade
one at a time.
* As usual, don’t forget to upgrade your puppetdb-terminus package
also (on the host where your Puppet Master lives), and restart your
master service.
* This is primarily a performance release, so we’re interested in any
feedback (good or bad) as to how helpful these changes have been
especially around catalog hash-miss performance and general updates on
the database. Most of the performance improvements are around facts
and catalogs, so we don’t expect any improvements for report storage
this time around.
Changes specific to rc2:
* (PDB-303) Utilise final version of prismatic/schema 0.2.0
* (PDB-279) Sanitize report imports
* (PDB-247) Fixes the namespace override warnings from schema
* (PDB-107) Support chained CA certificates
Notable improvements and fixes:
* (PDB-81) Deprecate JDK6. It's been EOL for quite some time.
* (#21083) Differential fact storage
Previously when facts for a node were replaced, all previous facts
for that node were deleted and all new facts were inserted. Now
existing facts are updated, old facts (no longer present) are
deleted and new facts are inserted. This results in much less I/O,
both because we have to write much less data and also because we
reduce churn in the database tables, allowing them to stay compact
and fast.
* (PDB-68) Differential edge storage
Previously when a catalog wasn't detected as a duplicate, we'd have
to reinsert all edges into the database using the new catalog's
hash. This meant that even if 99% of the edges were the same, we'd
still insert 100% of them anew and wait for our periodic GC to clean
up the old rows. We now only modify the edges that have actually
changed, and leave unchanged edges alone. This results in much less
I/O as we touch substantially fewer rows.
* (PDB-69) Differential resource storage
Previously when a catalog wasn't detected as a duplicate, we'd have
to reinsert all resource metadata into the catalog_resources table
using the catalog's new hash. Even if only 1 resource changed out of
a possible 1000, we'd still insert 1000 new rows. We now only modify
resources that have actually changed. This results in much less I/O
in the common case.
* Streaming resource and fact queries. Previously, we'd load all rows
from a resource or fact query into RAM, then do a bunch of sorting
and aggregation to transform them into a format that clients
expect. That has obvious problems involving RAM usage for large
result sets. Furthermore, this does all the work for querying
up-front...if a client disconnects, the query continues to tax the
database until it completes. And lastly, we'd have to wait until all
the query results have been paged into RAM before we could send
anything to the client. New streaming support massively reduces RAM
usage and time-to-first-result. Note that currently only resource
and fact queries employ streaming, as they're the most
frequently-used endpoints.
* Improvements to our deduplication algorithm. We've improved our
deduplication algorithms to better detect when we've already stored
the necessary data. Early reports from the field has show users who
previously had deduplication rates in the 0-10% range jumping up to
the 60-70% range. This has a massive impact on performance, as the
fastest way to persist data is to already have it persisted!
* Eliminate joins for parameter retrieval. Much of the slowness of
resource queries comes from the format of the resultset. We get back
one row per parameter, and we need to collapse multiple rows into
single resource objects that we can emit to the client. It's much
faster and tidier to just have a separate table that contains a
json-ified version of all a resource's parameters. That way, there's
no need to collapse multiple rows at all, or do any kind of ORDER BY
to ensure that the rows are properly sequenced. In testing, this
appears to speed up queries between 2x-3x...the improvement is much
better the larger the resultset.
* (#22350) Support for dedicated, read-only databases. Postgres
supports Hot Standby (
http://wiki.postgresql.org/wiki/Hot_Standby)
which uses one database for writes and another database for
reads. PuppetDB can now point read-only queries at the hot standby,
resulting in improved IO throughput for writes.
* (#22960) Don't automatically sort fact query results
Previously, we'd sort fact query results by default, regardless of
whether or not the user has requested sorting. That incurs a
performance penalty, as the DB has to now to a costly sort
operation. This patch removes the sort, and if users want sorted
results they can use the new sorting parameters to ask for that
explicitly.
* (#22947) Remove resource tags GIN index on Postgres. These indexes
can get large and they aren't used. This should free up some
precious disk space.
* (22977) Add a debugging option to help diagnose catalogs for a host
that hash to different values
Added a new global config parameter to allow debugging of catalogs
that hash to a different value. This makes it easier for users to
determine why their catalog duplication rates are low. More details
are in the included "Troubleshooting Low Catalog Duplication" guide.
* (PDB-56) Gzip HTTP responses
This patchset enables Jetty's gzip filter, which will automatically
compress output with compressible mime-types (text, JSON, etc). This
should reduce bandwidth requirements for clients who can handle
compressed responses.
* (PDB-70) Add index on catalog_resources.exported
This increases performance for exported resource collection
queries. For postgresql the index is only a partial on exported =
true, since indexing on the very common 'false' case is not that
effective. This gives us a big perf boost with minimal disk usage.
* (PDB-85) Various fixes for report export and anonymization
* (PDB-119) Pin to RSA ciphers for all jdk's to work-around Centos 6
EC failures
We were seeing EC cipher failures for Centos 6 boxes, so we now pin
the ciphers to RSA only for all JDK's to work-around the
problem. The cipher suite is still customizable, so users can
override this is they wish.
* Fixes to allow use of public/private key files generated by a wider
variety of tools, such as FreeIPA.
* (#17555) Use systemd on recent Fedora and RHEL systems.
* Documentation for `store-usage` and `temp-usage` MQ configuration
options.
* Travis-CI now automatically tests all pull requests against both
PostgreSQL and HSQLDB. We also run our full acceptance test suite on
incoming pull requests.
* (PDB-102) Implement Prismatic Schema for configuration validation
In the past our configuration validation was fairly ad-hoc and imperative. By
implementing an internal schema mechanism (using Prismatic Schema) this should
provice a better and more declarative mechanism to validate users
configuration rather then letting mis-configurations "fall through" to
internal code throwing undecipherable Java exceptions.
This implementation also handles configuration variable coercion and
defaulting also, thus allowing us to remove a lot of the bespoke code we had
before that performed this activity.
* (PDB-279) Sanitize report imports
Previously we had a bug PDB-85 that caused our exports on 1.5.x to fail. This
has been fixed, but alas people are trying to import those broken dumps into
1.6.x and finding it doesn't work.
This patch sanitizes our imports by only using select keys from the reports
model and dropping everything else.
* (PDB-107) Support chained CA certificates
This patch makes puppetdb load all of the certificates in the CA .pem
file into the in-memory truststore. This allows users to use a
certificate chain (typically represented as a sequence of certs in a
single file) for trust.
PuppetDB 1.6.0 Contributors
---------------------------
Chris Price, Deepak Giridharagopal, James Sweeney, Just Holguin, Ken
Barber, Kushal Pisavadia, Matthaus Owens, Melissa Stone, Moses
Mendoza, Nick Fagerlund, Nick Lewis, Rob Braden, Ryan Senior and Zack
Smith.
PuppetDB 1.6.0 Changlog
-----------------------
Chris Price (9):
160bc70 Fix typo in event status docs
444ac73 Fix database transaction scope for resources endpoint
3e49844 Add docs for catalog and version endpoints
8160ca4 Add `/v3` prefix to version and server-time docs
de2cc54 Remove references to required 'Accept: application/json' header
7371e9b Port PuppetDB to use kitchensink
df40728 Add leinengen pedantic flag
b3b354b Update kitchensink dependency to released 0.1.0 version
4f87739 Remove unused `test:package` rake task
Deepak Giridharagopal (24):
3f8fe26 Streaming resource queries
7ca69ae Add a table containing full parameter sets
b135a10 Remove resource-query-limit
3c3d0d5 Remove unneeded transaction
5e27747 Add test cases for params cache migration
1e771c6 Fix broken test case on PG
555e492 Query compilation should happen within a db transaction
0f9cca7 Upgrade to latest core.match
b8b8e9f Removed indexes that are duplicated by table constraints
a2d0099 Handle PEM privkeys that aren't represented as KeyPair objects
069f11c Streaming fact queries
70d014b (#22960) Don't automatically sort fact query results
6c3e4c9 Fix tests that assumed ordered output
9051813 We neglected to update a few of our dashboard links for v3
5b4b343 Update remaining v2 links to v3
4a4db8d (PDB-56) Gzip HTTP responses
6dcb194 Rename "defaults" to better reflect intent
5ecac0d Replace pre/post-conditions with type hint
81401fa Remove illegal keys from fact payload
766a7a2 Change from reduce to inject for ruby 1.8.5 compat
3c034c2 Change from reduce to inject for ruby 1.8.5 compat
3864052 Update changelog for 1.6.0
b144d74 Remove changelog items included in prev releases
80a207b (PDB-107) Support chained CA certificates
Justin (3):
19d3c02 Simplified string-contains?
0f4cda4 Extended the test for dissoc-if-nil
1ef07c2 Minor changes to string-contains? and string-contains?-test
Justin Holguin (2):
b53b0f4 Docs: add note about PE and jetty.ini
7102193 Docs: fixed typo
Ken Barber (46):
fb5f57f Provide some cleanup from review
bedfcc8 Introduce command 'puppetdb' as a sub-command dispatcher
bbd22ae Move sub-commands to libexec dir out of path
ee8027a Include legacy sub-command acceptance test
ae81173 Test to ensure the puppetdb command list all existing sub-commands
d275f9e Change regex for legacy subcommand warning test
8e992ac Improvements to puppetdb and legacy handling
01ee468 Test on file presence, not execution
4eab20f Rebase from master, fixing other non-sub commands
6ee5c9f Fix packaging to include sub-command dir for debian derivatives
0e7c790 Fix other legacy uses of puppetdb-ssl-setup to use the sub-command
dfe3fbe Fix the libexec syntax in base.install and remove
backquotes from echo
12e92a0 Provide better hash reproduction
b05989f Create a namespace to host the add-common-json-decoders!
function so it can be simply required
6caa40d Make com.puppetlabs.cheshire a proxy for cheshire.core
common functions
6e86672 Extend travis testing to test against postgres
7d39a2e Adapt to the PuppetDB 3.0.0 module changes
22b9f9a Remove extraenous setup file
3070895 Fix report tests to utilise postgresql 3.0.0 syntax
1127881 testing: Improve git fetch handling to support refspec
35147db PDB-67 Converting to using a bigint for catalogs table primary key
1cf43ed PDB-67 Remove match simple definitions from create table
6082eb1 PDB-67 Switch add-resources! and add-edges! to use
catalog id instead of hash
e4dd7bb PDB-70 Add index on catalog_resources.exported
6af3053 Replace insert-values with a select query for postgresql <= 8.1
2f1bc42 PDB-84 Update to latest version of PostgreSQL JDBC driver
1d635e3 Fix cli with no argument handling
a3f73ec Disclose the protocol for import/export CLI help
e936e06 PDB-85 Fix report export format
fb65525 PDB-109 Ensure basic_collection tests work on a clean database
19d4afa Fix reference to configure-logging! function
289bd0d Run gem install of activerecord with greater verbosity
ab3e975 Set a default timeout for sleep_until_queue_empty
cc2c000 Use the latest 1.x version of beaker
6557f41 PDB-119 Pin to RSA ciphers for all jdk's to work-around
Centos 6 EC failures
41ba297 PDB-68 Differential edges
96dd109 Remove try/catch for with-transaction and use only
wire-format for testing catalog concurrency
a14c7d7 PDB-249 Fix command-processing configuration exceptions
16b2a04 PDB-249 Pass through values directly during
get-construct-fn for core types
e128094 Update documentation headers for 1.6.x
c072e53 Update version docs to include a bogus version string
990f2b2 PDB-273 Fix stubs for HTTP tests to stub a higher level
a96c53e PDB-279 Sanitize report imports
c086389 PDB-303 Utilise final version of prismatic/schema 0.2.0
857999b PDB-279 Remove extraneous brackets from functions in ->
87ad27e PDB-299 Release notes for 1.6.0-rc2 changes
Kushal Pisavadia (1):
72d8a23 Clojure code cleanups
Melissa Stone (1):
8dc39b5 (maint) Add fedora 20 to mock list
Nick Fagerlund (2):
dccae48 (Docs) Note in installation pages that PuppetDB is
installed as part of PE 3.0+
8a9370a Maint: Copyedits to low catalog duplication guide
Rob Braden (7):
6886cce (packaging) (#17555) Use systemd for Fedora >= 17 and EL >= 7
9babb41 (packaging) (#17555) Fix issues with systemctl path in
the RPM spec file
cc231e9 (packaging) (#20148) Update logrotate script when appropriate
223c157 (packaging) (maint) Clean up the Debian rules file
f379bc5 (packaging)(#21631)Update RPM to use Java 1.7 as the default
f5064c7 (packaging)(#21631)Update Debian packaging to use Java
1.7 as the default
8b19b31 (packaging)(#21631)Update JAVA_BIN for OpenSUSE
Ryan Senior (64):
6a171b1 (#22350) Adding a read database, useful for Postgres Hot
Standby clusters
304eab9 Wrap the two-db shutdown call in a with-connection to
ensure the correct connection is used when shutting down an hsql
instance
23e0011 Change the max # of db connections to 25 per pool, 25
for reads, 25 for writes
951e061 Change root logger level to ERROR when running locally,
gets rid of some noise when running unit tests locally
a31691b Adding --root-keys for beaker test runs
16d06b7 Moved the database.ini original config replacement to
after the agent queries. Fixes current acceptance test failure.
693d7f2 Added a skip for the read database acceptance test when
running the embedded db
ab18aee Fix skip_test comment typo
110727e (#21083) Adding insert/update/delete logic to replace-facts!
b5503e2 Minor docstring changes
e2c5095 Moved the in-clause function from scf/storage to jdbc
84b5707 Added a docstring to scf.storage/insert-facts!
2b40709 Pushed down the timestamp checking of the fact metadata
table. Changed the isolation level of replace-facts! to ensure there
are no silent write/write failures when there are two replace-facts!
commands running concurrently for the same node. In that scenario, a
potentially older set of facts could overwrite a newer set of facts.
Added a test with an example of that scenario to ensure it's covered.
7830be6 Fixed a bug when replacing facts has only new facts (no
existing facts changed) caused an NPE, also added a test where no
facts were changed
7089b53 Moved the concurrent fact updating test up a level to
the command tests. This is needed now that the repeatable read
isolation happens at a higher level (in the command execution).
ec68e72 Added a with-transacted-connection' that allows
specifying transaction isolation level right after getting the
connection, the existing with-transacted-connection uses the same code
and uses the JDBC default isolation level
6ca619d Removed an unused function and no longer needed select
for update logic. That functionality is not handled by setting the
trasnaction isolation level to repeatable read.
88a987f Modify an expected error message slightly for the
postgres unit tests
2a9986e Streaming facts working with master
00c5b5d Minor changes after code review: - switch v3/facts to
use the PL cheshire wrapper, to ensures encoders are loaded - Change
v3/resources.clj to use json-response* similar to v2/resources.clj -
Removed an unnecessary IllegalStateException catch in v3/facts.clj -
Added some docstrings to testutils paged-results and paged-results*
1f2207d Fix for the current unit test hanging issue.
83e616d Added functions to test/services.clj to launch PuppetDB
instances from the REPL
5acd4e0 Moved the launch-puppetdb functions into a new
testutils.repl namespace
5229587 Added preconditions to the launch-puppetdb fn
701a0a0 (#22947) Remove resource tags GIN index on Postgres
c0b60b8 (PDB-9) Improve report query syntax error handling
7c2afa4 Split config related fns out of services.clj and into
it's own namespace
11b9269 Move catalog/resource hash computation functions out of
scf/storage.clj and into scf/hash.clj
0c9914f Moved storage utility functions out of storage.clj to
prevent a cyclic dependency between storage.clj and query.clj
8579508 Changed the storage-utils require alias from sutil -> sutils
a3de4f2 (22977) Add a debugging option to help diagnose catalogs
for a host that hash to different values
c36204a Added a new Troubleshooting Low Catalog Duplication
guide and other docs for the new catalog hash debugging feature
7617392 Added warnings when catalog hash debugging is enabled
3eda703 Changed catalog debugging paths to be platform agnostic.
ec2a22e Moved hash debugging fns to the scf/hash_debug.clj
ee83a1e Added an acceptance helper method for getting the vardir
3259634 Documentation changes - added a catalog debugging
example and another performance related FAQ
4a28099 Changed the names of the catalog debugging files from
new/old-catalog.* to catalog-new/old.*
fd76c36 Updated the catalog hash debugging code to use the same
UUID for all 5 generated files for a given host
970309e Munge the edge relationships and resource parameters to
make diffing the EDN files easier
e5add8c Minor change to an acceptance test helper method name
498329c Added direct test of cheshire/spit-json and
utils/true-str?, and some minor doc changes
a3188db Moved more config defaulting code from services.clj to config.clj
e8ed2e8 Added a first cut at using Prismatic Schema for
validating, defaulting and converting user specified config
521f6c2 Added a bunch of schema tests, moved more config from
config.clj over to schema, more defaulting from jdbc.clj
153b82b Added Prismatic style pragmas and an ns level doc string
f2c5fb2 Removed an old commandproc config function
d2b957c Fixed the wire_format links on the catalogs API page,
removed some HTML markup that wasn't rendering correctly in a code
example
c9eb989 (PDB-81) - Adding a JDK 1.6 deprecation warning message
036ea89 Debugging a CI server failure
ce6f6e3 Revert "Debugging a CI server failure"
8fd1a7f Switch to puppetlabs/tools.namespace 0.2.4.1 with a fix
for directories that contain a space
32df731 Move more connection pool config out of jdbc.clj and
into config.clj (using Schema)
0193bf4 Updated docstrings to clarify args, renamed
convert-ini-map-config fn in config.clj
6214877 (PDB-69) Differential updates for catalog_resources
b7637d3 (PDB-126) Command retries should log the exception's stacktrace
28e46e7 Added schema contracts to storage.clj
ee5384e Moved diff-fn over to utils, schemafied it
8f3921b Refactored some stored tests to use the new with-wrapped-fn-args
0227631 Fixes needed due to
https://github.com/Prismatic/schema/issues/21. Switched from
schema.core/defn to schema.macros/defn and schema.core/String and
schema.core/Number to String and Number. Loosened the schema around
resource parameters and the facts map.
7cb3127 Added a warning comment about the timestamp/expiration
keys for facts
71b2403 Switched test/schema.clj to to use platform specific
Numbers/Strings
7b91bb7 Changed the replace-facts test to pass in the correct facts map
74eee2f (PDB-247) Fixes the namespace override warnings from schema
Zack Smith (1):
1477ee3 Add exports puppet face
supercow (1):
3245614 (#22380) Document store-usage and temp-usage
Regards
Ken Barber
Software Developer, Puppet Labs Inc.