In the past we have talked about our vision and goals for Jenkins 3.0
on this list. Here is one of mine.
Has anyone besides me been highly dissatisfied with the way Jenkins
does object persistence? I think we are leaving a lot of functionality
and performance on the table by using flat files rather than a
relational database. Just run
syncsnoop.bt on any Jenkins controller
and observe that a standard installation writes out dozens of tiny
files per second while running a Pipeline job and calls fsync(2) on
every single one of them (!). This architectural choice is
constraining our ability to implement new features at reasonable
performance, especially with regard to test results and static
analysis checks.
I think SQLite is the ideal choice for a relational database for
Jenkins. SQLite directly competes with flat files, which is what we
are using today. Furthermore, it is serverless, so it would not
introduce any new installation or upgrade requirements. The migration
could be handled transparently on upgrade to the new version.
True, SQLite allows at most one writer to proceed concurrently. But do
we really need to support more than one concurrent writer for most
metadata, like the Configure System page? Obviously we need to support
concurrent builds of jobs. This can be handled by defining a set of
namespaces as concurrency domains, each one backed by its own SQLite
database. For example, we can have one SQLite database for global
configuration, one SQLite database for the build queue, one SQLite
database for each job (or even build), etc. In this way we can in fact
support multiple writers interacting with different parts of the
system concurrently. The point is that by grouping these into
high-level buckets we can take advantage of the economies of scale
provided by the database and OS page cache.
I put together a quick prototype today at
https://github.com/basil/jenkins/tree/sqlite. My Jenkins home looks
like this:
${JENKINS_HOME}/sqlite.db (one primary SQLite database)
${JENKINS_HOME}/jobs/test/sqlite.db (one SQLite database per job in
this prototype)
The primary SQLite database has these tables:
$ sqlite3 sqlite.db .tables
config
hudson.model.UpdateCenter
hudson.plugins.git.GitTool
jenkins.security.QueueItemAuthenticatorConfiguration
jenkins.security.UpdateSiteWarningsConfiguration
jenkins.security.apitoken.ApiTokenPropertyConfiguration
jenkins.telemetry.Correlator
nodeMonitors
org.jenkinsci.plugins.workflow.flow.FlowExecutionList
queue
users/admin_12464527240177267930/config
users/users
Each table represents an old XML file. In this prototype I am just
serializing the object with XStream and Jettison as JSON rather than
XML and storing it in one JSON column. Why JSON, you ask? Because
SQLite has a fully featured JSON extension. So here is how config.xml
looks:
$ sqlite3 sqlite.db 'select json from config'
{"hudson":{"disabledAdministrativeMonitors":[""],"version":"2.342-SNAPSHOT","numExecutors":2,"mode":"NORMAL","useSecurity":true,"authorizationStrategy":{"@class":"hudson.security.AuthorizationStrategy$Unsecured"},"securityRealm":{"@class":"hudson.security.HudsonPrivateSecurityRealm","disableSignup":true,"enableCaptcha":false},"disableRememberMe":false,"projectNamingStrategy":{"@class":"jenkins.model.ProjectNamingStrategy$DefaultProjectNamingStrategy"},"workspaceDir":"${JENKINS_HOME}\/workspace\/${ITEM_FULL_NAME}","buildsDir":"${ITEM_ROOTDIR}\/builds","markupFormatter":{"@class":"hudson.markup.EscapedMarkupFormatter"},"jdks":[""],"viewsTabBar":{"@class":"hudson.views.DefaultViewsTabBar"},"myViewsTabBar":{"@class":"hudson.views.DefaultMyViewsTabBar"},"clouds":[""],"scmCheckoutRetryCount":0,"primaryView":"all","slaveAgentPort":-1,"label":"","crumbIssuer":{"@class":"hudson.security.csrf.DefaultCrumbIssuer","excludeClientIPFromCrumb":false},"nodeProperties":[""],"globalNodeProperties":[""],"nodeRenameMigrationNeeded":false}}
The job SQLite database has these tables:
$ sqlite3 jobs/test/sqlite.db .tables
build config junitResult workflow
These correspond to the old XML files as well. So builds/1/build.xml
is row 1 in the build table with a JSON column for its content,
builds/1/junitResult.xml is row 1 in the junitResult table with a JSON
column for its content, builds/1/workflow/2.xml is a row in the
workflow table with a composite key of workflow 2 and build 1 and a
JSON column for its content, etc. I have not yet attempted to deal
with things like SCM changelogs, permalinks, nextBuildNumber, and the
like, but these could all be moved into the SQLite database as well.
Halfway through this prototype I realized I was building an ORM from
scratch, so it might be worth exploring an existing solution like
Hibernate. But I was able to get quite far just stuffing JSON from
XStream into a primitive table layout in SQLite.
How does this all stack up? Well, Freestyle and Pipeline jobs work
just fine, and performance seems quite fast. True, multiple concurrent
builds of the same Pipeline job will be contending with each other to
write new Pipeline steps out to the workflow table, yet also there are
economies of scale to be gained in letting the database manage the
layout of the data within a single file rather than laying out data
ourselves in multiple files and fsync(2)'ing each one. SQLite offers
"extra", "full", "normal", and "off" settings for its "synchronous"
option, which we can map to the existing Pipeline durability levels.
Obviously this code is a rough prototype, but I was surprised at how
much just worked out of the box after a few hours of hacking. I think
there could be a future for Jenkins where everything is managed by
SQLite databases and where we leave XStream behind in favor of an ORM
like Hibernate. On upgrade, we can read in all the data with XStream
and write it out to SQLite with the ORM. From then on, serialization
and deserialization would work through an ORM against the relevant
SQLite database(s). And this would be on by default for everyone on
upgrade, not some opt-in plugin.
I think the functionality and performance we could get out of such a
system would be better than what we have today. The real benefit would
come after the migration when we can optimize slow operations, like
loading builds or displaying test results and static analysis results,
with hand-rolled SQL queries. We could also allow people to do
full-text search of build console logs.