[Essentials] Collecting usage telemetry from Jenkins Pipeline

R. Tyler Croy

unread,

Mar 16, 2018, 4:28:29 PM3/16/18

to jenkin...@googlegroups.com, vivek....@gmail.com, mne...@cloudbees.com

The successful adoption and iterative improvement of Jenkins Essentials [0] is
heavily contingent on a spectrum of automated feedback which I've been
referring to as "telemetry" in many of the design documents. I wanted to start
discussing the prospect of collecting anonymized Pipeline usage telemetry to
help Jenkins Essentials, Blue Ocean, and Pipeline teams understand how users
are actually using Jenkins Pipeline.

James Dumay has already prototyped some similar work[1] for collecting
behavioral telemetry in Blue Ocean, but what I'm proposing would be much more
broad in scope[2]. The metrics I am interested in, to help us understand how
Pipeline is being used, are:

* Counts:
* configured Declarative Pipelines
* configured Script Pipelines
* Pipeline executions
* distinct built-in step invocations (i.e. not counting Global Variable invocations)
* Global Shared Pipelines configured
* Folder-level Shared Pipelines configured
* Agents used per-Pipeline
* Post-directive breakdown (stable,unstable,changed,etc)

* Timers
* Runtime duration per step invocation
* Runtime duration per Pipeline

I believe this is a sufficiently useful set of metrics to send along, but the
two questions I could use help answering are:

* What are other metrics which would positively impact the development of
Jenkins?
* Are there concerns about implementation feasibility for any of these?

I am planning on using statsd (sent to the project's Datadog account), so
sampling and controlling the volume of individual metrics is not something I'm
terribly worried about.

Happy to hear any feedback y'all are willing to share!

[0] https://github.com/jenkins-infra/evergreen#evergreen
[1] https://github.com/jenkinsci/blueocean-plugin/pull/1653
[2] https://issues.jenkins-ci.org/browse/JENKINS-49852

Cheers
- R. Tyler Croy

------------------------------------------------------
Code: <https://github.com/rtyler>
Chatter: <https://twitter.com/agentdero>
xmpp: rty...@jabber.org

% gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------

signature.asc

Andrew Bayer

unread,

Mar 16, 2018, 4:34:50 PM3/16/18

to jenkin...@googlegroups.com, mne...@cloudbees.com, vivek....@gmail.com

If we’re going to be tracking step invocations anyway, it’d be interesting to count the number of Declarative Pipelines with a script block, maybe?

A.

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/20180316202818.hgl5irxnuqduq24v%40blackberry.coupleofllamas.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Beck

unread,

Mar 16, 2018, 4:41:07 PM3/16/18

to jenkin...@googlegroups.com

> On 16. Mar 2018, at 21:28, R. Tyler Croy <ty...@monkeypox.org> wrote:
>
> * What are other metrics which would positively impact the development of
> Jenkins?

Assuming the Essentials setup allows for whitelisting pipeline methods, knowing which methods are whitelisted (or got rejected on whitelisting attempt) would be useful to guide improvements of the default whitelist and the blacklist.

Basically, INFRA-1285, which I haven't had time to work on.

Liam Newman

unread,

Mar 16, 2018, 5:18:42 PM3/16/18

to jenkin...@googlegroups.com

I think it would be interesting to record some metrics around the complexity of Pipelines, such as cyclomatic complexity and number of lines in the Jenkinsfile. If shared libraries are used record the complexity and size of those.

Also the size of the strings passed to external scripting steps (sh, cmd, python, powershell, etc).

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CEB8092F-4EAD-459B-BE96-FA2946E19B44%40beckweb.net.

Jesse Glick

unread,

Mar 16, 2018, 5:18:53 PM3/16/18

to Jenkins Dev

On Fri, Mar 16, 2018 at 4:28 PM, R. Tyler Croy <ty...@monkeypox.org> wrote:
> * distinct built-in step invocations (i.e. not counting Global Variable invocations)

Careful with metasteps: `step` and `wrap` should count their
`delegate`s, so we count individual plugins contributing
`SimpleBuildStep` / `SimpleBuildWrapper`. Also see JENKINS-37227 for
`SCM`.

The `workflow-cps` plugin already collects a few metrics from running
Pipeline builds that get reported to `support-core`, since they are
thought to be correlated with people trying to do way too much in
Groovy and killing their system. These should also be sent to
telemetry:

· Groovy parse time
· class loading time
· in-VM run time
· `program.dat` save time
· flow graph load/save time
· flow graph size (roughly proportional to number of steps)

For those using Pipeline-as-code, we would like `branch-api` to report
the number of branch projects (of each type: branch, PR, tag) per
repo, and the number of repos per org, and probably some other stuff
like the volume of events being received and the frequency and
duration of full reindexings.

R. Tyler Croy

unread,

Mar 16, 2018, 5:28:11 PM3/16/18

to jenkin...@googlegroups.com

(replies inline)

Good suggestion! I'll incorporate that into my design document.

signature.asc

R. Tyler Croy

unread,

Mar 16, 2018, 5:32:11 PM3/16/18

to jenkin...@googlegroups.com, mne...@cloudbees.com, vivek....@gmail.com

(replies inline)

On Fri, 16 Mar 2018, Andrew Bayer wrote:

> If we???re going to be tracking step invocations anyway, it???d be interesting

> to count the number of Declarative Pipelines with a script block, maybe?

I kind of assumed that if we were incrementing a counter on step invocations
that script{} would be collected already by the machinery, e.g. isn't it "just"
a step?

If it's a special snowflake then I'll make sure to include it in my design.

A few more which come to mind now that I'm thinking about Script:

* Count of stages per Pipeline
* Count of Pipelines with the Groovy sandbox disable
* Time spent in script{} block

Thanks for the ideas abayer!

signature.asc

Andrew Bayer

unread,

Mar 16, 2018, 6:55:58 PM3/16/18

to jenkin...@googlegroups.com

It’s a normal step - what I’m talking about is counting Pipelines containing one or more script blocks, I.e., what percentage of total Declarative Pipelines use script blocks, which I think is a more useful metric than just how many script block invocations there are.

A.

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/20180316213201.ckenekkqcbgtsuzx%40blackberry.coupleofllamas.com.

Message has been deleted

R. Tyler Croy

unread,

Mar 19, 2018, 6:32:01 PM3/19/18

to jenkin...@googlegroups.com

(replies inline)

On Mon, 19 Mar 2018, Samuel Van Oort wrote:

> Late to the party because I was heads-down on Pipeline bugs a lot of
> Friday, but this is a subject near-and-dear to my heart and in the past
> I've discussed what metrics might be interesting since this was an explicit
> intent to surface from my Bismuth (Pipeline Graph Analysis APIs). Some of
> these are things I'd wanted to make a weekend project of (including
> surfacing the existing workflow-cps performance metrics).

Long reply is long! Thanks for taking the time to respond Sam. Suffice it to
say, there isn't a world in which I wouldn't use statsd for this :) My current
thinking is to incorporate the Metrics plugin
(https://plugins.jenkins.io/metrics) in order to provide the appropriate
interfaces, and if that's fine, then I would have no qualms with that becoming
a dependency of Pipeline itself. I need to do some research on what Dropwizard
baggage might be unnecessarily added into Jenkins.

To many of your inline comments, I do not think there's any problem collecting
as much telemetry as you and the other Pipeline developers see fit. My list was
mostly what *I* think I need to demonstrate success with Pipeline for Jenkins
Essentials, and to understand how Jenkins Essentials is being used in order to
guide our future roadmap.

Cheers

>
> We should aim to implement Metrics using the existing Metrics interface
> because then that can be fairly easy exported in a variety of ways -- I use
> a Graphite Metrics reporter that couples to another metric aggregator/store
> for the Pipeline Scalability Lab (some may know it as "Hydra"). Other
> *cough* proprietary systems may already consume this format of data. I
> would not be surprised if a StatsD reporter is pretty easy to hack together
> using https://github.com/ReadyTalk/metrics-statsd and you get a lot of
> goodies "for free."
>
> The one catch for implementing metrics is that we want to be cautious about
> adding too much overhead to the execution process.
>
> As far as specific metrics:

>
> > distinct built-in step invocations (i.e. not counting Global Variable
> invocations)
>

> This can't be measured easily from the flow graph due to the potential to
> create multiple block structures for one step. It COULD be added easily
> via a registered new StepListener API in workflow-api (and implemented in
> workflow-cps) though. I think it's valuable.
>
> > configured Declarative Pipelines, configured Script Pipelines
>
> We can get all Pipelines (flavor-agnostic) by iterating over WorkflowJob
> items. Not sure how we'd tell Scripted vs. Declarative -- maybe
> registering a Listener extension point of some sort? I see value here.
>
> I'd *also* like to have a breakdown of which Pipelines have been run in the
> last, say week and month, by type (easy to do by looking at the most recent
> build). That way we know not just which were created but which are in
> active use.
>
> > Pipeline executions
>
> Rates and counts can be achieved with the existing Metrics Timer time. I'd
> like to see that broken down by Scripted vs. Declarative as well.

>
> > * Global Shared Pipelines configured
> > * Folder-level Shared Pipelines configured
>

> Do you mean Shared Library use? One metric I'd be interested in is how
> many shared libraries are used *per-pipeline* -- easy to measure from the
> count of LoadedScripts I believe (correct me if there's something I'm
> missing here, Jesse).
>
> > Agents used per-Pipeline
>
> I think should be possible to do this easily via flow graph analysis,
> looking for WorkspaceActionImpl -- nodes and labels are be available. We
> might want to count total nodes *uses* (open/close of node blocks) and
> distinct nodes used.
>
> Best to triggers as a post-build analysis using the RunTrigger -- that way
> it's just a quick iteration over the Pipeline.

>
> > Runtime duration per step invocation
>

> This is one of the MOST useful metrics I think.
>
> I already have an implementation used in the Scalability Lab that does this
> on a per-flownode basis using the GraphListener (rather than per-step).
> This is part of a small utility plugin for metrics used in the scalability
> lab (not hosted currently since it's not general-use).
>
> Doing per-step is somewhat more complex - for many steps, trivial, but for
> example for a Retry step there's not a logical way to do it because you get
> multiple blocks. Blocks in general are undefined - do you count the block
> *contents*, just the start, just the end, or start+end nodes? Also
> remember that Groovy logic counts against the Step time with the
> FlowNodes. Usually that shouldn't be a huge issue unless the Groovy is
> complex.
>
> If that's too noisy there might be ways to insert Listeners for the Step
> itself (more complex though) -- I think using the FlowNodes is good enough
> for now and gives us a solid first-order approximation that is useful 99%
> of the time.
>
> I would also like to extend this by breaking it down into separate metrics
> per step type, i.e. runtime for sh, runtime for echo, for 'node', etc.
> This is easier than you'd think since you can fetch the StepDescriptor and
> call getFunctionName to get a unique metric key for the step. This is far
> more useful to us than just average step timings, because it helps spot
> performance regressions in the field.
>
> Other aggregates of interest: total time spent in each step type for the
> pipeline and counts of the FlowNode by step per pipeline. This will show
> if we're spending (for example) a LOT of time running
> readFile/writeFile/dir steps due to some sudden bottleneck in the remoting
> interaction and also reveal which step types are used most often. Knowing
> which steps are used heavily helps me know which deserve extra priority for
> bugfixes, features, and optimizations.
>
> It actually *sounds* far more complicated than it really is -- this would
> be a pretty trivial afternoon project I think.
>
> > Runtime duration per Pipeline
>
> I already have an implementation. Same plugin as above. It's exposed as a
> DropWizard histogram as well, so you get rates + aggregate times with
> median, mean, etc.
>
> *Other desired metrics: *I think we want FlowNodes created as a rate per
> unit time (I have an implementation in the same plugin above). I also
> have an impl for this already (same plugins as before).
>
> If we could find a way I'd really like to have a counter of how many
> elements of GroovyCPS logic are run and how many function calls (for
> off-master you obviously wouldn't get this data). This is something useful
> for measuring the real complexity of their Groovy -- even better than
> Liam's Cyclomatic Complexity metric because it directly tracks runtime
> operations, not just code structure. I have notions how we'd accomplish
> this.
>
> On Friday, March 16, 2018 at 6:55:58 PM UTC-4, Andrew Bayer wrote:
> >
> > It???s a normal step - what I???m talking about is counting Pipelines

> >> xmpp: rty...@jabber.org <javascript:>

> >>
> >> % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
> >> ------------------------------------------------------
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Jenkins Developers" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an

> >> email to jenkinsci-de...@googlegroups.com <javascript:>.

> >> To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/jenkinsci-dev/20180316213201.ckenekkqcbgtsuzx%40blackberry.coupleofllamas.com
> >> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/9842785f-49ee-43fc-ab61-d9e7b45dc3db%40googlegroups.com.

> For more options, visit https://groups.google.com/d/optout.

signature.asc

Jesse Glick

unread,

Mar 19, 2018, 6:35:25 PM3/19/18

to Jenkins Dev

Some technical notes:

On Mon, Mar 19, 2018 at 4:19 PM, Samuel Van Oort <svan...@cloudbees.com> wrote:
>> distinct built-in step invocations (i.e. not counting Global Variable
>> invocations)
>

> This can't be measured easily from the flow graph due to the potential to
> create multiple block structures for one step.

Actually you _can_ get this from the flow graph already. You just
count `StepNode`s with no `BodyInvocationAction`. (cf.
`StepStartNode.isBody`)

> Not sure how we'd tell Scripted vs. Declarative

`DeclarativeJobAction`

> how many
> shared libraries are used *per-pipeline* -- easy to measure from the count
> of LoadedScripts I believe

That would tell you how many `load` steps were used. To count
libraries, check `LibrariesAction`.

> looking for WorkspaceActionImpl -- nodes and labels are be available

Note that the labels here were of the actual node allocated, not the
label expression (if any) requested by the step, for which use
`ArgumentsAction`.

> If we could find a way I'd really like to have a counter of how many
> elements of GroovyCPS logic are run and how many function calls

Easy—patch `CpsFlowExecution.start` to create a proxy `Invoker` that
counts the different kinds of calls. This could be included in the
current `CpsFlowExecution.PipelineTimings`, actually.

Samuel Van Oort

unread,

Mar 19, 2018, 7:10:00 PM3/19/18

to Jenkins Developers

I'm glad my mega-reply did turn up after all (just belatedly), anyway, replies inline!

The existing Metrics plugin is using the Dropwizard interface, so:

say, there isn't a world in which I wouldn't use statsd for this :) My current
thinking is to incorporate the Metrics plugin
(https://plugins.jenkins.io/metrics) in order to provide the appropriate
interfaces, and if that's fine, then I would have no qualms with that becoming
a dependency of Pipeline itself. I need to do some research on what Dropwizard
baggage might be unnecessarily added into Jenkins.

Basically means "just make it a normal Metric", or at least that's my intent. I think it might make sense to make extra metrics like this not a *dependency* of Pipeline (since it would actually need to depend on parts of it), but an additional plugin in the Aggregator or part of some sort of Essentials plugin. Which I probably explained incoherently due to have my head still buried in the gnarly guts of workflow-cps.

As far as statsd goes: the Metrics interfaces are reporter-agnostic, so they don't care if you're fetching Metrics to Graphite, some sort of proprietary Analytics solution, StatsD, etc -- as long as you have a Reporter implementation. The Metrics Graphite Plugin gives some idea how simple it can be to wrap an existing Reporter Impl with configuration for Jenkins: https://github.com/jenkinsci/metrics-graphite-plugin

To Jesse:

Actually you _can_ get this from the flow graph already. You just
count `StepNode`s with no `BodyInvocationAction`. (cf.
`StepStartNode.isBody`)

Could work, I'd have to take a closer look to confirm.

Otherwise noted some good points and technical corrections (bear in mind I was trying to get a lot down quickly) about practical implementation... though I think Jesse might be the only one who calls the following "easy":

Easy—patch `CpsFlowExecution.start` to create a proxy `Invoker` that
counts the different kinds of calls. This could be included in the
current `CpsFlowExecution.PipelineTimings`, actually.

Worth noting that we would need to make sure that such an implementation is *extremely* lightweight in practice because any overhead it adds to CPS operations would be felt. As long as it's direct field access & incrementation (or AtomicXX access) that's probably fine.

R. Tyler Croy

unread,

Mar 21, 2018, 10:49:58 AM3/21/18

to jenkin...@googlegroups.com, vivek....@gmail.com, mne...@cloudbees.com

Just a heads up in case anybody on this thread is interested, I've put a
hangout on the Jenkins project calendar to discuss this topic further on
Friday:
<https://calendar.google.com/calendar/event?eid=MGoyYm5nbXJsNDAyNWtlMGgxNmoxYzdxanEgNHNzMTJmMG1xcjN0YnAxdDJmZTM2OXNsZjRAZw&ctz=GMT-07:00>

signature.asc

R. Tyler Croy

unread,

Mar 23, 2018, 12:27:35 PM3/23/18

to jenkin...@googlegroups.com, vivek....@gmail.com, mne...@cloudbees.com

(replies inline)

On Wed, 21 Mar 2018, R. Tyler Croy wrote:

> Just a heads up in case anybody on this thread is interested, I've put a
> hangout on the Jenkins project calendar to discuss this topic further on
> Friday:
> <https://calendar.google.com/calendar/event?eid=MGoyYm5nbXJsNDAyNWtlMGgxNmoxYzdxanEgNHNzMTJmMG1xcjN0YnAxdDJmZTM2OXNsZjRAZw&ctz=GMT-07:00>

Thanks Vivek, Andrew, Daniel, Sam, Jesse, and Baptiste for joining! Our
collected meeting notes have been archived here:
https://github.com/jenkins-infra/evergreen/tree/master/docs/meetings/2018-03-23-JENKINS-49852-pipeline-usage-telemetry

As per our discussion, I've filed the following ticket to at least define how
metrics will be collected in the context of a Jenkins Essentials distribution:
https://issues.jenkins-ci.org/browse/JENKINS-50373

Follow-up work will entail creating a prototype or two hooking up the Metrics
plugin together with Pipeline's GraphListener. Once that approach is validated,
I think we'll be able to move a broader discussion forward for Pipeline
instrumentation which can benefit both the development of Jenkins Essentials
but also Pipeline itself.

Cheers

signature.asc

Reply all

Reply to author

Forward