(replies inline)
On Mon, 19 Mar 2018, Samuel Van Oort wrote:
> Late to the party because I was heads-down on Pipeline bugs a lot of
> Friday, but this is a subject near-and-dear to my heart and in the past
> I've discussed what metrics might be interesting since this was an explicit
> intent to surface from my Bismuth (Pipeline Graph Analysis APIs). Some of
> these are things I'd wanted to make a weekend project of (including
> surfacing the existing workflow-cps performance metrics).
Long reply is long! Thanks for taking the time to respond Sam. Suffice it to
say, there isn't a world in which I wouldn't use statsd for this :) My current
thinking is to incorporate the Metrics plugin
(
https://plugins.jenkins.io/metrics) in order to provide the appropriate
interfaces, and if that's fine, then I would have no qualms with that becoming
a dependency of Pipeline itself. I need to do some research on what Dropwizard
baggage might be unnecessarily added into Jenkins.
To many of your inline comments, I do not think there's any problem collecting
as much telemetry as you and the other Pipeline developers see fit. My list was
mostly what *I* think I need to demonstrate success with Pipeline for Jenkins
Essentials, and to understand how Jenkins Essentials is being used in order to
guide our future roadmap.
Cheers
>
> We should aim to implement Metrics using the existing Metrics interface
> because then that can be fairly easy exported in a variety of ways -- I use
> a Graphite Metrics reporter that couples to another metric aggregator/store
> for the Pipeline Scalability Lab (some may know it as "Hydra"). Other
> *cough* proprietary systems may already consume this format of data. I
> would not be surprised if a StatsD reporter is pretty easy to hack together
> using
https://github.com/ReadyTalk/metrics-statsd and you get a lot of
> goodies "for free."
>
> The one catch for implementing metrics is that we want to be cautious about
> adding too much overhead to the execution process.
>
> As far as specific metrics:
>
> > distinct built-in step invocations (i.e. not counting Global Variable
> invocations)
>
> This can't be measured easily from the flow graph due to the potential to
> create multiple block structures for one step. It COULD be added easily
> via a registered new StepListener API in workflow-api (and implemented in
> workflow-cps) though. I think it's valuable.
>
> > configured Declarative Pipelines, configured Script Pipelines
>
> We can get all Pipelines (flavor-agnostic) by iterating over WorkflowJob
> items. Not sure how we'd tell Scripted vs. Declarative -- maybe
> registering a Listener extension point of some sort? I see value here.
>
> I'd *also* like to have a breakdown of which Pipelines have been run in the
> last, say week and month, by type (easy to do by looking at the most recent
> build). That way we know not just which were created but which are in
> active use.
>
> > Pipeline executions
>
> Rates and counts can be achieved with the existing Metrics Timer time. I'd
> like to see that broken down by Scripted vs. Declarative as well.
>
> > * Global Shared Pipelines configured
> > * Folder-level Shared Pipelines configured
>
> Do you mean Shared Library use? One metric I'd be interested in is how
> many shared libraries are used *per-pipeline* -- easy to measure from the
> count of LoadedScripts I believe (correct me if there's something I'm
> missing here, Jesse).
>
> > Agents used per-Pipeline
>
> I think should be possible to do this easily via flow graph analysis,
> looking for WorkspaceActionImpl -- nodes and labels are be available. We
> might want to count total nodes *uses* (open/close of node blocks) and
> distinct nodes used.
>
> Best to triggers as a post-build analysis using the RunTrigger -- that way
> it's just a quick iteration over the Pipeline.
>
> > Runtime duration per step invocation
>
> This is one of the MOST useful metrics I think.
>
> I already have an implementation used in the Scalability Lab that does this
> on a per-flownode basis using the GraphListener (rather than per-step).
> This is part of a small utility plugin for metrics used in the scalability
> lab (not hosted currently since it's not general-use).
>
> Doing per-step is somewhat more complex - for many steps, trivial, but for
> example for a Retry step there's not a logical way to do it because you get
> multiple blocks. Blocks in general are undefined - do you count the block
> *contents*, just the start, just the end, or start+end nodes? Also
> remember that Groovy logic counts against the Step time with the
> FlowNodes. Usually that shouldn't be a huge issue unless the Groovy is
> complex.
>
> If that's too noisy there might be ways to insert Listeners for the Step
> itself (more complex though) -- I think using the FlowNodes is good enough
> for now and gives us a solid first-order approximation that is useful 99%
> of the time.
>
> I would also like to extend this by breaking it down into separate metrics
> per step type, i.e. runtime for sh, runtime for echo, for 'node', etc.
> This is easier than you'd think since you can fetch the StepDescriptor and
> call getFunctionName to get a unique metric key for the step. This is far
> more useful to us than just average step timings, because it helps spot
> performance regressions in the field.
>
> Other aggregates of interest: total time spent in each step type for the
> pipeline and counts of the FlowNode by step per pipeline. This will show
> if we're spending (for example) a LOT of time running
> readFile/writeFile/dir steps due to some sudden bottleneck in the remoting
> interaction and also reveal which step types are used most often. Knowing
> which steps are used heavily helps me know which deserve extra priority for
> bugfixes, features, and optimizations.
>
> It actually *sounds* far more complicated than it really is -- this would
> be a pretty trivial afternoon project I think.
>
> > Runtime duration per Pipeline
>
> I already have an implementation. Same plugin as above. It's exposed as a
> DropWizard histogram as well, so you get rates + aggregate times with
> median, mean, etc.
>
> *Other desired metrics: *I think we want FlowNodes created as a rate per
> unit time (I have an implementation in the same plugin above). I also
> have an impl for this already (same plugins as before).
>
> If we could find a way I'd really like to have a counter of how many
> elements of GroovyCPS logic are run and how many function calls (for
> off-master you obviously wouldn't get this data). This is something useful
> for measuring the real complexity of their Groovy -- even better than
> Liam's Cyclomatic Complexity metric because it directly tracks runtime
> operations, not just code structure. I have notions how we'd accomplish
> this.
>
> On Friday, March 16, 2018 at 6:55:58 PM UTC-4, Andrew Bayer wrote:
> >
> > It???s a normal step - what I???m talking about is counting Pipelines
> >> xmpp:
rty...@jabber.org <javascript:>
> >>
> >> % gpg --keyserver
keys.gnupg.net --recv-key 1426C7DC3F51E16F
> >> ------------------------------------------------------
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Jenkins Developers" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to
jenkinsci-de...@googlegroups.com <javascript:>.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-dev/9842785f-49ee-43fc-ab61-d9e7b45dc3db%40googlegroups.com.