(replies inline)
On Sat, 17 Feb 2018, Baptiste Mathus wrote:
> Le 16 févr. 2018 15:51, "R. Tyler Croy" <[1]
ty...@monkeypox.org> a écrit :
>
>
> One of the necessary details, in my opinion, to make Jenkins Essentials [0]
> successful is providing near-real-time error telemetry. Coupled with the
> "Evergreen" distribution system [1], error telemetry "post-deploy" will be
> absolutely crucial to determine whether or not we have just pushed out bad
> code
> worthy of reverting.
>
> I currently define "error telemetry" to include:
>
> Â * Uncaught exceptions which cause the Evil Jenkins 500 page
> Â * Logged ERROR messages, with or without exceptions
> Â * Logged WARN messages, with or without exceptions
>
>
>
> Totally agreed automated reporting is a must.
>
> Shouldn't the evergreen client send feedback too? Like if it triggered a
> Jenkins restart and never heard back since?
Your questions are definitely on the right track but I have been mentally
segmenting Jenkins _error_ telemetry from "generalized telemetry." For example,
my thinking recently evolved to change an "update" service to a "status"
service to more thoroughly accomodate the "status" from evergreen-client (for
example, is the Jenkins online, what version, how long has it been online,
etc).
> How about also a less automated /form/ in the Jenkins UI itself, to be used by
> human in case something is clearly wrong but didn't cause logs or outages.
> About that probably a clear web ui somewhere in case everything went wrong.
I like the idea theory, but in practice I believe we would get a tremendous
amount of low-signal "bug reports" through any such functionality and I don't
have the capacity to triage and handle that kind of feedback from users, thus
the automated routes :)
> General thought/note: this probably will require some setup to avoid attackers
> can trigger an auto-revert by sending bad reports to the telemetry endpoint.
Well certainly, "don't 100% trust client data" should be a foundational
principle for most applications :)
As an aside, your mail client sure likes to do non-standard quoting and inline
replies :/
>
>
>
> This list is by no means set in stone, and it is expected that there's
> going to
> be some "noise" in the system, so rooming upstream of this error telemetry
> won't be looking for the presence of errors but rather tracking patterns
> over
> time [2].
>
>
> The big challenge that we have, for which I wanted feedback, is *how* we
> can
> acquire this error telemetry
>
>
> My first prototype in this area was a plugin which integrates with the
> Sentry[3] error reporting service: [2]
https://github.com/jenkinsci/
> sentry-plugin
> This approach basically spins up a background busy-waiting thread which
> loops
> over all the loggers in the JVM, and adds the SentryHandler to loggers. Not
> the
> prettiest solution but it mostly works. There is an opportunity to miss
> logged errors before the SentryHandler is added, but it's hard to quantify
> how
> serious a gap that might be.
>
> I am not /thrilled/ with this approach, but it meets a very important
> criteria in
> that it's non-invasive to core and other plugins and can simply be
> installed in
> a Jenkins instance in order to work.
>
>
> I wanted to ask for more thoughts on alternative approaches, if they exist,
> which would enable the collection of the error telemetry discussed above.
> I'm
> sure there's something I'm missing.
>
>
>
>
> [0] [3]
https://github.com/jenkinsci/jep/tree/master/jep/300
> [1] [4]
https://github.com/jenkinsci/jep/tree/master/jep/300#auto-update
> [2] For example: [5]
https://itmonitor.zenoss.com/
> is-your-performance-normal-how-do-you-know/
> [3] [6]
https://sentry.io
>
>
> Cheers
> - R. Tyler Croy
>
> ------------------------------------------------------
> Â Â Â Code: <[7]
https://github.com/rtyler>
> Â Chatter: <[8]
https://twitter.com/agentdero>
> Â Â Â xmpp: [9]
rty...@jabber.org
>
> Â % gpg --keyserver [10]
keys.gnupg.net --recv-key 1426C7DC3F51E16F
> ------------------------------------------------------
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [11]
jenkinsci-de...@googlegroups.com.
> To view this discussion on the web visit [12]
https://groups.google.com/d/
> msgid/jenkinsci-dev/20180216145116.yizslgftmjgnhwmn%40blackberry.
>
coupleofllamas.com.
> For more options, visit [13]
https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to [14]
jenkinsci-de...@googlegroups.com.
> To view this discussion on the web visit [15]
https://groups.google.com/d/msgid/
> jenkinsci-dev/
> CANWgJS6xJnYhcuxTzPwtP%3DSrgymJmc6gKOAsb-ThMbK4YrGcLg%
40mail.gmail.com.
> For more options, visit [16]
https://groups.google.com/d/optout.
>
> References:
>
> [1] mailto:
ty...@monkeypox.org
> [2]
https://github.com/jenkinsci/sentry-plugin
> [3]
https://github.com/jenkinsci/jep/tree/master/jep/300
> [4]
https://github.com/jenkinsci/jep/tree/master/jep/300#auto-update
> [5]
https://itmonitor.zenoss.com/is-your-performance-normal-how-do-you-know/
> [6]
https://sentry.io/
> [7]
https://github.com/rtyler
> [8]
https://twitter.com/agentdero
> [9] mailto:
rty...@jabber.org
> [10]
http://keys.gnupg.net/
> [11] mailto:
jenkinsci-dev%2Bunsu...@googlegroups.com
> [12]
https://groups.google.com/d/msgid/jenkinsci-dev/20180216145116.yizslgftmjgnhwmn%40blackberry.coupleofllamas.com
> [13]
https://groups.google.com/d/optout
> [14] mailto:
jenkinsci-de...@googlegroups.com
> [15]
https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS6xJnYhcuxTzPwtP%3DSrgymJmc6gKOAsb-ThMbK4YrGcLg%40mail.gmail.com?utm_medium=email&utm_source=footer
> [16]
https://groups.google.com/d/optout