I totally understand your frustration. Current situation is that ci.jenkins.io is very unstable due to numerous factors. The most of CI runs fail, and on Friday we had an ACI outage which has not been resolved yet. On my side I had to postpone Jenkins core PR merges into weekly due to CI, and it is not the first time. I would say that ci.jenkins.io is basically unusable at the moment.
Why? As usual, we need contributors there. Olivier and other infra team members are doing a great job to keep the things afloat, bit right now it is a non-stop firefighting. JIRA, Confluence, ci.jenkins.io, and many other stories. We need more contributors in the infra team to share the maintenance load and to finally improve the setup. Ideally we should make ci.jenkins.io a reference setup which is reusable by other Jenkins users.
Starting from November 15th, I personally commit to focus on infrastructure and, specifically, on ci.jenkins.io as a top priority. I will have less time for other areas like Jenkins Core, bit I consider the current state as a grave danger to the project.
I invite everyone else to consider contributing to the infrastructure: https://jenkins.io/projects/infrastructure/#contributing
Best regards,
Oleg
--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/e310142d-4770-4c0d-a18d-89de3673ad96%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/e310142d-4770-4c0d-a18d-89de3673ad96%40googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "Jenkins Developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jenkinsci-dev/xqC8vW_emJI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CAFNCU--n1U8f%2BKtL%3DecORahj3vGp7VzznCmZ8BaPrgm_Lg2b7w%40mail.gmail.com.
> Do we have the support-core plugin installed on this instance ? If no, can
> we add it ?
We had it installed for years and nobody cared. After it showed up recently in a thread dump as the likely cause of excessive load while generating background bundles, I disabled it.
--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/38DAE091-615B-48EC-B9BD-904F8A044323%40beckweb.net.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CAFNCU-8uGtbPdyAyVBoHfXzNPLQre7Rp%3DHc1v_BGWqFeuS%3D9mg%40mail.gmail.com.
I filed https://issues.jenkins-ci.org/browse/INFRA-2308 after Arnaud rightly recommended to have something public and central for us to fix this.Pretty please anyone who's been suffering from ci.jenkins.io recent instabilities, do not hesitate to provide a status check here.We're especially interested in common stack traces or errors you're seeing.We have started analyzing support bundles. First step is https://github.com/jenkins-infra/jenkins-infra/pull/1375More to come.Thanks!
Le mer. 30 oct. 2019 à 14:56, Arnaud Héritier <aher...@gmail.com> a écrit :
I still don't have an admin access on the instance but I have a system level access.I was able to grab few bundles and we will see if it helps to start to diagnose the issue
On Wed, Oct 30, 2019 at 12:47 AM Daniel Beck <m...@beckweb.net> wrote:
> On 27. Oct 2019, at 12:58, Daniel Beck <m...@beckweb.net> wrote:
>
>
>
>> On 27. Oct 2019, at 11:29, Arnaud Héritier <aher...@gmail.com> wrote:
>>
>> Do we have the support-core plugin installed on this instance ? If no, can
>> we add it ?
>
> We had it installed for years and nobody cared. After it showed up recently in a thread dump as the likely cause of excessive load while generating background bundles, I disabled it.
And since support-core is disabled, the next restart will disable metrics plugin too.
We're affected by JENKINS-59793, with currently (AFAICT ~30 minutes after restart) more than 700 of these threads.
--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkin...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/38DAE091-615B-48EC-B9BD-904F8A044323%40beckweb.net.
---------Arnaud HéritierMail/GTalk: aheritier AT gmail DOT comTwitter/Skype : aheritier
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkin...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/a93ee5ab-305e-49e7-8b38-77f8a09fbac7%40googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "Jenkins Developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jenkinsci-dev/xqC8vW_emJI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS42DAD%2BpAK3Hgzv78VRKvv1GmWW1UL9OJsNaxNNYMnUZg%40mail.gmail.com.
One thing to add here is that the Metrics plugin has been disabled by Daniel before the restart.Right now the instance is indeed stable, bit the load was pretty low. But now we have a regular Jenkins core massive rebuild, so it should put some pressure on CI again.If it goes down overnight, we can just restart it tomorrow.Best regards,Oleg
On Thu, Oct 31, 2019 at 11:07 PM Baptiste Mathus <m...@batmat.net> wrote:
I guess in the JIRA Epic as a comment would be good. Especially if you still have seen failures in the last ~2 days.We've applied the change to get back already to G1 from ZGC + bumping to a bigger Xmx value earlier today. But AFAIK the current situation in the last two days since last restart was quite OKish.Hence we're waiting somehow for, if so, the situation to degrade again so we get a new support bundle to analyze, fix things/adjust config, then rinse and repeat.Given things looked ok since restart, we suspect something like thread or memory leak that takes some time to surface.BTW, we had a bit of an impromptu meeting earlier today on this.We do plan to plan recurring public sync-ups (probably during weekly infra meeting), so anyone willing to contribute can do it (contrary to today, facepalm).Stay tuned, and feel fully free to ask any question or provide any insight on what misbehaviors you're still seeing.Thanks
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/a93ee5ab-305e-49e7-8b38-77f8a09fbac7%40googlegroups.com.
--
You received this message because you are subscribed to a topic in the Google Groups "Jenkins Developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jenkinsci-dev/xqC8vW_emJI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jenkin...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS42DAD%2BpAK3Hgzv78VRKvv1GmWW1UL9OJsNaxNNYMnUZg%40mail.gmail.com.
About support core, based on what I see in the current bundle, the instance is running a really recent version of the plugin which is quite recent and should include most of the improvements made.
Another thing I can see though is that the support bundles contains all GC logs since March 2019. That's 244 files, each of them being 20 MB (file size limit) which means that we have 4.75 GB of GC logs in the support bundle. Maybe this could be accounting for the load when generating the bundle (that happen every hour by default but could be made less frequent or disabled: https://github.com/jenkinsci/support-core-plugin/blob/support-core-2.61/src/main/java/com/cloudbees/jenkins/support/SupportPlugin.java#L130-L135)* We should maybe setup log rotation on the server ?A further reason why the GC logs are included in the bundle is because they are written right under $JENKINS_HOME/ which is a location that a particular component is scanning: https://github.com/jenkinsci/support-core-plugin/blob/support-core-2.61/src/main/java/com/cloudbees/jenkins/support/impl/JenkinsLogs.java#L101-L130.* So a workaround (I would say a better practice actually) is to change the location of the GC logs to something like `$JENKINS_HOME/gc/` or it even out them outside the $JENKINS_HOME.Maybe this could be improved to not include files older than a certain number of days or have some limits (https://issues.jenkins-ci.org/browse/JENKINS-59030).
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/9a61a4f9-c850-48c1-93de-4d6bccc0b8e5%40googlegroups.com.