Loggregator logging issues in v198 / v199

68 views
Skip to first unread message

Erik Jasiak

unread,
Feb 13, 2015, 2:04:41 PM2/13/15
to vcap...@cloudfoundry.org
Hi all,

   The doppler component of Loggregator [1] is experiencing occasional crashes in v198/v199 - on the order of one to three doppler crashes per day for large cf message loads.

   Because the doppler instances are distributed, users will not lose all cf messages during a crash, however, messages sent to a restarting instance will be lost.  Doppler instance restarts typically takes less than one minute, and are usually on the order of seconds.

   The Loggregator team already has a fix in place for the issue that will go out with v200; only the frequency of the crash has changed.

   Thank you for your patience and understanding as we resolve these issues.  If you have any additional questions, please feel free to follow up to this email.

Erik Jasiak and John Tuley,
CF Loggregator PM and Anchor

[1] "Architecture" picture in Readme of https://github.com/cloudfoundry/loggregator

simon.j...@springer.com

unread,
Feb 13, 2015, 2:28:16 PM2/13/15
to vcap...@cloudfoundry.org
Thanks! Was just about to start a upgrade process, guess Ill wait for 200. :)

Phil Whelan

unread,
Feb 13, 2015, 5:13:16 PM2/13/15
to vcap...@cloudfoundry.org
Hi Erik,

Thanks for the update! Could you clarify this...

The Loggregator team already has a fix in place for the issue that will go out with v200; only the frequency of the crash has changed.

Does that mean the fix is only partial and in v200 the issue still occurs, but less frequently? Do you have link to the commit for the fix?

Thanks,
Phil

--
You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.
To view this discussion on the web visit https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/CACb-rQfz186cpesMaucf-zXQz1CfSAy2ayPSigcq91%2BVO%3DbbsQ%40mail.gmail.com.

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

Erik Jasiak

unread,
Feb 13, 2015, 6:01:17 PM2/13/15
to vcap...@cloudfoundry.org
Hi Phil,

  The original bug as we're seeing it today is discussed in tracker here [1] around syslog disconnects.

  At the time that we first saw the issue, it would happen infrequently (I believe at most once a month, but John could clarify.)  With the changes introduced in v198 and v199, the bug now appears to be happening more frequently under load.

   I'll let John add more detail as needed.  We're also throughly vetting for the conditions that trigger the event, in addition to making sure we handle them safely.

Thanks,
Erik


Reply all
Reply to author
Forward
0 new messages