just noticed the following message on our E220R (2x450MHz CPUs):
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 536986 kern.info] NOTICE: [AFT2] errID 0x00107585.b2508988 CBI event on CPU2
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 759624 kern.info] [AFT2] errID 0x00107585.b2508988 PA=0x00000000.003cbd40
Jul 30 01:24:06 titan E$tag 0x00000000.0c400007 E$State: Shared E$parity 0x06
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000000.00000000
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000 *Bad* PSYND=0x0004
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x18): 0x00000000.00000000 *Bad* PSYND=0x0004
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x00000030.3cfc200b *Bad* PSYND=0x0004
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.000000c8
Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x38): 0x00000067.00000000 *Bad* PSYND=0x0004
The machine continued working and prtdiag -v still reports "No failures
found in System". What does the above message mean and what shall we do
about it?
Thanks and bye, Dragan
--
Dragan Cvetkovic,
To be or not to be is true. G. Boole No it isn't. L. E. J. Brouwer
!!! Sender/From address is bogus. Use reply-to one !!!
> Hi,
>
> just noticed the following message on our E220R (2x450MHz CPUs):
>
> Jul 30 01:24:06 titan SUNW,UltraSPARC-II: [ID 536986 kern.info] NOTICE: [AFT2] errID 0x00107585.b2508988 CBI event on CPU2
[snip]
> The machine continued working and prtdiag -v still reports "No failures
> found in System". What does the above message mean and what shall we do
> about it?
>
OK, some googling helped a bit.
http://sunportal.sunmanagers.org/pipermail/summaries/2002-November/002767.html
has the following to offer:
A CBI event is a ecache error on a cache line that can occur without the
system panicing. CBI stands for Clean Bad Idle. Clean means that the cache
line is clean, or has not been modified. If it was modifed, it would be a
dirty page, which would have required flushing the changes out to
memory. Idle indicates that this cache line was not in use by the cpu at
this time. Bad means that it detected an error.
This is a corrected "scrubbed" Ecache event. This should be handled just
like any Ecache event, that is swap on the second event only.
It appears that Ecache error reporting has changed (again). Solaris 8
kernel patch 108528-13 introduces the changes detailed in bug 4385694. E$
errors seem to be reported as "xBy events" where x is C for "clean" or D
for "dirty", and y is I for "idle" or B for "busy" (so DBI event, CBD event
and so on), reflecting the state of the cache line when the error was
detected. So basically, a CBI event is telling us that the scrubbing
algorythm has found a bad line of ecache data and scrubbed it.
Since we are running 117000-01 on the system, it seems that the scrubber is
doing its job.
Bye, Dragan