Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Mixed -current MP results

2 views
Skip to first unread message

Hauke Fath

unread,
May 6, 2011, 4:19:12 AM5/6/11
to
All,

so I've taken the plunge, and upgraded my netbsd-4 SPARCstation 20 (2x
SM71) to -current two weeks ago. Mixed results, quite mixed.

On the up side, I don't see any "random" userland crashes. And when the
machine crashes, it doesn't lock up, as netbsd-4 used to, but reboots. And
it reboots quickly, thanks to "-o log", instead of spending fscking 20
minutes on the 70 GB disk. Most of the installed netbsd-4 pkgsrc userland
is fine, with the notable exception of sendmail dumping core, squid working
on 5_99_49, but silently failing on 5_99_51, and XEmacs being dodgy.

Kudos to those who pulled NetBSD/sparc kicking and screaming back to a
usable state.

On the down side... if the machine doesn't keel over running the daily /
security script, it will certainly die during the following Amanda backup
run. So far, I got one (1) successful Amanda run out of the last two weeks.
The (fairly reproducible) panic is

<snip>
Mutex error: mutex_vector_enter: locking against myself

lock address : 0x00000000f4ce8170 type : sleep/adaptive
initialized : 0x00000000f02258f4
shared holds : 0 exclusive: 0
shares wanted: 0 exclusive: 2
current cpu : 0 last held: 0
current lwp : 0x00000000f358d580 last held: 000000000000000000
last locked : 0x00000000f0214bec unlocked*: 0x00000000f0214c48
owner field : 0x00000000f358d580 wait/spin: 1/0

Turnstile chain at 0xf02e572c.
=> Turnstile at 0xf358e9d8 (wrq=0xf358e9e8, rdq=0xf358e9f0).
=> 0 waiting readers:
=> 1 waiting writers: 0xf359a8e0

panic: LOCKDEBUG
cpu0: Begin traceback...
0x0(0xf4ccc240, 0x0, 0xf02782d0, 0xf0296050, 0x1, 0xf02e3800) at
netbsd:mutex_enter+0x364
mutex_enter(0xf4ce8170, 0xf358d580, 0xf4ce8170, 0xf02e45d0, 0xf02db218, 0x1) at
netbsd:biodone2+0x8
biodone2(0xf12cbc88, 0x0, 0x0, 0x0, 0x51f06dbf, 0xa0000020) at
netbsd:biointr+0x44
biointr(0x0, 0x0, 0xf00e99c4, 0x0, 0xf358cd40, 0xf00027e0) at
netbsd:softint_thread+0x74
softint_thread(0xf3630008, 0xf358d580, 0xf02d13b0, 0xf02cb7c0, 0xf3582974,
0xf02e0d2e) at netbsd:lwp_setfunc_trampoline
cpu0: End traceback...
Frame pointer is at 0xf3635c00
Call traceback:
pc = 0xf00fe2b0 args = (0xf02dd224, 0x0, 0xf02dd224, 0xf02d0400, 0x75,
0xffffffff, 0xf3635c68) fp = 0xf3635c68
pc = 0xf01ad1e0 args = (0x104, 0x0, 0xefffffff, 0xf3635f20, 0xf01ac4dc,
0x1,0xf3635cd8) fp = 0xf3635cd8
pc = 0xf01a55b4 args = (0xf02a63e0, 0xf01ac594, 0xf02782d0, 0xf02c8800,
0xf02cdc00, 0x104, 0xf3635d48) fp = 0xf3635d48
pc = 0xf00d96c0 args = (0xf4ccc240, 0x0, 0xf02782d0, 0xf0296050, 0x1,
0xf02e3800, 0xf3635db0) fp = 0xf3635db0
pc = 0xf0214bec args = (0xf4ce8170, 0xf358d580, 0xf4ce8170, 0xf02e45d0,
0xf02db218, 0x1, 0xf3635e18) fp = 0xf3635e18
pc = 0xf0214cf8 args = (0xf12cbc88, 0x0, 0x0, 0x0, 0x51f06dbf,
0xa0000020, 0xf3635e80) fp = 0xf3635e80
pc = 0xf00e9898 args = (0x0, 0x0, 0xf00e99c4, 0x0, 0xf358cd40,
0xf00027e0, 0xf3635ee8) fp = 0xf3635ee8
pc = 0xf0007d28 args = (0xf3630008, 0xf358d580, 0xf02d13b0, 0xf02cb7c0,
0xf3582974, 0xf02e0d2e, 0xf3635f50) fp = 0xf3635f50
pc = 0x0 args = (0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) fp = 0x0

dump to dev 7,1 not possible
sd0: cache synchronization failed
rebooting
</snip>

After reboot, I get a mildly disquieting

<snip>
WARNING: negative runtime; monotonic clock has gone backwards
</snip>

Every now and then, I see dig(1) and nsupdate(8) busy-looping at 100% cpu,
and have to "kill -9" them. named(8) generally seems to be dodgy, and
should probably be built without threads on sparc.

Last, but not least: -current is slow. The machine runs a custom !DEBUG,
!DIAGNOSTIC, LOCKDEBUG kernel. The /etc/daily cron job used to take about
an hour, now it's about four hours. Typical Amanda back up times:

<snip>
amanda run on -current

HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS
KB/s
-------------------------- --------------------------------------
--------------
pizza ccd0b 1 5940 5940 -- 0:45 132.3 N/A
N/A
pizza ccd0d 1 10977 955 8.7 1:57 8.2 N/A
N/A
pizza ccd0e 0 12103632 2679703 22.1 403:21 110.7 N/A
N/A


amanda run on netbsd-4

HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS
KB/s
-------------------------- --------------------------------------
--------------
pizza ccd0b 1 5940 5940 -- 0:08 719.5 N/A
N/A
pizza ccd0d 1 10967 953 8.7 0:35 26.9 N/A
N/A
pizza ccd0e 0 12105495 2681853 22.2 290:03 154.1 N/A
N/A
</snip>

Generally, I see a much higher percentage of system time than I've been
used to even under moderate load, and - a bit disturbing - a much higher
percentage of interrupt time, especially during disk activity (which seems
to be slower than netbsd-4, see the Amanda numbers), with spikes up to 30%.
I don't know, though, if the latter is a property of the -current MP kernel
changes, or a quirk of MD sparc code.

Given the loss of speed, I am seriously thinking about going back to
netbsd-4. I'll sorely miss "-o log", though...

Comments?

hauke


--
"It's never straight up and down" (DEVO)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-...@muc.de

Hauke Fath

unread,
May 9, 2011, 2:57:20 PM5/9/11
to
Another one, FWIW:

NetBSD/sparc (pizza.causeuse.org) (constty)

login: Mutex error: mutex_vector_enter: locking against myself

lock address : 0x00000000f4ceb170 type : sleep/adaptive


initialized : 0x00000000f02258f4
shared holds : 0 exclusive: 0
shares wanted: 0 exclusive: 2
current cpu : 0 last held: 0
current lwp : 0x00000000f358d580 last held: 000000000000000000
last locked : 0x00000000f0214bec unlocked*: 0x00000000f0214c48
owner field : 0x00000000f358d580 wait/spin: 1/0

Turnstile chain at 0xf02e572c.
=> Turnstile at 0xf358e930 (wrq=0xf358e940, rdq=0xf358e948).
=> 0 waiting readers:
=> 1 waiting writers: 0xf359a0a0

panic: LOCKDEBUG
Stopped in pid 0.4 (system) at netbsd:cpu_Debugger+0x4: or %o7, %g0, %g1
db{0}> t
cpu_Debugger(0xf02a63e0, 0xf01ac594, 0xf02782d0, 0xf02c8800, 0xf02f6400,
0x104) at netbsd:lockdebug_abort1+0xa4
lockdebug_abort1(0xf4ccc900, 0x0, 0xf02782d0, 0xf0296050, 0x1, 0xf02e3800)
at netbsd:mutex_enter+0x364
mutex_enter(0xf4ceb170, 0xf00b5f18, 0xf4ceb170, 0x0, 0xf02db218, 0x1) at
netbsd:biodone2+0x8
biodone2(0xf13945e8, 0x0, 0x0, 0x0, 0xf2a9fc6e, 0xa0000020) at


netbsd:biointr+0x44
biointr(0x0, 0x0, 0xf00e99c4, 0x0, 0xf358cd40, 0xf00027e0) at
netbsd:softint_thread+0x74
softint_thread(0xf3630008, 0xf358d580, 0xf02d13b0, 0xf02cb7c0, 0xf3582974,
0xf02e0d2e) at netbsd:lwp_setfunc_trampoline

db{0}> reboot 0x04
rebooting

Resetting ...
</snip>

and some performance data: A Retrospect backup from a System 7.0.1 Mac via
ftp to the netbsd-4 ss20 usually ran at an average of 30 MB/min. Now, I see
16..20 MB/min, and the writes are considerably slower than the reads.

matthew green

unread,
May 9, 2011, 4:59:17 PM5/9/11
to

i haven't had a chance to look at these yet, to see what is up with biodone2.

one comment wrt the sleep you're seeing, LOCKDEBUG is *extmerely* expensive
with netbsd-5 and newer. the slowdowns you're seeing are unfortunately to
be expected. in at least one case, we've observed at least a 90% drop of
perf of an x86 quadcore system.


.mrg.

matthew green

unread,
May 9, 2011, 6:14:25 PM5/9/11
to

>
> i haven't had a chance to look at these yet, to see what is up with biodone2.
>
> one comment wrt the sleep you're seeing, LOCKDEBUG is *extmerely* expensive

i mean "slowness you're seeing".

Hauke Fath

unread,
May 10, 2011, 1:59:39 PM5/10/11
to
At 6:59 Uhr +1000 10.5.2011, matthew green wrote:
>one comment wrt the sleep you're seeing, LOCKDEBUG is *extmerely* expensive
>with netbsd-5 and newer. the slowdowns you're seeing are unfortunately to
>be expected. in at least one case, we've observed at least a 90% drop of
>perf of an x86 quadcore system.

Ah, okay. Kind of to be expected with fine-grained locking spreading...
more locks, more debug code. My sparc isn't faster, yet, but I am less
worried. ;)

How mandatory is LOCK_DEBUG these days?

hauke


--
"It's never straight up and down" (DEVO)

--

0 new messages