Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

freebsd-stable Digest, Vol 350, Issue 3

1 view

Skip to first unread message

freebsd-sta...@freebsd.org

unread,

Mar 31, 2010, 8:00:39 AM3/31/10

to freebsd...@freebsd.org

Send freebsd-stable mailing list submissions to
freebsd...@freebsd.org

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
or, via email, send a message with subject or body 'help' to
freebsd-sta...@freebsd.org

You can reach the person managing the list at
freebsd-st...@freebsd.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of freebsd-stable digest..."

Today's Topics:

1. Re: FreeBSD 8 and quotas on root partition error (Andriy Gapon)
2. Re: ZFS Tuning - arc_summary.pl (jhell)
3. Re: Strange NFS-related messages (related to lockd/statd)
(Rick Macklem)
4. Re: ZFS Tuning - arc_summary.pl (Barry Pederson)
5. Re: Strange NFS-related messages (related to lockd/statd)
(Jeremy Chadwick)
6. Re: Re: boot and boot0cfg problem (Andrey V. Elsukov)
7. Re: Re: boot and boot0cfg problem (N.J. Mann)
8. Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't
(Attila Nagy)
9. Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't
(Jonathan Feally)
10. Re: boot and boot0cfg problem (Daniel Braniss)
11. Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't
(Pyun YongHyeon)
12. Re: ahcich timeouts, only with ahci, not with ataahci
(Harald Schmalzbauer)
13. Re: ahcich timeouts, only with ahci, not with ataahci
(Alexander Motin)
14. Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't
(Attila Nagy)

----------------------------------------------------------------------

Message: 1
Date: Tue, 30 Mar 2010 15:39:49 +0300
From: Andriy Gapon <a...@icyb.net.ua>
Subject: Re: FreeBSD 8 and quotas on root partition error
To: "M. Vale" <maur...@gmail.com>
Cc: Craig Rodrigues <rod...@freebsd.org>, freebsd...@freebsd.org
Message-ID: <4BB1F115...@icyb.net.ua>
Content-Type: text/plain; charset=UTF-8

on 30/03/2010 01:26 M. Vale said the following:
> Hi, on FreeBSD 8.0 (i386 or AMD64) if we configure to use quotas on root
> partition.
>
> It stops on boot with the following message:
>
> Trying to mount root from ufs:/dev/ad0s1a
> mount option <userquota> is unknown
> mount option <groupquota> is unknown
> ROOT MOUNT ERROR: mount option <groupquota> is unknown
> If you have invalid mount options, reboot, and first try the following from
>
> the loader prompt:
>
> set vfs.root.mountfrom.options=rw
>
> and then remove invalid mount options from /etc/fstab.
>
> Loader variables:
> vfs.root.mountfrom=ufs:/dev/ad0s1a
> vfs.root.mountfrom.options=rw,userquota,groupquota,acls
>
>
> Manual root filesystem specification:
> <fstype>:<device> Mount <device> using filesystem <fstype>
> eg. ufs:/dev/da0s1a
> eg. cd9660:/dev/acd0
>
> This is equivalent to: mount -t cd9660 /dev/acd0 /
>
> ? List valid disk boot devices
> <empty line> Abort manual input
>
> mountroot>
>
>
> If i do:
>
> ufs:/dev/ad0s1a
>
> Then the boot continues and it mount the quotas ok. but if I reboot the same
> thing happens again.
>
> This only occurs on FreeBSD 8.
>
> Does anyone have a clue about the problem ?

Yes, it's a known problem.
It is caused by you having userquota/groupquota options for root filesystem in
your fstab.
Previously it was OK, but it got broken when a new feature was implemented in r193192.

--
Andriy Gapon

------------------------------

Message: 2
Date: Tue, 30 Mar 2010 10:30:05 -0400
From: jhell <jh...@dataix.net>
Subject: Re: ZFS Tuning - arc_summary.pl
To: freebsd...@freebsd.org
Message-ID: <4BB20AE...@dataix.net>
Content-Type: text/plain; charset=ISO-8859-1

On 03/29/2010 10:43, Barry Pederson wrote:
> I've been using the arc_summary.pl script from here:
>
> http://jhell.googlecode.com/svn/base/head/scripts/zfs/arc_summary/arc_summary.pl
>
>
> and noticed some odd numbers, with the ARC Current Size being larger
> than the Max Size, and the breakdown adding up to less than the current
> size as shown below
>
> --------
> ARC Size:
> Current Size: 992.71M (arcsize)
> Target Size: (Adaptive) 512.00M (c)
> Min Size (Hard Limit): 81.82M (arc_min)
> Max Size (Hard Limit): 512.00M (arc_max)
>
> ARC Size Breakdown:
> Recently Used Cache Size: 99.84% 511.18M (p)
> Frequently Used Cache Size: 0.16% 0.82M (c-p)
> --------
>
>
> From another thread I saw, it sounds like arc_max isn't really
> a "Hard Limit" but rather some kind of high water mark. If that's
> the case then I wonder if this might make more sense....
>
>
>
> ---------
> --- arc_summary.pl.original 2010-02-25 19:23:13.000000000 -0600
> +++ arc_summary.pl 2010-03-29 09:32:28.000000000 -0500
> @@ -121,20 +121,20 @@
>
> my $arc_size = ${Kstat}->{zfs}->{0}->{arcstats}->{size};
> my $arc_size_MiB = ($arc_size / 1048576);
> -my $mfu_size = $target_size - $mru_size;
> +my $mfu_size = $arc_size - $mru_size;
> my $mfu_size_MiB = ($mfu_size / 1048576);
> -my $mru_perc = 100*($mru_size / $target_size);
> -my $mfu_perc = 100*($mfu_size / $target_size);
> +my $mru_perc = 100*($mru_size / $arc_size);
> +my $mfu_perc = 100*($mfu_size / $arc_size);
>
> print "ARC Size:\n";
> printf("\tCurrent Size:\t\t\t\t%0.2fM (arcsize)\n", $arc_size_MiB);
> printf("\tTarget Size: (Adaptive)\t\t\t%0.2fM (c)\n", $target_size_MiB);
> printf("\tMin Size (Hard Limit):\t\t\t%0.2fM (arc_min)\n",
> $arc_min_size_MiB);
> -printf("\tMax Size (Hard Limit):\t\t\t%0.2fM (arc_max)\n",
> $arc_max_size_MiB);
> +printf("\tMax Size :\t\t\t%0.2fM (arc_max)\n",
> $arc_max_size_MiB);
>
> print "\nARC Size Breakdown:\n";
> printf("\tRecently Used Cache Size:\t%0.2f%%\t%0.2fM (p)\n", $mru_perc,
> $mru_size_MiB);
> -printf("\tFrequently Used Cache Size:\t%0.2f%%\t%0.2fM (c-p)\n",
> $mfu_perc, $mfu_size_MiB);
> +printf("\tFrequently Used Cache Size:\t%0.2f%%\t%0.2fM (arcsize-p)\n",
> $mfu_perc, $mfu_size_MiB);
> print "\n";
>
> ### ARC Efficency ###
>
> -----------
>
>
> Giving something like this...
>
> --------
> ARC Size:
> Current Size: 992.88M (arcsize)
> Target Size: (Adaptive) 512.00M (c)
> Min Size (Hard Limit): 81.82M (arc_min)
> Max Size : 512.00M (arc_max)
>
> ARC Size Breakdown:
> Recently Used Cache Size: 51.48% 511.18M (p)
> Frequently Used Cache Size: 48.52% 481.70M (arcsize-p)
> --------
>
> Barry
> _______________________________________________
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

Ill mark this up on my todo. Thanks for the feedback & I should have
something committed back within the next couple days, possibly even tonight.

I never recalculated for this difference as that area of the code was
just a formatting fix.

But as Jeremy has pointed out, it would have to verify against
__FreeBSD_version but since I already pull in the sysctl MIB
kern.osreldate I should be able to compare to that and say 700000 or
higher make the above correction.

As this is mainly ZFS v13 dependent now I don't feel to bad doing what I
have stated above.

Thanks again,

Regards,

jhell

------------------------------

Message: 3
Date: Tue, 30 Mar 2010 10:45:09 -0400 (EDT)
From: Rick Macklem <rmac...@uoguelph.ca>
Subject: Re: Strange NFS-related messages (related to lockd/statd)
To: Jeremy Chadwick <fre...@jdc.parodius.com>
Cc: freebsd...@freebsd.org
Message-ID: <Pine.GSO.4.63.10...@muncher.cs.uoguelph.ca>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

On Mon, 29 Mar 2010, Jeremy Chadwick wrote:

> I recently brought up rpc.lockd and rpc.statd on all of our NFS clients
> (mixed RELENG_6, RELENG_7, and RELENG_8), and our NFS server (RELENG_8).
>
> All clients had nfs_client_enable="yes" in rc.conf prior to their last
> reboot, but lacked rpcbind_enable="yes", rpc_lockd_enable="yes", and
> rpc_statd_enable="yes" prior to the below.
>
> The 8.x clients started rpcbind, rpc.lockd, rpc.statd -- then said:
>
> NLM: failed to contact remote rpcbind, stat = 0, port = 0
> Can't start NLM - unable to contact NSM
>
> The 7.x clients started rpcbind, rpc.lockd, rpc.statd -- then said:
>
> Can't start NLM - unable to contact NSM
>
Oh, I forgot to mention..I can't help much, but these protocols/daemons
are SunRPC, so they will be using portmapper (now called rpcbind) to get
port #s assigned dynamically. I also believe (not sure, don't know much
about it) that the NSM will poll for other machines, so it needs to be
able to talk to all clients and server(s), including doing IP broadcast
that gets to them all. (These were designed in the 1980s for a LAN, which
was just a chunk of coax in those days:-)

Hope this helps, rick

------------------------------

Message: 4
Date: Tue, 30 Mar 2010 10:21:45 -0500
From: Barry Pederson <b...@barryp.org>
Subject: Re: ZFS Tuning - arc_summary.pl
To: jhell <jh...@dataix.net>
Cc: FreeBSD Stable <freebsd...@freebsd.org>
Message-ID: <4BB21709...@barryp.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 3/30/10 9:30 AM, jhell wrote:
> Ill mark this up on my todo. Thanks for the feedback& I should have
> something committed back within the next couple days, possibly even tonight.
>
> I never recalculated for this difference as that area of the code was
> just a formatting fix.
>
> But as Jeremy has pointed out, it would have to verify against
> __FreeBSD_version but since I already pull in the sysctl MIB
> kern.osreldate I should be able to compare to that and say 700000 or
> higher make the above correction.
>
> As this is mainly ZFS v13 dependent now I don't feel to bad doing what I
> have stated above.
>
> Thanks again,
>
> Regards,

Shouldn't the "ARC Size Breakdown" be based on the current size (instead
of the target size) no matter what?

If the current size was *less* than arc_max, you'd also get an
nonsensical breakdown - in that case with the "recently used" and
"frequently used" adding up to *more* than the current size.

It seems to me that the only thing that would be different between newer
and older FreeBSD version would be whether arc_max would be described as
a "Hard Limit" or not - which is maybe not that important to show.

If that's all true, then the patch should work for both older and newer
versions of FreeBSD.

Barry

------------------------------

Message: 5
Date: Tue, 30 Mar 2010 08:38:39 -0700
From: Jeremy Chadwick <fre...@jdc.parodius.com>
Subject: Re: Strange NFS-related messages (related to lockd/statd)
To: Rick Macklem <rmac...@uoguelph.ca>
Cc: freebsd...@freebsd.org
Message-ID: <20100330153...@icarus.home.lan>
Content-Type: text/plain; charset=us-ascii

On Tue, Mar 30, 2010 at 10:45:09AM -0400, Rick Macklem wrote:
>
>
> On Mon, 29 Mar 2010, Jeremy Chadwick wrote:
>
> >I recently brought up rpc.lockd and rpc.statd on all of our NFS clients
> >(mixed RELENG_6, RELENG_7, and RELENG_8), and our NFS server (RELENG_8).
> >
> >All clients had nfs_client_enable="yes" in rc.conf prior to their last
> >reboot, but lacked rpcbind_enable="yes", rpc_lockd_enable="yes", and
> >rpc_statd_enable="yes" prior to the below.
> >
> >The 8.x clients started rpcbind, rpc.lockd, rpc.statd -- then said:
> >
> >NLM: failed to contact remote rpcbind, stat = 0, port = 0
> >Can't start NLM - unable to contact NSM
> >
> >The 7.x clients started rpcbind, rpc.lockd, rpc.statd -- then said:
> >
> >Can't start NLM - unable to contact NSM
> >
> Oh, I forgot to mention..I can't help much, but these protocols/daemons
> are SunRPC, so they will be using portmapper (now called rpcbind) to get
> port #s assigned dynamically. I also believe (not sure, don't know much
> about it) that the NSM will poll for other machines, so it needs to be
> able to talk to all clients and server(s), including doing IP broadcast
> that gets to them all. (These were designed in the 1980s for a LAN, which
> was just a chunk of coax in those days:-)
>
> Hope this helps, rick

In fact it did! Your hint lead me to try my earlier idea: using the -h
flag to rpcbind.

Turns out lockd wasn't running on any of the systems (rpcinfo didn't
show it, and ps didn't show it). I ended up modifying all of the boxes
to use:

rpcbind_flags="-h <ipaddr of em1>"

(Where em1=LAN, em0=WAN. em0 contains the default route as well)

Restarted rpcbind + statd + lockd (in that order). Voila, everything
started up, and no messages. rpcinfo shows all correct services. So my
guess is that by binding to INADDR_ANY by default, packets were going
out the primary interface (em0) or going to broadcast on em0 -- which
would return nothing, since pf blocked such packets. Makes sense to me
anyway.

Thanks!

------------------------------

Message: 6
Date: Tue, 30 Mar 2010 19:44:10 +0400
From: Andrey V. Elsukov <bu7...@yandex.ru>
Subject: Re: Re: boot and boot0cfg problem
To: Daniel Braniss <da...@cs.huji.ac.il>
Cc: freebsd...@freebsd.org
Message-ID: <7316126...@web103.yandex.ru>
Content-Type: text/plain

30.03.10, 14:03, "Daniel Braniss" <da...@cs.huji.ac.il>:

> > On 30.03.2010 12:05, Daniel Braniss wrote:
> > > so it seems that someone is preventing changes to the partition table!
> > > btw, this problem was not present in older boot0 (1.0) where the active
> > > partition flag is ignored.
> >
> > You can change active partition via gpart(8).
> >
> Hi Andrey,
> I'm sorry, I've reread the manual, and can't find the write magic.

Yes, i also doesn't remember where it can be read. Only in g_part_mbr.c :)
Try this:
# gpart set -a active -i 1 ada2
This will set active first partition on ada2:
# gpart show ada2
=> 63 1250263665 ada2 MBR (596G)
63 40965687 1 !7 [active] (20G)
40965750 1209292875 2 !7 (577G)
1250258625 5103 - free - (2.5M)

> btw, boot0cfg does call geom but something seems to be broken.
I'll look boot0cfg code today and probably made a patch.

--
WBR, Andrey V. Elsukov

------------------------------

Message: 7
Date: Tue, 30 Mar 2010 16:50:30 +0100
From: "N.J. Mann" <n...@njm.me.uk>
Subject: Re: Re: boot and boot0cfg problem
To: "Andrey V. Elsukov" <bu7...@yandex.ru>
Cc: freebsd...@freebsd.org
Message-ID: <2010033015...@titania.njm.me.uk>
Content-Type: text/plain; charset=us-ascii

In message <7316126...@web103.yandex.ru>,
Andrey V. Elsukov (bu7...@yandex.ru) wrote:
> 30.03.10, 14:03, "Daniel Braniss" <da...@cs.huji.ac.il>:
>
> > > On 30.03.2010 12:05, Daniel Braniss wrote:
> > > > so it seems that someone is preventing changes to the partition table!
> > > > btw, this problem was not present in older boot0 (1.0) where the active
> > > > partition flag is ignored.
> > >
> > > You can change active partition via gpart(8).
> > >
> > Hi Andrey,
> > I'm sorry, I've reread the manual, and can't find the write magic.
>
> Yes, i also doesn't remember where it can be read. Only in g_part_mbr.c :)
> Try this:
> # gpart set -a active -i 1 ada2
> This will set active first partition on ada2:
> # gpart show ada2
> => 63 1250263665 ada2 MBR (596G)
> 63 40965687 1 !7 [active] (20G)
> 40965750 1209292875 2 !7 (577G)
> 1250258625 5103 - free - (2.5M)
>
> > btw, boot0cfg does call geom but something seems to be broken.
> I'll look boot0cfg code today and probably made a patch.

Do you need to disable the geom anti-foot-shooting feature? I seem to
remember it is something like:

sysctl kern.geom.debugflags=16

Cheers,
Nick.
--

------------------------------

Message: 8
Date: Tue, 30 Mar 2010 17:57:45 +0200
From: Attila Nagy <b...@fsn.hu>
Subject: Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't
To: Jonathan Feally <vul...@netvulture.com>
Cc: pyu...@gmail.com, Mailing List FreeBSD Stable
<freebsd...@freebsd.org>, Michael Loftis <mlo...@wgops.com>
Message-ID: <4BB21F79...@fsn.hu>
Content-Type: text/plain; charset=ISO-8859-1

Jonathan Feally wrote:
> Attila Nagy wrote:
>>>>>> Bingo, this solved the problem. The current uptime nears four days.
>>>>>> Previously I couldn't go further than a day.
>>>>>>
>>>>>> The machine gets very light TCP load (and other machines which
>>>>>> get work
>>>>>> well), so I guess it's UDP RX or TX checksum related
>>>>>>
> I also have had my network go dead on a recent 8.0-STABLE on bge
> system. Console is alive, but network just stops. I am running it as a
> router with untagged on bge0 and nat of traffic on vlan201 tagged on
> top of bge1. I haven't had it lock up in 3 days, but I will try the
> -txcsum and -rxcsum on both interfaces to see if the problem still
> persists or not. I do have a lot of tcp traffic, but there is also
> unsolicited udp flying in as well.
Well, it's a short time to judge from, but with rx,txcsum disabled, the
machine froze nearly instantly (less than one hour of uptime), while
with tso disabled, it still works.
So for now I think tso causes the problems.
BTW, now that we are talking about that, I remember that I've disabled
it on a lot of machines previously, because I've had strange issues.

------------------------------

Message: 9
Date: Tue, 30 Mar 2010 08:54:15 -0700
From: Jonathan Feally <vul...@netvulture.com>
Subject: Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't
To: Attila Nagy <b...@fsn.hu>
Cc: pyu...@gmail.com, Mailing List FreeBSD Stable
<freebsd...@freebsd.org>, Michael Loftis <mlo...@wgops.com>
Message-ID: <4BB21EA7...@netvulture.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Attila Nagy wrote:
>>>>> Bingo, this solved the problem. The current uptime nears four days.
>>>>> Previously I couldn't go further than a day.
>>>>>
>>>>> The machine gets very light TCP load (and other machines which get work
>>>>> well), so I guess it's UDP RX or TX checksum related
>>>>>
I also have had my network go dead on a recent 8.0-STABLE on bge system.
Console is alive, but network just stops. I am running it as a router
with untagged on bge0 and nat of traffic on vlan201 tagged on top of
bge1. I haven't had it lock up in 3 days, but I will try the -txcsum and
-rxcsum on both interfaces to see if the problem still persists or not.
I do have a lot of tcp traffic, but there is also unsolicited udp flying
in as well.

-Jon

--
Scanned for viruses and dangerous content by MailScanner

------------------------------

Message: 10
Date: Tue, 30 Mar 2010 19:48:52 +0300
From: Daniel Braniss <da...@cs.huji.ac.il>
Subject: Re: boot and boot0cfg problem
To: "Andrey V. Elsukov" <bu7...@yandex.ru>
Cc: freebsd...@freebsd.org
Message-ID: <E1NwecW-...@kabab.cs.huji.ac.il>
Content-Type: text/plain; charset=us-ascii

> 30.03.10, 14:03, "Daniel Braniss" <da...@cs.huji.ac.il>:
>
> > > On 30.03.2010 12:05, Daniel Braniss wrote:
> > > > so it seems that someone is preventing changes to the partition table!
> > > > btw, this problem was not present in older boot0 (1.0) where the active
> > > > partition flag is ignored.
> > >
> > > You can change active partition via gpart(8).
> > >
> > Hi Andrey,
> > I'm sorry, I've reread the manual, and can't find the write magic.
>
> Yes, i also doesn't remember where it can be read. Only in g_part_mbr.c :)
> Try this:
> # gpart set -a active -i 1 ada2
> This will set active first partition on ada2:
> # gpart show ada2
> => 63 1250263665 ada2 MBR (596G)
> 63 40965687 1 !7 [active] (20G)
> 40965750 1209292875 2 !7 (577G)
> 1250258625 5103 - free - (2.5M)
>
> > btw, boot0cfg does call geom but something seems to be broken.
> I'll look boot0cfg code today and probably made a patch.
ok, that worked!
now if you can get boot0cfg to work that would realy be nice.
thanks,
danny

------------------------------

Message: 11
Date: Tue, 30 Mar 2010 11:07:10 -0700
From: Pyun YongHyeon <pyu...@gmail.com>
Subject: Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't
To: Attila Nagy <b...@fsn.hu>
Cc: Jonathan Feally <vul...@netvulture.com>, Mailing List FreeBSD
Stable <freebsd...@freebsd.org>, Michael Loftis
<mlo...@wgops.com>
Message-ID: <2010033018...@michelle.cdnetworks.com>
Content-Type: text/plain; charset=us-ascii

On Tue, Mar 30, 2010 at 05:57:45PM +0200, Attila Nagy wrote:
> Jonathan Feally wrote:
> > Attila Nagy wrote:
> >>>>>> Bingo, this solved the problem. The current uptime nears four days.
> >>>>>> Previously I couldn't go further than a day.
> >>>>>>
> >>>>>> The machine gets very light TCP load (and other machines which
> >>>>>> get work
> >>>>>> well), so I guess it's UDP RX or TX checksum related
> >>>>>>
> > I also have had my network go dead on a recent 8.0-STABLE on bge
> > system. Console is alive, but network just stops. I am running it as a
> > router with untagged on bge0 and nat of traffic on vlan201 tagged on
> > top of bge1. I haven't had it lock up in 3 days, but I will try the
> > -txcsum and -rxcsum on both interfaces to see if the problem still
> > persists or not. I do have a lot of tcp traffic, but there is also
> > unsolicited udp flying in as well.
> Well, it's a short time to judge from, but with rx,txcsum disabled, the
> machine froze nearly instantly (less than one hour of uptime), while
> with tso disabled, it still works.
> So for now I think tso causes the problems.
> BTW, now that we are talking about that, I remember that I've disabled
> it on a lot of machines previously, because I've had strange issues.

Would you show me the dmesg output(only bce(4) part)?

------------------------------

Message: 12
Date: Tue, 30 Mar 2010 22:13:07 +0200
From: Harald Schmalzbauer <h.schma...@omnilan.de>
Subject: Re: ahcich timeouts, only with ahci, not with ataahci
To: Alexander Motin <m...@FreeBSD.org>
Cc: freebsd...@FreeBSD.org
Message-ID: <4BB25B5...@omnilan.de>
Content-Type: text/plain; charset="iso-8859-15"

Alexander Motin schrieb am 29.03.2010 21:25 (localtime):
> Harald Schmalzbauer wrote:
>> I have the drives now running in another server, ich7 chipset.
>> Using UFS, the complete machine locks up for ~30 secs with disk load of
>> 3.5MB/s. But I don't get any timeout messages and the machine always
>> recovered.
>
> Most of ICH7's do not support AHCI. What's about your's?

It does, it's a FujitsuSiemens Server and has ERST-II (LSI Software
RAID) along with AHCI.

>> Changing to the old ata driver solves the problem.
...
>> Any chance to get this problem fixed? I couldn't see lockups on another
>> OS with NCQ in AHCI mode enabled. I'd ship such a disk to anyone who is
>> willing to debug.
>
> It's difficult to fix something, until problem could be reproduced.

I understand!
The machine lock @3.5MB/s was wrong, sorry. Not the HD was the culprit
but an intermediate router...
But still there is the problem that ZFS stalls if I use these drives
with ahci, not with ataahci.

> If you wish to send drive - my address is:
> Topol-2, b34, f150, Dnepropetrovsk, 49040, Ukraine.
> Phone: +380503622312.
> Do not use courier services, only regular mail. Ask for tracking number.

Can you use such a drive? I mean for yourself. If yes, then I'll ship
it, but if you say "na, thanks, no such crap" then I don't want to waste
your time and highly appreciated skills to bother with vendor-specific
problems.

Thnaks,

-Harry

--
OmniLAN - UNIX & Windows Netze + Systeme
Harald Schmalzbauer
Flintsbacher Str. 3
80686 München
+49 (0) 89 18947781
+49 (0) 160 93860101
USt-IdNr.: DE253184753
http:/www.omnilan.de/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100330/38480eef/signature-0001.pgp

------------------------------

Message: 13
Date: Wed, 31 Mar 2010 00:04:13 +0300
From: Alexander Motin <m...@FreeBSD.org>
Subject: Re: ahcich timeouts, only with ahci, not with ataahci
To: Harald Schmalzbauer <h.schma...@omnilan.de>
Cc: freebsd...@FreeBSD.org
Message-ID: <4BB2674D...@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-15

Harald Schmalzbauer wrote:
> Alexander Motin schrieb am 29.03.2010 21:25 (localtime):
>> If you wish to send drive - my address is:
>
> Can you use such a drive? I mean for yourself. If yes, then I'll ship
> it, but if you say "na, thanks, no such crap" then I don't want to waste
> your time and highly appreciated skills to bother with vendor-specific
> problems.

I have enough drives, so I don't really need one more. If it is really
vendor-specific problem (looking on messages I don't have much doubts)
then I am not sure I can do much to it, except just making sure that
error recovery code handles situation well and writing some quirks to
somehow workaround problem for this particular drive, if it is possible.

--
Alexander Motin

------------------------------

Message: 14
Date: Wed, 31 Mar 2010 10:40:48 +0200
From: Attila Nagy <b...@fsn.hu>
Subject: Re: 8-STABLE freezes on UDP traffic (DNS), 7.x doesn't
To: pyu...@gmail.com
Cc: Jonathan Feally <vul...@netvulture.com>, Mailing List FreeBSD
Stable <freebsd...@freebsd.org>, Michael Loftis
<mlo...@wgops.com>
Message-ID: <4BB30A90...@fsn.hu>
Content-Type: text/plain; charset=ISO-8859-1

Pyun YongHyeon wrote:
> On Tue, Mar 30, 2010 at 05:57:45PM +0200, Attila Nagy wrote:
>
>> Jonathan Feally wrote:
>>
>>> Attila Nagy wrote:
>>>
>>>>>>>> Bingo, this solved the problem. The current uptime nears four days.
>>>>>>>> Previously I couldn't go further than a day.
>>>>>>>>
>>>>>>>> The machine gets very light TCP load (and other machines which
>>>>>>>> get work
>>>>>>>> well), so I guess it's UDP RX or TX checksum related
>>>>>>>>
>>>>>>>>
>>> I also have had my network go dead on a recent 8.0-STABLE on bge
>>> system. Console is alive, but network just stops. I am running it as a
>>> router with untagged on bge0 and nat of traffic on vlan201 tagged on
>>> top of bge1. I haven't had it lock up in 3 days, but I will try the
>>> -txcsum and -rxcsum on both interfaces to see if the problem still
>>> persists or not. I do have a lot of tcp traffic, but there is also
>>> unsolicited udp flying in as well.
>>>
>> Well, it's a short time to judge from, but with rx,txcsum disabled, the
>> machine froze nearly instantly (less than one hour of uptime), while
>> with tso disabled, it still works.
>> So for now I think tso causes the problems.
>> BTW, now that we are talking about that, I remember that I've disabled
>> it on a lot of machines previously, because I've had strange issues.
>>
>
> Would you show me the dmesg output(only bce(4) part)?
>
Sure:
bce0: <HP NC373i Multifunction Gigabit Server Adapter (B2)> mem
0xfa000000-0xfbffffff irq 16 at device 0.0 on pci7
miibus0: <MII bus> on bce0
brgphy0: <BCM5708S 1000/2500BaseSX PHY> PHY 2 on miibus0
brgphy0: 1000baseSX-FDX, 2500baseSX-FDX, auto
bce0: Ethernet address: 00:1b:78:75:f0:34
bce0: [ITHREAD]
bce0: ASIC (0x57081021); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C
(4.4.1); Flags (MSI|2.5G)
bce1: <HP NC373i Multifunction Gigabit Server Adapter (B2)> mem
0xf6000000-0xf7ffffff irq 16 at device 0.0 on pci3
miibus1: <MII bus> on bce1
brgphy1: <BCM5708S 1000/2500BaseSX PHY> PHY 2 on miibus1
brgphy1: 1000baseSX-FDX, 2500baseSX-FDX, auto
bce1: Ethernet address: 00:1b:78:75:f0:38
bce1: [ITHREAD]
bce1: ASIC (0x57081021); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C
(4.4.1); Flags (MSI|2.5G)

The NIC's firmware is up to date (latest available on HP firmware update
CD).

------------------------------

End of freebsd-stable Digest, Vol 350, Issue 3
**********************************************

0 new messages