via_rhine kernel crashes in 2.6.32

Andrey Rahmatullin

unread,

Dec 19, 2009, 6:50:01 AM12/19/09

to

Hello.

Recently I've started to experience kernel crashes under heavy network
load. It may be caused by switching to .32-rc0, but as .30 crashed too, it
may be caused by switching from Vuze to Deluge (maybe it uses the network
more actively). Unfortunately, most crashes can't be detected with
netconsole, and they usually don't fit into 1280x1024 (I think the kernel
prints 2-3 or more backtraces of random processes like hostapd and scmpc
and then completely locks up). Here are two screenshots that look like
complete one-screen traces:
http://wrar.name/temp/P1010714.JPG - this is from .32-rc0
(v2.6.32-7123-g75b0803)
http://wrar.name/temp/P1010711.JPG - this is from 2.6.30 (stock ALT Linux)

Of course I have lots of "eth0: Transmit timed out" and some "WARNING: at
net/sched/sch_generic.c:255 dev_watchdog" before crash, as described at
http://bugzilla.kernel.org/show_bug.cgi?id=11663

--
WBR, wRAR (ALT Linux Team)

signature.asc

Andrey Rahmatullin

unread,

Dec 20, 2009, 3:10:02 PM12/20/09

to

Here are the proper screenshots from 2.6.33-rc1:
http://wrar.name/temp/P1010724.JPG
http://wrar.name/temp/P1010725.JPG

signature.asc

Christian Kujau

unread,

Dec 21, 2009, 7:10:01 AM12/21/09

to

[netdev Cc'ed]

On Mon, 21 Dec 2009 at 01:03, Andrey Rahmatullin wrote:
> Here are the proper screenshots from 2.6.33-rc1:
> http://wrar.name/temp/P1010724.JPG
> http://wrar.name/temp/P1010725.JPG

Wow, I had to rotate my laptop here. In you first posting [0] you said you
"upgraded to .32" but had crashes in .30 too. What was the last working
kernel?

In the .jpgs one can see something along the lines of
schedule_timeout, napi_disable, rhine_tx_timeout - the only patch I could
find touching these things [1] altogether is rather old, from 2.6.24. Is
2.6.29 working for you?

Christian.

[0] http://lkml.org/lkml/2009/12/19/36
[1] http://lists.openwall.net/netdev/2007/08/28/13
--
BOFH excuse #19:

floating point processor overflow
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Andrey Rahmatullin

unread,

Dec 21, 2009, 7:40:01 AM12/21/09

to

On Mon, Dec 21, 2009 at 04:03:06AM -0800, Christian Kujau wrote:
> Wow, I had to rotate my laptop here. In you first posting [0] you said you
> "upgraded to .32" but had crashes in .30 too. What was the last working
> kernel?

It was 2.6.31, but as I've already said I'm not sure it was because of
kernel version alone.

> In the .jpgs one can see something along the lines of
> schedule_timeout, napi_disable, rhine_tx_timeout - the only patch I could
> find touching these things [1] altogether is rather old, from 2.6.24. Is
> 2.6.29 working for you?

Well, I can try some old versions.

signature.asc

Andrey Rahmatullin

unread,

Dec 21, 2009, 1:20:01 PM12/21/09

to

On Mon, Dec 21, 2009 at 04:03:06AM -0800, Christian Kujau wrote:

> In the .jpgs one can see something along the lines of
> schedule_timeout, napi_disable, rhine_tx_timeout - the only patch I could
> find touching these things [1] altogether is rather old, from 2.6.24. Is
> 2.6.29 working for you?

I've installed 2.6.27, the earliest kernel supported by udev 149. I
started to download two large files to my second machine (maybe 700 or
1000 Kbyte/s combined). Nothing happened. I stopped the downloads and
started deluged. The kernel showed "netdev watchdog timeout", but nothing
else happened. deluged opened something like 100 TCP connections and
started to upload some data at ~20 Kbyte/s. Nothing happened. I resumed
one of downloads, waited for some time, nothing happened. I resumed the
second download, the kernel crashed into an endless stream of backtraces
(did 2.6.27 support pause_on_oops?), containing "whatever from the idle
thread", or smth like that, which was also in other crash logs.

signature.asc

Christian Kujau

unread,

Dec 21, 2009, 2:40:01 PM12/21/09

to

On Mon, 21 Dec 2009 at 23:18, Andrey Rahmatullin wrote:
> I've installed 2.6.27, the earliest kernel supported by udev 149. I

[...]

> one of downloads, waited for some time, nothing happened. I resumed the
> second download, the kernel crashed into an endless stream of backtraces
> (did 2.6.27 support pause_on_oops?), containing "whatever from the idle
> thread", or smth like that, which was also in other crash logs.

So, 2.6.27 crashed as well. Was the backtrace similar to those on 2.6.30?
I know it's a long shot, but since you seem to be able to reproduce this
pretty reliably, can you try 2.6.23? Or at least something before
bea3348eef27e6044b6161fd04c3152215f96411 [0]?

Christian.

[0] I'm *really* guessing here, if some netdev guru has some better
understanding of the backtraces Andrey sent, please step forward.
--
BOFH excuse #433:

error: one bad user found in front of screen

Jarek Poplawski

unread,

Dec 22, 2009, 7:40:02 AM12/22/09

to

On 21-12-2009 20:32, Christian Kujau wrote:
> On Mon, 21 Dec 2009 at 23:18, Andrey Rahmatullin wrote:
>> I've installed 2.6.27, the earliest kernel supported by udev 149. I
> [...]
>> one of downloads, waited for some time, nothing happened. I resumed the
>> second download, the kernel crashed into an endless stream of backtraces
>> (did 2.6.27 support pause_on_oops?), containing "whatever from the idle
>> thread", or smth like that, which was also in other crash logs.
>
> So, 2.6.27 crashed as well. Was the backtrace similar to those on 2.6.30?
> I know it's a long shot, but since you seem to be able to reproduce this
> pretty reliably, can you try 2.6.23? Or at least something before
> bea3348eef27e6044b6161fd04c3152215f96411 [0]?
>
> Christian.
>
> [0] I'm *really* guessing here, if some netdev guru has some better
> understanding of the backtraces Andrey sent, please step forward.

It looks like napi_disable() should be illegal in ndo_tx_timeout().
Here is a patch which moves most of the timeout work to a workqueue,
similarly to tg3 etc. It should prevent at least one of reported
bugs. Alas I can't even check-compile it at the moment, so let me
know on any problems.

Jarek P.

---

drivers/net/via-rhine.c | 41 ++++++++++++++++++++++++++++-------------
1 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/net/via-rhine.c b/drivers/net/via-rhine.c
index 593e01f..125406b 100644
--- a/drivers/net/via-rhine.c
+++ b/drivers/net/via-rhine.c
@@ -102,6 +102,7 @@ static const int multicast_filter_limit = 32;
#include <linux/ethtool.h>
#include <linux/crc32.h>
#include <linux/bitops.h>
+#include <linux/workqueue.h>
#include <asm/processor.h> /* Processor type for cache alignment. */
#include <asm/io.h>
#include <asm/irq.h>
@@ -389,6 +390,7 @@ struct rhine_private {
struct net_device *dev;
struct napi_struct napi;
spinlock_t lock;
+ struct work_struct reset_task;

/* Frequently used values: keep some adjacent for cache effect. */
u32 quirks;
@@ -407,6 +409,7 @@ struct rhine_private {
static int mdio_read(struct net_device *dev, int phy_id, int location);
static void mdio_write(struct net_device *dev, int phy_id, int location, int value);
static int rhine_open(struct net_device *dev);
+static void rhine_reset_task(struct work_struct *work);
static void rhine_tx_timeout(struct net_device *dev);
static netdev_tx_t rhine_start_tx(struct sk_buff *skb,
struct net_device *dev);
@@ -775,6 +778,8 @@ static int __devinit rhine_init_one(struct pci_dev *pdev,
dev->irq = pdev->irq;

spin_lock_init(&rp->lock);
+ INIT_WORK(&rp->reset_task, rhine_reset_task);
+
rp->mii_if.dev = dev;
rp->mii_if.mdio_read = mdio_read;
rp->mii_if.mdio_write = mdio_write;
@@ -1179,22 +1184,18 @@ static int rhine_open(struct net_device *dev)
return 0;
}

-static void rhine_tx_timeout(struct net_device *dev)
+static void rhine_reset_task(struct work_struct *work)
{
- struct rhine_private *rp = netdev_priv(dev);
- void __iomem *ioaddr = rp->base;
-
- printk(KERN_WARNING "%s: Transmit timed out, status %4.4x, PHY status "
- "%4.4x, resetting...\n",
- dev->name, ioread16(ioaddr + IntrStatus),
- mdio_read(dev, rp->mii_if.phy_id, MII_BMSR));
+ struct rhine_private *rp = container_of(work, struct rhine_private,
+ reset_task);
+ struct net_device *dev = rp->dev;

/* protect against concurrent rx interrupts */
disable_irq(rp->pdev->irq);

napi_disable(&rp->napi);

- spin_lock(&rp->lock);
+ spin_lock_irq(&rp->lock);

/* clear all descriptors */
free_tbufs(dev);
@@ -1206,7 +1207,7 @@ static void rhine_tx_timeout(struct net_device *dev)
rhine_chip_reset(dev);
init_registers(dev);

- spin_unlock(&rp->lock);
+ spin_unlock_irq(&rp->lock);
enable_irq(rp->pdev->irq);

dev->trans_start = jiffies;
@@ -1214,6 +1215,19 @@ static void rhine_tx_timeout(struct net_device *dev)
netif_wake_queue(dev);
}

+static void rhine_tx_timeout(struct net_device *dev)
+{
+ struct rhine_private *rp = netdev_priv(dev);
+ void __iomem *ioaddr = rp->base;
+
+ printk(KERN_WARNING "%s: Transmit timed out, status %4.4x, PHY status "
+ "%4.4x, resetting...\n",
+ dev->name, ioread16(ioaddr + IntrStatus),
+ mdio_read(dev, rp->mii_if.phy_id, MII_BMSR));
+
+ schedule_work(&rp->reset_task);
+}
+
static netdev_tx_t rhine_start_tx(struct sk_buff *skb,
struct net_device *dev)
{
@@ -1830,10 +1844,11 @@ static int rhine_close(struct net_device *dev)
struct rhine_private *rp = netdev_priv(dev);
void __iomem *ioaddr = rp->base;

- spin_lock_irq(&rp->lock);
-
- netif_stop_queue(dev);
napi_disable(&rp->napi);
+ cancel_work_sync(&rp->reset_task);
+ netif_stop_queue(dev);
+
+ spin_lock_irq(&rp->lock);

if (debug > 1)
printk(KERN_DEBUG "%s: Shutting down ethercard, "

Jarek Poplawski

unread,

Dec 22, 2009, 8:30:01 AM12/22/09

to

On Tue, Dec 22, 2009 at 12:32:11PM +0000, Jarek Poplawski wrote:
> It looks like napi_disable() should be illegal in ndo_tx_timeout().
> Here is a patch which moves most of the timeout work to a workqueue,
> similarly to tg3 etc. It should prevent at least one of reported
> bugs. Alas I can't even check-compile it at the moment, so let me
> know on any problems.

It seems I needlessly changed locking btw, so here it is again.

Jarek P.

--- (take 2)

drivers/net/via-rhine.c | 37 ++++++++++++++++++++++++++-----------
1 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/net/via-rhine.c b/drivers/net/via-rhine.c
index 593e01f..15a4063 100644

@@ -1179,15 +1184,11 @@ static int rhine_open(struct net_device *dev)

return 0;
}

-static void rhine_tx_timeout(struct net_device *dev)
+static void rhine_reset_task(struct work_struct *work)
{
- struct rhine_private *rp = netdev_priv(dev);
- void __iomem *ioaddr = rp->base;
-
- printk(KERN_WARNING "%s: Transmit timed out, status %4.4x, PHY status "
- "%4.4x, resetting...\n",
- dev->name, ioread16(ioaddr + IntrStatus),
- mdio_read(dev, rp->mii_if.phy_id, MII_BMSR));
+ struct rhine_private *rp = container_of(work, struct rhine_private,
+ reset_task);
+ struct net_device *dev = rp->dev;

/* protect against concurrent rx interrupts */
disable_irq(rp->pdev->irq);

Jarek Poplawski

unread,

Dec 22, 2009, 8:40:02 AM12/22/09

to

On Tue, Dec 22, 2009 at 01:21:07PM +0000, Jarek Poplawski wrote:
> On Tue, Dec 22, 2009 at 12:32:11PM +0000, Jarek Poplawski wrote:
> > It looks like napi_disable() should be illegal in ndo_tx_timeout().
> > Here is a patch which moves most of the timeout work to a workqueue,
> > similarly to tg3 etc. It should prevent at least one of reported
> > bugs. Alas I can't even check-compile it at the moment, so let me
> > know on any problems.
>
> It seems I needlessly changed locking btw, so here it is again.

Hmm... On the other hand, it definitely needs at least _bh now...

Sorry,
Jarek P.

--- (take 3)

drivers/net/via-rhine.c | 41 ++++++++++++++++++++++++++++-------------
1 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/net/via-rhine.c b/drivers/net/via-rhine.c
index 593e01f..125406b 100644

@@ -1179,22 +1184,18 @@ static int rhine_open(struct net_device *dev)

return 0;
}

-static void rhine_tx_timeout(struct net_device *dev)
+static void rhine_reset_task(struct work_struct *work)
{
- struct rhine_private *rp = netdev_priv(dev);
- void __iomem *ioaddr = rp->base;
-
- printk(KERN_WARNING "%s: Transmit timed out, status %4.4x, PHY status "
- "%4.4x, resetting...\n",
- dev->name, ioread16(ioaddr + IntrStatus),
- mdio_read(dev, rp->mii_if.phy_id, MII_BMSR));
+ struct rhine_private *rp = container_of(work, struct rhine_private,
+ reset_task);
+ struct net_device *dev = rp->dev;

/* protect against concurrent rx interrupts */
disable_irq(rp->pdev->irq);

napi_disable(&rp->napi);

- spin_lock(&rp->lock);

+ spin_lock_bh(&rp->lock);

/* clear all descriptors */
free_tbufs(dev);
@@ -1206,7 +1207,7 @@ static void rhine_tx_timeout(struct net_device *dev)
rhine_chip_reset(dev);
init_registers(dev);

- spin_unlock(&rp->lock);

+ spin_unlock_bh(&rp->lock);

enable_irq(rp->pdev->irq);

dev->trans_start = jiffies;

Andrey Rahmatullin

unread,

Dec 22, 2009, 10:10:02 AM12/22/09

to

On Tue, Dec 22, 2009 at 01:38:17PM +0000, Jarek Poplawski wrote:
> > > It looks like napi_disable() should be illegal in ndo_tx_timeout().
> > > Here is a patch which moves most of the timeout work to a workqueue,
> > > similarly to tg3 etc. It should prevent at least one of reported
> > > bugs. Alas I can't even check-compile it at the moment, so let me
> > > know on any problems.
> > It seems I needlessly changed locking btw, so here it is again.
> Hmm... On the other hand, it definitely needs at least _bh now...

I've tried this patch. There are lots of "Transmit timed out", but no
crashes.

signature.asc

Roger Luethi

unread,

Dec 22, 2009, 10:40:02 AM12/22/09

to

ACK. Looks like you guys tracked down the crashing and fixed it (thanks!).
I suspect we shouldn't have to reset due to timeouts that often, but that's
another story.

Roger

Jarek Poplawski

unread,

Dec 22, 2009, 12:40:01 PM12/22/09

to

On Tue, Dec 22, 2009 at 04:26:59PM +0100, Roger Luethi wrote:
> On Tue, 22 Dec 2009 20:00:45 +0500, Andrey Rahmatullin wrote:
> > On Tue, Dec 22, 2009 at 01:38:17PM +0000, Jarek Poplawski wrote:
> > > > > It looks like napi_disable() should be illegal in ndo_tx_timeout().
> > > > > Here is a patch which moves most of the timeout work to a workqueue,
> > > > > similarly to tg3 etc. It should prevent at least one of reported
> > > > > bugs. Alas I can't even check-compile it at the moment, so let me
> > > > > know on any problems.
> > > > It seems I needlessly changed locking btw, so here it is again.
> > > Hmm... On the other hand, it definitely needs at least _bh now...
> > I've tried this patch. There are lots of "Transmit timed out", but no
> > crashes.
>
> ACK. Looks like you guys tracked down the crashing and fixed it (thanks!).
> I suspect we shouldn't have to reset due to timeouts that often, but that's
> another story.
>
> Roger

Thanks everybody,
Jarek P.
------------------->

There are BUGs "scheduling while atomic" triggered by the timer
rhine_tx_timeout(). They are caused by calling napi_disable() (with
msleep()). This patch fixes it by moving most of the timer content to
the workqueue function (similarly to other drivers, like tg3), with
spin_lock() changed to BH version.

Additionally, there is spin_lock_irq() moved in rhine_close() to
exclude napi_disable() etc., also tg3's way.

Reported-by: Andrey Rahmatullin <wr...@altlinux.org>
Tested-by: Andrey Rahmatullin <wr...@altlinux.org>
Signed-off-by: Jarek Poplawski <jar...@gmail.com>
Cc: Christian Kujau <li...@nerdbynature.de>
Cc: Roger Luethi <r...@hellgate.ch>
---

Jarek Poplawski

unread,

Dec 23, 2009, 5:00:02 AM12/23/09

to

On 22-12-2009 16:26, Roger Luethi wrote:
> On Tue, 22 Dec 2009 20:00:45 +0500, Andrey Rahmatullin wrote:
>> On Tue, Dec 22, 2009 at 01:38:17PM +0000, Jarek Poplawski wrote:
>>>>> It looks like napi_disable() should be illegal in ndo_tx_timeout().
>>>>> Here is a patch which moves most of the timeout work to a workqueue,
>>>>> similarly to tg3 etc. It should prevent at least one of reported
>>>>> bugs. Alas I can't even check-compile it at the moment, so let me
>>>>> know on any problems.
>>>> It seems I needlessly changed locking btw, so here it is again.
>>> Hmm... On the other hand, it definitely needs at least _bh now...
>> I've tried this patch. There are lots of "Transmit timed out", but no
>> crashes.
>
> ACK. Looks like you guys tracked down the crashing and fixed it (thanks!).
> I suspect we shouldn't have to reset due to timeouts that often, but that's
> another story.

BTW, it seems a change in 2.6.31 might trigger these timeouts more
often than before. Andrey, could you try if this matters here?

Thanks,
Jarek P.

--- (on top of net-2.6 with the previous "Fix scheduling..." patch)

diff -Nurp a/drivers/net/via-rhine.c b/drivers/net/via-rhine.c
--- a/drivers/net/via-rhine.c 2009-12-23 09:28:25.000000000 +0000
+++ b/drivers/net/via-rhine.c 2009-12-23 09:33:57.000000000 +0000
@@ -1226,6 +1226,7 @@ static void rhine_tx_timeout(struct net_
mdio_read(dev, rp->mii_if.phy_id, MII_BMSR));

schedule_work(&rp->reset_task);
+ netdev_get_tx_queue(dev, 0)->trans_start = jiffies;

}

static netdev_tx_t rhine_start_tx(struct sk_buff *skb,

Andrey Rahmatullin

unread,

Dec 23, 2009, 11:30:02 AM12/23/09

to

On Wed, Dec 23, 2009 at 09:52:03AM +0000, Jarek Poplawski wrote:
> BTW, it seems a change in 2.6.31 might trigger these timeouts more
> often than before. Andrey, could you try if this matters here?

They appear once per 4 seconds with or without this patch.

--
WBR, wRAR (ALT Linux Team)

Jarek Poplawski

unread,

Dec 23, 2009, 11:40:02 AM12/23/09

to

Andrey Rahmatullin wrote, On 12/23/2009 05:21 PM:

> On Wed, Dec 23, 2009 at 09:52:03AM +0000, Jarek Poplawski wrote:
>> BTW, it seems a change in 2.6.31 might trigger these timeouts more
>> often than before. Andrey, could you try if this matters here?
> They appear once per 4 seconds with or without this patch.
>

OK, thanks for checking this.

Jarek P.

David Miller

unread,

Dec 24, 2009, 1:00:02 AM12/24/09

to

From: Jarek Poplawski <jar...@gmail.com>
Date: Tue, 22 Dec 2009 18:36:42 +0100

> There are BUGs "scheduling while atomic" triggered by the timer
> rhine_tx_timeout(). They are caused by calling napi_disable() (with
> msleep()). This patch fixes it by moving most of the timer content to
> the workqueue function (similarly to other drivers, like tg3), with
> spin_lock() changed to BH version.
>
> Additionally, there is spin_lock_irq() moved in rhine_close() to
> exclude napi_disable() etc., also tg3's way.
>
> Reported-by: Andrey Rahmatullin <wr...@altlinux.org>
> Tested-by: Andrey Rahmatullin <wr...@altlinux.org>
> Signed-off-by: Jarek Poplawski <jar...@gmail.com>

Applied, thanks!