Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] nfsd: wrong index used in inner loop

28 views
Skip to first unread message

roel

unread,
Mar 8, 2011, 4:40:01 PM3/8/11
to
Index i was already used in the outer loop

Signed-off-by: Roel Kluin <roel....@gmail.com>
---
fs/nfsd/nfs4xdr.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

Not 100% sure this one is needed but it looks suspicious.

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 1275b86..615f0a9 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1142,7 +1142,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,

u32 dummy;
char *machine_name;
- int i;
+ int i, j;
int nr_secflavs;

READ_BUF(16);
@@ -1215,7 +1215,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
READ_BUF(4);
READ32(dummy);
READ_BUF(dummy * 4);
- for (i = 0; i < dummy; ++i)
+ for (j = 0; j < dummy; ++j)
READ32(dummy);
break;
case RPC_AUTH_GSS:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

JA Magallón

unread,
Mar 8, 2011, 6:50:02 PM3/8/11
to
On Tue, 08 Mar 2011 22:32:26 +0100, roel <roel....@gmail.com> wrote:

> Index i was already used in the outer loop
>

Is this as serious as it looks ?
Is it worth tho flag your own loved distro to hurry on a kernel
patch ?

TIA


--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free

J. Bruce Fields

unread,
Mar 8, 2011, 8:00:02 PM3/8/11
to
On Tue, Mar 08, 2011 at 10:32:26PM +0100, roel wrote:
> Index i was already used in the outer loop
>
> Signed-off-by: Roel Kluin <roel....@gmail.com>
> ---
> fs/nfsd/nfs4xdr.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> Not 100% sure this one is needed but it looks suspicious.

Looks bad to me, thanks.

nfsd4_decode_create_session should probably really be broken up a little
bit; if it wasn't so long this would have been more obvious.

I'll see if I can slip this into 2.6.38 with a couple other last-minute
patches.... Otherwise, it'll be in 2.6.39.

--b.

Andrew Morton

unread,
Mar 9, 2011, 6:50:03 PM3/9/11
to
On Tue, 08 Mar 2011 22:32:26 +0100
roel <roel....@gmail.com> wrote:

ooh, big bug.

I wonder why it was not previously detected at runtime. Perhaps
nr_secflavs is always 1.

afacit this bug will allow a well-crafted packet to cause an
infinite-until-it-oopses loop in the kernel.

J. Bruce Fields

unread,
Mar 10, 2011, 1:10:03 PM3/10/11
to

Yeah, no client uses this calback security information yet.

Mi Jinlong, do you think this is something we could have caught with
another pynfs test?

--b.

Greg KH

unread,
Mar 10, 2011, 7:00:01 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: roel <roel....@gmail.com>

commit 3ec07aa9522e3d5e9d5ede7bef946756e623a0a0 upstream.

Index i was already used in the outer loop

Signed-off-by: Roel Kluin <roel....@gmail.com>
Signed-off-by: J. Bruce Fields <bfi...@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
fs/nfsd/nfs4xdr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1107,7 +1107,7 @@ nfsd4_decode_create_session(struct nfsd4



u32 dummy;
char *machine_name;
- int i;
+ int i, j;
int nr_secflavs;

READ_BUF(16);

@@ -1180,7 +1180,7 @@ nfsd4_decode_create_session(struct nfsd4


READ_BUF(4);
READ32(dummy);
READ_BUF(dummy * 4);
- for (i = 0; i < dummy; ++i)
+ for (j = 0; j < dummy; ++j)
READ32(dummy);
break;
case RPC_AUTH_GSS:

Greg KH

unread,
Mar 10, 2011, 7:00:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Sven Barth <pascal...@googlemail.com>

commit 1e6406b8f0dc1ae7d7c39c9e1ac6ca78e016ebfb upstream.

Fix the probing of cx2583x chips, because two controls were clustered
that are not created for these chips.

This regression was introduced in 2.6.36.

Signed-off-by: Sven Barth <pascal...@googlemail.com>
Signed-off-by: Andy Walls <awa...@md.metrocast.net>
Signed-off-by: Mauro Carvalho Chehab <mch...@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/media/video/cx25840/cx25840-core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/media/video/cx25840/cx25840-core.c
+++ b/drivers/media/video/cx25840/cx25840-core.c
@@ -2031,7 +2031,8 @@ static int cx25840_probe(struct i2c_clie
kfree(state);
return err;
}
- v4l2_ctrl_cluster(2, &state->volume);
+ if (!is_cx2583x(state))
+ v4l2_ctrl_cluster(2, &state->volume);
v4l2_ctrl_handler_setup(&state->hdl);

cx25840_ir_probe(sd);

Greg KH

unread,
Mar 10, 2011, 7:00:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Michael <mi...@rsy.com>

commit d213ad08362909ab50fbd6568fcc9fd568268d29 upstream.

After upgrading the kernel from stock Ubuntu 7.10 to
10.04, with no hardware changes, I started getting the dreaded DMA
TIMEOUT errors, followed by inability to encode until the machine was
rebooted.

I came across a post from Andy in March
(http://www.gossamer-threads.com/lists/ivtv/users/40943#40943) where he
speculates that perhaps the corrective actions being taken after a DMA
ERROR are not sufficient to recover the situation. After some testing
I suspect that this is indeed the case, and that in fact the corrective
action may be what hangs the card's DMA engine, rather than the
original error.

Specifically these DMA ERROR IRQs seem to present with two different
values in the IVTV_REG_DMASTATUS register: 0x11 and 0x13. The current
corrective action is to clear that status register back to 0x01 or
0x03, and then issue the next DMA request. In the case of a 0x13 this
seems to result in a minor glitch in the encoded stream due to the
failed transfer that was not retried, but otherwise things continue OK.
In the case of a 0x11 the card's DMA write engine is never heard from
again, and a DMA TIMEOUT follows shortly after. 0x11 is the killer.

I suspect that the two cases need to be handled differently. The
difference is in bit 1 (0x02), which is set when the error is about to
be successfully recovered, and clear when things are about to go bad.

Bit 1 of DMASTATUS is described differently in different places either
as a positive "write finished", or an inverted "write busy". If we
take the first definition, then when an error arises with state 0x11,
it means that the write did not complete. It makes sense to start a
new transfer, as in the current code. But if we take the second
definition, then 0x11 means "an error but the write engine is still
busy". Trying to feed it a new transfer in this situation might not be
a good idea.

As an experiment, I added code to ignore the DMA ERROR IRQ if DMASTATUS
is 0x11. I.e., don't start a new transfer, don't clear our flags, etc.
The hope was that the card would complete the transfer and issue a ENC
DMA COMPLETE, either successfully or with an error condition there.
However the card still hung.

The only remaining corrective action being taken with a 0x11 status was
then the write back to the status register to clear the error, i.e.
DMASTATUS = DMASTATUS & ~3. This would have the effect of clearing the
error bit 4, while leaving the lower bits indicating DMA write busy.

Strangely enough, removing this write to the status register solved the
problem! If the DMA ERROR IRQ with DMASTATUS=0x11 is completely
ignored, with no corrective action at all, then the card will complete
the transfer and issue a new IRQ. If the status register is written to
when it has the value 0x11, then the DMA engine hangs. Perhaps it's
illegal to write to
DMASTATUS while the read or write busy bit is set? At any rate, it
appears that the current corrective action is indeed making things
worse rather than better.

I put together a patch that modifies ivtv_irq_dma_err to do the
following:

- Don't write back to IVTV_REG_DMASTATUS.
- If write-busy is asserted, leave the card alone. Just extend the
timeout slightly.
- If write-busy is de-asserted, retry the current transfer.

This has completely fixed my DMA TIMEOUT woes. DMA ERR events still
occur, but now they seem to be correctly handled. 0x11 events no
longer hang the card, and 0x13 events no longer result in a glitch in
the stream, as the failed transfer is retried. I'm happy.

I've inlined the patch below in case it is of interest. As described
above, I have a theory about why it works (based on a different
interpretation of bit 1 of DMASTATUS), but I can't guarantee that my
theory is correct. There may be another explanation, or it may be a
fluke. Maybe ignoring that IRQ entirely would be equally effective?
Maybe the status register read/writeback sequence is race condition if
the card changes it in the mean time? Also as I am using a PVR-150
only, I have not been able to test it on other cards, which may be
especially relevant for 350s that support concurrent decoding.
Hopefully the patch does not break the DMA READ path.

Mike

[awa...@md.metrocast.net: Modified patch to add a verbose comment, make minor
brace reformats, and clear the error flags in the IVTV_REG_DMASTATUS iff both
read and write DMA were not in progress. Mike's conjecture about a race
condition with the writeback is correct; it can confuse the DMA engine.]

[Comment and analysis from the ML post by Michael <mi...@rsy.com>]


Signed-off-by: Andy Walls <awa...@md.metrocast.net>
Signed-off-by: Mauro Carvalho Chehab <mch...@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/media/video/ivtv/ivtv-irq.c | 58 +++++++++++++++++++++++++++++++-----
1 file changed, 51 insertions(+), 7 deletions(-)

--- a/drivers/media/video/ivtv/ivtv-irq.c
+++ b/drivers/media/video/ivtv/ivtv-irq.c
@@ -628,22 +628,66 @@ static void ivtv_irq_enc_pio_complete(st
static void ivtv_irq_dma_err(struct ivtv *itv)
{
u32 data[CX2341X_MBOX_MAX_DATA];
+ u32 status;

del_timer(&itv->dma_timer);
+
ivtv_api_get_data(&itv->enc_mbox, IVTV_MBOX_DMA_END, 2, data);
+ status = read_reg(IVTV_REG_DMASTATUS);
IVTV_DEBUG_WARN("DMA ERROR %08x %08x %08x %d\n", data[0], data[1],
- read_reg(IVTV_REG_DMASTATUS), itv->cur_dma_stream);
- write_reg(read_reg(IVTV_REG_DMASTATUS) & 3, IVTV_REG_DMASTATUS);
+ status, itv->cur_dma_stream);
+ /*
+ * We do *not* write back to the IVTV_REG_DMASTATUS register to
+ * clear the error status, if either the encoder write (0x02) or
+ * decoder read (0x01) bus master DMA operation do not indicate
+ * completed. We can race with the DMA engine, which may have
+ * transitioned to completed status *after* we read the register.
+ * Setting a IVTV_REG_DMASTATUS flag back to "busy" status, after the
+ * DMA engine has completed, will cause the DMA engine to stop working.
+ */
+ status &= 0x3;
+ if (status == 0x3)
+ write_reg(status, IVTV_REG_DMASTATUS);
+
if (!test_bit(IVTV_F_I_UDMA, &itv->i_flags) &&
itv->cur_dma_stream >= 0 && itv->cur_dma_stream < IVTV_MAX_STREAMS) {
struct ivtv_stream *s = &itv->streams[itv->cur_dma_stream];

- /* retry */
- if (s->type >= IVTV_DEC_STREAM_TYPE_MPG)
+ if (s->type >= IVTV_DEC_STREAM_TYPE_MPG) {
+ /* retry */
+ /*
+ * FIXME - handle cases of DMA error similar to
+ * encoder below, except conditioned on status & 0x1
+ */
ivtv_dma_dec_start(s);
- else
- ivtv_dma_enc_start(s);
- return;
+ return;
+ } else {
+ if ((status & 0x2) == 0) {
+ /*
+ * CX2341x Bus Master DMA write is ongoing.
+ * Reset the timer and let it complete.
+ */
+ itv->dma_timer.expires =
+ jiffies + msecs_to_jiffies(600);
+ add_timer(&itv->dma_timer);
+ return;
+ }
+
+ if (itv->dma_retries < 3) {
+ /*
+ * CX2341x Bus Master DMA write has ended.
+ * Retry the write, starting with the first
+ * xfer segment. Just retrying the current
+ * segment is not sufficient.
+ */
+ s->sg_processed = 0;
+ itv->dma_retries++;
+ ivtv_dma_enc_start_xfer(s);
+ return;
+ }
+ /* Too many retries, give up on this one */
+ }
+
}
if (test_bit(IVTV_F_I_UDMA, &itv->i_flags)) {
ivtv_udma_start(itv);

Greg KH

unread,
Mar 10, 2011, 7:00:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Francois Romieu <rom...@fr.zoreil.com>

commit 1519e57fe81c14bb8fa4855579f19264d1ef63b4 upstream.

Some experiment-based action to prevent my 8168 chipsets locking-up hard
in the irq handler under load (pktgen ~1Mpps). Apparently a reset is not
always mandatory (is it at all ?).

- RTL_GIGA_MAC_VER_12
- RTL_GIGA_MAC_VER_25
Missed ~55% packets. Note:
- this is an old SiS 965L motherboard
- the 8168 chipset emits (lots of) control frames towards the sender

- RTL_GIGA_MAC_VER_26
The chipset does not go into a frenzy of mac control pause when it
crashes yet but it can still be crashed. It needs more work.

Signed-off-by: Francois Romieu <rom...@fr.zoreil.com>
Cc: Ivan Vecera <ive...@redhat.com>
Cc: Hayes <haye...@realtek.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/net/r8169.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -758,7 +758,8 @@ static void __rtl8169_check_link_status(
if (pm)
pm_request_resume(&tp->pci_dev->dev);
netif_carrier_on(dev);
- netif_info(tp, ifup, dev, "link up\n");
+ if (net_ratelimit())
+ netif_info(tp, ifup, dev, "link up\n");
} else {
netif_carrier_off(dev);
netif_info(tp, ifdown, dev, "link down\n");
@@ -4603,13 +4604,24 @@ static irqreturn_t rtl8169_interrupt(int
break;
}

- /* Work around for rx fifo overflow */
- if (unlikely(status & RxFIFOOver) &&
- (tp->mac_version == RTL_GIGA_MAC_VER_11 ||
- tp->mac_version == RTL_GIGA_MAC_VER_22)) {
- netif_stop_queue(dev);
- rtl8169_tx_timeout(dev);
- break;
+ if (unlikely(status & RxFIFOOver)) {
+ switch (tp->mac_version) {
+ /* Work around for rx fifo overflow */
+ case RTL_GIGA_MAC_VER_11:
+ case RTL_GIGA_MAC_VER_22:
+ case RTL_GIGA_MAC_VER_26:
+ netif_stop_queue(dev);
+ rtl8169_tx_timeout(dev);
+ goto done;
+ /* Experimental science. Pktgen proof. */
+ case RTL_GIGA_MAC_VER_12:
+ case RTL_GIGA_MAC_VER_25:
+ if (status == RxFIFOOver)
+ goto done;
+ break;
+ default:
+ break;
+ }
}

if (unlikely(status & SYSErr)) {
@@ -4645,7 +4657,7 @@ static irqreturn_t rtl8169_interrupt(int
(status & RxFIFOOver) ? (status | RxOverflow) : status);
status = RTL_R16(IntrStatus);
}
-
+done:
return IRQ_RETVAL(handled);

Greg KH

unread,
Mar 10, 2011, 7:00:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Balbir Singh <bal...@linux.vnet.ibm.com>

commit 0c3b9168017cbad2c4af3dd65ec93fe646eeaa62 upstream.

The current sched rt code is broken when it comes to hierarchical
scheduling, this patch fixes two problems

1. It adds redundant enqueuing (harmless) when it finds a queue
has tasks enqueued, but it has no run time and it is not
throttled.

2. The most important change is in sched_rt_rq_enqueue/dequeue.
The code just picks the rt_rq belonging to the current cpu
on which the period timer runs, the patch fixes it, so that
the correct rt_se is enqueued/dequeued.

Tested with a simple hierarchy

/c/d, c and d assigned similar runtimes of 50,000 and a while
1 loop runs within "d". Both c and d get throttled, without
the patch, the task just stops running and never runs (depends
on where the sched_rt b/w timer runs). With the patch, the
task is throttled and runs as expected.

[ bharata, suggestions on how to pick the rt_se belong to the
rt_rq and correct cpu ]

Signed-off-by: Balbir Singh <bal...@linux.vnet.ibm.com>
Acked-by: Bharata B Rao <bha...@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zi...@chello.nl>
LKML-Reference: <2011030311...@balbir.in.ibm.com>
Signed-off-by: Ingo Molnar <mi...@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
kernel/sched_rt.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -199,11 +199,12 @@ static void dequeue_rt_entity(struct sch

static void sched_rt_rq_enqueue(struct rt_rq *rt_rq)
{
- int this_cpu = smp_processor_id();
struct task_struct *curr = rq_of_rt_rq(rt_rq)->curr;
struct sched_rt_entity *rt_se;

- rt_se = rt_rq->tg->rt_se[this_cpu];
+ int cpu = cpu_of(rq_of_rt_rq(rt_rq));
+
+ rt_se = rt_rq->tg->rt_se[cpu];

if (rt_rq->rt_nr_running) {
if (rt_se && !on_rt_rq(rt_se))
@@ -215,10 +216,10 @@ static void sched_rt_rq_enqueue(struct r

static void sched_rt_rq_dequeue(struct rt_rq *rt_rq)
{
- int this_cpu = smp_processor_id();
struct sched_rt_entity *rt_se;
+ int cpu = cpu_of(rq_of_rt_rq(rt_rq));

- rt_se = rt_rq->tg->rt_se[this_cpu];
+ rt_se = rt_rq->tg->rt_se[cpu];

if (rt_se && on_rt_rq(rt_se))
dequeue_rt_entity(rt_se);
@@ -546,8 +547,11 @@ static int do_sched_rt_period_timer(stru
if (rt_rq->rt_time || rt_rq->rt_nr_running)
idle = 0;
raw_spin_unlock(&rt_rq->rt_runtime_lock);
- } else if (rt_rq->rt_nr_running)
+ } else if (rt_rq->rt_nr_running) {
idle = 0;
+ if (!rt_rq_throttled(rt_rq))
+ enqueue = 1;
+ }

if (enqueue)
sched_rt_rq_enqueue(rt_rq);

Greg KH

unread,
Mar 10, 2011, 7:00:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Neil Horman <nho...@tuxdriver.com>

commit e9e3d724e2145f5039b423c290ce2b2c3d8f94bc upstream.

The "bad_page()" page allocator sanity check was reported recently (call
chain as follows):

bad_page+0x69/0x91
free_hot_cold_page+0x81/0x144
skb_release_data+0x5f/0x98
__kfree_skb+0x11/0x1a
tcp_ack+0x6a3/0x1868
tcp_rcv_established+0x7a6/0x8b9
tcp_v4_do_rcv+0x2a/0x2fa
tcp_v4_rcv+0x9a2/0x9f6
do_timer+0x2df/0x52c
ip_local_deliver+0x19d/0x263
ip_rcv+0x539/0x57c
netif_receive_skb+0x470/0x49f
:virtio_net:virtnet_poll+0x46b/0x5c5
net_rx_action+0xac/0x1b3
__do_softirq+0x89/0x133
call_softirq+0x1c/0x28
do_softirq+0x2c/0x7d
do_IRQ+0xec/0xf5
default_idle+0x0/0x50
ret_from_intr+0x0/0xa
default_idle+0x29/0x50
cpu_idle+0x95/0xb8
start_kernel+0x220/0x225
_sinittext+0x22f/0x236

It occurs because an skb with a fraglist was freed from the tcp
retransmit queue when it was acked, but a page on that fraglist had
PG_Slab set (indicating it was allocated from the Slab allocator (which
means the free path above can't safely free it via put_page.

We tracked this back to an nfsv4 setacl operation, in which the nfs code
attempted to fill convert the passed in buffer to an array of pages in
__nfs4_proc_set_acl, which gets used by the skb->frags list in
xs_sendpages. __nfs4_proc_set_acl just converts each page in the buffer
to a page struct via virt_to_page, but the vfs allocates the buffer via
kmalloc, meaning the PG_slab bit is set. We can't create a buffer with
kmalloc and free it later in the tcp ack path with put_page, so we need
to either:

1) ensure that when we create the list of pages, no page struct has
PG_Slab set

or

2) not use a page list to send this data

Given that these buffers can be multiple pages and arbitrarily sized, I
think (1) is the right way to go. I've written the below patch to
allocate a page from the buddy allocator directly and copy the data over
to it. This ensures that we have a put_page free-able page for every
entry that winds up on an skb frag list, so it can be safely freed when
the frame is acked. We do a put page on each entry after the
rpc_call_sync call so as to drop our own reference count to the page,
leaving only the ref count taken by tcp_sendpages. This way the data
will be properly freed when the ack comes in

Successfully tested by myself to solve the above oops.

Note, as this is the result of a setacl operation that exceeded a page
of data, I think this amounts to a local DOS triggerable by an
uprivlidged user, so I'm CCing security on this as well.

Signed-off-by: Neil Horman <nho...@tuxdriver.com>
CC: Trond Myklebust <Trond.M...@netapp.com>
CC: Jeff Layton <jla...@redhat.com>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
fs/nfs/nfs4proc.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 42 insertions(+), 2 deletions(-)

--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -49,6 +49,7 @@
#include <linux/mount.h>
#include <linux/module.h>
#include <linux/sunrpc/bc_xprt.h>
+#include <linux/mm.h>

#include "nfs4_fs.h"
#include "delegation.h"
@@ -3216,6 +3217,35 @@ static void buf_to_pages(const void *buf
}
}

+static int buf_to_pages_noslab(const void *buf, size_t buflen,
+ struct page **pages, unsigned int *pgbase)
+{
+ struct page *newpage, **spages;
+ int rc = 0;
+ size_t len;
+ spages = pages;
+
+ do {
+ len = min(PAGE_CACHE_SIZE, buflen);
+ newpage = alloc_page(GFP_KERNEL);
+
+ if (newpage == NULL)
+ goto unwind;
+ memcpy(page_address(newpage), buf, len);
+ buf += len;
+ buflen -= len;
+ *pages++ = newpage;
+ rc++;
+ } while (buflen != 0);
+
+ return rc;
+
+unwind:
+ for(; rc > 0; rc--)
+ __free_page(spages[rc-1]);
+ return -ENOMEM;
+}
+
struct nfs4_cached_acl {
int cached;
size_t len;
@@ -3384,13 +3414,23 @@ static int __nfs4_proc_set_acl(struct in
.rpc_argp = &arg,
.rpc_resp = &res,
};
- int ret;
+ int ret, i;

if (!nfs4_server_supports_acls(server))
return -EOPNOTSUPP;
+ i = buf_to_pages_noslab(buf, buflen, arg.acl_pages, &arg.acl_pgbase);
+ if (i < 0)
+ return i;
nfs_inode_return_delegation(inode);
- buf_to_pages(buf, buflen, arg.acl_pages, &arg.acl_pgbase);
ret = nfs4_call_sync(server, &msg, &arg, &res, 1);
+
+ /*
+ * Free each page after tx, so the only ref left is
+ * held by the network stack
+ */
+ for (; i > 0; i--)
+ put_page(pages[i-1]);
+
/*
* Acl update can result in inode attribute update.
* so mark the attribute cache invalid.

Greg KH

unread,
Mar 10, 2011, 7:00:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Ivan Vecera <ive...@redhat.com>

commit 0d672e9f8ac320c6d1ea9103db6df7f99ea20361 upstream.

Without calling of netif_carrier_off at the end of the probe the operstate
is unknown when the device is initially opened. By default the carrier is
on so when the device is opened and netif_carrier_on is called the link
watch event is not fired and operstate remains zero (unknown).

This patch fixes this behavior in forcedeth and r8169.

Signed-off-by: Ivan Vecera <ive...@redhat.com>
Acked-by: Francois Romieu <rom...@fr.zoreil.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/net/forcedeth.c | 2 ++
drivers/net/r8169.c | 2 ++
2 files changed, 4 insertions(+)

--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -5816,6 +5816,8 @@ static int __devinit nv_probe(struct pci
goto out_error;
}

+ netif_carrier_off(dev);
+
dev_printk(KERN_INFO, &pci_dev->dev, "ifname %s, PHY OUI 0x%x @ %d, "
"addr %2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x\n",
dev->name,
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -3236,6 +3236,8 @@ rtl8169_init_one(struct pci_dev *pdev, c
if (pci_dev_run_wake(pdev))
pm_runtime_put_noidle(&pdev->dev);

+ netif_carrier_off(dev);
+
out:
return rc;

Greg KH

unread,
Mar 10, 2011, 7:00:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Dan Carpenter <err...@gmail.com>

commit b652277b09d3d030cb074cc6a98ba80b34244c03 upstream.

The "ct" variable should be an unsigned int. Both struct kbdiacrs
->kb_cnt and struct kbd_data ->accent_table_size are unsigned ints.

Making it signed causes a problem in KBDIACRUC because the user could
set the signed bit and cause a buffer overflow.

Signed-off-by: Dan Carpenter <err...@gmail.com>
Signed-off-by: Martin Schwidefsky <schwi...@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/s390/char/keyboard.c | 3 ++-


1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/s390/char/keyboard.c
+++ b/drivers/s390/char/keyboard.c
@@ -460,7 +460,8 @@ kbd_ioctl(struct kbd_data *kbd, struct f
unsigned int cmd, unsigned long arg)
{
void __user *argp;
- int ct, perm;
+ unsigned int ct;
+ int perm;

argp = (void __user *)arg;

Greg KH

unread,
Mar 10, 2011, 7:00:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Andy Walls <awa...@md.metrocast.net>

commit 67914b5c400d6c213f9e56d7547a2038ab5c06f4 upstream.

This reverts commit 44835f197bf1e3f57464f23dfb239fef06cf89be.

With the CX23885 hardware I2C master, checking for I2C slave ACK/NAK
is not valid when the I2C_EXTEND or I2C_NOSTOP bits are set.
Revert the commit that checks for I2C slave ACK/NAK on all transactions,
so that XC5000 tuners work with the CX23885 again.

Thanks go to Mark Zimmerman for reporting and bisecting this problem.

Bisected-by: Mark Zimmerman <mark...@frii.com>

Reported-by: Mark Zimmerman <mark...@frii.com>


Signed-off-by: Andy Walls <awa...@md.metrocast.net>
Signed-off-by: Mauro Carvalho Chehab <mch...@redhat.com>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/media/video/cx23885/cx23885-i2c.c | 8 --------
1 file changed, 8 deletions(-)

--- a/drivers/media/video/cx23885/cx23885-i2c.c
+++ b/drivers/media/video/cx23885/cx23885-i2c.c
@@ -122,10 +122,6 @@ static int i2c_sendbytes(struct i2c_adap

if (!i2c_wait_done(i2c_adap))
goto eio;
- if (!i2c_slave_did_ack(i2c_adap)) {
- retval = -ENXIO;
- goto err;
- }
if (i2c_debug) {
printk(" <W %02x %02x", msg->addr << 1, msg->buf[0]);
if (!(ctrl & I2C_NOSTOP))
@@ -209,10 +205,6 @@ static int i2c_readbytes(struct i2c_adap

if (!i2c_wait_done(i2c_adap))
goto eio;
- if (cnt == 0 && !i2c_slave_did_ack(i2c_adap)) {
- retval = -ENXIO;
- goto err;
- }
msg->buf[cnt] = cx_read(bus->reg_rdata) & 0xff;
if (i2c_debug) {
dprintk(1, " %02x", msg->buf[cnt]);

Greg KH

unread,
Mar 10, 2011, 7:00:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Ivan Vecera <ive...@redhat.com>

commit b5ba6d12bdac21bc0620a5089e0f24e362645efd upstream.

I found that one of the 8168c chipsets (concretely XID 1c4000c0) starts
generating RxFIFO overflow errors. The result is an infinite loop in
interrupt handler as the RxFIFOOver is handled only for ...MAC_VER_11.
With the workaround everything goes fine.

Signed-off-by: Ivan Vecera <ive...@redhat.com>
Acked-by: Francois Romieu <rom...@fr.zoreil.com>

Cc: Hayes <haye...@realtek.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/net/r8169.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -3725,7 +3725,8 @@ static void rtl_hw_start_8168(struct net
RTL_W16(IntrMitigate, 0x5151);

/* Work around for RxFIFO overflow. */
- if (tp->mac_version == RTL_GIGA_MAC_VER_11) {
+ if (tp->mac_version == RTL_GIGA_MAC_VER_11 ||
+ tp->mac_version == RTL_GIGA_MAC_VER_22) {
tp->intr_event |= RxFIFOOver | PCSTimeout;
tp->intr_event &= ~RxOverflow;
}
@@ -4604,7 +4605,8 @@ static irqreturn_t rtl8169_interrupt(int



/* Work around for rx fifo overflow */

if (unlikely(status & RxFIFOOver) &&

- (tp->mac_version == RTL_GIGA_MAC_VER_11)) {
+ (tp->mac_version == RTL_GIGA_MAC_VER_11 ||
+ tp->mac_version == RTL_GIGA_MAC_VER_22)) {
netif_stop_queue(dev);
rtl8169_tx_timeout(dev);
break;

Greg KH

unread,
Mar 10, 2011, 7:00:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Olivier Grenie <olivier...@dibcom.fr>

commit e192a7cf0effe7680264a5bc35c0ad1bdcdc921c upstream.

This patch adds the pid filtering for the dib7000M demod. It also
corrects the pid filtering for the dib7700 based board. It should
prevent an oops, when using dib7700p based board.

References: https://bugzilla.novell.com/show_bug.cgi?id=644807

Signed-off-by: Olivier Grenie <olivier...@dibcom.fr>
Signed-off-by: Patrick Boettcher <patrick....@dibcom.fr>
Tested-by: Pavel SKARKA <pau...@seznam.cz>


Signed-off-by: Mauro Carvalho Chehab <mch...@redhat.com>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/media/dvb/dvb-usb/dib0700_devices.c | 21 +++++++++++++++++++--
drivers/media/dvb/frontends/dib7000m.c | 19 +++++++++++++++++++
drivers/media/dvb/frontends/dib7000m.h | 15 +++++++++++++++
3 files changed, 53 insertions(+), 2 deletions(-)

--- a/drivers/media/dvb/dvb-usb/dib0700_devices.c
+++ b/drivers/media/dvb/dvb-usb/dib0700_devices.c
@@ -870,6 +870,23 @@ static int dib7070p_tuner_attach(struct
return 0;
}

+static int stk7700p_pid_filter(struct dvb_usb_adapter *adapter, int index,
+ u16 pid, int onoff)
+{
+ struct dib0700_state *st = adapter->dev->priv;
+ if (st->is_dib7000pc)
+ return dib7000p_pid_filter(adapter->fe, index, pid, onoff);
+ return dib7000m_pid_filter(adapter->fe, index, pid, onoff);
+}
+
+static int stk7700p_pid_filter_ctrl(struct dvb_usb_adapter *adapter, int onoff)
+{
+ struct dib0700_state *st = adapter->dev->priv;
+ if (st->is_dib7000pc)
+ return dib7000p_pid_filter_ctrl(adapter->fe, onoff);
+ return dib7000m_pid_filter_ctrl(adapter->fe, onoff);
+}
+
static int stk70x0p_pid_filter(struct dvb_usb_adapter *adapter, int index, u16 pid, int onoff)
{
return dib7000p_pid_filter(adapter->fe, index, pid, onoff);
@@ -1875,8 +1892,8 @@ struct dvb_usb_device_properties dib0700
{
.caps = DVB_USB_ADAP_HAS_PID_FILTER | DVB_USB_ADAP_PID_FILTER_CAN_BE_TURNED_OFF,
.pid_filter_count = 32,
- .pid_filter = stk70x0p_pid_filter,
- .pid_filter_ctrl = stk70x0p_pid_filter_ctrl,
+ .pid_filter = stk7700p_pid_filter,
+ .pid_filter_ctrl = stk7700p_pid_filter_ctrl,
.frontend_attach = stk7700p_frontend_attach,
.tuner_attach = stk7700p_tuner_attach,

--- a/drivers/media/dvb/frontends/dib7000m.c
+++ b/drivers/media/dvb/frontends/dib7000m.c
@@ -1285,6 +1285,25 @@ struct i2c_adapter * dib7000m_get_i2c_ma
}
EXPORT_SYMBOL(dib7000m_get_i2c_master);

+int dib7000m_pid_filter_ctrl(struct dvb_frontend *fe, u8 onoff)
+{
+ struct dib7000m_state *state = fe->demodulator_priv;
+ u16 val = dib7000m_read_word(state, 294 + state->reg_offs) & 0xffef;
+ val |= (onoff & 0x1) << 4;
+ dprintk("PID filter enabled %d", onoff);
+ return dib7000m_write_word(state, 294 + state->reg_offs, val);
+}
+EXPORT_SYMBOL(dib7000m_pid_filter_ctrl);
+
+int dib7000m_pid_filter(struct dvb_frontend *fe, u8 id, u16 pid, u8 onoff)
+{
+ struct dib7000m_state *state = fe->demodulator_priv;
+ dprintk("PID filter: index %x, PID %d, OnOff %d", id, pid, onoff);
+ return dib7000m_write_word(state, 300 + state->reg_offs + id,
+ onoff ? (1 << 13) | pid : 0);
+}
+EXPORT_SYMBOL(dib7000m_pid_filter);
+
#if 0
/* used with some prototype boards */
int dib7000m_i2c_enumeration(struct i2c_adapter *i2c, int no_of_demods,
--- a/drivers/media/dvb/frontends/dib7000m.h
+++ b/drivers/media/dvb/frontends/dib7000m.h
@@ -46,6 +46,8 @@ extern struct dvb_frontend *dib7000m_att
extern struct i2c_adapter *dib7000m_get_i2c_master(struct dvb_frontend *,
enum dibx000_i2c_interface,
int);
+extern int dib7000m_pid_filter(struct dvb_frontend *, u8 id, u16 pid, u8 onoff);
+extern int dib7000m_pid_filter_ctrl(struct dvb_frontend *fe, u8 onoff);
#else
static inline
struct dvb_frontend *dib7000m_attach(struct i2c_adapter *i2c_adap,
@@ -63,6 +65,19 @@ struct i2c_adapter *dib7000m_get_i2c_mas
printk(KERN_WARNING "%s: driver disabled by Kconfig\n", __func__);
return NULL;
}
+static inline int dib7000m_pid_filter(struct dvb_frontend *fe, u8 id,
+ u16 pid, u8 onoff)
+{
+ printk(KERN_WARNING "%s: driver disabled by Kconfig\n", __func__);
+ return -ENODEV;
+}
+
+static inline int dib7000m_pid_filter_ctrl(struct dvb_frontend *fe,
+ uint8_t onoff)
+{
+ printk(KERN_WARNING "%s: driver disabled by Kconfig\n", __func__);
+ return -ENODEV;
+}
#endif

/* TODO

Greg KH

unread,
Mar 10, 2011, 7:00:03 PM3/10/11
to
This is the start of the stable review cycle for the 2.6.37.r release.
There are 29 patches in this series, all will be posted as a response to
this one. If anyone has any issues with these being applied, please let
us know. If anyone is a maintainer of the proper subsystem, and wants
to add a Signed-off-by: line to the patch, please respond with it.

Responses should be made by Saturday, March 12, 24:00:00 UTC.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.37.4-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h


Makefile | 2 +-
arch/x86/kernel/cpu/cpufreq/pcc-cpufreq.c | 2 +-
drivers/char/virtio_console.c | 8 ++++
drivers/hid/hid-mosart.c | 4 ++
drivers/media/dvb/dvb-usb/dib0700_devices.c | 21 +++++++++-
drivers/media/dvb/frontends/dib7000m.c | 19 +++++++++
drivers/media/dvb/frontends/dib7000m.h | 15 +++++++
drivers/media/video/cx23885/cx23885-i2c.c | 8 ----
drivers/media/video/cx25840/cx25840-core.c | 3 +-
drivers/media/video/ivtv/ivtv-irq.c | 58 +++++++++++++++++++++++---
drivers/misc/bmp085.c | 1 +
drivers/net/forcedeth.c | 2 +
drivers/net/ixgbe/ixgbe_main.c | 4 ++
drivers/net/r8169.c | 42 +++++++++++++++----
drivers/net/wireless/ath/ath9k/ath9k.h | 3 -
drivers/net/wireless/ath/ath9k/init.c | 4 --
drivers/net/wireless/ath/ath9k/main.c | 4 --
drivers/s390/char/keyboard.c | 3 +-
fs/nfs/nfs4proc.c | 44 +++++++++++++++++++-
fs/nfs/nfs4xdr.c | 3 -
fs/nfsd/nfs4xdr.c | 4 +-
include/keys/rxrpc-type.h | 1 -
include/linux/netdevice.h | 3 +
kernel/cpuset.c | 7 ++-
kernel/sched_rt.c | 14 ++++--
mm/mremap.c | 4 +-
net/core/dev.c | 12 +++++-
net/ipv4/ip_gre.c | 1 +
net/ipv4/ipip.c | 1 +
net/ipv4/netfilter/arpt_mangle.c | 6 +-
net/ipv6/sit.c | 2 +-
net/netfilter/ipvs/ip_vs_ctl.c | 4 +-
net/netfilter/nf_log.c | 4 ++
sound/pci/hda/patch_cirrus.c | 2 +
sound/pci/hda/patch_realtek.c | 7 +--
sound/soc/codecs/wm9081.c | 5 ++
36 files changed, 255 insertions(+), 72 deletions(-)

Greg KH

unread,
Mar 10, 2011, 7:00:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Francois Romieu <rom...@fr.zoreil.com>

commit f60ac8e7ab7cbb413a0131d5665b053f9f386526 upstream.

While the RxFIFO interruption is masked for most 8168, nothing prevents
it to appear in the irq status word. This is no excuse to crash.

Signed-off-by: Francois Romieu <rom...@fr.zoreil.com>
Cc: Ivan Vecera <ive...@redhat.com>
Cc: Hayes <haye...@realtek.com>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/net/r8169.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4613,6 +4613,14 @@ static irqreturn_t rtl8169_interrupt(int
netif_stop_queue(dev);
rtl8169_tx_timeout(dev);
goto done;
+ /* Testers needed. */
+ case RTL_GIGA_MAC_VER_17:
+ case RTL_GIGA_MAC_VER_19:
+ case RTL_GIGA_MAC_VER_20:
+ case RTL_GIGA_MAC_VER_21:
+ case RTL_GIGA_MAC_VER_23:
+ case RTL_GIGA_MAC_VER_24:
+ case RTL_GIGA_MAC_VER_27:


/* Experimental science. Pktgen proof. */

case RTL_GIGA_MAC_VER_12:
case RTL_GIGA_MAC_VER_25:

Greg KH

unread,
Mar 10, 2011, 7:00:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Li Zefan <li...@cn.fujitsu.com>

commit b75f38d659e6fc747eda64cb72f3920e29dd44a4 upstream.

Don't forget to release cgroup_mutex if alloc_trial_cpuset() fails.

[ak...@linux-foundation.org: avoid multiple return points]
Signed-off-by: Li Zefan <li...@cn.fujitsu.com>
Cc: Paul Menage <men...@google.com>
Acked-by: David Rientjes <rien...@google.com>
Cc: Miao Xie <mi...@cn.fujitsu.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
kernel/cpuset.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1575,8 +1575,10 @@ static int cpuset_write_resmask(struct c
return -ENODEV;

trialcs = alloc_trial_cpuset(cs);
- if (!trialcs)
- return -ENOMEM;
+ if (!trialcs) {
+ retval = -ENOMEM;
+ goto out;
+ }

switch (cft->private) {
case FILE_CPULIST:
@@ -1591,6 +1593,7 @@ static int cpuset_write_resmask(struct c
}

free_trial_cpuset(trialcs);
+out:
cgroup_unlock();
return retval;

Greg KH

unread,
Mar 10, 2011, 7:00:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Pablo Neira Ayuso <pa...@netfilter.org>

commit 9d0db8b6b1da9e3d4c696ef29449700c58d589db upstream.

In 135367b "netfilter: xtables: change xt_target.checkentry return type",
the type returned by checkentry was changed from boolean to int, but the
return values where not adjusted.

arptables: Input/output error

This broke arptables with the mangle target since it returns true
under success, which is interpreted by xtables as >0, thus
returning EIO.

Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>
Signed-off-by: Patrick McHardy <ka...@trash.net>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
net/ipv4/netfilter/arpt_mangle.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/net/ipv4/netfilter/arpt_mangle.c
+++ b/net/ipv4/netfilter/arpt_mangle.c
@@ -60,12 +60,12 @@ static int checkentry(const struct xt_tg

if (mangle->flags & ~ARPT_MANGLE_MASK ||
!(mangle->flags & ARPT_MANGLE_MASK))
- return false;
+ return -EINVAL;

if (mangle->target != NF_DROP && mangle->target != NF_ACCEPT &&
mangle->target != XT_CONTINUE)
- return false;
- return true;
+ return -EINVAL;
+ return 0;
}

static struct xt_target arpt_mangle_reg __read_mostly = {

Greg KH

unread,
Mar 10, 2011, 7:10:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Julian Anastasov <j...@ssi.bg>

commit ff75f40f44ae9b79d520bf32a05d35af74a805c0 upstream.

Fix dst_lock usage in __ip_vs_update_dest. We need
_bh locking because destination is updated in user context.
Can cause lockups on frequent destination updates.
Problem reported by Simon Kirby. Bug was introduced
in 2.6.37 from the "ipvs: changes for local real server"
change.

Signed-off-by: Julian Anastasov <j...@ssi.bg>
Signed-off-by: Hans Schillstrom <ha...@schillstrom.com>
Signed-off-by: Simon Horman <ho...@verge.net.au>
Cc: Simon Kirby <s...@hostway.ca>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
net/netfilter/ipvs/ip_vs_ctl.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -810,9 +810,9 @@ __ip_vs_update_dest(struct ip_vs_service
dest->u_threshold = udest->u_threshold;
dest->l_threshold = udest->l_threshold;

- spin_lock(&dest->dst_lock);
+ spin_lock_bh(&dest->dst_lock);
ip_vs_dst_reset(dest);
- spin_unlock(&dest->dst_lock);
+ spin_unlock_bh(&dest->dst_lock);

if (add)
ip_vs_new_estimator(&dest->stats);

Greg KH

unread,
Mar 10, 2011, 7:10:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Takashi Iwai <ti...@suse.de>

commit 38c07641905c0db58e800ea974cd9158717c6610 upstream.

The errata init verbs for CS42xx codecs contain the verbs to set
the power-state of SPDIF nodes to D3, which seem to break the SPDIF
output on some MacBooks. Since this is executed during the power-up
initialization, we shouldn't turn them down there.

Reported-by: Arun Raghavan <arun.r...@collabora.co.uk>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
sound/pci/hda/patch_cirrus.c | 2 ++
1 file changed, 2 insertions(+)

--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -1039,9 +1039,11 @@ static struct hda_verb cs_errata_init_ve
{0x11, AC_VERB_SET_PROC_COEF, 0x0008},
{0x11, AC_VERB_SET_PROC_STATE, 0x00},

+#if 0 /* Don't to set to D3 as we are in power-up sequence */
{0x07, AC_VERB_SET_POWER_STATE, 0x03}, /* S/PDIF Rx: D3 */
{0x08, AC_VERB_SET_POWER_STATE, 0x03}, /* S/PDIF Tx: D3 */
/*{0x01, AC_VERB_SET_POWER_STATE, 0x03},*/ /* AFG: D3 This is already handled */
+#endif

{} /* terminator */
};

Greg KH

unread,
Mar 10, 2011, 7:10:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Chuck Lever <chuck...@oracle.com>

commit d1205f87bbb8040c1408bbd9e0a720310b2b0b9b upstream.

On recent 2.6.38-rc kernels, connectathon basic test 6 fails on
NFSv4 mounts of OpenSolaris with something like:

> ./test6: readdir
> ./test6: (/mnt/klimt/matisse.test) didn't read expected 'file.12' dir entry, pass 0
> ./test6: (/mnt/klimt/matisse.test) didn't read expected 'file.82' dir entry, pass 0
> ./test6: (/mnt/klimt/matisse.test) didn't read expected 'file.164' dir entry, pass 0
> ./test6: (/mnt/klimt/matisse.test) Test failed with 3 errors
> basic tests failed
> Tests failed, leaving /mnt/klimt mounted
> [cel@matisse cthon04]$

I narrowed the problem down to nfs4_decode_dirent() reporting that the
decode buffer had overflowed while decoding the entries for those
missing files.

verify_attr_len() assumes both it's pointer arguments reside on the
same page. When these arguments point to locations on two different
pages, verify_attr_len() can report false errors. This can happen now
that a large NFSv4 readdir result can span pages.

We have reasonably good checking in nfs4_decode_dirent() anyway, so
it should be safe to simply remove the extra checking.

At a guess, this was introduced by commit 6650239a, "NFS: Don't use
vm_map_ram() in readdir".

Signed-off-by: Chuck Lever <chuck...@oracle.com>
Signed-off-by: Trond Myklebust <Trond.M...@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
fs/nfs/nfs4xdr.c | 3 ---
1 file changed, 3 deletions(-)

--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -6212,9 +6212,6 @@ __be32 *nfs4_decode_dirent(struct xdr_st
if (entry->fattr->valid & NFS_ATTR_FATTR_TYPE)
entry->d_type = nfs_umode_to_dtype(entry->fattr->mode);

- if (verify_attr_len(xdr, p, len) < 0)
- goto out_overflow;
-
return p;

out_overflow:

Greg KH

unread,
Mar 10, 2011, 7:10:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Hugh Dickins <hu...@google.com>

commit a3e8cc643d22d2c8ed36b9be7d9c9ca21efcf7f7 upstream.

Robert Swiecki reported a BUG_ON(page_mapped) from a fuzzer, punching
a hole with madvise(,, MADV_REMOVE). That path is under mutex, and
cannot be explained by lack of serialization in unmap_mapping_range().

Reviewing the code, I found one place where vm_truncate_count handling
should have been updated, when I switched at the last minute from one
way of managing the restart_addr to another: mremap move changes the
virtual addresses, so it ought to adjust the restart_addr.

But rather than exporting the notion of restart_addr from memory.c, or
converting to restart_pgoff throughout, simply reset vm_truncate_count
to 0 to force a rescan if mremap move races with preempted truncation.

We have no confirmation that this fixes Robert's BUG,
but it is a fix that's worth making anyway.

Signed-off-by: Hugh Dickins <hu...@google.com>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Cc: Kerin Millar <kerf...@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
mm/mremap.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -91,9 +91,7 @@ static void move_ptes(struct vm_area_str
*/
mapping = vma->vm_file->f_mapping;
spin_lock(&mapping->i_mmap_lock);
- if (new_vma->vm_truncate_count &&
- new_vma->vm_truncate_count != vma->vm_truncate_count)
- new_vma->vm_truncate_count = 0;
+ new_vma->vm_truncate_count = 0;
}

/*

Greg KH

unread,
Mar 10, 2011, 7:10:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Mark Brown <bro...@opensource.wolfsonmicro.com>

commit 3ee845acba58549578d03a46ed307c0a56c7f777 upstream.

It went AWOL in the multi-component conversion.

Signed-off-by: Mark Brown <bro...@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <l...@slimlogic.co.uk>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
sound/soc/codecs/wm9081.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/sound/soc/codecs/wm9081.c
+++ b/sound/soc/codecs/wm9081.c
@@ -15,6 +15,7 @@
#include <linux/moduleparam.h>
#include <linux/init.h>
#include <linux/delay.h>
+#include <linux/device.h>
#include <linux/pm.h>
#include <linux/i2c.h>
#include <linux/platform_device.h>
@@ -1338,6 +1339,10 @@ static __devinit int wm9081_i2c_probe(st
wm9081->control_type = SND_SOC_I2C;
wm9081->control_data = i2c;

+ if (dev_get_platdata(&i2c->dev))
+ memcpy(&wm9081->retune, dev_get_platdata(&i2c->dev),
+ sizeof(wm9081->retune));
+
ret = snd_soc_register_codec(&i2c->dev,
&soc_codec_dev_wm9081, &wm9081_dai, 1);
if (ret < 0)

Greg KH

unread,
Mar 10, 2011, 7:10:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Amit Shah <amit...@redhat.com>

commit d7a62cd0332115d4c7c4689abea0d889a30d8349 upstream.

If a virtio-console device gets unplugged while a port is open, a
subsequent close() call on the port accesses vqs to free up buffers.
This can lead to a crash.

The buffers are already freed up as a result of the call to
unplug_ports() from virtcons_remove(). The fix is to simply not access
vq information if port->portdev is NULL.

Reported-by: juzhang <juz...@redhat.com>
Signed-off-by: Amit Shah <amit...@redhat.com>
Signed-off-by: Rusty Russell <ru...@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/char/virtio_console.c | 8 ++++++++


1 file changed, 8 insertions(+)

--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -387,6 +387,10 @@ static void discard_port_data(struct por
unsigned int len;
int ret;

+ if (!port->portdev) {
+ /* Device has been unplugged. vqs are already gone. */
+ return;
+ }
vq = port->in_vq;
if (port->inbuf)
buf = port->inbuf;
@@ -469,6 +473,10 @@ static void reclaim_consumed_buffers(str
void *buf;
unsigned int len;

+ if (!port->portdev) {
+ /* Device has been unplugged. vqs are already gone. */
+ return;
+ }
while ((buf = virtqueue_get_buf(port->out_vq, &len))) {
kfree(buf);
port->outvq_full = false;

Greg KH

unread,
Mar 10, 2011, 7:10:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: David Henningsson <david.he...@canonical.com>

commit f0ce27996217d06207c8bfda1b1bbec2fbab48c6 upstream.

This patch fixes an error in the jack detection reporting,
causing the jack detection sometimes not to be reported
correctly to the input subsystem. It should apply to several
Realtek codecs.

Signed-off-by: David Henningsson <david.he...@canonical.com>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
sound/pci/hda/patch_realtek.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -1127,11 +1127,8 @@ static void alc_automute_speaker(struct
nid = spec->autocfg.hp_pins[i];
if (!nid)
break;
- if (snd_hda_jack_detect(codec, nid)) {
- spec->jack_present = 1;
- break;
- }
- alc_report_jack(codec, spec->autocfg.hp_pins[i]);
+ alc_report_jack(codec, nid);
+ spec->jack_present |= snd_hda_jack_detect(codec, nid);
}

mute = spec->jack_present ? HDA_AMP_MUTE : 0;

Greg KH

unread,
Mar 10, 2011, 7:10:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Benjamin Tissoires <benjamin....@enac.fr>

commit ad6d42670279da8f33f633f8a96a67cd7ef3b1da upstream.

This commit allows the device to be recognized as a touchscreen, and not a
touchpad by xf86-input-evdev.

The device has 2 modes. The first one is an emulation of a touchscreen by
sending left and right button, and the second mode is the one used in
dual-touch (sending trackingID, touch and else).

That's why there is a hid report containing left and right buttons
(9000001 and 9000002). The point is that xorg relies on these fields to
determine if it's a touchpad or a touchscreen.
Clearing the report (return -1) makes xorg detecting it out of the box
as a quite pleasant (dual)touchscreen.

Signed-off-by: Benjamin Tissoires <benjamin....@enac.fr>
Acked-by: Chase Douglas <chase....@canonical.com>
Signed-off-by: Jiri Kosina <jko...@suse.cz>
Cc: James Sharam <james....@googlemail.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/hid/hid-mosart.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/hid/hid-mosart.c
+++ b/drivers/hid/hid-mosart.c
@@ -90,6 +90,10 @@ static int mosart_input_mapping(struct h
case 0xff000000:
/* ignore HID features */
return -1;
+
+ case HID_UP_BUTTON:
+ /* ignore buttons */
+ return -1;
}

return 0;

Greg KH

unread,
Mar 10, 2011, 7:10:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Naga Chumbalkar <nagananda....@hp.com>

commit 1f858ef2fbabdc5e645644010a31a40c32e397c9 upstream.

Return 0 on failure. This will cause the initialization of the driver
to fail and prevent the driver from loading if the BIOS cannot handle
the PCC interface command to "get frequency". Otherwise, the driver
will load and display a very high value like "4294967274" (which is
actually -EINVAL) for frequency:

# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
4294967274

Signed-off-by: Naga Chumbalkar <nagananda....@hp.com>
Signed-off-by: Dave Jones <da...@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
arch/x86/kernel/cpu/cpufreq/pcc-cpufreq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/cpufreq/pcc-cpufreq.c
+++ b/arch/x86/kernel/cpu/cpufreq/pcc-cpufreq.c
@@ -195,7 +195,7 @@ static unsigned int pcc_get_freq(unsigne
cmd_incomplete:
iowrite16(0, &pcch_hdr->status);
spin_unlock(&pcc_lock);
- return -EINVAL;
+ return 0;
}

static int pcc_cpufreq_target(struct cpufreq_policy *policy,

Greg KH

unread,
Mar 10, 2011, 7:10:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Don Skidmore <donald.c...@intel.com>

commit a124339ad28389093ed15eca990d39c51c5736cc upstream.

We have found a hardware erratum on 82599 hardware that can lead to
unpredictable behavior when Header Splitting mode is enabled. So
we are no longer enabling this feature on affected hardware.

Please see the 82599 Specification Update for more information.

Signed-off-by: Don Skidmore <donald.c...@intel.com>
Tested-by: Stephen Ko <stephe...@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey....@intel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/net/ixgbe/ixgbe_main.c | 4 ++++


1 file changed, 4 insertions(+)

--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -2923,6 +2923,10 @@ static void ixgbe_set_rx_buffer_len(stru
if (hw->mac.type == ixgbe_mac_82599EB)
adapter->flags &= ~IXGBE_FLAG_RX_PS_ENABLED;

+ /* Disable packet split due to 82599 erratum #45 */
+ if (hw->mac.type == ixgbe_mac_82599EB)
+ adapter->flags &= ~IXGBE_FLAG_RX_PS_ENABLED;
+
/* Set the RX buffer length according to the mode */
if (adapter->flags & IXGBE_FLAG_RX_PS_ENABLED) {
rx_buf_len = IXGBE_RX_HDR_SIZE;

Greg KH

unread,
Mar 10, 2011, 7:10:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Jan Engelhardt <jen...@medozas.de>

commit 9ef0298a8e5730d9a46d640014c727f3b4152870 upstream.

Like many other places, we have to check that the array index is
within allowed limits, or otherwise, a kernel oops and other nastiness
can ensue when we access memory beyond the end of the array.

[ 5954.115381] BUG: unable to handle kernel paging request at 0000004000000000
[ 5954.120014] IP: __find_logger+0x6f/0xa0
[ 5954.123979] nf_log_bind_pf+0x2b/0x70
[ 5954.123979] nfulnl_recv_config+0xc0/0x4a0 [nfnetlink_log]
[ 5954.123979] nfnetlink_rcv_msg+0x12c/0x1b0 [nfnetlink]
...

The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind
was decoupled from nf_log_register.

Reported-by: Miguel Di Ciurcio Filho <miguel...@gmail.com>,
via irc.freenode.net/#netfilter
Signed-off-by: Jan Engelhardt <jen...@medozas.de>
Signed-off-by: Patrick McHardy <ka...@trash.net>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
net/netfilter/nf_log.c | 4 ++++


1 file changed, 4 insertions(+)

--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -85,6 +85,8 @@ EXPORT_SYMBOL(nf_log_unregister);

int nf_log_bind_pf(u_int8_t pf, const struct nf_logger *logger)
{
+ if (pf >= ARRAY_SIZE(nf_loggers))
+ return -EINVAL;
mutex_lock(&nf_log_mutex);
if (__find_logger(pf, logger->name) == NULL) {
mutex_unlock(&nf_log_mutex);
@@ -98,6 +100,8 @@ EXPORT_SYMBOL(nf_log_bind_pf);

void nf_log_unbind_pf(u_int8_t pf)
{
+ if (pf >= ARRAY_SIZE(nf_loggers))
+ return;
mutex_lock(&nf_log_mutex);
rcu_assign_pointer(nf_loggers[pf], NULL);
mutex_unlock(&nf_log_mutex);

Greg KH

unread,
Mar 10, 2011, 7:10:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------


From: Mohammed Shafi Shajakhan <mshaj...@atheros.com>

This is a backport of upstream commit 0f5cd45960173ba5b36727decbb4a241cbd35ef9.

The DMA latency issue is observed only in Intel pinetrail platforms
but in the driver we had a default PM-QOS value of 55. This caused
unnecessary power consumption and battery drain in other platforms.

Remove the pm-qos thing in the driver code and address the throughput
issue in Intel pinetrail platfroms in user space using any one of
the scripts in below links:

http://www.kernel.org/pub/linux/kernel/people/mcgrof/scripts/cpudmalatency.c
http://johannes.sipsolutions.net/files/netlatency.c.txt

More details can be found in the following bugzilla link:

https://bugzilla.kernel.org/show_bug.cgi?id=27532

Signed-off-by: Thomas Bächler <tho...@archlinux.org>
Acked-by: Mohammed Shafi Shajakhan <mshaj...@atheros.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/net/wireless/ath/ath9k/ath9k.h | 3 ---
drivers/net/wireless/ath/ath9k/init.c | 4 ----
drivers/net/wireless/ath/ath9k/main.c | 4 ----
3 files changed, 11 deletions(-)

--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -21,7 +21,6 @@
#include <linux/device.h>
#include <linux/leds.h>
#include <linux/completion.h>
-#include <linux/pm_qos_params.h>

#include "debug.h"
#include "common.h"
@@ -647,8 +646,6 @@ struct ath_softc {
struct ath_descdma txsdma;

struct ath_ant_comb ant_comb;
-
- struct pm_qos_request_list pm_qos_req;
};

struct ath_wiphy {
--- a/drivers/net/wireless/ath/ath9k/init.c
+++ b/drivers/net/wireless/ath/ath9k/init.c
@@ -758,9 +758,6 @@ int ath9k_init_device(u16 devid, struct
ath_init_leds(sc);
ath_start_rfkill_poll(sc);

- pm_qos_add_request(&sc->pm_qos_req, PM_QOS_CPU_DMA_LATENCY,
- PM_QOS_DEFAULT_VALUE);
-
return 0;

error_world:
@@ -829,7 +826,6 @@ void ath9k_deinit_device(struct ath_soft
}

ieee80211_unregister_hw(hw);
- pm_qos_remove_request(&sc->pm_qos_req);
ath_rx_cleanup(sc);
ath_tx_cleanup(sc);
ath9k_deinit_softc(sc);
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -1245,8 +1245,6 @@ static int ath9k_start(struct ieee80211_
ath9k_btcoex_timer_resume(sc);
}

- pm_qos_update_request(&sc->pm_qos_req, 55);
-
mutex_unlock:
mutex_unlock(&sc->mutex);

@@ -1425,8 +1423,6 @@ static void ath9k_stop(struct ieee80211_

sc->sc_flags |= SC_OP_INVALID;

- pm_qos_update_request(&sc->pm_qos_req, PM_QOS_DEFAULT_VALUE);
-
mutex_unlock(&sc->mutex);

ath_print(common, ATH_DBG_CONFIG, "Driver halt\n");

Greg KH

unread,
Mar 10, 2011, 7:10:04 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Axel Lin <axel...@gmail.com>

commit 97e419a082461f8a3a0818834eb88ad41219a1da upstream.

The device table is required to load modules based on modaliases.

Signed-off-by: Axel Lin <axel...@gmail.com>
Cc: Shubhrajyoti D <shubhr...@ti.com>
Cc: Christoph Mair <christo...@gmail.com>
Cc: Jonathan Cameron <ji...@cam.ac.uk>


Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/misc/bmp085.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/misc/bmp085.c
+++ b/drivers/misc/bmp085.c
@@ -449,6 +449,7 @@ static const struct i2c_device_id bmp085
{ "bmp085", 0 },
{ }
};
+MODULE_DEVICE_TABLE(i2c, bmp085_id);

static struct i2c_driver bmp085_driver = {
.driver = {

Greg KH

unread,
Mar 10, 2011, 7:10:02 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Anton Blanchard <an...@au1.ibm.com>

commit f009918a1c1bbf8607b8aab3959876913a30193a upstream.

commit 339412841d7 (RxRPC: Allow key payloads to be passed in XDR form)
broke klog for me. I notice the v1 key struct had a kif_version field
added:

-struct rxkad_key {
- u16 security_index; /* RxRPC header security index */
- u16 ticket_len; /* length of ticket[] */
- u32 expiry; /* time at which expires */
- u32 kvno; /* key version number */
- u8 session_key[8]; /* DES session key */
- u8 ticket[0]; /* the encrypted ticket */
-};

+struct rxrpc_key_data_v1 {
+ u32 kif_version; /* 1 */
+ u16 security_index;
+ u16 ticket_length;
+ u32 expiry; /* time_t */
+ u32 kvno;
+ u8 session_key[8];
+ u8 ticket[0];
+};

However the code in rxrpc_instantiate strips it away:

data += sizeof(kver);
datalen -= sizeof(kver);

Removing kif_version fixes my problem.

Signed-off-by: Anton Blanchard <an...@samba.org>
Signed-off-by: David Howells <dhow...@redhat.com>


Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
include/keys/rxrpc-type.h | 1 -
1 file changed, 1 deletion(-)

--- a/include/keys/rxrpc-type.h
+++ b/include/keys/rxrpc-type.h
@@ -99,7 +99,6 @@ struct rxrpc_key_token {
* structure of raw payloads passed to add_key() or instantiate key
*/
struct rxrpc_key_data_v1 {
- u32 kif_version; /* 1 */
u16 security_index;
u16 ticket_length;
u32 expiry; /* time_t */

Greg KH

unread,
Mar 10, 2011, 7:10:03 PM3/10/11
to
2.6.37-stable review patch. If anyone has any objections, please let us know.

------------------

From: Vasiliy Kulikov <seg...@openwall.com>

commit 8909c9ad8ff03611c9c96c9a92656213e4bb495b upstream.

Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with
CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean
that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't
allow anybody load any module not related to networking.

This patch restricts an ability of autoloading modules to netdev modules
with explicit aliases. This fixes CVE-2011-1019.

Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
of loading netdev modules by name (without any prefix) for processes
with CAP_SYS_MODULE to maintain the compatibility with network scripts
that use autoloading netdev modules by aliases like "eth0", "wlan0".

Currently there are only three users of the feature in the upstream
kernel: ipip, ip_gre and sit.

root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
root@albatros:~# grep Cap /proc/$$/status
CapInh: 0000000000000000
CapPrm: fffffff800001000
CapEff: fffffff800001000
CapBnd: fffffff800001000
root@albatros:~# modprobe xfs
FATAL: Error inserting xfs
(/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
root@albatros:~# lsmod | grep xfs
root@albatros:~# ifconfig xfs
xfs: error fetching interface information: Device not found
root@albatros:~# lsmod | grep xfs
root@albatros:~# lsmod | grep sit
root@albatros:~# ifconfig sit
sit: error fetching interface information: Device not found
root@albatros:~# lsmod | grep sit
root@albatros:~# ifconfig sit0
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1

root@albatros:~# lsmod | grep sit
sit 10457 0
tunnel4 2957 1 sit

For CAP_SYS_MODULE module loading is still relaxed:

root@albatros:~# grep Cap /proc/$$/status
CapInh: 0000000000000000
CapPrm: ffffffffffffffff
CapEff: ffffffffffffffff
CapBnd: ffffffffffffffff
root@albatros:~# ifconfig xfs
xfs: error fetching interface information: Device not found
root@albatros:~# lsmod | grep xfs
xfs 745319 0

Reference: https://lkml.org/lkml/2011/2/24/203

Signed-off-by: Vasiliy Kulikov <seg...@openwall.com>
Signed-off-by: Michael Tokarev <m...@tls.msk.ru>
Acked-by: David S. Miller <da...@davemloft.net>
Acked-by: Kees Cook <kees...@canonical.com>
Signed-off-by: James Morris <jmo...@namei.org>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
include/linux/netdevice.h | 3 +++
net/core/dev.c | 12 ++++++++++--


net/ipv4/ip_gre.c | 1 +
net/ipv4/ipip.c | 1 +

net/ipv6/sit.c | 2 +-
5 files changed, 16 insertions(+), 3 deletions(-)

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2336,6 +2336,9 @@ extern int netdev_notice(const struct ne
extern int netdev_info(const struct net_device *dev, const char *format, ...)
__attribute__ ((format (printf, 2, 3)));

+#define MODULE_ALIAS_NETDEV(device) \
+ MODULE_ALIAS("netdev-" device)
+
#if defined(DEBUG)
#define netdev_dbg(__dev, format, args...) \
netdev_printk(KERN_DEBUG, __dev, format, ##args)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1116,13 +1116,21 @@ EXPORT_SYMBOL(netdev_bonding_change);
void dev_load(struct net *net, const char *name)
{
struct net_device *dev;
+ int no_module;

rcu_read_lock();
dev = dev_get_by_name_rcu(net, name);
rcu_read_unlock();

- if (!dev && capable(CAP_NET_ADMIN))
- request_module("%s", name);
+ no_module = !dev;
+ if (no_module && capable(CAP_NET_ADMIN))
+ no_module = request_module("netdev-%s", name);
+ if (no_module && capable(CAP_SYS_MODULE)) {
+ if (!request_module("%s", name))
+ pr_err("Loading kernel module for a network device "
+"with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-%s "
+"instead\n", name);
+ }
}
EXPORT_SYMBOL(dev_load);

--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -1775,3 +1775,4 @@ module_exit(ipgre_fini);
MODULE_LICENSE("GPL");
MODULE_ALIAS_RTNL_LINK("gre");
MODULE_ALIAS_RTNL_LINK("gretap");
+MODULE_ALIAS_NETDEV("gre0");
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -921,3 +921,4 @@ static void __exit ipip_fini(void)
module_init(ipip_init);
module_exit(ipip_fini);
MODULE_LICENSE("GPL");
+MODULE_ALIAS_NETDEV("tunl0");
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1292,4 +1292,4 @@ static int __init sit_init(void)
module_init(sit_init);
module_exit(sit_cleanup);
MODULE_LICENSE("GPL");
-MODULE_ALIAS("sit0");
+MODULE_ALIAS_NETDEV("sit0");

Mi Jinlong

unread,
Mar 10, 2011, 11:00:01 PM3/10/11
to

J. Bruce Fields:
> On Wed, Mar 09, 2011 at 03:42:30PM -0800, Andrew Morton wrote:
>> On Tue, 08 Mar 2011 22:32:26 +0100
>> roel <roel....@gmail.com> wrote:
>>
>>> Index i was already used in the outer loop
>>>
>>> Signed-off-by: Roel Kluin <roel....@gmail.com>
>>> ---
>>> fs/nfsd/nfs4xdr.c | 4 ++--
>>> 1 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>> Not 100% sure this one is needed but it looks suspicious.
>>>
>>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>>> index 1275b86..615f0a9 100644
>>> --- a/fs/nfsd/nfs4xdr.c
>>> +++ b/fs/nfsd/nfs4xdr.c
>>> @@ -1142,7 +1142,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
>>>
>>> u32 dummy;
>>> char *machine_name;
>>> - int i;
>>> + int i, j;
>>> int nr_secflavs;
>>>
>>> READ_BUF(16);
>>> @@ -1215,7 +1215,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
>>> READ_BUF(4);
>>> READ32(dummy);
>>> READ_BUF(dummy * 4);
>>> - for (i = 0; i < dummy; ++i)
>>> + for (j = 0; j < dummy; ++j)
>>> READ32(dummy);
>>> break;
>>> case RPC_AUTH_GSS:
>> ooh, big bug.
>>
>> I wonder why it was not previously detected at runtime. Perhaps
>> nr_secflavs is always 1.
>
> Yeah, no client uses this calback security information yet.
>
> Mi Jinlong, do you think this is something we could have caught with
> another pynfs test?

Yes, we must test it.

After testing, the following test case is OK.

--
thanks,
Mi Jinlong


From 1afac3444b37bac66970f19c409660a304a53fb4 Mon Sep 17 00:00:00 2001
From: Mi Jinlong <miji...@cn.fujitsu.com>
Date: Sun, 11 Mar 2011 09:05:22 +0800
Subject: [PATCH] CLNT: test a decode problem which use wrong index

Signed-off-by: Mi Jinlong <miji...@cn.fujitsu.com>
---
nfs4.1/server41tests/st_create_session.py | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/nfs4.1/server41tests/st_create_session.py b/nfs4.1/server41tests/st_create_session.py
index ff55d10..e3a8421 100644
--- a/nfs4.1/server41tests/st_create_session.py
+++ b/nfs4.1/server41tests/st_create_session.py
@@ -252,6 +252,22 @@ def testCbSecParms(t, env):
c1 = env.c1.new_client(env.testname(t))
sess1 = c1.create_session(sec=sec)

+def testCbSecParmsDec(t, env):
+ """A decode problem was found at NFS server that
+ wrong index used in inner loop,
+ http://marc.info/?l=linux-kernel&m=129961996327640&w=2
+
+ FLAGS: create_session all
+ CODE: CSESS16a
+ """
+ sec = [callback_sec_parms4(AUTH_NONE),
+ callback_sec_parms4(RPCSEC_GSS, cbsp_gss_handles=gss_cb_handles4(RPC_GSS_SVC_PRIVACY, "Handle from server", "Client handle")),
+ callback_sec_parms4(AUTH_SYS, cbsp_sys_cred=authsys_parms(5, "Random machine name", 7, 11, [])),
+ ]
+
+ c1 = env.c1.new_client(env.testname(t))
+ sess1 = c1.create_session(sec=sec)
+
def testRdmaArray0(t, env):
"""Test 0 length rdma arrays

--
1.7.4.1

Mi Jinlong

unread,
Mar 10, 2011, 11:20:02 PM3/10/11
to

J. Bruce Fields:


> On Tue, Mar 08, 2011 at 10:32:26PM +0100, roel wrote:
>> Index i was already used in the outer loop
>>
>> Signed-off-by: Roel Kluin <roel....@gmail.com>
>> ---
>> fs/nfsd/nfs4xdr.c | 4 ++--
>> 1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> Not 100% sure this one is needed but it looks suspicious.
>

> Looks bad to me, thanks.
>
> nfsd4_decode_create_session should probably really be broken up a little
> bit; if it wasn't so long this would have been more obvious.
>
> I'll see if I can slip this into 2.6.38 with a couple other last-minute
> patches.... Otherwise, it'll be in 2.6.39.
>
> --b.


>
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index 1275b86..615f0a9 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -1142,7 +1142,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
>>
>> u32 dummy;
>> char *machine_name;
>> - int i;
>> + int i, j;
>> int nr_secflavs;
>>
>> READ_BUF(16);
>> @@ -1215,7 +1215,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
>> READ_BUF(4);
>> READ32(dummy);
>> READ_BUF(dummy * 4);
>> - for (i = 0; i < dummy; ++i)
>> + for (j = 0; j < dummy; ++j)
>> READ32(dummy);

We must not use dummy for index here.
After the first index, READ32(dummy) will change dummy!!!!

The following patch fix this problem.

--
thanks,
Mi Jinlong
============================================================

We must not use dummy for index.
After the first index, READ32(dummy) will change dummy!!!!

Signed-off-by: Mi Jinlong <miji...@cn.fujitsu.com>
---

fs/nfsd/nfs4xdr.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 615f0a9..8dd70d0 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1140,7 +1140,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
{
DECODE_HEAD;

- u32 dummy;
+ u32 dummy, tmp;
char *machine_name;


int i, j;
int nr_secflavs;

@@ -1216,7 +1216,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
READ32(dummy);
READ_BUF(dummy * 4);


for (j = 0; j < dummy; ++j)

- READ32(dummy);
+ READ32(tmp);
break;
case RPC_AUTH_GSS:
dprintk("RPC_AUTH_GSS callback secflavor "

Stefan Lippers-Hollmann

unread,
Mar 11, 2011, 10:10:02 AM3/11/11
to
Hi

On Friday 11 March 2011, Greg KH wrote:
> This is the start of the stable review cycle for the 2.6.37.r release.
> There are 29 patches in this series, all will be posted as a response to
> this one. If anyone has any issues with these being applied, please let
> us know. If anyone is a maintainer of the proper subsystem, and wants
> to add a Signed-off-by: line to the patch, please respond with it.
>
> Responses should be made by Saturday, March 12, 24:00:00 UTC.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.37.4-rc1.gz

This seems to be still missing on kernel.org, is there an issue with
the mirroring?

$ LANG= wget kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.37.4-rc1.gz
--2011-03-11 16:01:24-- http://kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.37.4-rc1.gz
Resolving kernel.org... 130.239.17.4, 149.20.20.133, 199.6.1.164, ...
Connecting to kernel.org|130.239.17.4|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2011-03-11 16:01:24 ERROR 404: Not Found.

Regards
Stefan Lippers-Hollmann

Greg KH

unread,
Mar 11, 2011, 11:00:03 AM3/11/11
to
On Fri, Mar 11, 2011 at 04:02:02PM +0100, Stefan Lippers-Hollmann wrote:
> Hi
>
> On Friday 11 March 2011, Greg KH wrote:
> > This is the start of the stable review cycle for the 2.6.37.r release.
> > There are 29 patches in this series, all will be posted as a response to
> > this one. If anyone has any issues with these being applied, please let
> > us know. If anyone is a maintainer of the proper subsystem, and wants
> > to add a Signed-off-by: line to the patch, please respond with it.
> >
> > Responses should be made by Saturday, March 12, 24:00:00 UTC.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.37.4-rc1.gz
>
> This seems to be still missing on kernel.org, is there an issue with
> the mirroring?
>
> $ LANG= wget kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.37.4-rc1.gz
> --2011-03-11 16:01:24-- http://kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.37.4-rc1.gz
> Resolving kernel.org... 130.239.17.4, 149.20.20.133, 199.6.1.164, ...
> Connecting to kernel.org|130.239.17.4|:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2011-03-11 16:01:24 ERROR 404: Not Found.

Sorry, right after I sent out this announcement, the power went out here
where I live, and when it came on 3 hours later, the network wasn't back
on until this morning. The file is now copied there and should show up
on the mirrors within 30 minutes.

thanks,

greg k-h

Greg KH

unread,
Mar 11, 2011, 3:50:01 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Ivan Vecera <ive...@redhat.com>

commit b5ba6d12bdac21bc0620a5089e0f24e362645efd upstream.

I found that one of the 8168c chipsets (concretely XID 1c4000c0) starts
generating RxFIFO overflow errors. The result is an infinite loop in
interrupt handler as the RxFIFOOver is handled only for ...MAC_VER_11.
With the workaround everything goes fine.

Signed-off-by: Ivan Vecera <ive...@redhat.com>
Acked-by: Francois Romieu <rom...@fr.zoreil.com>
Cc: Hayes <haye...@realtek.com>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---


drivers/net/r8169.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -3741,7 +3741,8 @@ static void rtl_hw_start_8168(struct net


RTL_W16(IntrMitigate, 0x5151);

/* Work around for RxFIFO overflow. */
- if (tp->mac_version == RTL_GIGA_MAC_VER_11) {
+ if (tp->mac_version == RTL_GIGA_MAC_VER_11 ||
+ tp->mac_version == RTL_GIGA_MAC_VER_22) {
tp->intr_event |= RxFIFOOver | PCSTimeout;
tp->intr_event &= ~RxOverflow;
}

@@ -4633,7 +4634,8 @@ static irqreturn_t rtl8169_interrupt(int



/* Work around for rx fifo overflow */
if (unlikely(status & RxFIFOOver) &&
- (tp->mac_version == RTL_GIGA_MAC_VER_11)) {
+ (tp->mac_version == RTL_GIGA_MAC_VER_11 ||
+ tp->mac_version == RTL_GIGA_MAC_VER_22)) {
netif_stop_queue(dev);
rtl8169_tx_timeout(dev);
break;

Greg KH

unread,
Mar 11, 2011, 3:50:01 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Anton Blanchard <an...@samba.org>

commit 5d7a87217de48b234b3c8ff8a73059947d822e07 upstream.

I saw this in a kdump kernel:

IOMMU table initialized, virtual merging enabled
Interrupt 155954 (real) is invalid, disabling it.
Interrupt 155953 (real) is invalid, disabling it.

ie we took some spurious interrupts. default_machine_crash_shutdown tries
to disable all interrupt sources but uses chip->disable which maps to
the default action of:

static void default_disable(unsigned int irq)
{
}

If we use chip->shutdown, then we actually mask the IRQ:

static void default_shutdown(unsigned int irq)
{
struct irq_desc *desc = irq_to_desc(irq);

desc->chip->mask(irq);
desc->status |= IRQ_MASKED;
}

Not sure why we don't implement a ->disable action for xics.c, or why
default_disable doesn't mask the interrupt.

Signed-off-by: Anton Blanchard <an...@samba.org>
Signed-off-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Signed-off-by: Kamalesh babulal <kama...@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
arch/powerpc/kernel/crash.c | 2 +-


1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/kernel/crash.c
+++ b/arch/powerpc/kernel/crash.c
@@ -381,7 +381,7 @@ void default_machine_crash_shutdown(stru
desc->chip->eoi(i);

if (!(desc->status & IRQ_DISABLED))
- desc->chip->disable(i);
+ desc->chip->shutdown(i);
}

/*

Greg KH

unread,
Mar 11, 2011, 3:50:01 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Dan Carpenter <err...@gmail.com>

commit b652277b09d3d030cb074cc6a98ba80b34244c03 upstream.

The "ct" variable should be an unsigned int. Both struct kbdiacrs
->kb_cnt and struct kbd_data ->accent_table_size are unsigned ints.

Making it signed causes a problem in KBDIACRUC because the user could
set the signed bit and cause a buffer overflow.

Signed-off-by: Dan Carpenter <err...@gmail.com>
Signed-off-by: Martin Schwidefsky <schwi...@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/s390/char/keyboard.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/s390/char/keyboard.c
+++ b/drivers/s390/char/keyboard.c
@@ -462,7 +462,8 @@ kbd_ioctl(struct kbd_data *kbd, struct f
unsigned int cmd, unsigned long arg)
{
void __user *argp;
- int ct, perm;
+ unsigned int ct;
+ int perm;

argp = (void __user *)arg;

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Ian Abbott <abb...@mev.co.uk>

commit fa5c5f4ce0c9ba03a670c640cad17e14cb35678b upstream.

For the JR3/PCI cards, the size of the PCIBAR0 region depends on the
number of channels. Don't try and ioremap space for 4 channels if the
card has fewer channels. Also check for ioremap failure.

Thanks to Anders Blomdell for input and Sami Hussein for testing.

Signed-off-by: Ian Abbott <abb...@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/staging/comedi/drivers/jr3_pci.c | 7 +++++--


1 file changed, 5 insertions(+), 2 deletions(-)

--- a/drivers/staging/comedi/drivers/jr3_pci.c
+++ b/drivers/staging/comedi/drivers/jr3_pci.c
@@ -856,8 +856,11 @@ static int jr3_pci_attach(struct comedi_
}

devpriv->pci_enabled = 1;
- devpriv->iobase =
- ioremap(pci_resource_start(card, 0), sizeof(struct jr3_t));
+ devpriv->iobase = ioremap(pci_resource_start(card, 0),
+ offsetof(struct jr3_t, channel[devpriv->n_channels]));
+ if (!devpriv->iobase)
+ return -ENOMEM;
+
result = alloc_subdevices(dev, devpriv->n_channels);
if (result < 0)
goto out;

Greg KH

unread,
Mar 11, 2011, 3:50:03 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Hugh Dickins <hu...@google.com>

commit a3e8cc643d22d2c8ed36b9be7d9c9ca21efcf7f7 upstream.

Robert Swiecki reported a BUG_ON(page_mapped) from a fuzzer, punching
a hole with madvise(,, MADV_REMOVE). That path is under mutex, and
cannot be explained by lack of serialization in unmap_mapping_range().

Reviewing the code, I found one place where vm_truncate_count handling
should have been updated, when I switched at the last minute from one
way of managing the restart_addr to another: mremap move changes the
virtual addresses, so it ought to adjust the restart_addr.

But rather than exporting the notion of restart_addr from memory.c, or
converting to restart_pgoff throughout, simply reset vm_truncate_count
to 0 to force a rescan if mremap move races with preempted truncation.

We have no confirmation that this fixes Robert's BUG,
but it is a fix that's worth making anyway.

Signed-off-by: Hugh Dickins <hu...@google.com>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Cc: Kerin Millar <kerf...@gmail.com>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---


mm/mremap.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -92,9 +92,7 @@ static void move_ptes(struct vm_area_str


*/
mapping = vma->vm_file->f_mapping;
spin_lock(&mapping->i_mmap_lock);
- if (new_vma->vm_truncate_count &&
- new_vma->vm_truncate_count != vma->vm_truncate_count)
- new_vma->vm_truncate_count = 0;
+ new_vma->vm_truncate_count = 0;
}

/*

Greg KH

unread,
Mar 11, 2011, 3:50:03 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Vasiliy Kulikov <seg...@openwall.com>

commit 8909c9ad8ff03611c9c96c9a92656213e4bb495b upstream.

Reference: https://lkml.org/lkml/2011/2/24/203

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
include/linux/netdevice.h | 4 ++++


net/core/dev.c | 12 ++++++++++--
net/ipv4/ip_gre.c | 1 +
net/ipv4/ipip.c | 1 +
net/ipv6/sit.c | 2 +-

5 files changed, 17 insertions(+), 3 deletions(-)

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2015,6 +2015,10 @@ static inline u32 dev_ethtool_get_flags(
return 0;
return dev->ethtool_ops->get_flags(dev);
}
+


+#define MODULE_ALIAS_NETDEV(device) \
+ MODULE_ALIAS("netdev-" device)
+

#endif /* __KERNEL__ */

#endif /* _LINUX_NETDEVICE_H */
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1037,13 +1037,21 @@ EXPORT_SYMBOL(netdev_bonding_change);


void dev_load(struct net *net, const char *name)
{
struct net_device *dev;
+ int no_module;

read_lock(&dev_base_lock);
dev = __dev_get_by_name(net, name);
read_unlock(&dev_base_lock);



- if (!dev && capable(CAP_NET_ADMIN))
- request_module("%s", name);
+ no_module = !dev;
+ if (no_module && capable(CAP_NET_ADMIN))
+ no_module = request_module("netdev-%s", name);
+ if (no_module && capable(CAP_SYS_MODULE)) {
+ if (!request_module("%s", name))
+ pr_err("Loading kernel module for a network device "
+"with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-%s "
+"instead\n", name);
+ }
}
EXPORT_SYMBOL(dev_load);

--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c

@@ -1708,3 +1708,4 @@ module_exit(ipgre_fini);


MODULE_LICENSE("GPL");
MODULE_ALIAS_RTNL_LINK("gre");
MODULE_ALIAS_RTNL_LINK("gretap");
+MODULE_ALIAS_NETDEV("gre0");
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c

@@ -853,3 +853,4 @@ static void __exit ipip_fini(void)


module_init(ipip_init);
module_exit(ipip_fini);
MODULE_LICENSE("GPL");
+MODULE_ALIAS_NETDEV("tunl0");
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c

@@ -1101,4 +1101,4 @@ static int __init sit_init(void)


module_init(sit_init);
module_exit(sit_cleanup);
MODULE_LICENSE("GPL");
-MODULE_ALIAS("sit0");
+MODULE_ALIAS_NETDEV("sit0");

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Anton Blanchard <an...@au1.ibm.com>

commit f009918a1c1bbf8607b8aab3959876913a30193a upstream.

Signed-off-by: Anton Blanchard <an...@samba.org>


Signed-off-by: David Howells <dhow...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---


include/keys/rxrpc-type.h | 1 -
1 file changed, 1 deletion(-)

--- a/include/keys/rxrpc-type.h
+++ b/include/keys/rxrpc-type.h
@@ -99,7 +99,6 @@ struct rxrpc_key_token {
* structure of raw payloads passed to add_key() or instantiate key
*/
struct rxrpc_key_data_v1 {
- u32 kif_version; /* 1 */
u16 security_index;
u16 ticket_length;
u32 expiry; /* time_t */

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Michael Neuling <mi...@neuling.org>

commit d504bed676caad29a3dba3d3727298c560628f5c upstream.

Currently for kexec the PTE tear down on 1TB segment systems normally
requires 3 hcalls for each PTE removal. On a machine with 32GB of
memory it can take around a minute to remove all the PTEs.

This optimises the path so that we only remove PTEs that are valid.
It also uses the read 4 PTEs at once HCALL. For the common case where
a PTEs is invalid in a 1TB segment, this turns the 3 HCALLs per PTE
down to 1 HCALL per 4 PTEs.

This gives an > 10x speedup in kexec times on PHYP, taking a 32GB
machine from around 1 minute down to a few seconds.

Signed-off-by: Michael Neuling <mi...@neuling.org>


Signed-off-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Signed-off-by: Kamalesh babulal <kama...@linux.vnet.ibm.com>

cc: Anton Blanchard <an...@samba.org>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
arch/powerpc/platforms/pseries/lpar.c | 33 ++++++++++++++++++++-------------
1 file changed, 20 insertions(+), 13 deletions(-)

--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -366,21 +366,28 @@ static void pSeries_lpar_hptab_clear(voi
{
unsigned long size_bytes = 1UL << ppc64_pft_size;
unsigned long hpte_count = size_bytes >> 4;
- unsigned long dummy1, dummy2, dword0;
+ struct {
+ unsigned long pteh;
+ unsigned long ptel;
+ } ptes[4];
long lpar_rc;


- int i;
+ int i, j;

- /* TODO: Use bulk call */
- for (i = 0; i < hpte_count; i++) {
- /* dont remove HPTEs with VRMA mappings */
- lpar_rc = plpar_pte_remove_raw(H_ANDCOND, i, HPTE_V_1TB_SEG,
- &dummy1, &dummy2);
- if (lpar_rc == H_NOT_FOUND) {
- lpar_rc = plpar_pte_read_raw(0, i, &dword0, &dummy1);
- if (!lpar_rc && ((dword0 & HPTE_V_VRMA_MASK)
- != HPTE_V_VRMA_MASK))
- /* Can be hpte for 1TB Seg. So remove it */
- plpar_pte_remove_raw(0, i, 0, &dummy1, &dummy2);
+ /* Read in batches of 4,
+ * invalidate only valid entries not in the VRMA
+ * hpte_count will be a multiple of 4
+ */
+ for (i = 0; i < hpte_count; i += 4) {
+ lpar_rc = plpar_pte_read_4_raw(0, i, (void *)ptes);
+ if (lpar_rc != H_SUCCESS)
+ continue;
+ for (j = 0; j < 4; j++){
+ if ((ptes[j].pteh & HPTE_V_VRMA_MASK) ==
+ HPTE_V_VRMA_MASK)
+ continue;
+ if (ptes[j].pteh & HPTE_V_VALID)
+ plpar_pte_remove_raw(0, i + j, 0,
+ &(ptes[j].pteh), &(ptes[j].ptel));

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Michael Neuling <mi...@neuling.org>

commit f90ece28c1f5b3ec13fe481406857fe92f4bc7d1 upstream.

This adds plpar_pte_read_4_raw() which can be used read 4 PTEs from
PHYP at a time, while in real mode.

It also creates a new hcall9 which can be used in real mode. It's the
same as plpar_hcall9 but minus the tracing hcall statistics which may
require variables outside the RMO.

Signed-off-by: Michael Neuling <mi...@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Signed-off-by: Kamalesh babulal <kama...@linux.vnet.ibm.com>

Cc: Anton Blanchard <an...@samba.org>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
arch/powerpc/include/asm/hvcall.h | 1
arch/powerpc/platforms/pseries/hvCall.S | 38 ++++++++++++++++++++++++
arch/powerpc/platforms/pseries/plpar_wrappers.h | 18 +++++++++++
3 files changed, 57 insertions(+)

--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -268,6 +268,7 @@ long plpar_hcall_raw(unsigned long opcod
*/
#define PLPAR_HCALL9_BUFSIZE 9
long plpar_hcall9(unsigned long opcode, unsigned long *retbuf, ...);
+long plpar_hcall9_raw(unsigned long opcode, unsigned long *retbuf, ...);

/* For hcall instrumentation. One structure per-hcall, per-CPU */
struct hcall_stats {
--- a/arch/powerpc/platforms/pseries/hvCall.S
+++ b/arch/powerpc/platforms/pseries/hvCall.S
@@ -202,3 +202,41 @@ _GLOBAL(plpar_hcall9)
mtcrf 0xff,r0

blr /* return r3 = status */
+
+/* See plpar_hcall_raw to see why this is needed */
+_GLOBAL(plpar_hcall9_raw)
+ HMT_MEDIUM
+
+ mfcr r0
+ stw r0,8(r1)
+
+ std r4,STK_PARM(r4)(r1) /* Save ret buffer */
+
+ mr r4,r5
+ mr r5,r6
+ mr r6,r7
+ mr r7,r8
+ mr r8,r9
+ mr r9,r10
+ ld r10,STK_PARM(r11)(r1) /* put arg7 in R10 */
+ ld r11,STK_PARM(r12)(r1) /* put arg8 in R11 */
+ ld r12,STK_PARM(r13)(r1) /* put arg9 in R12 */
+
+ HVSC /* invoke the hypervisor */
+
+ mr r0,r12
+ ld r12,STK_PARM(r4)(r1)
+ std r4, 0(r12)
+ std r5, 8(r12)
+ std r6, 16(r12)
+ std r7, 24(r12)
+ std r8, 32(r12)
+ std r9, 40(r12)
+ std r10,48(r12)
+ std r11,56(r12)
+ std r0, 64(r12)
+
+ lwz r0,8(r1)
+ mtcrf 0xff,r0
+
+ blr /* return r3 = status */
--- a/arch/powerpc/platforms/pseries/plpar_wrappers.h
+++ b/arch/powerpc/platforms/pseries/plpar_wrappers.h
@@ -169,6 +169,24 @@ static inline long plpar_pte_read_raw(un
return rc;
}

+/*
+ * plpar_pte_read_4_raw can be called in real mode.
+ * ptes must be 8*sizeof(unsigned long)
+ */
+static inline long plpar_pte_read_4_raw(unsigned long flags, unsigned long ptex,
+ unsigned long *ptes)
+
+{
+ long rc;
+ unsigned long retbuf[PLPAR_HCALL9_BUFSIZE];
+
+ rc = plpar_hcall9_raw(H_READ, retbuf, flags | H_READ_4, ptex);
+
+ memcpy(ptes, retbuf, 8*sizeof(unsigned long));
+
+ return rc;
+}
+
static inline long plpar_pte_protect(unsigned long flags, unsigned long ptex,
unsigned long avpn)
{

Greg KH

unread,
Mar 11, 2011, 3:50:03 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Jan Engelhardt <jen...@medozas.de>

commit 9ef0298a8e5730d9a46d640014c727f3b4152870 upstream.

Like many other places, we have to check that the array index is
within allowed limits, or otherwise, a kernel oops and other nastiness
can ensue when we access memory beyond the end of the array.

[ 5954.115381] BUG: unable to handle kernel paging request at 0000004000000000
[ 5954.120014] IP: __find_logger+0x6f/0xa0
[ 5954.123979] nf_log_bind_pf+0x2b/0x70
[ 5954.123979] nfulnl_recv_config+0xc0/0x4a0 [nfnetlink_log]
[ 5954.123979] nfnetlink_rcv_msg+0x12c/0x1b0 [nfnetlink]
...

The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind
was decoupled from nf_log_register.

Reported-by: Miguel Di Ciurcio Filho <miguel...@gmail.com>,
via irc.freenode.net/#netfilter
Signed-off-by: Jan Engelhardt <jen...@medozas.de>
Signed-off-by: Patrick McHardy <ka...@trash.net>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---


net/netfilter/nf_log.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -83,6 +83,8 @@ EXPORT_SYMBOL(nf_log_unregister);



int nf_log_bind_pf(u_int8_t pf, const struct nf_logger *logger)
{
+ if (pf >= ARRAY_SIZE(nf_loggers))
+ return -EINVAL;
mutex_lock(&nf_log_mutex);
if (__find_logger(pf, logger->name) == NULL) {
mutex_unlock(&nf_log_mutex);

@@ -96,6 +98,8 @@ EXPORT_SYMBOL(nf_log_bind_pf);



void nf_log_unbind_pf(u_int8_t pf)
{
+ if (pf >= ARRAY_SIZE(nf_loggers))
+ return;
mutex_lock(&nf_log_mutex);
rcu_assign_pointer(nf_loggers[pf], NULL);
mutex_unlock(&nf_log_mutex);

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: roel <roel....@gmail.com>

commit 3ec07aa9522e3d5e9d5ede7bef946756e623a0a0 upstream.

Index i was already used in the outer loop

Signed-off-by: Roel Kluin <roel....@gmail.com>
Signed-off-by: J. Bruce Fields <bfi...@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
fs/nfsd/nfs4xdr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1114,7 +1114,7 @@ nfsd4_decode_create_session(struct nfsd4

u32 dummy;
char *machine_name;


- int i;
+ int i, j;

int nr_secflavs;

READ_BUF(16);
@@ -1187,7 +1187,7 @@ nfsd4_decode_create_session(struct nfsd4


READ_BUF(4);
READ32(dummy);
READ_BUF(dummy * 4);

- for (i = 0; i < dummy; ++i)
+ for (j = 0; j < dummy; ++j)
READ32(dummy);
break;
case RPC_AUTH_GSS:

Greg KH

unread,
Mar 11, 2011, 3:50:03 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Anton Blanchard <an...@samba.org>

commit 0644079410065567e3bb31fcb8e6441f2b7685a9 upstream.

We wrap the crash_shutdown_handles[] calls with longjmp/setjmp, so if any
of them fault we can recover. The problem is we add a hook to the debugger
fault handler hook which calls longjmp unconditionally.

This first part of kdump is run before we marshall the other CPUs, so there
is a very good chance some CPU on the box is going to page fault. And when
it does it hits the longjmp code and assumes the context of the oopsing CPU.
The machine gets very confused when it has 10 CPUs all with the same stack,
all thinking they have the same CPU id. I get even more confused trying
to debug it.

The patch below adds crash_shutdown_cpu and uses it to specify which cpu is
in the protected region. Since it can only be -1 or the oopsing CPU, we don't
need to use memory barriers since it is only valid on the local CPU - no other
CPU will ever see a value that matches it's local CPU id.

Eventually we should switch the order and marshall all CPUs before doing the
crash_shutdown_handles[] calls, but that is a bigger fix.

Signed-off-by: Anton Blanchard <an...@samba.org>
Signed-off-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Cc: Kamalesh babulal <kama...@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
arch/powerpc/kernel/crash.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

--- a/arch/powerpc/kernel/crash.c
+++ b/arch/powerpc/kernel/crash.c
@@ -347,10 +347,12 @@ int crash_shutdown_unregister(crash_shut
EXPORT_SYMBOL(crash_shutdown_unregister);

static unsigned long crash_shutdown_buf[JMP_BUF_LEN];
+static int crash_shutdown_cpu = -1;

static int handle_fault(struct pt_regs *regs)
{
- longjmp(crash_shutdown_buf, 1);
+ if (crash_shutdown_cpu == smp_processor_id())
+ longjmp(crash_shutdown_buf, 1);
return 0;
}

@@ -388,6 +390,7 @@ void default_machine_crash_shutdown(stru
*/
old_handler = __debugger_fault_handler;
__debugger_fault_handler = handle_fault;
+ crash_shutdown_cpu = smp_processor_id();
for (i = 0; crash_shutdown_handles[i]; i++) {
if (setjmp(crash_shutdown_buf) == 0) {
/*
@@ -401,6 +404,7 @@ void default_machine_crash_shutdown(stru
asm volatile("sync; isync");
}
}
+ crash_shutdown_cpu = -1;
__debugger_fault_handler = old_handler;

/*

Greg KH

unread,
Mar 11, 2011, 3:50:03 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Li Zefan <li...@cn.fujitsu.com>

commit b75f38d659e6fc747eda64cb72f3920e29dd44a4 upstream.

Don't forget to release cgroup_mutex if alloc_trial_cpuset() fails.

[ak...@linux-foundation.org: avoid multiple return points]
Signed-off-by: Li Zefan <li...@cn.fujitsu.com>
Cc: Paul Menage <men...@google.com>
Acked-by: David Rientjes <rien...@google.com>
Cc: Miao Xie <mi...@cn.fujitsu.com>


Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
kernel/cpuset.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1514,8 +1514,10 @@ static int cpuset_write_resmask(struct c
return -ENODEV;

trialcs = alloc_trial_cpuset(cs);
- if (!trialcs)
- return -ENOMEM;
+ if (!trialcs) {
+ retval = -ENOMEM;
+ goto out;
+ }

switch (cft->private) {
case FILE_CPULIST:
@@ -1530,6 +1532,7 @@ static int cpuset_write_resmask(struct c
}

free_trial_cpuset(trialcs);
+out:
cgroup_unlock();
return retval;

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
This is the start of the longterm review cycle for the 2.6.32.33 release.
There are 17 patches in this series, all will be posted as a response

to this one. If anyone has any issues with these being applied, please
let us know. If anyone is a maintainer of the proper subsystem, and
wants to add a Signed-off-by: line to the patch, please respond with it.

Responses should be made by Sunday, March 13, 2011, 20:00:00 UTC.


Anything received after that time might be too late.

The whole patch series can be found in one patch at:

kernel.org/pub/linux/kernel/v2.6/longterm-review/patch-2.6.32.33-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

Makefile | 2 +-
arch/powerpc/include/asm/hvcall.h | 1 +
arch/powerpc/kernel/crash.c | 11 +++++-
arch/powerpc/kernel/machine_kexec_64.c | 25 +++++++++++++++
arch/powerpc/kernel/setup_64.c | 17 ++++++++--
arch/powerpc/platforms/pseries/hvCall.S | 38 +++++++++++++++++++++++
arch/powerpc/platforms/pseries/lpar.c | 35 ++++++++++++--------
arch/powerpc/platforms/pseries/plpar_wrappers.h | 18 +++++++++++
drivers/net/ixgbe/ixgbe_main.c | 4 ++
drivers/net/r8169.c | 6 ++-
drivers/s390/char/keyboard.c | 3 +-
drivers/staging/comedi/drivers/jr3_pci.c | 7 +++-
fs/nfsd/nfs4xdr.c | 4 +-
include/keys/rxrpc-type.h | 1 -
include/linux/netdevice.h | 4 ++
kernel/cpuset.c | 7 +++-
mm/mremap.c | 4 +--
net/core/dev.c | 12 ++++++-


net/ipv4/ip_gre.c | 1 +
net/ipv4/ipip.c | 1 +
net/ipv6/sit.c | 2 +-

net/netfilter/nf_log.c | 4 ++
22 files changed, 170 insertions(+), 37 deletions(-)

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Anton Blanchard <an...@samba.org>

commit 095c7965f4dc870ed2b65143b1e2610de653416c upstream.

Author: Milton Miller <mil...@bga.com>

On large machines we are running out of room below 256MB. In some cases we
only need to ensure the allocation is in the first segment, which may be
256MB or 1TB.

Add slb0_limit and use it to specify the upper limit for the irqstack and
emergency stacks.

On a large ppc64 box, this fixes a panic at boot when the crashkernel=
option is specified (previously we would run out of memory below 256MB).

Signed-off-by: Milton Miller <mil...@bga.com>


Signed-off-by: Anton Blanchard <an...@samba.org>
Signed-off-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>

Cc: Kamalesh Babulal <kama...@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
arch/powerpc/kernel/setup_64.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -432,9 +432,18 @@ void __init setup_system(void)
DBG(" <- setup_system()\n");
}

+static u64 slb0_limit(void)
+{
+ if (cpu_has_feature(CPU_FTR_1T_SEGMENT)) {
+ return 1UL << SID_SHIFT_1T;
+ }
+ return 1UL << SID_SHIFT;
+}
+
#ifdef CONFIG_IRQSTACKS
static void __init irqstack_early_init(void)
{
+ u64 limit = slb0_limit();
unsigned int i;

/*
@@ -444,10 +453,10 @@ static void __init irqstack_early_init(v
for_each_possible_cpu(i) {
softirq_ctx[i] = (struct thread_info *)
__va(lmb_alloc_base(THREAD_SIZE,
- THREAD_SIZE, 0x10000000));
+ THREAD_SIZE, limit));
hardirq_ctx[i] = (struct thread_info *)
__va(lmb_alloc_base(THREAD_SIZE,
- THREAD_SIZE, 0x10000000));
+ THREAD_SIZE, limit));
}
}
#else
@@ -478,7 +487,7 @@ static void __init exc_lvl_early_init(vo
*/
static void __init emergency_stack_init(void)
{
- unsigned long limit;
+ u64 limit;
unsigned int i;

/*
@@ -490,7 +499,7 @@ static void __init emergency_stack_init(
* bringup, we need to get at them in real mode. This means they
* must also be within the RMO region.
*/
- limit = min(0x10000000ULL, lmb.rmo_size);
+ limit = min(slb0_limit(), lmb.rmo_size);

for_each_possible_cpu(i) {
unsigned long sp;

Greg KH

unread,
Mar 11, 2011, 3:50:03 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Don Skidmore <donald.c...@intel.com>

commit a124339ad28389093ed15eca990d39c51c5736cc upstream.

We have found a hardware erratum on 82599 hardware that can lead to
unpredictable behavior when Header Splitting mode is enabled. So
we are no longer enabling this feature on affected hardware.

Please see the 82599 Specification Update for more information.

Signed-off-by: Don Skidmore <donald.c...@intel.com>
Tested-by: Stephen Ko <stephe...@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey....@intel.com>

Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
drivers/net/ixgbe/ixgbe_main.c | 4 ++++


1 file changed, 4 insertions(+)

--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -2134,6 +2134,10 @@ static void ixgbe_configure_rx(struct ix
/* Decide whether to use packet split mode or not */
adapter->flags |= IXGBE_FLAG_RX_PS_ENABLED;



+ /* Disable packet split due to 82599 erratum #45 */
+ if (hw->mac.type == ixgbe_mac_82599EB)
+ adapter->flags &= ~IXGBE_FLAG_RX_PS_ENABLED;
+
/* Set the RX buffer length according to the mode */
if (adapter->flags & IXGBE_FLAG_RX_PS_ENABLED) {
rx_buf_len = IXGBE_RX_HDR_SIZE;

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Maxim Uvarov <muv...@gmail.com>

commit 426b6cb478e60352a463a0d1ec75c1c9fab30b13 upstream.

Signed-off-by: Maxim Uvarov <muv...@gmail.com>


Signed-off-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Signed-off-by: Kamalesh babulal <kama...@linux.vnet.ibm.com>

cc: Anton Blanchard <an...@samba.org>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>

---
arch/powerpc/kernel/crash.c | 3 +++
1 file changed, 3 insertions(+)

--- a/arch/powerpc/kernel/crash.c
+++ b/arch/powerpc/kernel/crash.c
@@ -377,6 +377,9 @@ void default_machine_crash_shutdown(stru
for_each_irq(i) {
struct irq_desc *desc = irq_desc + i;

+ if (!desc || !desc->chip || !desc->chip->eoi)
+ continue;
+
if (desc->status & IRQ_INPROGRESS)
desc->chip->eoi(i);

Greg KH

unread,
Mar 11, 2011, 3:50:02 PM3/11/11
to
2.6.32-longterm review patch. If anyone has any objections, please let us know.

------------------

From: Matt Evans <ma...@ozlabs.org>

Commit: e8e5c2155b0035b6e04f29be67f6444bc914005b upstream

When CPU hotplug is used, some CPUs may be offline at the time a kexec is
performed. The subsequent kernel may expect these CPUs to be already running,
and will declare them stuck. On pseries, there's also a soft-offline (cede)
state that CPUs may be in; this can also cause problems as the kexeced kernel
may ask RTAS if they're online -- and RTAS would say they are. The CPU will
either appear stuck, or will cause a crash as we replace its cede loop beneath
it.

This patch kicks each present offline CPU awake before the kexec, so that
none are forever lost to these assumptions in the subsequent kernel.

Now, the behaviour is that all available CPUs that were offlined are now
online & usable after the kexec. This mimics the behaviour of a full reboot
(on which all CPUs will be restarted).

Signed-off-by: Matt Evans <ma...@ozlabs.org>


Signed-off-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Signed-off-by: Kamalesh babulal <kama...@linux.vnet.ibm.com>
cc: Anton Blanchard <an...@samba.org>
Signed-off-by: Greg Kroah-Hartman <gre...@suse.de>
---

arch/powerpc/kernel/machine_kexec_64.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -15,6 +15,7 @@
#include <linux/thread_info.h>
#include <linux/init_task.h>
#include <linux/errno.h>
+#include <linux/cpu.h>

#include <asm/page.h>
#include <asm/current.h>
@@ -169,10 +170,34 @@ static void kexec_smp_down(void *arg)
/* NOTREACHED */
}

+/*
+ * We need to make sure each present CPU is online. The next kernel will scan
+ * the device tree and assume primary threads are online and query secondary
+ * threads via RTAS to online them if required. If we don't online primary
+ * threads, they will be stuck. However, we also online secondary threads as we
+ * may be using 'cede offline'. In this case RTAS doesn't see the secondary
+ * threads as offline -- and again, these CPUs will be stuck.
+ *
+ * So, we online all CPUs that should be running, including secondary threads.
+ */
+static void wake_offline_cpus(void)
+{
+ int cpu = 0;
+
+ for_each_present_cpu(cpu) {
+ if (!cpu_online(cpu)) {
+ printk(KERN_INFO "kexec: Waking offline cpu %d.\n",
+ cpu);
+ cpu_up(cpu);
+ }
+ }
+}
+
static void kexec_prepare_cpus(void)
{
int my_cpu, i, notified=-1;

+ wake_offline_cpus();
smp_call_function(kexec_smp_down, NULL, /* wait */0);
my_cpu = get_cpu();

Tim Gardner

unread,
Mar 11, 2011, 5:30:02 PM3/11/11
to

I agree that fixing the index in this loop is a good thing, but its
caused me to look at the result:

for (j = 0; j< dummy; ++j)
READ32(dummy);

It seems to me that this loop might never terminate if the original
buffer is maliciously constructed, e.g., 0, 1, 2, 3, ... Is the data in
this buffer really that well vetted?

rtg
--
Tim Gardner tim.g...@canonical.com

J. Bruce Fields

unread,
Mar 14, 2011, 3:40:02 PM3/14/11
to

I've pushed out a tree with your additional tests to

git://linux-nfs.org/~bfields/pynfs41.git

Let me know what I'm missing.

Thanks again.--b.

J. Bruce Fields

unread,
Mar 14, 2011, 6:30:03 PM3/14/11
to

Actually, wait, this is kind of silly. I don't see why we couldn't just
skip the loop and do

p += dummy;

Also, your new test is still failing with a BAD_XDR error. Well, maybe
the test should fail--we don't really implement this yet anyway--but it
should at least be getting past the xdr decoding. So something else is
still wrong.

--b.

Trond Myklebust

unread,
Mar 14, 2011, 8:00:02 PM3/14/11
to

This is exactly why I _hate_ the READ*() macros and their ilk, and am
really happy we got rid of them in the client.

READ_BUF() _sets_ p to whatever the value of argp->p is, and then
updates argp->p. It is just very very very hard to see that due to the
lack of transparency.

IOW: You don't need the "p += dummy" either. That happens automatically
when you next invoke READ_BUF().

Trond
--
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.M...@netapp.com
www.netapp.com

Mi Jinlong

unread,
Mar 14, 2011, 10:40:02 PM3/14/11
to

J. Bruce Fields:

How did you modify it??

When testing it, I modify as

- for (j = 0; j < dummy; ++j)
- READ32(dummy);
+ p += dummy;

or

- for (j = 0; j < dummy; ++j)
- READ32(dummy);

Test case CSESS16 and CSESS16a are PASS,
I can't get BAD_XDR error as you said.

--
thanks,
Mi Jinlong

> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in


> the body of a message to majo...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

--
----
thanks
Mi Jinlong

J. Bruce Fields

unread,
Mar 16, 2011, 7:00:02 PM3/16/11
to
On Mon, Mar 14, 2011 at 07:52:11PM -0400, Trond Myklebust wrote:
> On Mon, 2011-03-14 at 18:22 -0400, J. Bruce Fields wrote:
> > On Fri, Mar 11, 2011 at 12:13:55PM +0800, Mi Jinlong wrote:
> > >
> > >
> > > J. Bruce Fields:
> > > > On Tue, Mar 08, 2011 at 10:32:26PM +0100, roel wrote:
> > > >> @@ -1215,7 +1215,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
> > > >> READ_BUF(4);
> > > >> READ32(dummy);
> > > >> READ_BUF(dummy * 4);
> > > >> - for (i = 0; i < dummy; ++i)
> > > >> + for (j = 0; j < dummy; ++j)
> > > >> READ32(dummy);
> > >
> > > We must not use dummy for index here.
> > > After the first index, READ32(dummy) will change dummy!!!!
> >
> > Actually, wait, this is kind of silly. I don't see why we couldn't just
> > skip the loop and do
> >
> > p += dummy;
>
> This is exactly why I _hate_ the READ*() macros and their ilk, and am
> really happy we got rid of them in the client.

Agreed, I'm all for getting rid of them completely.

> READ_BUF() _sets_ p to whatever the value of argp->p is, and then
> updates argp->p. It is just very very very hard to see that due to the
> lack of transparency.
>
> IOW: You don't need the "p += dummy" either. That happens automatically
> when you next invoke READ_BUF().

Yes, you're right, we could remove that silly loop entirely.

--b.

J. Bruce Fields

unread,
Mar 17, 2011, 2:00:02 PM3/17/11
to
On Tue, Mar 15, 2011 at 10:31:45AM +0800, Mi Jinlong wrote:
>
>
> J. Bruce Fields:
> > Actually, wait, this is kind of silly. I don't see why we couldn't just
> > skip the loop and do
> >
> > p += dummy;
> >
> > Also, your new test is still failing with a BAD_XDR error. Well, maybe
> > the test should fail--we don't really implement this yet anyway--but it
> > should at least be getting past the xdr decoding. So something else is
> > still wrong.
>
> How did you modify it??
>
> When testing it, I modify as
>
> - for (j = 0; j < dummy; ++j)
> - READ32(dummy);
> + p += dummy;
>
> or
>
> - for (j = 0; j < dummy; ++j)
> - READ32(dummy);
>
> Test case CSESS16 and CSESS16a are PASS,
> I can't get BAD_XDR error as you said.

Yes, I thought I had the former, but perhaps I had the wrong kernel
running on my test server. I've confirmed those tests pass after the
following patch.

--b.

commit 5a02ab7c3c4580f94d13c683721039855b67cda6
Author: Mi Jinlong <miji...@cn.fujitsu.com>
Date: Fri Mar 11 12:13:55 2011 +0800

nfsd: wrong index used in inner loop



We must not use dummy for index.
After the first index, READ32(dummy) will change dummy!!!!

Signed-off-by: Mi Jinlong <miji...@cn.fujitsu.com>

[bfi...@redhat.com: Trond points out READ_BUF alone is sufficient.]
Cc: sta...@kernel.org


Signed-off-by: J. Bruce Fields <bfi...@redhat.com>

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 615f0a9..c6766af 100644


--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1142,7 +1142,7 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,

u32 dummy;
char *machine_name;

- int i, j;
+ int i;
int nr_secflavs;

READ_BUF(16);
@@ -1215,8 +1215,6 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,


READ_BUF(4);
READ32(dummy);
READ_BUF(dummy * 4);

- for (j = 0; j < dummy; ++j)
- READ32(dummy);

break;
case RPC_AUTH_GSS:
dprintk("RPC_AUTH_GSS callback secflavor "

@@ -1232,7 +1230,6 @@ nfsd4_decode_create_session(struct nfsd4_compoundargs *argp,
READ_BUF(4);
READ32(dummy);
READ_BUF(dummy);
- p += XDR_QUADLEN(dummy);
break;
default:
dprintk("Illegal callback secflavor\n");

J. Bruce Fields

unread,
Mar 17, 2011, 7:10:01 PM3/17/11
to

Agreed, the code's still clearly bogus. In fact, we can just delete
that loop entirely; I have a patch queued up to send to Linus soon.

(But go ahead and apply this anyway, and then you'll get the followup
patch when it lands.)

--b.

Tim Gardner

unread,
Mar 17, 2011, 9:00:03 PM3/17/11
to

Will do. Thanks for the update.

rtg
--
Tim Gardner tim.g...@canonical.com

0 new messages