Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH 3.11 203/208] drm/nouveau/bios: make jump conditional

163 views
Skip to first unread message

Luis Henriques

unread,
Jan 13, 2014, 11:10:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Ilia Mirkin <imi...@alum.mit.edu>

commit 6d60792ec059d9f2139828f9f017679abb81aa73 upstream.

This fixes a hang in VBIOS scripts of the form "condition; jump".
The jump used to always be executed, while now it will only be
executed if the condition is true.

See https://bugs.freedesktop.org/show_bug.cgi?id=72943

Reported-by: Darcy Brás da Silva <darde...@cidadecool.com>
Signed-off-by: Ilia Mirkin <imi...@alum.mit.edu>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/gpu/drm/nouveau/core/subdev/bios/init.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/init.c b/drivers/gpu/drm/nouveau/core/subdev/bios/init.c
index 8f06cca..892070a 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/bios/init.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/bios/init.c
@@ -1294,7 +1294,11 @@ init_jump(struct nvbios_init *init)
u16 offset = nv_ro16(bios, init->offset + 1);

trace("JUMP\t0x%04x\n", offset);
- init->offset = offset;
+
+ if (init_exec(init))
+ init->offset = offset;
+ else
+ init->offset += 3;
}

/**
--
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Ard Biesheuvel <ard.bie...@linaro.org>

commit f60900f2609e893c7f8d0bccc7ada4947dac4cd5 upstream.

Commit 2171364d1a92 ("powerpc: Add HWCAP2 aux entry") introduced a new
AT_ auxv entry type AT_HWCAP2 but failed to update AT_VECTOR_SIZE_BASE
accordingly.

Signed-off-by: Ard Biesheuvel <ard.bie...@linaro.org>
Fixes: 2171364d1a92 (powerpc: Add HWCAP2 aux entry)
Acked-by: Michael Neuling <mic...@neuling.org>
Cc: Nishanth Aravamudan <na...@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
include/linux/auxvec.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/auxvec.h b/include/linux/auxvec.h
index 669fef5..3e0fbe4 100644
--- a/include/linux/auxvec.h
+++ b/include/linux/auxvec.h
@@ -3,6 +3,6 @@

#include <uapi/linux/auxvec.h>

-#define AT_VECTOR_SIZE_BASE 19 /* NEW_AUX_ENT entries in auxiliary table */
+#define AT_VECTOR_SIZE_BASE 20 /* NEW_AUX_ENT entries in auxiliary table */
/* number of "#define AT_.*" above, minus {AT_NULL, AT_IGNORE, AT_NOTELF} */
#endif /* _LINUX_AUXVEC_H */

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Jan Kiszka <jan.k...@siemens.com>

commit e66d2ae7c67bd9ac982a3d1890564de7f7eabf4b upstream.

Update arch.apic_base before triggering recalculate_apic_map. Otherwise
the recalculation will work against the previous state of the APIC and
will fail to build the correct map when an APIC is hardware-enabled
again.

This fixes a regression of 1e08ec4a13.

Signed-off-by: Jan Kiszka <jan.k...@siemens.com>
Signed-off-by: Marcelo Tosatti <mtos...@redhat.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/x86/kvm/lapic.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 81d7b23..dedd7c2 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1364,6 +1364,10 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
return;
}

+ if (!kvm_vcpu_is_bsp(apic->vcpu))
+ value &= ~MSR_IA32_APICBASE_BSP;
+ vcpu->arch.apic_base = value;
+
/* update jump label if enable bit changes */
if ((vcpu->arch.apic_base ^ value) & MSR_IA32_APICBASE_ENABLE) {
if (value & MSR_IA32_APICBASE_ENABLE)
@@ -1373,10 +1377,6 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
recalculate_apic_map(vcpu->kvm);
}

- if (!kvm_vcpu_is_bsp(apic->vcpu))
- value &= ~MSR_IA32_APICBASE_BSP;
-
- vcpu->arch.apic_base = value;
if ((old_value ^ value) & X2APIC_ENABLE) {
if (value & X2APIC_ENABLE) {
u32 id = kvm_apic_id(apic);

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Alex Deucher <alexande...@amd.com>

commit e2f6c88fb903e123edfd1106b0b8310d5117f774 upstream.

Fixes gfx corruption on certain TN/RL parts.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=60389

Signed-off-by: Alex Deucher <alexande...@amd.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/gpu/drm/radeon/ni.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 263d14b..b2a76f5 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -896,6 +896,10 @@ static void cayman_gpu_init(struct radeon_device *rdev)
(rdev->pdev->device == 0x999C)) {
rdev->config.cayman.max_simds_per_se = 6;
rdev->config.cayman.max_backends_per_se = 2;
+ rdev->config.cayman.max_hw_contexts = 8;
+ rdev->config.cayman.sx_max_export_size = 256;
+ rdev->config.cayman.sx_max_export_pos_size = 64;
+ rdev->config.cayman.sx_max_export_smx_size = 192;
} else if ((rdev->pdev->device == 0x9903) ||
(rdev->pdev->device == 0x9904) ||
(rdev->pdev->device == 0x990A) ||
@@ -906,6 +910,10 @@ static void cayman_gpu_init(struct radeon_device *rdev)
(rdev->pdev->device == 0x999D)) {
rdev->config.cayman.max_simds_per_se = 4;
rdev->config.cayman.max_backends_per_se = 2;
+ rdev->config.cayman.max_hw_contexts = 8;
+ rdev->config.cayman.sx_max_export_size = 256;
+ rdev->config.cayman.sx_max_export_pos_size = 64;
+ rdev->config.cayman.sx_max_export_smx_size = 192;
} else if ((rdev->pdev->device == 0x9919) ||
(rdev->pdev->device == 0x9990) ||
(rdev->pdev->device == 0x9991) ||
@@ -916,9 +924,17 @@ static void cayman_gpu_init(struct radeon_device *rdev)
(rdev->pdev->device == 0x99A0)) {
rdev->config.cayman.max_simds_per_se = 3;
rdev->config.cayman.max_backends_per_se = 1;
+ rdev->config.cayman.max_hw_contexts = 4;
+ rdev->config.cayman.sx_max_export_size = 128;
+ rdev->config.cayman.sx_max_export_pos_size = 32;
+ rdev->config.cayman.sx_max_export_smx_size = 96;
} else {
rdev->config.cayman.max_simds_per_se = 2;
rdev->config.cayman.max_backends_per_se = 1;
+ rdev->config.cayman.max_hw_contexts = 4;
+ rdev->config.cayman.sx_max_export_size = 128;
+ rdev->config.cayman.sx_max_export_pos_size = 32;
+ rdev->config.cayman.sx_max_export_smx_size = 96;
}
rdev->config.cayman.max_texture_channel_caches = 2;
rdev->config.cayman.max_gprs = 256;
@@ -926,10 +942,6 @@ static void cayman_gpu_init(struct radeon_device *rdev)
rdev->config.cayman.max_gs_threads = 32;
rdev->config.cayman.max_stack_entries = 512;
rdev->config.cayman.sx_num_of_sets = 8;
- rdev->config.cayman.sx_max_export_size = 256;
- rdev->config.cayman.sx_max_export_pos_size = 64;
- rdev->config.cayman.sx_max_export_smx_size = 192;
- rdev->config.cayman.max_hw_contexts = 8;
rdev->config.cayman.sq_num_cf_insts = 2;

rdev->config.cayman.sc_prim_fifo_size = 0x40;

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Rik van Riel <ri...@redhat.com>

commit 4eb919825e6c3c7fb3630d5621f6d11e98a18b3a upstream.

remap_file_pages calls mmap_region, which may merge the VMA with other
existing VMAs, and free "vma". This can lead to a use-after-free bug.
Avoid the bug by remembering vm_flags before calling mmap_region, and
not trying to dereference vma later.

Signed-off-by: Rik van Riel <ri...@redhat.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
Cc: PaX Team <page...@freemail.hu>
Cc: Kees Cook <kees...@chromium.org>
Cc: Michel Lespinasse <wal...@google.com>
Cc: Cyrill Gorcunov <gorc...@openvz.org>
Cc: Hugh Dickins <hu...@google.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/fremap.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/fremap.c b/mm/fremap.c
index 5bff081..bbc4d66 100644
--- a/mm/fremap.c
+++ b/mm/fremap.c
@@ -208,9 +208,10 @@ get_write_lock:
if (mapping_cap_account_dirty(mapping)) {
unsigned long addr;
struct file *file = get_file(vma->vm_file);
+ /* mmap_region may free vma; grab the info now */
+ vm_flags = vma->vm_flags;

- addr = mmap_region(file, start, size,
- vma->vm_flags, pgoff);
+ addr = mmap_region(file, start, size, vm_flags, pgoff);
fput(file);
if (IS_ERR_VALUE(addr)) {
err = addr;
@@ -218,7 +219,7 @@ get_write_lock:
BUG_ON(addr != start);
err = 0;
}
- goto out;
+ goto out_freed;
}
mutex_lock(&mapping->i_mmap_mutex);
flush_dcache_mmap_lock(mapping);
@@ -253,6 +254,7 @@ get_write_lock:
out:
if (vma)
vm_flags = vma->vm_flags;
+out_freed:
if (likely(!has_write_lock))
up_read(&mm->mmap_sem);
else

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt <ros...@goodmis.org>

commit 3dc91d4338d698ce77832985f9cb183d8eeaf6be upstream.

While running stress tests on adding and deleting ftrace instances I hit
this bug:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: selinux_inode_permission+0x85/0x160
PGD 63681067 PUD 7ddbe067 PMD 0
Oops: 0000 [#1] PREEMPT
CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
Call Trace:
security_inode_permission+0x1c/0x30
__inode_permission+0x41/0xa0
inode_permission+0x18/0x50
link_path_walk+0x66/0x920
path_openat+0xa6/0x6c0
do_filp_open+0x43/0xa0
do_sys_open+0x146/0x240
SyS_open+0x1e/0x20
system_call_fastpath+0x16/0x1b
Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
RIP selinux_inode_permission+0x85/0x160
CR2: 0000000000000020

Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.

in selinux_inode_permission():

isec = inode->i_security;

rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);

Note, the crash came from stressing the deletion and reading of debugfs
files. I was not able to recreate this via normal files. But I'm not
sure they are safe. It may just be that the race window is much harder
to hit.

What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().

The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct. Now if the freeing of
the i_security happens with a call_rcu(), and the i_security field of
the inode is not changed (it gets freed as the inode gets freed) then
there will be no issue here. (Linus Torvalds suggested not setting the
field to NULL such that we do not need to check if it is NULL in the
permission check).

Note, this is a hack, but it fixes the problem at hand. A real fix is
to restructure the destroy_inode() to call all the destructor handlers
from the RCU callback. But that is a major job to do, and requires a
lot of work. For now, we just band-aid this bug with this fix (it
works), and work on a more maintainable solution in the future.

Link: http://lkml.kernel.org/r/20140109101...@gandalf.local.home
Link: http://lkml.kernel.org/r/20140109182...@gandalf.local.home

Signed-off-by: Steven Rostedt <ros...@goodmis.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
security/selinux/hooks.c | 20 ++++++++++++++++++--
security/selinux/include/objsec.h | 5 ++++-
2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 56f7621..bf64893 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -220,6 +220,14 @@ static int inode_alloc_security(struct inode *inode)
return 0;
}

+static void inode_free_rcu(struct rcu_head *head)
+{
+ struct inode_security_struct *isec;
+
+ isec = container_of(head, struct inode_security_struct, rcu);
+ kmem_cache_free(sel_inode_cache, isec);
+}
+
static void inode_free_security(struct inode *inode)
{
struct inode_security_struct *isec = inode->i_security;
@@ -230,8 +238,16 @@ static void inode_free_security(struct inode *inode)
list_del_init(&isec->list);
spin_unlock(&sbsec->isec_lock);

- inode->i_security = NULL;
- kmem_cache_free(sel_inode_cache, isec);
+ /*
+ * The inode may still be referenced in a path walk and
+ * a call to selinux_inode_permission() can be made
+ * after inode_free_security() is called. Ideally, the VFS
+ * wouldn't do this, but fixing that is a much harder
+ * job. For now, simply free the i_security via RCU, and
+ * leave the current inode->i_security pointer intact.
+ * The inode will be freed after the RCU grace period too.
+ */
+ call_rcu(&isec->rcu, inode_free_rcu);
}

static int file_alloc_security(struct file *file)
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index aa47bca..6fd9dd2 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -38,7 +38,10 @@ struct task_security_struct {

struct inode_security_struct {
struct inode *inode; /* back pointer to inode object */
- struct list_head list; /* list of inode_security_struct */
+ union {
+ struct list_head list; /* list of inode_security_struct */
+ struct rcu_head rcu; /* for freeing the inode_security_struct */
+ };
u32 task_sid; /* SID of creating task */
u32 sid; /* SID of this object */
u16 sclass; /* security class of this object */

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

commit b1aac815c0891fe4a55a6b0b715910142227700f upstream.

Jakub reported while working with nlmon netlink sniffer that parts of
the inet_diag_sockid are not initialized when r->idiag_family != AF_INET6.
That is, fields of r->id.idiag_src[1 ... 3], r->id.idiag_dst[1 ... 3].

In fact, it seems that we can leak 6 * sizeof(u32) byte of kernel [slab]
memory through this. At least, in udp_dump_one(), we allocate a skb in ...

rep = nlmsg_new(sizeof(struct inet_diag_msg) + ..., GFP_KERNEL);

... and then pass that to inet_sk_diag_fill() that puts the whole struct
inet_diag_msg into the skb, where we only fill out r->id.idiag_src[0],
r->id.idiag_dst[0] and leave the rest untouched:

r->id.idiag_src[0] = inet->inet_rcv_saddr;
r->id.idiag_dst[0] = inet->inet_daddr;

struct inet_diag_msg embeds struct inet_diag_sockid that is correctly /
fully filled out in IPv6 case, but for IPv4 not.

So just zero them out by using plain memset (for this little amount of
bytes it's probably not worth the extra check for idiag_family == AF_INET).

Similarly, fix also other places where we fill that out.

Reported-by: Jakub Zawadzki <darkja...@darkjames.pl>
Signed-off-by: Daniel Borkmann <dbor...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
[ luis: backported to 3.11: used davem's backport to 3.10 ]
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/ipv4/inet_diag.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 5f64875..31cf54d 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -106,6 +106,10 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk,

r->id.idiag_sport = inet->inet_sport;
r->id.idiag_dport = inet->inet_dport;
+
+ memset(&r->id.idiag_src, 0, sizeof(r->id.idiag_src));
+ memset(&r->id.idiag_dst, 0, sizeof(r->id.idiag_dst));
+
r->id.idiag_src[0] = inet->inet_rcv_saddr;
r->id.idiag_dst[0] = inet->inet_daddr;

@@ -240,12 +244,19 @@ static int inet_twsk_diag_fill(struct inet_timewait_sock *tw,

r->idiag_family = tw->tw_family;
r->idiag_retrans = 0;
+
r->id.idiag_if = tw->tw_bound_dev_if;
sock_diag_save_cookie(tw, r->id.idiag_cookie);
+
r->id.idiag_sport = tw->tw_sport;
r->id.idiag_dport = tw->tw_dport;
+
+ memset(&r->id.idiag_src, 0, sizeof(r->id.idiag_src));
+ memset(&r->id.idiag_dst, 0, sizeof(r->id.idiag_dst));
+
r->id.idiag_src[0] = tw->tw_rcv_saddr;
r->id.idiag_dst[0] = tw->tw_daddr;
+
r->idiag_state = tw->tw_substate;
r->idiag_timer = 3;
r->idiag_expires = DIV_ROUND_UP(tmo * 1000, HZ);
@@ -732,8 +743,13 @@ static int inet_diag_fill_req(struct sk_buff *skb, struct sock *sk,

r->id.idiag_sport = inet->inet_sport;
r->id.idiag_dport = ireq->rmt_port;
+
+ memset(&r->id.idiag_src, 0, sizeof(r->id.idiag_src));
+ memset(&r->id.idiag_dst, 0, sizeof(r->id.idiag_dst));
+
r->id.idiag_src[0] = ireq->loc_addr;
r->id.idiag_dst[0] = ireq->rmt_addr;
+
r->idiag_expires = jiffies_to_msecs(tmo);
r->idiag_rqueue = 0;
r->idiag_wqueue = 0;

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

commit 4d231b76eef6c4a6bd9c96769e191517765942cb upstream.

While commit 30a584d944fb fixes datagram interface in LLC, a use
after free bug has been introduced for SOCK_STREAM sockets that do
not make use of MSG_PEEK.

The flow is as follow ...

if (!(flags & MSG_PEEK)) {
...
sk_eat_skb(sk, skb, false);
...
}
...
if (used + offset < skb->len)
continue;

... where sk_eat_skb() calls __kfree_skb(). Therefore, cache
original length and work on skb_len to check partial reads.

Fixes: 30a584d944fb ("[LLX]: SOCK_DGRAM interface fixes")
Signed-off-by: Daniel Borkmann <dbor...@redhat.com>
Cc: Stephen Hemminger <ste...@networkplumber.org>
Cc: Arnaldo Carvalho de Melo <ac...@ghostprotocols.net>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/llc/af_llc.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 8870988..c3ee805 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -715,7 +715,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
unsigned long cpu_flags;
size_t copied = 0;
u32 peek_seq = 0;
- u32 *seq;
+ u32 *seq, skb_len;
unsigned long used;
int target; /* Read at least this many bytes */
long timeo;
@@ -812,6 +812,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
}
continue;
found_ok_skb:
+ skb_len = skb->len;
/* Ok so how much can we use? */
used = skb->len - offset;
if (len < used)
@@ -844,7 +845,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
}

/* Partial read */
- if (used + offset < skb->len)
+ if (used + offset < skb_len)
continue;
} while (len > 0);

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Dmitry Kunilov <dmitry....@gmail.com>

commit 52d0dc7597c89b2ab779f3dcb9b9bf0800dd9218 upstream.

ZTE AC2726 EVDO modem drops ppp connection every minute when driven by
zte_ev but works fine when driven by option. Move the support for AC2726
back to option driver.

Signed-off-by: Dmitry Kunilov <dmitry....@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/usb/serial/option.c | 2 ++
drivers/usb/serial/zte_ev.c | 3 +--
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index 496b7e39..cc7a241 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -251,6 +251,7 @@ static void option_instat_callback(struct urb *urb);
#define ZTE_PRODUCT_MF628 0x0015
#define ZTE_PRODUCT_MF626 0x0031
#define ZTE_PRODUCT_MC2718 0xffe8
+#define ZTE_PRODUCT_AC2726 0xfff1

#define BENQ_VENDOR_ID 0x04a5
#define BENQ_PRODUCT_H10 0x4068
@@ -1453,6 +1454,7 @@ static const struct usb_device_id option_ids[] = {
{ USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x02, 0x01) },
{ USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x02, 0x05) },
{ USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0xff, 0x86, 0x10) },
+ { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, ZTE_PRODUCT_AC2726, 0xff, 0xff, 0xff) },

{ USB_DEVICE(BENQ_VENDOR_ID, BENQ_PRODUCT_H10) },
{ USB_DEVICE(DLINK_VENDOR_ID, DLINK_PRODUCT_DWM_652) },
diff --git a/drivers/usb/serial/zte_ev.c b/drivers/usb/serial/zte_ev.c
index fca4c75..eae2c87 100644
--- a/drivers/usb/serial/zte_ev.c
+++ b/drivers/usb/serial/zte_ev.c
@@ -281,8 +281,7 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE(0x19d2, 0xfffd) },
{ USB_DEVICE(0x19d2, 0xfffc) },
{ USB_DEVICE(0x19d2, 0xfffb) },
- /* AC2726, AC8710_V3 */
- { USB_DEVICE_AND_INTERFACE_INFO(0x19d2, 0xfff1, 0xff, 0xff, 0xff) },
+ /* AC8710_V3 */
{ USB_DEVICE(0x19d2, 0xfff6) },
{ USB_DEVICE(0x19d2, 0xfff7) },
{ USB_DEVICE(0x19d2, 0xfff8) },

Luis Henriques

unread,
Jan 13, 2014, 11:10:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torv...@linux-foundation.org>

commit 26bef1318adc1b3a530ecc807ef99346db2aa8b0 upstream.

Before we do an EMMS in the AMD FXSAVE information leak workaround we
need to clear any pending exceptions, otherwise we trap with a
floating-point exception inside this code.

Reported-by: halfdog <m...@halfdog.net>
Tested-by: Borislav Petkov <b...@suse.de>
Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy...@mail.gmail.com
Signed-off-by: H. Peter Anvin <h...@zytor.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/x86/include/asm/fpu-internal.h | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 4d0bda7..5be9f87 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -293,12 +293,13 @@ static inline int restore_fpu_checking(struct task_struct *tsk)
/* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception
is pending. Clear the x87 state here by setting it to fixed
values. "m" is a random variable that should be in L1 */
- alternative_input(
- ASM_NOP8 ASM_NOP2,
- "emms\n\t" /* clear stack tags */
- "fildl %P[addr]", /* set F?P to defined value */
- X86_FEATURE_FXSAVE_LEAK,
- [addr] "m" (tsk->thread.fpu.has_fpu));
+ if (unlikely(static_cpu_has(X86_FEATURE_FXSAVE_LEAK))) {
+ asm volatile(
+ "fnclex\n\t"
+ "emms\n\t"
+ "fildl %P[addr]" /* set F?P to defined value */
+ : : [addr] "m" (tsk->thread.fpu.has_fpu));
+ }

return fpu_restore_checking(&tsk->thread.fpu);

Luis Henriques

unread,
Jan 13, 2014, 11:10:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Jason Wang <jaso...@redhat.com>

commit ce232ce01d61b184202bb185103d119820e1260c upstream.

macvtap_put_user() never return a value grater than iov length, this in fact
bypasses the truncated checking in macvtap_recvmsg(). Fix this by always
returning the size of packet plus the possible vlan header to let the trunca
checking work.

Cc: Vlad Yasevich <vyas...@gmail.com>
Cc: Zhi Yong Wu <wu...@linux.vnet.ibm.com>
Cc: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Jason Wang <jaso...@redhat.com>
Acked-by: Vlad Yasevich <vyas...@gmail.com>
Acked-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/macvtap.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index fa54f0e..8ce7075 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -873,7 +873,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
int ret;
int vnet_hdr_len = 0;
int vlan_offset = 0;
- int copied;
+ int copied, total;

if (q->flags & IFF_VNET_HDR) {
struct virtio_net_hdr vnet_hdr;
@@ -888,7 +888,8 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, sizeof(vnet_hdr)))
return -EFAULT;
}
- copied = vnet_hdr_len;
+ total = copied = vnet_hdr_len;
+ total += skb->len;

if (!vlan_tx_tag_present(skb))
len = min_t(int, skb->len, len);
@@ -903,6 +904,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,

vlan_offset = offsetof(struct vlan_ethhdr, h_vlan_proto);
len = min_t(int, skb->len + VLAN_HLEN, len);
+ total += VLAN_HLEN;

copy = min_t(int, vlan_offset, len);
ret = skb_copy_datagram_const_iovec(skb, 0, iv, copied, copy);
@@ -920,10 +922,9 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
}

ret = skb_copy_datagram_const_iovec(skb, vlan_offset, iv, copied, len);
- copied += len;

done:
- return ret ? ret : copied;
+ return ret ? ret : total;
}

static ssize_t macvtap_do_read(struct macvtap_queue *q, struct kiocb *iocb,
@@ -978,7 +979,7 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const struct iovec *iv,
}

ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
- ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
+ ret = min_t(ssize_t, ret, len);
if (ret > 0)
iocb->ki_pos = ret;
out:

Luis Henriques

unread,
Jan 13, 2014, 11:10:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <ty...@mit.edu>

commit ae1495b12df1897d4f42842a7aa7276d920f6290 upstream.

While it's true that errors can only happen if there is a bug in
jbd2_journal_dirty_metadata(), if a bug does happen, we need to halt
the kernel or remount the file system read-only in order to avoid
further data loss. The ext4_journal_abort_handle() function doesn't
do any of this, and while it's likely that this call (since it doesn't
adjust refcounts) will likely result in the file system eventually
deadlocking since the current transaction will never be able to close,
it's much cleaner to call let ext4's error handling system deal with
this situation.

There's a separate bug here which is that if certain jbd2 errors
errors occur and file system is mounted errors=continue, the file
system will probably eventually end grind to a halt as described
above. But things have been this way in a long time, and usually when
we have these sorts of errors it's pretty much a disaster --- and
that's why the jbd2 layer aggressively retries memory allocations,
which is the most likely cause of these jbd2 errors.

Signed-off-by: "Theodore Ts'o" <ty...@mit.edu>
Reviewed-by: Jan Kara <ja...@suse.cz>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
fs/ext4/ext4_jbd2.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 17ac112..3fe29de 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -259,6 +259,15 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
if (WARN_ON_ONCE(err)) {
ext4_journal_abort_handle(where, line, __func__, bh,
handle, err);
+ ext4_error_inode(inode, where, line,
+ bh->b_blocknr,
+ "journal_dirty_metadata failed: "
+ "handle type %u started at line %u, "
+ "credits %u/%u, errcode %d",
+ handle->h_type,
+ handle->h_line_no,
+ handle->h_requested_credits,
+ handle->h_buffer_credits, err);
}
} else {
if (inode)

Luis Henriques

unread,
Jan 13, 2014, 11:10:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Oliver Neukum <one...@suse.de>

commit 455f58925247e8a1a1941e159f3636ad6ee4c90b upstream.

It has been reported that this chipset really cannot
sleep without this extraordinary delay.

This patch should be backported, in order to ensure this host functions
under stable kernels. The last quirk for Fresco Logic hosts (commit
bba18e33f25072ebf70fd8f7f0cdbf8cdb59a746 "xhci: Extend Fresco Logic MSI
quirk.") was backported to stable kernels as old as 2.6.36.

Signed-off-by: Oliver Neukum <one...@suse.de>
Signed-off-by: Sarah Sharp <sarah....@linux.intel.com>
[ luis: backported to 3.11:
- replaced xhci_dbg_trace() by xhci_dbg() ]
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/usb/host/xhci-pci.c | 7 +++++++
drivers/usb/host/xhci.c | 7 ++++++-
drivers/usb/host/xhci.h | 1 +
3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 159e3c6d..6480158 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -67,6 +67,13 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci_dbg(xhci, "QUIRK: Fresco Logic xHC needs configure"
" endpoint cmd after reset endpoint\n");
}
+ if (pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_PDK &&
+ pdev->revision == 0x4) {
+ xhci->quirks |= XHCI_SLOW_SUSPEND;
+ xhci_dbg(xhci, "QUIRK: Fresco Logic xHC revision %u"
+ "must be suspended extra slowly",
+ pdev->revision);
+ }
/* Fresco Logic confirms: all revisions of this chip do not
* support MSI, even though some of them claim to in their PCI
* capabilities.
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index eeb479f..5b341e5 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -887,6 +887,7 @@ static void xhci_clear_command_ring(struct xhci_hcd *xhci)
int xhci_suspend(struct xhci_hcd *xhci)
{
int rc = 0;
+ unsigned int delay = XHCI_MAX_HALT_USEC;
struct usb_hcd *hcd = xhci_to_hcd(xhci);
u32 command;

@@ -909,8 +910,12 @@ int xhci_suspend(struct xhci_hcd *xhci)
command = xhci_readl(xhci, &xhci->op_regs->command);
command &= ~CMD_RUN;
xhci_writel(xhci, command, &xhci->op_regs->command);
+
+ /* Some chips from Fresco Logic need an extraordinary delay */
+ delay *= (xhci->quirks & XHCI_SLOW_SUSPEND) ? 10 : 1;
+
if (xhci_handshake(xhci, &xhci->op_regs->status,
- STS_HALT, STS_HALT, XHCI_MAX_HALT_USEC)) {
+ STS_HALT, STS_HALT, delay)) {
xhci_warn(xhci, "WARN: xHC CMD_RUN timeout\n");
spin_unlock_irq(&xhci->lock);
return -ETIMEDOUT;
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 46ce776..b72601c 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1544,6 +1544,7 @@ struct xhci_hcd {
#define XHCI_COMP_MODE_QUIRK (1 << 14)
#define XHCI_AVOID_BEI (1 << 15)
#define XHCI_PLAT (1 << 16)
+#define XHCI_SLOW_SUSPEND (1 << 17)
unsigned int num_active_eps;
unsigned int limit_active_eps;
/* There are two roothubs to keep track of bus suspend info for */

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt <ros...@goodmis.org>

commit 1739f09e33d8f66bf48ddbc3eca615574da6c4f6 upstream.

Function tracing callbacks expect to have the ftrace_ops that registered it
passed to them, not the address of the variable that holds the ftrace_ops
that registered it.

Use a mov instead of a lea to store the ftrace_ops into the parameter
of the function tracing callback.

Signed-off-by: Steven Rostedt <ros...@goodmis.org>
Reviewed-by: Masami Hiramatsu <masami.hi...@hitachi.com>
Link: http://lkml.kernel.org/r/20131113152...@gandalf.local.home
Signed-off-by: H. Peter Anvin <h...@linux.intel.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/x86/kernel/entry_32.S | 4 ++--
arch/x86/kernel/entry_64.S | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 2cfbc3a..bbc89cf 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1085,7 +1085,7 @@ ENTRY(ftrace_caller)
pushl $0 /* Pass NULL as regs pointer */
movl 4*4(%esp), %eax
movl 0x4(%ebp), %edx
- leal function_trace_op, %ecx
+ movl function_trace_op, %ecx
subl $MCOUNT_INSN_SIZE, %eax

.globl ftrace_call
@@ -1143,7 +1143,7 @@ ENTRY(ftrace_regs_caller)
movl 12*4(%esp), %eax /* Load ip (1st parameter) */
subl $MCOUNT_INSN_SIZE, %eax /* Adjust ip */
movl 0x4(%ebp), %edx /* Load parent ip (2nd parameter) */
- leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
+ movl function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
pushl %esp /* Save pt_regs as 4th parameter */

GLOBAL(ftrace_regs_call)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1b69951..bcdb3ca 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -88,7 +88,7 @@ END(function_hook)
MCOUNT_SAVE_FRAME \skip

/* Load the ftrace_ops into the 3rd parameter */
- leaq function_trace_op, %rdx
+ movq function_trace_op(%rip), %rdx

/* Load ip into the first parameter */
movq RIP(%rsp), %rdi

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Lan Tianyu <tiany...@intel.com>

commit a90b40385735af0d3031f98e97b439e8944a31b3 upstream.

The AML method _BIX of NEC LZ750/LS returns a broken package which
skips the first member "Revision" (ACPI 5.0, Table 10-234).

Add a quirk for this machine to skip member "Revision" during parsing
the package returned by _BIX.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=67351
Reported-and-tested-by: Francisco Castro <f...@adinet.com.uy>
Signed-off-by: Lan Tianyu <tiany...@intel.com>
Reviewed-by: Dmitry Torokhov <dmitry....@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>
[ luis: backported to 3.11: adjusted context ]
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/acpi/battery.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
index d405fba..cb2ba88 100644
--- a/drivers/acpi/battery.c
+++ b/drivers/acpi/battery.c
@@ -68,6 +68,7 @@ MODULE_AUTHOR("Alexey Starikovskiy <astari...@suse.de>");
MODULE_DESCRIPTION("ACPI Battery Driver");
MODULE_LICENSE("GPL");

+static int battery_bix_broken_package;
static unsigned int cache_time = 1000;
module_param(cache_time, uint, 0644);
MODULE_PARM_DESC(cache_time, "cache time in milliseconds");
@@ -443,7 +444,12 @@ static int acpi_battery_get_info(struct acpi_battery *battery)
ACPI_EXCEPTION((AE_INFO, status, "Evaluating %s", name));
return -ENODEV;
}
- if (test_bit(ACPI_BATTERY_XINFO_PRESENT, &battery->flags))
+
+ if (battery_bix_broken_package)
+ result = extract_package(battery, buffer.pointer,
+ extended_info_offsets + 1,
+ ARRAY_SIZE(extended_info_offsets) - 1);
+ else if (test_bit(ACPI_BATTERY_XINFO_PRESENT, &battery->flags))
result = extract_package(battery, buffer.pointer,
extended_info_offsets,
ARRAY_SIZE(extended_info_offsets));
@@ -1064,6 +1070,17 @@ static int battery_notify(struct notifier_block *nb,
return 0;
}

+static struct dmi_system_id bat_dmi_table[] = {
+ {
+ .ident = "NEC LZ750/LS",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "NEC"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "PC-LZ750LS"),
+ },
+ },
+ {},
+};
+
static int acpi_battery_add(struct acpi_device *device)
{
int result = 0;
@@ -1174,6 +1191,8 @@ static void __init acpi_battery_init_async(void *unused, async_cookie_t cookie)
if (!acpi_battery_dir)
return;
#endif
+ if (dmi_check_system(bat_dmi_table))
+ battery_bix_broken_package = 1;
if (acpi_bus_register_driver(&acpi_battery_driver) < 0) {
#ifdef CONFIG_ACPI_PROCFS_POWER
acpi_unlock_battery_dir(acpi_battery_dir);

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Dirk Brandewie <dirk.j.b...@intel.com>

commit 6cbd7ee10e2842a3d1f9b60abede1c8f3d1f1130 upstream.

KVM environments do not support APERF/MPERF MSRs. intel_pstate cannot
operate without these registers.

The previous validity checks in intel_pstate_msrs_not_valid() are
insufficent in nested KVMs.

References: https://bugzilla.redhat.com/show_bug.cgi?id=1046317
Signed-off-by: Dirk Brandewie <dirk.j.b...@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/cpufreq/intel_pstate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 85f02a0..8d10dda 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -520,7 +520,8 @@ static void intel_pstate_timer_func(unsigned long __data)
}

#define ICPU(model, policy) \
- { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long)&policy }
+ { X86_VENDOR_INTEL, 6, model, X86_FEATURE_APERFMPERF,\
+ (unsigned long)&policy }

static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
ICPU(0x2a, default_policy),

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christia...@amd.com>

commit 1b3abef830db98c11d7f916a483abaf2501f3323 upstream.

Otherwise we end up with a rather strange looking result.

Signed-off-by: Christian König <christia...@amd.com>
Tested-by: Tom Stellard <thomas....@amd.com>
Signed-off-by: Alex Deucher <alexande...@amd.com>
[ luis: backported to 3.11:
- adjusted filename: cik_sdma.c -> cik.c ]
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/gpu/drm/radeon/cik.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 4a315da..e4182a6 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -3643,7 +3643,7 @@ int cik_copy_dma(struct radeon_device *rdev,
radeon_ring_write(ring, 0); /* src/dst endian swap */
radeon_ring_write(ring, src_offset & 0xffffffff);
radeon_ring_write(ring, upper_32_bits(src_offset) & 0xffffffff);
- radeon_ring_write(ring, dst_offset & 0xfffffffc);
+ radeon_ring_write(ring, dst_offset & 0xffffffff);
radeon_ring_write(ring, upper_32_bits(dst_offset) & 0xffffffff);
src_offset += cur_size_in_bytes;
dst_offset += cur_size_in_bytes;

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: James Hogan <james...@imgtec.com>

commit 778037e1ccb75609846deca9e419449c1dc137fa upstream.

Commit 6d9252bd9a4bb (clk: Add support for power of two type dividers)
merged in v3.6 added the _get_val function to convert a divisor value to
a register field value depending on the flags. However it used the type
u8 for the div field, causing divisors larger than 255 to be masked
and the resultant clock rate to be too high.

E.g. in my case an 11bit divider was supposed to divide 24.576 MHz down
to 32.768KHz. The divisor was correctly calculated as 750 (0x2ee). This
was masked to 238 (0xee) resulting in a frequency of 103.26KHz.

Signed-off-by: James Hogan <james...@imgtec.com>
Cc: Rajendra Nayak <rna...@ti.com>
Cc: linux-ar...@lists.infradead.org
Signed-off-by: Mike Turquette <mturq...@linaro.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/clk/clk-divider.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
index 6d55eb2..ae76b0f 100644
--- a/drivers/clk/clk-divider.c
+++ b/drivers/clk/clk-divider.c
@@ -87,7 +87,7 @@ static unsigned int _get_table_val(const struct clk_div_table *table,
return 0;
}

-static unsigned int _get_val(struct clk_divider *divider, u8 div)
+static unsigned int _get_val(struct clk_divider *divider, unsigned int div)
{
if (divider->flags & CLK_DIVIDER_ONE_BASED)
return div;

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Nathaniel Yazdani <n1ght....@gmail.com>

commit c338c07c51e3106711fad5eb599e375eadb6855d upstream.

When register_session() is given an out-of-range argument for mds,
ceph_mdsmap_get_addr() will return a null pointer, which would be given to
ceph_con_open() & be dereferenced, causing a kernel oops. This fixes bug #4685
in the Ceph bug tracker <http://tracker.ceph.com/issues/4685>.

Signed-off-by: Nathaniel Yazdani <n1ght....@gmail.com>
Reviewed-by: Sage Weil <sa...@inktank.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
fs/ceph/mds_client.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 17b97ff..96d03db 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -414,6 +414,9 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc,
{
struct ceph_mds_session *s;

+ if (mds >= mdsc->mdsmap->m_max_mds)
+ return ERR_PTR(-EINVAL);
+
s = kzalloc(sizeof(*s), GFP_NOFS);
if (!s)
return ERR_PTR(-ENOMEM);

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Nat Gurumoorthy <na...@google.com>

commit 388d3335575f4c056dcf7138a30f1454e2145cd8 upstream.

The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120)
uninitialized. From power on reset this register may have garbage in it. The
Register Base Address register defines the device local address of a
register. The data pointed to by this location is read or written using
the Register Data register (PCI config offset 128). When REG_BASE_ADDR has
garbage any read or write of Register Data Register (PCI 128) will cause the
PCI bus to lock up. The TCO watchdog will fire and bring down the system.

Signed-off-by: Nat Gurumoorthy <na...@google.com>
Acked-by: Michael Chan <mc...@broadcom.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/ethernet/broadcom/tg3.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index e7af885..f9aec90 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -16436,6 +16436,9 @@ static int tg3_get_invariants(struct tg3 *tp, const struct pci_device_id *ent)
/* Clear this out for sanity. */
tw32(TG3PCI_MEM_WIN_BASE_ADDR, 0);

+ /* Clear TG3PCI_REG_BASE_ADDR to prevent hangs. */
+ tw32(TG3PCI_REG_BASE_ADDR, 0);
+
pci_read_config_dword(tp->pdev, TG3PCI_PCISTATE,
&pci_state_reg);
if ((pci_state_reg & PCISTATE_CONV_PCI_MODE) == 0 &&

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit af2c1401e6f9177483be4fad876d0073669df9df upstream.

According to documentation on barriers, stores issued before a LOCK can
complete after the lock implying that it's possible tlb_flush_pending
can be visible after a page table update. As per revised documentation,
this patch adds a smp_mb__before_spinlock to guarantee the correct
ordering.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Acked-by: Paul E. McKenney <pau...@linux.vnet.ibm.com>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
include/linux/mm_types.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index ef9dc52..e8d77fb 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -477,7 +477,12 @@ static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
static inline void set_tlb_flush_pending(struct mm_struct *mm)
{
mm->tlb_flush_pending = true;
- barrier();
+
+ /*
+ * Guarantee that the tlb_flush_pending store does not leak into the
+ * critical section updating the page tables
+ */
+ smp_mb__before_spinlock();
}
/* Clearing is done after a TLB flush, which also provides a barrier. */
static inline void clear_tlb_flush_pending(struct mm_struct *mm)

Luis Henriques

unread,
Jan 13, 2014, 11:10:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Jiang Liu <jian...@linux.intel.com>

commit df45c712d1f4ef37714245fb75de726f4ca2bf8d upstream.

In function ppi_callback(), memory allocated by acpi_get_name() will get
leaked when current device isn't the desired TPM device, so fix the
memory leak.

Signed-off-by: Jiang Liu <jian...@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/char/tpm/tpm_ppi.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/char/tpm/tpm_ppi.c b/drivers/char/tpm/tpm_ppi.c
index 2168d15..57a818b 100644
--- a/drivers/char/tpm/tpm_ppi.c
+++ b/drivers/char/tpm/tpm_ppi.c
@@ -27,15 +27,18 @@ static char *tpm_device_name = "TPM";
static acpi_status ppi_callback(acpi_handle handle, u32 level, void *context,
void **return_value)
{
- acpi_status status;
+ acpi_status status = AE_OK;
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
- status = acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer);
- if (strstr(buffer.pointer, context) != NULL) {
- *return_value = handle;
+
+ if (ACPI_SUCCESS(acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer))) {
+ if (strstr(buffer.pointer, context) != NULL) {
+ *return_value = handle;
+ status = AE_CTRL_TERMINATE;
+ }
kfree(buffer.pointer);
- return AE_CTRL_TERMINATE;
}
- return AE_OK;
+
+ return status;
}

static inline void ppi_assign_params(union acpi_object params[4],

Luis Henriques

unread,
Jan 13, 2014, 11:10:04 AM1/13/14
to
This is the start of the review cycle for the Linux 3.11.10.3 stable kernel.

This version contains 208 new patches, summarized below. The new patches are
posted as replies to this message and also available in this git branch:

http://kernel.ubuntu.com/git?p=ubuntu/linux.git;h=linux-3.11.y-review;a=shortlog

git://kernel.ubuntu.com/ubuntu/linux.git linux-3.11.y-review

The review period for version 3.11.10.3 will be open for the next three days.
To report a problem, please reply to the relevant follow-up patch message.

For more information about the Linux 3.11.y.z extended stable kernel version,
see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable .

-Luis

--
Documentation/kernel-parameters.txt | 2 +
Documentation/networking/packet_mmap.txt | 10 ++
arch/arm/kernel/traps.c | 8 +-
arch/arm/mach-footbridge/dc21285-timer.c | 5 +-
arch/arm/mach-omap2/omap_hwmod.c | 43 +++++++-
arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c | 4 +-
arch/arm/mach-omap2/omap_hwmod_3xxx_data.c | 19 ++--
arch/arm/mm/flush.c | 6 +-
arch/arm64/boot/dts/foundation-v8.dts | 2 +
arch/arm64/include/asm/pgtable.h | 2 +-
arch/arm64/include/asm/syscall.h | 6 ++
arch/arm64/kernel/fpsimd.c | 2 +
arch/arm64/kernel/ptrace.c | 38 ++++---
arch/arm64/mm/proc.S | 4 -
arch/parisc/include/asm/cacheflush.h | 12 +--
arch/parisc/include/asm/page.h | 5 +-
arch/parisc/kernel/cache.c | 35 ------
arch/powerpc/include/asm/exception-64s.h | 2 +-
arch/powerpc/kernel/head_64.S | 1 +
arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 +-
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 4 +
arch/sh/kernel/sh_ksyms_32.c | 5 +
arch/sh/lib/Makefile | 2 +-
arch/sparc/include/asm/pgtable_64.h | 4 +-
arch/x86/include/asm/fpu-internal.h | 13 +--
arch/x86/include/asm/pgtable.h | 11 +-
arch/x86/kernel/cpu/intel.c | 3 +-
arch/x86/kernel/crash.c | 2 +-
arch/x86/kernel/entry_32.S | 4 +-
arch/x86/kernel/entry_64.S | 2 +-
arch/x86/kernel/reboot.c | 8 +-
arch/x86/kvm/lapic.c | 8 +-
arch/x86/mm/gup.c | 13 +++
drivers/acpi/acpi_lpss.c | 1 +
drivers/acpi/battery.c | 21 +++-
drivers/ata/ahci.c | 3 +
drivers/ata/ahci_imx.c | 3 +-
drivers/ata/libata-core.c | 4 +
drivers/ata/libata-scsi.c | 21 ++++
drivers/block/rbd.c | 95 +++++++++++-----
drivers/char/tpm/tpm_ppi.c | 15 +--
drivers/clk/clk-divider.c | 2 +-
drivers/clocksource/dw_apb_timer_of.c | 7 +-
drivers/clocksource/em_sti.c | 2 +-
drivers/cpufreq/intel_pstate.c | 8 +-
drivers/dma/Kconfig | 1 +
drivers/firewire/sbp2.c | 1 -
drivers/gpio/gpio-msm-v2.c | 4 +-
drivers/gpio/gpio-twl4030.c | 15 ++-
drivers/gpu/drm/drm_edid.c | 8 ++
drivers/gpu/drm/i915/i915_dma.c | 10 ++
drivers/gpu/drm/i915/i915_drv.c | 1 +
drivers/gpu/drm/i915/i915_gem_context.c | 16 ++-
drivers/gpu/drm/i915/intel_display.c | 10 +-
drivers/gpu/drm/nouveau/core/subdev/bios/init.c | 6 +-
drivers/gpu/drm/radeon/atombios_crtc.c | 4 +-
drivers/gpu/drm/radeon/cik.c | 12 +--
drivers/gpu/drm/radeon/evergreen_hdmi.c | 2 +-
drivers/gpu/drm/radeon/ni.c | 20 +++-
drivers/gpu/drm/radeon/radeon_uvd.c | 2 +-
drivers/gpu/drm/radeon/rs690.c | 10 ++
drivers/gpu/drm/radeon/rv770_dpm.c | 6 ++
drivers/gpu/drm/radeon/si.c | 10 +-
drivers/idle/intel_idle.c | 3 +
drivers/iio/adc/ad7887.c | 16 ++-
drivers/iio/imu/adis16400_core.c | 7 +-
drivers/infiniband/hw/qib/qib_user_sdma.c | 6 +-
drivers/infiniband/ulp/isert/ib_isert.c | 16 ++-
drivers/input/input.c | 4 +
drivers/input/touchscreen/usbtouchscreen.c | 22 +++-
drivers/md/bcache/btree.c | 3 +-
drivers/media/dvb-frontends/cxd2820r_core.c | 4 +-
drivers/mfd/rtsx_pcr.c | 10 +-
drivers/net/can/usb/peak_usb/pcan_usb_pro.c | 3 +
drivers/net/ethernet/broadcom/tg3.c | 5 +-
drivers/net/ethernet/freescale/fec_main.c | 4 +-
drivers/net/ethernet/ibm/ehea/ehea_main.c | 2 +-
drivers/net/ethernet/tehuti/tehuti.c | 1 -
drivers/net/ethernet/xilinx/ll_temac_main.c | 2 +-
drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 2 +-
drivers/net/hamradio/hdlcdrv.c | 2 +
drivers/net/hamradio/yam.c | 1 +
drivers/net/hyperv/netvsc_drv.c | 1 -
drivers/net/macvtap.c | 23 ++--
drivers/net/tun.c | 2 +
drivers/net/usb/dm9601.c | 34 ++++--
drivers/net/virtio_net.c | 119 +++++++++++++++------
drivers/net/wireless/ath/ath9k/ar9002_mac.c | 52 +++++++--
drivers/net/wireless/ath/ath9k/htc_drv_main.c | 25 +++--
drivers/net/wireless/ath/ath9k/main.c | 5 +-
drivers/net/wireless/rtlwifi/pci.c | 4 +-
drivers/net/xen-netback/interface.c | 7 +-
drivers/of/address.c | 8 --
drivers/pinctrl/pinctrl-baytrail.c | 1 +
drivers/platform/x86/dell-wmi.c | 7 +-
drivers/s390/char/tty3270.c | 2 +-
drivers/scsi/qla2xxx/qla_target.c | 9 +-
drivers/staging/comedi/drivers.c | 2 +-
drivers/staging/comedi/drivers/8255_pci.c | 15 ++-
drivers/staging/comedi/drivers/amplc_pc263.c | 3 +
drivers/staging/comedi/drivers/amplc_pci263.c | 3 +
drivers/staging/comedi/drivers/pcmuio.c | 11 +-
drivers/staging/comedi/drivers/ssv_dnp.c | 6 +-
drivers/target/iscsi/iscsi_target.c | 26 +++--
drivers/target/target_core_device.c | 5 +
drivers/target/target_core_file.c | 8 +-
drivers/target/target_core_file.h | 5 +-
drivers/tty/serial/8250/8250_dw.c | 2 +
drivers/tty/serial/pmac_zilog.c | 3 +
drivers/usb/class/cdc-wdm.c | 8 +-
drivers/usb/host/xhci-pci.c | 29 +++++
drivers/usb/host/xhci.c | 14 ++-
drivers/usb/host/xhci.h | 2 +
drivers/usb/serial/option.c | 2 +
drivers/usb/serial/zte_ev.c | 3 +-
drivers/watchdog/sc1200wdt.c | 3 +-
fs/btrfs/acl.c | 2 +-
fs/btrfs/inode.c | 47 ++++----
fs/btrfs/tree-log.c | 2 +-
fs/btrfs/volumes.c | 2 +
fs/ceph/addr.c | 8 +-
fs/ceph/file.c | 47 ++++----
fs/ceph/ioctl.c | 8 +-
fs/ceph/mds_client.c | 11 +-
fs/cifs/cifsproto.h | 7 +-
fs/cifs/inode.c | 6 +-
fs/cifs/link.c | 26 ++---
fs/ext2/super.c | 1 +
fs/ext4/ext4.h | 10 ++
fs/ext4/ext4_jbd2.c | 9 ++
fs/ext4/extents.c | 45 +++++---
fs/ext4/inode.c | 12 ---
fs/ext4/mballoc.c | 21 ++--
fs/ext4/super.c | 21 ++--
fs/gfs2/aops.c | 30 ++++++
fs/gfs2/ops_fstype.c | 12 ++-
fs/jbd2/transaction.c | 6 +-
fs/xfs/xfs_qm.c | 71 ++++++++----
include/asm-generic/pgtable.h | 2 +-
include/drm/drm_pciids.h | 2 +-
include/linux/auxvec.h | 2 +-
include/linux/ceph/osd_client.h | 2 +
include/linux/migrate.h | 10 +-
include/linux/mm_types.h | 49 +++++++++
include/linux/net.h | 2 +-
include/linux/netdevice.h | 9 ++
include/linux/reboot.h | 1 +
include/linux/skbuff.h | 5 +
include/target/target_core_base.h | 1 +
kernel/fork.c | 1 +
kernel/freezer.c | 6 ++
kernel/kexec.c | 1 +
kernel/reboot.c | 2 +-
kernel/sched/core.c | 9 +-
kernel/sched/fair.c | 55 +++++++---
kernel/sched/rt.c | 14 +++
kernel/sched/sched.h | 3 +-
kernel/trace/ftrace.c | 2 +-
mm/compaction.c | 4 +
mm/fremap.c | 8 +-
mm/huge_memory.c | 49 +++++++--
mm/memcontrol.c | 2 +-
mm/memory-failure.c | 24 ++++-
mm/migrate.c | 69 ++++++++++--
mm/mprotect.c | 13 ++-
mm/pgtable-generic.c | 8 +-
mm/rmap.c | 4 +
net/8021q/vlan_dev.c | 19 +++-
net/bridge/br_multicast.c | 4 +-
net/ceph/osd_client.c | 25 +++--
net/core/drop_monitor.c | 1 -
net/core/neighbour.c | 2 +-
net/core/netpoll.c | 11 +-
net/core/skbuff.c | 1 +
net/core/sock.c | 2 +-
net/ipv4/inet_diag.c | 16 +++
net/ipv4/ip_gre.c | 1 +
net/ipv6/route.c | 34 +++---
net/ipv6/udp_offload.c | 2 +-
net/llc/af_llc.c | 5 +-
net/mac80211/tx.c | 23 ++--
net/packet/af_packet.c | 65 ++++++-----
net/rds/ib.c | 3 +-
net/rds/ib_send.c | 5 +-
net/rose/af_rose.c | 16 +--
net/unix/af_unix.c | 16 ++-
net/wireless/radiotap.c | 4 +
scripts/link-vmlinux.sh | 4 +-
security/selinux/hooks.c | 73 ++++++++++---
security/selinux/include/objsec.h | 5 +-
security/selinux/include/xfrm.h | 9 +-
security/selinux/xfrm.c | 53 +++++++--
sound/core/pcm_lib.c | 2 +
sound/pci/hda/hda_intel.c | 4 +
sound/pci/hda/patch_realtek.c | 3 +
sound/soc/codecs/wm5110.c | 2 +-
sound/soc/codecs/wm8904.c | 2 +-
sound/soc/codecs/wm_adsp.c | 10 +-
sound/soc/tegra/tegra20_i2s.c | 6 +-
sound/soc/tegra/tegra20_spdif.c | 10 +-
sound/soc/tegra/tegra30_i2s.c | 6 +-
tools/power/cpupower/utils/cpupower-set.c | 6 +-
202 files changed, 1690 insertions(+), 696 deletions(-)

AKASHI Takahiro (1):
arm64: check for number of arguments in syscall_get/set_arguments()

Al Viro (1):
ext4: fix del_timer() misuse for ->s_err_report

Alan (1):
sc1200_wdt: Fix oops

Alex Deucher (6):
drm/radeon: Fix sideport problems on certain RS690 boards
drm/radeon: add missing display tiling setup for oland
drm/radeon/dpm: disable ss on Cayman
drm/radeon: check for 0 count in speaker allocation and SAD code
drm/radeon: fix asic gfx values for scrapper asics
drm/radeon: 0x9649 is SUMO2 not SUMO

Alex Hung (1):
dell-wmi: Add KEY_MICMUTE to bios_to_linux_keycode

Andrey Vagin (1):
virtio: delete napi structures from netdev before releasing memory

Anton Blanchard (1):
powerpc: Align p_end

Ard Biesheuvel (1):
auxvec.h: account for AT_HWCAP2 in AT_VECTOR_SIZE_BASE

Ben Segall (3):
sched: Fix race on toggling cfs_bandwidth_used
sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
sched: Fix hrtimer_cancel()/rq->lock deadlock

Bjørn Mork (1):
usb: cdc-wdm: manage_power should always set needs_remote_wakeup

Bo Shen (1):
ASoC: wm8904: fix DSP mode B configuration

Catalin Marinas (3):
arm64: dts: Reserve the memory used for secondary CPU release address
arm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S
arm64: Use Normal NonCacheable memory for writecombine

Chad Hanson (1):
selinux: fix broken peer recv check

Changli Gao (1):
net: drop_monitor: fix the value of maxattr

Charles Keepax (2):
ASoC: wm5110: Correct HPOUT3 DAPM route typo
ASoC: wm_adsp: Add small delay while polling DSP RAM start

Chris Wilson (3):
drm/i915: Do not clobber config status after a forced restore of hw state
drm/i915: Hold mutex across i915_gem_release
drm/i915: Use the correct GMCH_CTRL register for Sandybridge+

Christian Engelmayer (1):
Input: usbtouchscreen - separate report and transmit buffer size handling

Christian König (2):
drm/radeon: fix typo in cik_copy_dma
drm/radeon: fix UVD 256MB check

Curt Brune (1):
bridge: use spin_lock_bh() in br_multicast_set_hash_max

Dan Carpenter (4):
ceph: cleanup types in striped_read()
libceph: fix error handling in handle_reply()
libceph: potential NULL dereference in ceph_osdc_handle_map()
libceph: create_singlethread_workqueue() doesn't return ERR_PTRs

Dan Williams (1):
net_dma: mark broken

Daniel Borkmann (3):
packet: fix send path when running with proto == 0
net: inet_diag: zero out uninitialized idiag_{src,dst} fields
net: llc: fix use after free in llc_ui_recvmsg

Daniel Vetter (2):
drm/i915: Fix use-after-free in do_switch
drm/i915: don't update the dri1 breadcrumb with modesetting

David Henningsson (1):
ALSA: hda - Add enable_msi=0 workaround for four HP machines

David S. Miller (2):
vlan: Fix header ops passthru when doing TX VLAN offload.
netpoll: Fix missing TXQ unlock and and OOPS.

Dinh Nguyen (2):
clocksource: dw_apb_timer_of: Fix read_sched_clock
clocksource: dw_apb_timer_of: Fix support for dts binding "snps,dw-apb-timer"

Dirk Brandewie (1):
intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.

Dmitry Kunilov (1):
usb: serial: zte_ev: move support for ZTE AC2726 from zte_ev back to option

Dmitry Torokhov (1):
Input: allocate absinfo data when setting ABS capability

Eric Dumazet (2):
net: do not pretend FRAGLIST support
net: fec: fix potential use after free

Eric Whitney (1):
ext4: fix bigalloc regression

Eryu Guan (1):
ext4: check for overlapping extents in ext4_valid_extent_entries()

Felix Fietkau (1):
mac80211: move "bufferable MMPDU" check to fix AP mode scan

Fenghua Yu (1):
x86/apic: Disable I/O APIC before shutdown of the local APIC

Filipe David Borba Manana (1):
Btrfs: fix incorrect inode acl reset

Florian Westphal (1):
net: rose: restore old recvmsg behavior

Geert Uytterhoeven (2):
TTY: pmac_zilog, check existence of ports in pmz_console_init()
sh: always link in helper functions extracted from libgcc

H Hartley Sweeten (1):
staging: comedi: drivers: fix return value of comedi_load_firmware()

Hannes Frederic Sowa (3):
net: clear local_df when passing skb between namespaces
ipv6: don't count addrconf generated routes against gc limit
ipv6: fix illegal mac_header comparison on 32bit

Hans Verkuil (1):
[media] cxd2820r_core: fix sparse warnings

Hui Wang (1):
ALSA: hda - Add Dell headset detection quirk for three laptop models

Ian Abbott (4):
staging: comedi: pcmuio: fix possible NULL deref on detach
staging: comedi: ssv_dnp: use comedi_dio_update_state()
staging: comedi: drivers: use comedi_dio_update_state() for simple cases
staging: comedi: 8255_pci: fix for newer PCI-DIO48H

Ilia Mirkin (1):
drm/nouveau/bios: make jump conditional

James Hogan (1):
clk: clk-divider: fix divisor > 255 bug

Jan Kara (4):
IB/qib: Convert qib_user_sdma_pin_pages() to use get_user_pages_fast()
ext2: Fix oops in ext2_get_block() called from ext2_quota_write()
ext4: Do not reserve clusters when fs doesn't support extents
ext4: fix deadlock when writing in ENOSPC conditions

Jan Kiszka (1):
KVM: x86: Fix APIC map calculation after re-enabling

Jason Wang (3):
macvtap: signal truncated packets
netvsc: don't flush peers notifying work during setting mtu
virtio-net: fix refill races during restore

Jiang Liu (2):
arm64: fix possible invalid FPSIMD initialization state
ACPI / TPM: fix memory leak when walking ACPI namespace

Jianguo Wu (2):
mm/hugetlb: check for pte NULL pointer in __page_check_address()
mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully

Jie Liu (1):
xfs: fix infinite loop by detaching the group/project hints from user dquot

Johannes Berg (1):
radiotap: fix bitmap-end-finding buffer overrun

John David Anglin (1):
parisc: Ensure full cache coherency for kmap/kunmap

Jonathan Cameron (2):
iio:imu:adis16400 fix pressure channel scan type
iio:adc:ad7887 Fix channel reported endianness from cpu to big endian

JongHo Kim (1):
ALSA: Add SNDRV_PCM_STATE_PAUSED case in wait_for_avail function

Joonsoo Kim (1):
mm/compaction: respect ignore_skip_hint in update_pageblock_skip

Josef Bacik (1):
Btrfs: fix hole check in log_one_extent

Josh Boyer (1):
cpupower: Fix segfault due to incorrect getopt_long arugments

Josh Durgin (8):
rbd: fix buffer size for writes to images with snapshots
rbd: fix null dereference in dout
libceph: add function to ensure notifies are complete
rbd: complete notifies before cleaning up osd_client and rbd_dev
rbd: make rbd_obj_notify_ack() synchronous
rbd: fix use-after free of rbd_dev->disk
rbd: ignore unmapped snapshots that no longer exist
rbd: fix error handling from rbd_snap_name()

Junho Ryu (1):
ext4: fix use-after-free in ext4_mb_new_blocks

Kamala R (1):
IPv6: Fixed support for blackhole and prohibit routes

Kent Overstreet (1):
bcache: Fix dirty_data accounting

Kirill Tkhai (1):
sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities

Lan Tianyu (1):
ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS

Larry Finger (1):
rtlwifi: pci: Fix oops on driver unload

Len Brown (1):
x86 idle: Repair large-server 50-watt idle-power regression

Li RongQing (1):
ipv6: always set the new created dst's from in ip6_rt_copy

Li Wang (1):
ceph: Avoid data inconsistency due to d-cache aliasing in readpage()

Linus Torvalds (1):
x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround

Liu Bo (2):
Btrfs: fix memory leak of chunks' extent map
Btrfs: do not run snapshot-aware defragment on error

Lukas Czerner (1):
ext4: fix FITRIM in no journal mode

Magnus Damm (1):
clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast

Marc Kleine-Budde (1):
can: peak_usb: fix mem leak in pcan_usb_pro_init()

Marek Olšák (1):
drm/radeon: fix render backend setup for SI and CIK

Marek Vasut (1):
ahci: imx: Explicitly clear IMX6Q_GPR13_SATA_MPLL_CLK_EN

Martin Schwidefsky (1):
s390/3270: fix allocation of tty3270_screen structure

Mathy Vanhoef (1):
ath9k_htc: properly set MAC address and BSSID mask

Mel Gorman (12):
mm: numa: serialise parallel get_user_page against THP migration
mm: numa: call MMU notifiers on THP migration
mm: clear pmd_numa before invalidating
mm: numa: do not clear PMD during PTE update scan
mm: numa: do not clear PTE for pte_numa update
mm: numa: ensure anon_vma is locked to prevent parallel THP splits
mm: numa: avoid unnecessary work on the failure path
sched: numa: skip inaccessible VMAs
mm: numa: clear numa hinting information on mprotect
mm: numa: avoid unnecessary disruption of NUMA hinting during migration
mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates
mm: numa: defer TLB flush for THP migration as long as possible

Miao Xie (1):
ftrace: Initialize the ftrace profiler for each possible cpu

Michael Neuling (1):
powerpc: Fix bad stack check in exception entry

Michael S. Tsirkin (3):
virtio_net: fix error handling for mergeable buffers
virtio-net: make all RX paths handle errors consistently
virtio_net: don't leak memory or block when too many frags

Michele Baldessari (1):
libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for Seagate Momentus SpinPoint M8

Mika Westerberg (1):
serial: 8250_dw: add new ACPI IDs

Ming Lei (1):
scripts/link-vmlinux.sh: only filter kernel symbols for arm

Naoya Horiguchi (1):
mm/memory-failure.c: transfer page count from head page to tail page after split thp

Nat Gurumoorthy (1):
tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0

Nathaniel Yazdani (1):
ceph: fix null pointer dereference

Nicholas Bellinger (2):
iscsi-target: Fix-up all zero data-length CDBs with R/W_BIT set
target/file: Update hw_max_sectors based on current block_size

Nithin Sujir (1):
tg3: Expand 4g_overflow_test workaround to skb fragments of any size.

Nobuhiro Iwamatsu (1):
sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c

Oleg Nesterov (1):
selinux: selinux_setprocattr()->ptrace_parent() needs rcu_read_lock()

Oliver Neukum (1):
xhci: quirk for extra long delay for S4

Paul Drews (1):
ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs

Paul Moore (2):
selinux: look for IPsec labels on both inbound and outbound packets
selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()

Paul Turner (1):
sched: Guarantee new group-entities always have weight

Peter Korsgaard (2):
dm9601: fix reception of full size ethernet frames on dm9620/dm9621a
dm9601: work around tx fifo sync issue on dm962x

Rafael J. Wysocki (1):
intel_pstate: Fail initialization if P-state information is missing

Rafał Miłecki (1):
drm/edid: add quirk for BPC in Samsung NP700G7A-S01PL notebook

Rik van Riel (2):
mm: fix use-after-free in sys_remap_file_pages
mm: fix TLB flush race between migration, and change_protection_range

Rob Herring (1):
Revert "of/address: Handle #address-cells > 2 specially"

Robin H. Johnson (1):
libata: disable a disk via libata.force params

Roger Quadros (3):
ARM: OMAP3: hwmod data: Don't prevent RESET of USB Host module
ARM: OMAP2+: hwmod: Fix SOFTRESET logic
gpio: twl4030: Fix regression for twl gpio LED output

Russell King (2):
ARM: fix footbridge clockevent device
ARM: fix "bad mode in ... handler" message for undefined instructions

Sachin Prabhu (1):
cifs: We do not drop reference to tlink in CIFSCheckMFSymlink()

Salva Peiró (1):
hamradio/yam: fix info leak in ioctl

Sasha Levin (3):
net: unix: allow set_peek_off to fail
net: unix: allow bind to fail on mutex lock
rds: prevent dereference of a NULL device

Shivaram Upadhyayula (1):
qla2xxx: Fix schedule_delayed_work() for target timeout calculations

Simon Guinot (1):
ahci: add PCI ID for Marvell 88SE9170 SATA controller

Stefan Richter (1):
firewire: sbp2: bring back WRITE SAME support

Stephen Boyd (1):
gpio: msm: Fix irq mask/unmask by writing bits instead of numbers

Stephen Warren (1):
ASoC: tegra: fix uninitialized variables in set_fmt

Steven Capper (1):
ARM: 7923/1: mm: fix dcache flush logic for compound high pages

Steven Rostedt (2):
ftrace/x86: Load ftrace_ops in parameter not the variable holding it
SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

Steven Whitehouse (2):
GFS2: don't hold s_umount over blkdev_put
GFS2: Fix incorrect invalidation for DIO/buffered I/O

Sujith Manoharan (1):
ath9k: Fix interrupt handling for the AR9002 family

Suman Anna (1):
ARM: OMAP2+: hwmod_data: fix missing OMAP_INTC_START in irq data

Takashi Iwai (2):
xhci: Fix spurious wakeups after S5 on Haswell
xhci: Limit the spurious wakeup fix only to HP machines

Tejun Heo (1):
libata, freezer: avoid block device removal while system is frozen

Theodore Ts'o (3):
ext4: call ext4_error_inode() if jbd2_journal_dirty_metadata() fails
ext4: add explicit casts when masking cluster sizes
jbd2: don't BUG but return ENOSPC if a handle runs out of space

Thomas Gleixner (1):
mfd: rtsx_pcr: Disable interrupts before cancelling delayed works

Timo Teräs (1):
ip_gre: fix msg_name parsing for recvfrom/recvmsg

Venkat Venkatsubra (1):
rds: prevent BUG_ON triggered on congestion update to loopback

Ville Syrjälä (1):
drm/i915: Take modeset locks around intel_modeset_setup_hw_state()

Vivek Goyal (1):
kexec: migrate to reboot cpu

Vlad Yasevich (1):
macvtap: Do not double-count received packets

Vladimir Davydov (1):
memcg: fix memcg_size() calculation

Wei Liu (1):
xen-netback: fix refcnt unbalance for 3.11 and earlier versions

Wei Yongjun (1):
iser-target: fix error return code in isert_create_device_ib_res()

Wenliang Fan (1):
drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl()

Will Deacon (1):
arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events

Yan, Zheng (2):
ceph: cleanup aborted requests when re-sending requests.
ceph: wake up 'safe' waiters when unregistering request

Zhi Yong Wu (2):
macvtap: update file current position
tun: update file current position

majianpeng (3):
ceph: Add check returned value on func ceph_calc_ceph_pg.
ceph: fix bugs about handling short-read for sync read mode.
ceph: allow sync_read/write return partial successed size of read/write.

pingfan liu (1):
powerpc: kvm: fix rare but potential deadlock scene

Luis Henriques

unread,
Jan 13, 2014, 11:10:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit eb4489f69f224356193364dc2762aa009738ca7f upstream.

If a PMD changes during a THP migration then migration aborts but the
failure path is doing more work than is necessary.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/migrate.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 65bbca5..cde4b34 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1723,7 +1723,8 @@ fail_putback:
putback_lru_page(page);
mod_zone_page_state(page_zone(page),
NR_ISOLATED_ANON + page_lru, -HPAGE_PMD_NR);
- goto out_fail;
+
+ goto out_unlock;
}

/*
@@ -1797,6 +1798,7 @@ out_dropref:
}
spin_unlock(&mm->page_table_lock);

+out_unlock:
unlock_page(page);
put_page(page);
return 0;
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:10:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <da...@davemloft.net>

commit aca5f58f9ba803ec8c2e6bcf890db17589e8dfcc upstream.

The VLAN tag handling code in netpoll_send_skb_on_dev() has two problems.

1) It exits without unlocking the TXQ.

2) It then tries to queue a NULL skb to npinfo->txq.

Reported-by: Ahmed Tamrawi <atam...@iastate.edu>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/core/netpoll.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index fc75c9e..0c1482c 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -386,8 +386,14 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
!vlan_hw_offload_capable(netif_skb_features(skb),
skb->vlan_proto)) {
skb = __vlan_put_tag(skb, skb->vlan_proto, vlan_tx_tag_get(skb));
- if (unlikely(!skb))
- break;
+ if (unlikely(!skb)) {
+ /* This is actually a packet drop, but we
+ * don't want the code at the end of this
+ * function to try and re-queue a NULL skb.
+ */
+ status = NETDEV_TX_OK;
+ goto unlock_txq;
+ }
skb->vlan_tci = 0;
}

@@ -395,6 +401,7 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
if (status == NETDEV_TX_OK)
txq_trans_update(txq);
}
+ unlock_txq:
__netif_tx_unlock(txq);

if (status == NETDEV_TX_OK)
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:10:06 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit c3a489cac38d43ea6dc4ac240473b44b46deecf7 upstream.

The anon_vma lock prevents parallel THP splits and any associated
complexity that arises when handling splits during THP migration. This
patch checks if the lock was successfully acquired and bails from THP
migration if it failed for any reason.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/huge_memory.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ac2546c..0db1517 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1354,6 +1354,13 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
goto out_unlock;
}

+ /* Bail if we fail to protect against THP splits for any reason */
+ if (unlikely(!anon_vma)) {
+ put_page(page);
+ page_nid = -1;
+ goto clear_pmdnuma;
+ }
+
/*
* Migrate the THP to the requested node, returns with page unlocked
* and pmd_numa cleared.
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:10:06 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: John David Anglin <dave....@bell.net>

commit f8dae00684d678afa13041ef170cecfd1297ed40 upstream.

Helge Deller noted a few weeks ago problems with the AIO support on
parisc. This change is the result of numerous iterations on how best to
deal with this problem.

The solution adopted here is to provide full cache coherency in a
uniform manner on all parisc systems. This involves calling
flush_dcache_page() on kmap operations and flush_kernel_dcache_page() on
kunmap operations. As a result, the copy_user_page() and
clear_user_page() functions can be removed and the overall code is
simpler.

The change ensures that both userspace and kernel aliases to a mapped
page are invalidated and flushed. This is necessary for the correct
operation of PA8800 and PA8900 based systems which do not support
inequivalent aliases.

With this change, I have observed no cache related issues on c8000 and
rp3440. It is now possible for example to do kernel builds with "-j64"
on four way systems.

On systems using XFS file systems, the patch recently posted by Mikulas
Patocka to "fix crash using XFS on loopback" is needed to avoid a hang
caused by an uninitialized lock passed to flush_dcache_page() in the
page struct.

Signed-off-by: John David Anglin <dave....@bell.net>
Signed-off-by: Helge Deller <del...@gmx.de>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/parisc/include/asm/cacheflush.h | 12 ++++--------
arch/parisc/include/asm/page.h | 5 ++---
arch/parisc/kernel/cache.c | 35 -----------------------------------
3 files changed, 6 insertions(+), 46 deletions(-)

diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index f0e2784..2f9b751 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -125,42 +125,38 @@ flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vma
void mark_rodata_ro(void);
#endif

-#ifdef CONFIG_PA8X00
-/* Only pa8800, pa8900 needs this */
-
#include <asm/kmap_types.h>

#define ARCH_HAS_KMAP

-void kunmap_parisc(void *addr);
-
static inline void *kmap(struct page *page)
{
might_sleep();
+ flush_dcache_page(page);
return page_address(page);
}

static inline void kunmap(struct page *page)
{
- kunmap_parisc(page_address(page));
+ flush_kernel_dcache_page_addr(page_address(page));
}

static inline void *kmap_atomic(struct page *page)
{
pagefault_disable();
+ flush_dcache_page(page);
return page_address(page);
}

static inline void __kunmap_atomic(void *addr)
{
- kunmap_parisc(addr);
+ flush_kernel_dcache_page_addr(addr);
pagefault_enable();
}

#define kmap_atomic_prot(page, prot) kmap_atomic(page)
#define kmap_atomic_pfn(pfn) kmap_atomic(pfn_to_page(pfn))
#define kmap_atomic_to_page(ptr) virt_to_page(ptr)
-#endif

#endif /* _PARISC_CACHEFLUSH_H */

diff --git a/arch/parisc/include/asm/page.h b/arch/parisc/include/asm/page.h
index b7adb2a..c53fc63 100644
--- a/arch/parisc/include/asm/page.h
+++ b/arch/parisc/include/asm/page.h
@@ -28,9 +28,8 @@ struct page;

void clear_page_asm(void *page);
void copy_page_asm(void *to, void *from);
-void clear_user_page(void *vto, unsigned long vaddr, struct page *pg);
-void copy_user_page(void *vto, void *vfrom, unsigned long vaddr,
- struct page *pg);
+#define clear_user_page(vto, vaddr, page) clear_page_asm(vto)
+#define copy_user_page(vto, vfrom, vaddr, page) copy_page_asm(vto, vfrom)

/* #define CONFIG_PARISC_TMPALIAS */

diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index c035673..a725455 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -388,41 +388,6 @@ void flush_kernel_dcache_page_addr(void *addr)
}
EXPORT_SYMBOL(flush_kernel_dcache_page_addr);

-void clear_user_page(void *vto, unsigned long vaddr, struct page *page)
-{
- clear_page_asm(vto);
- if (!parisc_requires_coherency())
- flush_kernel_dcache_page_asm(vto);
-}
-EXPORT_SYMBOL(clear_user_page);
-
-void copy_user_page(void *vto, void *vfrom, unsigned long vaddr,
- struct page *pg)
-{
- /* Copy using kernel mapping. No coherency is needed
- (all in kmap/kunmap) on machines that don't support
- non-equivalent aliasing. However, the `from' page
- needs to be flushed before it can be accessed through
- the kernel mapping. */
- preempt_disable();
- flush_dcache_page_asm(__pa(vfrom), vaddr);
- preempt_enable();
- copy_page_asm(vto, vfrom);
- if (!parisc_requires_coherency())
- flush_kernel_dcache_page_asm(vto);
-}
-EXPORT_SYMBOL(copy_user_page);
-
-#ifdef CONFIG_PA8X00
-
-void kunmap_parisc(void *addr)
-{
- if (parisc_requires_coherency())
- flush_kernel_dcache_page_addr(addr);
-}
-EXPORT_SYMBOL(kunmap_parisc);
-#endif
-
void purge_tlb_entries(struct mm_struct *mm, unsigned long addr)
{
unsigned long flags;
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:10:06 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: "Robin H. Johnson" <rob...@gentoo.org>

commit b8bd6dc36186fe99afa7b73e9e2d9a98ad5c4865 upstream.

A user on StackExchange had a failing SSD that's soldered directly
onto the motherboard of his system. The BIOS does not give any option
to disable it at all, so he can't just hide it from the OS via the
BIOS.

The old IDE layer had hdX=noprobe override for situations like this,
but that was never ported to the libata layer.

This patch implements a disable flag for libata.force.

Example use:

libata.force=2.0:disable

[v2 of the patch, removed the nodisable flag per Tejun Heo]

Signed-off-by: Robin H. Johnson <rob...@gentoo.org>
Signed-off-by: Tejun Heo <t...@kernel.org>
Link: http://unix.stackexchange.com/questions/102648/how-to-tell-linux-kernel-3-0-to-completely-ignore-a-failing-disk
Link: http://askubuntu.com/questions/352836/how-can-i-tell-linux-kernel-to-completely-ignore-a-disk-as-if-it-was-not-even-co
Link: http://superuser.com/questions/599333/how-to-disable-kernel-probing-for-drive
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
Documentation/kernel-parameters.txt | 2 ++
drivers/ata/libata-core.c | 1 +
2 files changed, 3 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 7f9d4f5..205cc23 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1460,6 +1460,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.

* atapi_dmadir: Enable ATAPI DMADIR bridge support

+ * disable: Disable this device.
+
If there are multiple matching configurations changing
the same attribute, the last one is used.

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index bcdbcff..bfd727a 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6509,6 +6509,7 @@ static int __init ata_parse_force_one(char **cur,
{ "norst", .lflags = ATA_LFLAG_NO_HRST | ATA_LFLAG_NO_SRST },
{ "rstonce", .lflags = ATA_LFLAG_RST_ONCE },
{ "atapi_dmadir", .horkage_on = ATA_HORKAGE_ATAPI_DMADIR },
+ { "disable", .horkage_on = ATA_HORKAGE_DISABLE },
};
char *start = *cur, *p = *cur;
char *id, *val, *endp;
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:10:06 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Magnus Damm <da...@opensource.se>

commit 2199a5574b6d94b9ca26c6345356f45ec60fef8b upstream.

Update the STI driver by setting cpu_possible_mask to make EMEV2
SMP work as expected together with the ARM broadcast timer.

This breakage was introduced by:

f7db706 ARM: 7674/1: smp: Avoid dummy clockevent being preferred over real hardware clock-event

Without this fix SMP operation is broken on EMEV2 since no
broadcast timer interrupts trigger on the secondary CPU cores.

Signed-off-by: Magnus Damm <da...@opensource.se>
Tested-by: Simon Horman <horms+...@verge.net.au>
Reviewed-by: Stephen Boyd <sb...@codeaurora.org>
Signed-off-by: Simon Horman <horms+...@verge.net.au>
Signed-off-by: Daniel Lezcano <daniel....@linaro.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/clocksource/em_sti.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clocksource/em_sti.c b/drivers/clocksource/em_sti.c
index 4329a29..3141849 100644
--- a/drivers/clocksource/em_sti.c
+++ b/drivers/clocksource/em_sti.c
@@ -301,7 +301,7 @@ static void em_sti_register_clockevent(struct em_sti_priv *p)
ced->name = dev_name(&p->pdev->dev);
ced->features = CLOCK_EVT_FEAT_ONESHOT;
ced->rating = 200;
- ced->cpumask = cpumask_of(0);
+ ced->cpumask = cpu_possible_mask;
ced->set_next_event = em_sti_clock_event_next;
ced->set_mode = em_sti_clock_event_mode;

--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:10:06 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

commit 7a2a84518cfb263d2c4171b3d63671f88316adb2 upstream.

skb_tx_timestamp(skb) should be called _before_ TX completion
has a chance to trigger, otherwise it is too late and we access
freed memory.

Signed-off-by: Eric Dumazet <edum...@google.com>
Fixes: de5fb0a05348 ("net: fec: put tx to napi poll function to fix dead lock")
Cc: Frank Li <Fran...@freescale.com>
Cc: Richard Cochran <richard...@gmail.com>
Acked-by: Richard Cochran <richard...@gmail.com>
Acked-by: Frank Li <Fran...@freescale.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/ethernet/freescale/fec_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index c610a27..2cb8401 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -394,6 +394,8 @@ fec_enet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
else
bdp = fec_enet_get_nextdesc(bdp, fep->bufdesc_ex);

+ skb_tx_timestamp(skb);
+
fep->cur_tx = bdp;

if (fep->cur_tx == fep->dirty_tx)
@@ -402,8 +404,6 @@ fec_enet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
/* Trigger transmission start */
writel(0, fep->hwp + FEC_X_DES_ACTIVE);

- skb_tx_timestamp(skb);
-
return NETDEV_TX_OK;
}

--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:10:06 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Jason Wang <jaso...@redhat.com>

commit 6cd4ce0099da7702f885b6fa9ebb49e3831d90b4 upstream.

During restoring, try_fill_recv() was called with neither napi lock nor napi
disabled. This can lead two try_fill_recv() was called in the same time. Fix
this by refilling before trying to enable napi.

Fixes 0741bcb5584f9e2390ae6261573c4de8314999f2
(virtio: net: Add freeze, restore handlers to support S4).

Cc: Amit Shah <amit...@redhat.com>
Cc: Rusty Russell <ru...@rustcorp.com.au>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Eric Dumazet <eric.d...@gmail.com>
Signed-off-by: Jason Wang <jaso...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/virtio_net.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 68692af..a0c05e0 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1745,16 +1745,17 @@ static int virtnet_restore(struct virtio_device *vdev)
if (err)
return err;

- if (netif_running(vi->dev))
+ if (netif_running(vi->dev)) {
+ for (i = 0; i < vi->curr_queue_pairs; i++)
+ if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
+ schedule_delayed_work(&vi->refill, 0);
+
for (i = 0; i < vi->max_queue_pairs; i++)
virtnet_napi_enable(&vi->rq[i]);
+ }

netif_device_attach(vi->dev);

- for (i = 0; i < vi->curr_queue_pairs; i++)
- if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
- schedule_delayed_work(&vi->refill, 0);
-
mutex_lock(&vi->config_lock);
vi->config_enable = true;
mutex_unlock(&vi->config_lock);
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:10:06 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Josh Durgin <josh....@inktank.com>

commit 9875201e10496612080e7d164acc8f625c18725c upstream.

Removing a device deallocates the disk, unschedules the watch, and
finally cleans up the rbd_dev structure. rbd_dev_refresh(), called
from the watch callback, updates the disk size and rbd_dev
structure. With no locking between them, rbd_dev_refresh() may use the
device or rbd_dev after they've been freed.

To fix this, check whether RBD_DEV_FLAG_REMOVING is set before
updating the disk size in rbd_dev_refresh(). In order to prevent a
race where rbd_dev_refresh() is already revalidating the disk when
rbd_remove() is called, move the call to rbd_bus_del_dev() after the
watch is unregistered and all notifies are complete. It's safe to
defer deleting this structure because no new requests can be submitted
once the RBD_DEV_FLAG_REMOVING is set, since the device cannot be
opened.

Fixes: http://tracker.ceph.com/issues/5636
Signed-off-by: Josh Durgin <josh....@inktank.com>
Reviewed-by: Alex Elder <el...@linaro.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/block/rbd.c | 40 +++++++++++++++++++++++++++++++++-------
1 file changed, 33 insertions(+), 7 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index d88a27d..0f5307f 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3325,6 +3325,31 @@ static void rbd_exists_validate(struct rbd_device *rbd_dev)
clear_bit(RBD_DEV_FLAG_EXISTS, &rbd_dev->flags);
}

+static void rbd_dev_update_size(struct rbd_device *rbd_dev)
+{
+ sector_t size;
+ bool removing;
+
+ /*
+ * Don't hold the lock while doing disk operations,
+ * or lock ordering will conflict with the bdev mutex via:
+ * rbd_add() -> blkdev_get() -> rbd_open()
+ */
+ spin_lock_irq(&rbd_dev->lock);
+ removing = test_bit(RBD_DEV_FLAG_REMOVING, &rbd_dev->flags);
+ spin_unlock_irq(&rbd_dev->lock);
+ /*
+ * If the device is being removed, rbd_dev->disk has
+ * been destroyed, so don't try to update its size
+ */
+ if (!removing) {
+ size = (sector_t)rbd_dev->mapping.size / SECTOR_SIZE;
+ dout("setting size to %llu sectors", (unsigned long long)size);
+ set_capacity(rbd_dev->disk, size);
+ revalidate_disk(rbd_dev->disk);
+ }
+}
+
static int rbd_dev_refresh(struct rbd_device *rbd_dev)
{
u64 mapping_size;
@@ -3344,12 +3369,7 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev)
up_write(&rbd_dev->header_rwsem);

if (mapping_size != rbd_dev->mapping.size) {
- sector_t size;
-
- size = (sector_t)rbd_dev->mapping.size / SECTOR_SIZE;
- dout("setting size to %llu sectors", (unsigned long long)size);
- set_capacity(rbd_dev->disk, size);
- revalidate_disk(rbd_dev->disk);
+ rbd_dev_update_size(rbd_dev);
}

return ret;
@@ -5160,7 +5180,6 @@ static ssize_t rbd_remove(struct bus_type *bus,
if (ret < 0 || already)
return ret;

- rbd_bus_del_dev(rbd_dev);
ret = rbd_dev_header_watch_sync(rbd_dev, false);
if (ret)
rbd_warn(rbd_dev, "failed to cancel watch event (%d)\n", ret);
@@ -5171,6 +5190,13 @@ static ssize_t rbd_remove(struct bus_type *bus,
*/
dout("%s: flushing notifies", __func__);
ceph_osdc_flush_notifies(&rbd_dev->rbd_client->client->osdc);
+ /*
+ * Don't free anything from rbd_dev->disk until after all
+ * notifies are completely processed. Otherwise
+ * rbd_bus_del_dev() will race with rbd_watch_cb(), resulting
+ * in a potential use after free of rbd_dev->disk or rbd_dev.
+ */
+ rbd_bus_del_dev(rbd_dev);
rbd_dev_image_release(rbd_dev);
module_put(THIS_MODULE);

--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
include/linux/skbuff.h | 5 +++++
net/ipv6/udp_offload.c | 2 +-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 37b4517..e27c8f6 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1643,6 +1643,11 @@ static inline void skb_set_mac_header(struct sk_buff *skb, const int offset)
skb->mac_header += offset;
}

+static inline void skb_pop_mac_header(struct sk_buff *skb)
+{
+ skb->mac_header = skb->network_header;
+}
+
static inline void skb_probe_transport_header(struct sk_buff *skb,
const int offset_hint)
{
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 657914b..f21df3f 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -86,7 +86,7 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb,

/* Check if there is enough headroom to insert fragment header. */
tnl_hlen = skb_tnl_header_len(skb);
- if (skb->mac_header < (tnl_hlen + frag_hdr_sz)) {
+ if (skb_mac_header(skb) < skb->head + tnl_hlen + frag_hdr_sz) {
if (gso_pskb_expand_head(skb, tnl_hlen + frag_hdr_sz))
goto out;
}
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Li RongQing <roy.q...@gmail.com>

commit 24f5b855e17df7e355eacd6c4a12cc4d6a6c9ff0 upstream.

ip6_rt_copy only sets dst.from if ort has flag RTF_ADDRCONF and RTF_DEFAULT.
but the prefix routes which did get installed by hand locally can have an
expiration, and no any flag combination which can ensure a potential from
does never expire, so we should always set the new created dst's from.

This also fixes the new created dst is always expired since the ort, which
is created by RA, maybe has RTF_EXPIRES and RTF_ADDRCONF, but no RTF_DEFAULT.

Suggested-by: Hannes Frederic Sowa <han...@stressinduktion.org>
CC: Gao feng <gao...@cn.fujitsu.com>
Signed-off-by: Li RongQing <roy.q...@gmail.com>
Acked-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/ipv6/route.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 9e0f8e1..2a0f219 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1861,9 +1861,7 @@ static struct rt6_info *ip6_rt_copy(struct rt6_info *ort,
else
rt->rt6i_gateway = *dest;
rt->rt6i_flags = ort->rt6i_flags;
- if ((ort->rt6i_flags & (RTF_DEFAULT | RTF_ADDRCONF)) ==
- (RTF_DEFAULT | RTF_ADDRCONF))
- rt6_set_from(rt, ort);
+ rt6_set_from(rt, ort);
rt->rt6i_metric = 0;

#ifdef CONFIG_IPV6_SUBTREES
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Zhi Yong Wu <wu...@linux.vnet.ibm.com>

commit d0b7da8afa079ffe018ab3e92879b7138977fc8f upstream.

Signed-off-by: Zhi Yong Wu <wu...@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/tun.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 171a88f..efc56ca 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1410,6 +1410,8 @@ static ssize_t tun_chr_aio_read(struct kiocb *iocb, const struct iovec *iv,
ret = tun_do_read(tun, tfile, iocb, iv, len,
file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
+ if (ret > 0)
+ iocb->ki_pos = ret;
out:
tun_put(tun);
return ret;
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Felix Fietkau <n...@openwrt.org>

commit 277d916fc2e959c3f106904116bb4f7b1148d47a upstream.

The check needs to apply to both multicast and unicast packets,
otherwise probe requests on AP mode scans are sent through the multicast
buffer queue, which adds long delays (often longer than the scanning
interval).

Signed-off-by: Felix Fietkau <n...@openwrt.org>
Signed-off-by: Johannes Berg <johann...@intel.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/mac80211/tx.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 4438aed..ebd9471 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -448,7 +448,6 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx)
{
struct sta_info *sta = tx->sta;
struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb);
- struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;
struct ieee80211_local *local = tx->local;

if (unlikely(!sta))
@@ -459,15 +458,6 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx)
!(info->flags & IEEE80211_TX_CTL_NO_PS_BUFFER))) {
int ac = skb_get_queue_mapping(tx->skb);

- /* only deauth, disassoc and action are bufferable MMPDUs */
- if (ieee80211_is_mgmt(hdr->frame_control) &&
- !ieee80211_is_deauth(hdr->frame_control) &&
- !ieee80211_is_disassoc(hdr->frame_control) &&
- !ieee80211_is_action(hdr->frame_control)) {
- info->flags |= IEEE80211_TX_CTL_NO_PS_BUFFER;
- return TX_CONTINUE;
- }
-
ps_dbg(sta->sdata, "STA %pM aid %d: PS buffer for AC %d\n",
sta->sta.addr, sta->sta.aid, ac);
if (tx->local->total_ps_buffered >= TOTAL_MAX_TX_BUFFER)
@@ -510,9 +500,22 @@ ieee80211_tx_h_unicast_ps_buf(struct ieee80211_tx_data *tx)
static ieee80211_tx_result debug_noinline
ieee80211_tx_h_ps_buf(struct ieee80211_tx_data *tx)
{
+ struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb);
+ struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;
+
if (unlikely(tx->flags & IEEE80211_TX_PS_BUFFERED))
return TX_CONTINUE;

+ /* only deauth, disassoc and action are bufferable MMPDUs */
+ if (ieee80211_is_mgmt(hdr->frame_control) &&
+ !ieee80211_is_deauth(hdr->frame_control) &&
+ !ieee80211_is_disassoc(hdr->frame_control) &&
+ !ieee80211_is_action(hdr->frame_control)) {
+ if (tx->flags & IEEE80211_TX_UNICAST)
+ info->flags |= IEEE80211_TX_CTL_NO_PS_BUFFER;
+ return TX_CONTINUE;
+ }
+
if (tx->flags & IEEE80211_TX_UNICAST)
return ieee80211_tx_h_unicast_ps_buf(tx);
else
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: =?UTF-8?q?Salva=20Peir=C3=B3?= <spe...@ai2.upv.es>

commit 8e3fbf870481eb53b2d3a322d1fc395ad8b367ed upstream.

The yam_ioctl() code fails to initialise the cmd field
of the struct yamdrv_ioctl_cfg. Add an explicit memset(0)
before filling the structure to avoid the 4-byte info leak.

Signed-off-by: Salva Peiró <spe...@ai2.upv.es>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/hamradio/yam.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/hamradio/yam.c b/drivers/net/hamradio/yam.c
index 0721e72..82529a2 100644
--- a/drivers/net/hamradio/yam.c
+++ b/drivers/net/hamradio/yam.c
@@ -1058,6 +1058,7 @@ static int yam_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
break;

case SIOCYAMGCFG:
+ memset(&yi, 0, sizeof(yi));
yi.cfg.mask = 0xffffffff;
yi.cfg.iobase = yp->iobase;
yi.cfg.irq = yp->irq;
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: "Michael S. Tsirkin" <m...@redhat.com>

We leak an skb when there are too many frags,
we also stop processing the packet in the middle,
the result is almost sure to be loss of networking.

Reported-by: Michael Dalton <mwda...@google.com>
Acked-by: Michael Dalton <mwda...@google.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/virtio_net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ec0e9f2..68692af 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -341,7 +341,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
if (i >= MAX_SKB_FRAGS) {
pr_debug("%s: packet too long\n", skb->dev->name);
skb->dev->stats.rx_length_errors++;
- return NULL;
+ goto err_frags;
}
page = virtqueue_get_buf(rq->vq, &len);
if (!page) {
@@ -362,6 +362,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
err_skb:
give_pages(rq, page);
while (--num_buf) {
+err_frags:
buf = virtqueue_get_buf(rq->vq, &len);
if (unlikely(!buf)) {
pr_debug("%s: rx error: %d buffers missing\n",
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: =?UTF-8?q?Timo=20Ter=C3=A4s?= <timo....@iki.fi>

commit 0e3da5bb8da45890b1dc413404e0f978ab71173e upstream.

ipgre_header_parse() needs to parse the tunnel's ip header and it
uses mac_header to locate the iphdr. This got broken when gre tunneling
was refactored as mac_header is no longer updated to point to iphdr.
Introduce skb_pop_mac_header() helper to do the mac_header assignment
and use it in ipgre_rcv() to fix msg_name parsing.

Bug introduced in commit c54419321455 (GRE: Refactor GRE tunneling code.)

Cc: Pravin B Shelar <psh...@nicira.com>
Signed-off-by: Timo Teräs <timo....@iki.fi>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/ipv4/ip_gre.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 8d6939e..2977400 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -217,6 +217,7 @@ static int ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi)
iph->saddr, iph->daddr, tpi->key);

if (tunnel) {
+ skb_pop_mac_header(skb);
ip_tunnel_rcv(tunnel, skb, tpi, log_ecn_error);
return PACKET_RCVD;
}
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Zhi Yong Wu <wu...@linux.vnet.ibm.com>

commit e6ebc7f16ca1434a334647aa56399c546be4e64b upstream.

Signed-off-by: Zhi Yong Wu <wu...@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/macvtap.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 70d5800..fa54f0e 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -979,6 +979,8 @@ static ssize_t macvtap_aio_read(struct kiocb *iocb, const struct iovec *iv,

ret = macvtap_do_read(q, iocb, iv, len, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len); /* XXX copied from tun.c. Why? */
+ if (ret > 0)
+ iocb->ki_pos = ret;
out:
return ret;
}
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Sasha Levin <sasha...@oracle.com>

commit c2349758acf1874e4c2b93fe41d072336f1a31d0 upstream.

Binding might result in a NULL device, which is dereferenced
causing this BUG:

[ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097
4
[ 1317.261847] IP: [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110
[ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0
[ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 1317.264179] Dumping ftrace buffer:
[ 1317.264774] (ftrace buffer empty)
[ 1317.265220] Modules linked in:
[ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G W 3.13.0-rc4-
next-20131218-sasha-00013-g2cebb9b-dirty #4159
[ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000
[ 1317.268399] RIP: 0010:[<ffffffff84225f52>] [<ffffffff84225f52>] rds_ib_laddr_check+
0x82/0x110
[ 1317.269670] RSP: 0000:ffff8803cd31bdf8 EFLAGS: 00010246
[ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000
[ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286
[ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000
[ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031
[ 1317.270230] FS: 00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000
0000
[ 1317.270230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0
[ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
[ 1317.270230] Stack:
[ 1317.270230] 0000000054086700 5408670000a25de0 5408670000000002 0000000000000000
[ 1317.270230] ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160
[ 1317.270230] ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280
[ 1317.270230] Call Trace:
[ 1317.270230] [<ffffffff84223542>] ? rds_trans_get_preferred+0x42/0xa0
[ 1317.270230] [<ffffffff84223556>] rds_trans_get_preferred+0x56/0xa0
[ 1317.270230] [<ffffffff8421c9c3>] rds_bind+0x73/0xf0
[ 1317.270230] [<ffffffff83e4ce62>] SYSC_bind+0x92/0xf0
[ 1317.270230] [<ffffffff812493f8>] ? context_tracking_user_exit+0xb8/0x1d0
[ 1317.270230] [<ffffffff8119313d>] ? trace_hardirqs_on+0xd/0x10
[ 1317.270230] [<ffffffff8107a852>] ? syscall_trace_enter+0x32/0x290
[ 1317.270230] [<ffffffff83e4cece>] SyS_bind+0xe/0x10
[ 1317.270230] [<ffffffff843a6ad0>] tracesys+0xdd/0xe2
[ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00
89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7
4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02
[ 1317.270230] RIP [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110
[ 1317.270230] RSP <ffff8803cd31bdf8>
[ 1317.270230] CR2: 0000000000000974

Signed-off-by: Sasha Levin <sasha...@oracle.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/rds/ib.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/rds/ib.c b/net/rds/ib.c
index b4c8b00..ba2dffe 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -338,7 +338,8 @@ static int rds_ib_laddr_check(__be32 addr)
ret = rdma_bind_addr(cm_id, (struct sockaddr *)&sin);
/* due to this, we will claim to support iWARP devices unless we
check node_type. */
- if (ret || cm_id->device->node_type != RDMA_NODE_IB_CA)
+ if (ret || !cm_id->device ||
+ cm_id->device->node_type != RDMA_NODE_IB_CA)
ret = -EADDRNOTAVAIL;

rdsdebug("addr %pI4 ret %d node type %d\n",
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: "Michael S. Tsirkin" <m...@redhat.com>

commit f121159d72091f25afb22007c833e60a6845e912 upstream.

receive mergeable now handles errors internally.
Do same for big and small packet paths, otherwise
the logic is too hard to follow.

Cc: Jason Wang <jaso...@redhat.com>
Cc: David S. Miller <da...@davemloft.net>
Acked-by: Michael Dalton <mwda...@google.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Jason Wang <jaso...@redhat.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/virtio_net.c | 56 +++++++++++++++++++++++++++++++-----------------
1 file changed, 36 insertions(+), 20 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ad9b385..ec0e9f2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -294,6 +294,34 @@ static struct sk_buff *page_to_skb(struct receive_queue *rq,
return skb;
}

+static struct sk_buff *receive_small(void *buf, unsigned int len)
+{
+ struct sk_buff * skb = buf;
+
+ len -= sizeof(struct virtio_net_hdr);
+ skb_trim(skb, len);
+
+ return skb;
+}
+
+static struct sk_buff *receive_big(struct net_device *dev,
+ struct receive_queue *rq,
+ void *buf)
+{
+ struct page *page = buf;
+ struct sk_buff *skb = page_to_skb(rq, page, 0);
+
+ if (unlikely(!skb))
+ goto err;
+
+ return skb;
+
+err:
+ dev->stats.rx_dropped++;
+ give_pages(rq, page);
+ return NULL;
+}
+
static struct sk_buff *receive_mergeable(struct net_device *dev,
struct receive_queue *rq,
void *buf,
@@ -357,7 +385,6 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
struct net_device *dev = vi->dev;
struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
struct sk_buff *skb;
- struct page *page;
struct skb_vnet_hdr *hdr;

if (unlikely(len < sizeof(struct virtio_net_hdr) + ETH_HLEN)) {
@@ -369,26 +396,15 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
dev_kfree_skb(buf);
return;
}
+ if (vi->mergeable_rx_bufs)
+ skb = receive_mergeable(dev, rq, buf, len);
+ else if (vi->big_packets)
+ skb = receive_big(dev, rq, buf);
+ else
+ skb = receive_small(buf, len);

- if (!vi->mergeable_rx_bufs && !vi->big_packets) {
- skb = buf;
- len -= sizeof(struct virtio_net_hdr);
- skb_trim(skb, len);
- } else {
- page = buf;
- if (vi->mergeable_rx_bufs) {
- skb = receive_mergeable(dev, rq, page, len);
- if (unlikely(!skb))
- return;
- } else {
- skb = page_to_skb(rq, page, len);
- if (unlikely(!skb)) {
- dev->stats.rx_dropped++;
- give_pages(rq, page);
- return;
- }
- }
- }
+ if (unlikely(!skb))
+ return;

hdr = skb_vnet_hdr(skb);

--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Curt Brune <cu...@cumulusnetworks.com>

commit fe0d692bbc645786bce1a98439e548ae619269f5 upstream.

br_multicast_set_hash_max() is called from process context in
net/bridge/br_sysfs_br.c by the sysfs store_hash_max() function.

br_multicast_set_hash_max() calls spin_lock(&br->multicast_lock),
which can deadlock the CPU if a softirq that also tries to take the
same lock interrupts br_multicast_set_hash_max() while the lock is
held . This can happen quite easily when any of the bridge multicast
timers expire, which try to take the same lock.

The fix here is to use spin_lock_bh(), preventing other softirqs from
executing on this CPU.

Steps to reproduce:

1. Create a bridge with several interfaces (I used 4).
2. Set the "multicast query interval" to a low number, like 2.
3. Enable the bridge as a multicast querier.
4. Repeatedly set the bridge hash_max parameter via sysfs.

# brctl addbr br0
# brctl addif br0 eth1 eth2 eth3 eth4
# brctl setmcqi br0 2
# brctl setmcquerier br0 1

# while true ; do echo 4096 > /sys/class/net/br0/bridge/hash_max; done

Signed-off-by: Curt Brune <cu...@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfe...@cumulusnetworks.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/bridge/br_multicast.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index fbad619..b02fb47 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1997,7 +1997,7 @@ int br_multicast_set_hash_max(struct net_bridge *br, unsigned long val)
u32 old;
struct net_bridge_mdb_htable *mdb;

- spin_lock(&br->multicast_lock);
+ spin_lock_bh(&br->multicast_lock);
if (!netif_running(br->dev))
goto unlock;

@@ -2029,7 +2029,7 @@ rollback:
}

unlock:
- spin_unlock(&br->multicast_lock);
+ spin_unlock_bh(&br->multicast_lock);

return err;
}
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrey Vagin <ava...@openvz.org>

commit d4fb84eefe5164f6a6ea51d0a9e26280c661a0dd upstream.

free_netdev calls netif_napi_del too, but it's too late, because napi
structures are placed on vi->rq. netif_napi_add() is called from
virtnet_alloc_queues.

general protection fault: 0000 [#1] SMP
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables virtio_balloon pcspkr virtio_net(-) i2c_pii
CPU: 1 PID: 347 Comm: rmmod Not tainted 3.13.0-rc2+ #171
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff8800b779c420 ti: ffff8800379e0000 task.ti: ffff8800379e0000
RIP: 0010:[<ffffffff81322e19>] [<ffffffff81322e19>] __list_del_entry+0x29/0xd0
RSP: 0018:ffff8800379e1dd0 EFLAGS: 00010a83
RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800379c2fd0 RCX: dead000000200200
RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000001 RDI: ffff8800379c2fd0
RBP: ffff8800379e1dd0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800379c2f90
R13: ffff880037839160 R14: 0000000000000000 R15: 00000000013352f0
FS: 00007f1400e34740(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f464124c763 CR3: 00000000b68cf000 CR4: 00000000000006e0
Stack:
ffff8800379e1df0 ffffffff8155beab 6b6b6b6b6b6b6b2b ffff8800378391c0
ffff8800379e1e18 ffffffff8156499b ffff880037839be0 ffff880037839d20
ffff88003779d3f0 ffff8800379e1e38 ffffffffa003477c ffff88003779d388
Call Trace:
[<ffffffff8155beab>] netif_napi_del+0x1b/0x80
[<ffffffff8156499b>] free_netdev+0x8b/0x110
[<ffffffffa003477c>] virtnet_remove+0x7c/0x90 [virtio_net]
[<ffffffff813ae323>] virtio_dev_remove+0x23/0x80
[<ffffffff813f62ef>] __device_release_driver+0x7f/0xf0
[<ffffffff813f6ca0>] driver_detach+0xc0/0xd0
[<ffffffff813f5f28>] bus_remove_driver+0x58/0xd0
[<ffffffff813f72ec>] driver_unregister+0x2c/0x50
[<ffffffff813ae65e>] unregister_virtio_driver+0xe/0x10
[<ffffffffa0036942>] virtio_net_driver_exit+0x10/0x6ce [virtio_net]
[<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220
[<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0
[<ffffffff81677f69>] system_call_fastpath+0x16/0x1b
Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00
RIP [<ffffffff81322e19>] __list_del_entry+0x29/0xd0
RSP <ffff8800379e1dd0>
---[ end trace d5931cd3f87c9763 ]---

Fixes: 986a4f4d452d (virtio_net: multiqueue support)
Cc: Rusty Russell <ru...@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Signed-off-by: Andrey Vagin <ava...@openvz.org>
Acked-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Jason Wang <jaso...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/virtio_net.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 64cf702..5055eb7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1285,6 +1285,11 @@ static void virtnet_config_changed(struct virtio_device *vdev)

static void virtnet_free_queues(struct virtnet_info *vi)
{
+ int i;
+
+ for (i = 0; i < vi->max_queue_pairs; i++)
+ netif_napi_del(&vi->rq[i].napi);
+
kfree(vi->rq);
kfree(vi->sq);
}
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Sasha Levin <sasha...@oracle.com>

commit 37ab4fa7844a044dc21fde45e2a0fc2f3c3b6490 upstream.

This is similar to the set_peek_off patch where calling bind while the
socket is stuck in unix_dgram_recvmsg() will block and cause a hung task
spew after a while.

This is also the last place that did a straightforward mutex_lock(), so
there shouldn't be any more of these patches.

Signed-off-by: Sasha Levin <sasha...@oracle.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/unix/af_unix.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 303dcec..f246812 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -718,7 +718,9 @@ static int unix_autobind(struct socket *sock)
int err;
unsigned int retries = 0;

- mutex_lock(&u->readlock);
+ err = mutex_lock_interruptible(&u->readlock);
+ if (err)
+ return err;

err = 0;
if (u->addr)
@@ -877,7 +879,9 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
goto out;
addr_len = err;

- mutex_lock(&u->readlock);
+ err = mutex_lock_interruptible(&u->readlock);
+ if (err)
+ goto out;

err = -EINVAL;
if (u->addr)
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Sasha Levin <sasha...@oracle.com>

commit 12663bfc97c8b3fdb292428105dd92d563164050 upstream.

unix_dgram_recvmsg() will hold the readlock of the socket until recv
is complete.

In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until
unix_dgram_recvmsg() will complete (which can take a while) without allowing
us to break out of it, triggering a hung task spew.

Instead, allow set_peek_off to fail, this way userspace will not hang.

Signed-off-by: Sasha Levin <sasha...@oracle.com>
Acked-by: Pavel Emelyanov <xe...@parallels.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
include/linux/net.h | 2 +-
net/core/sock.c | 2 +-
net/unix/af_unix.c | 8 ++++++--
3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/linux/net.h b/include/linux/net.h
index 8bd9d92..41103f8 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -180,7 +180,7 @@ struct proto_ops {
int offset, size_t size, int flags);
ssize_t (*splice_read)(struct socket *sock, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len, unsigned int flags);
- void (*set_peek_off)(struct sock *sk, int val);
+ int (*set_peek_off)(struct sock *sk, int val);
};

#define DECLARE_SOCKADDR(type, dst, src) \
diff --git a/net/core/sock.c b/net/core/sock.c
index 8729d91..dca3102 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -887,7 +887,7 @@ set_rcvbuf:

case SO_PEEK_OFF:
if (sock->ops->set_peek_off)
- sock->ops->set_peek_off(sk, val);
+ ret = sock->ops->set_peek_off(sk, val);
else
ret = -EOPNOTSUPP;
break;
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 6c66e8d..303dcec 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -530,13 +530,17 @@ static int unix_seqpacket_sendmsg(struct kiocb *, struct socket *,
static int unix_seqpacket_recvmsg(struct kiocb *, struct socket *,
struct msghdr *, size_t, int);

-static void unix_set_peek_off(struct sock *sk, int val)
+static int unix_set_peek_off(struct sock *sk, int val)
{
struct unix_sock *u = unix_sk(sk);

- mutex_lock(&u->readlock);
+ if (mutex_lock_interruptible(&u->readlock))
+ return -EINTR;
+
sk->sk_peek_off = val;
mutex_unlock(&u->readlock);
+
+ return 0;
}


--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Florian Westphal <f...@strlen.de>

commit f81152e35001e91997ec74a7b4e040e6ab0acccf upstream.

recvmsg handler in net/rose/af_rose.c performs size-check ->msg_namelen.

After commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c
(net: rework recvmsg handler msg_name and msg_namelen logic), we now
always take the else branch due to namelen being initialized to 0.

Digging in netdev-vger-cvs git repo shows that msg_namelen was
initialized with a fixed-size since at least 1995, so the else branch
was never taken.

Compile tested only.

Signed-off-by: Florian Westphal <f...@strlen.de>
Acked-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/rose/af_rose.c | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index 33af772..62ced65 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1253,6 +1253,7 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,

if (msg->msg_name) {
struct sockaddr_rose *srose;
+ struct full_sockaddr_rose *full_srose = msg->msg_name;

memset(msg->msg_name, 0, sizeof(struct full_sockaddr_rose));
srose = msg->msg_name;
@@ -1260,18 +1261,9 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
srose->srose_addr = rose->dest_addr;
srose->srose_call = rose->dest_call;
srose->srose_ndigis = rose->dest_ndigis;
- if (msg->msg_namelen >= sizeof(struct full_sockaddr_rose)) {
- struct full_sockaddr_rose *full_srose = (struct full_sockaddr_rose *)msg->msg_name;
- for (n = 0 ; n < rose->dest_ndigis ; n++)
- full_srose->srose_digis[n] = rose->dest_digis[n];
- msg->msg_namelen = sizeof(struct full_sockaddr_rose);
- } else {
- if (rose->dest_ndigis >= 1) {
- srose->srose_ndigis = 1;
- srose->srose_digi = rose->dest_digis[0];
- }
- msg->msg_namelen = sizeof(struct sockaddr_rose);
- }
+ for (n = 0 ; n < rose->dest_ndigis ; n++)
+ full_srose->srose_digis[n] = rose->dest_digis[n];
+ msg->msg_namelen = sizeof(struct full_sockaddr_rose);
}

skb_free_datagram(sk, skb);
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Wenliang Fan <fanw...@gmail.com>

commit e9db5c21d3646a6454fcd04938dd215ac3ab620a upstream.

The local variable 'bi' comes from userspace. If userspace passed a
large number to 'bi.data.calibrate', there would be an integer overflow
in the following line:
s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16;

Signed-off-by: Wenliang Fan <fanw...@gmail.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/hamradio/hdlcdrv.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/hamradio/hdlcdrv.c b/drivers/net/hamradio/hdlcdrv.c
index 3169252..5d78c1d 100644
--- a/drivers/net/hamradio/hdlcdrv.c
+++ b/drivers/net/hamradio/hdlcdrv.c
@@ -571,6 +571,8 @@ static int hdlcdrv_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
case HDLCDRVCTL_CALIBRATE:
if(!capable(CAP_SYS_RAWIO))
return -EPERM;
+ if (bi.data.calibrate > INT_MAX / s->par.bitrate)
+ return -EINVAL;
s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16;
return 0;

--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:20:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Venkat Venkatsubra <venkat.x.v...@oracle.com>

commit 18fc25c94eadc52a42c025125af24657a93638c0 upstream.

After congestion update on a local connection, when rds_ib_xmit returns
less bytes than that are there in the message, rds_send_xmit calls
back rds_ib_xmit with an offset that causes BUG_ON(off & RDS_FRAG_SIZE)
to trigger.

For a 4Kb PAGE_SIZE rds_ib_xmit returns min(8240,4096)=4096 when actually
the message contains 8240 bytes. rds_send_xmit thinks there is more to send
and calls rds_ib_xmit again with a data offset "off" of 4096-48(rds header)
=4048 bytes thus hitting the BUG_ON(off & RDS_FRAG_SIZE) [RDS_FRAG_SIZE=4k].

The commit 6094628bfd94323fc1cea05ec2c6affd98c18f7f
"rds: prevent BUG_ON triggering on congestion map updates" introduced
this regression. That change was addressing the triggering of a different
BUG_ON in rds_send_xmit() on PowerPC architecture with 64Kbytes PAGE_SIZE:
BUG_ON(ret != 0 &&
conn->c_xmit_sg == rm->data.op_nents);
This was the sequence it was going through:
(rds_ib_xmit)
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
}
rds_ib_xmit returns 8240
rds_send_xmit:
c_xmit_data_off = 0 + 8240 - 48 (rds header accounted only the first time)
= 8192
c_xmit_data_off < 65536 (sg->length), so calls rds_ib_xmit again
rds_ib_xmit returns 8240
rds_send_xmit:
c_xmit_data_off = 8192 + 8240 = 16432, calls rds_ib_xmit again
and so on (c_xmit_data_off 24672,32912,41152,49392,57632)
rds_ib_xmit returns 8240
On this iteration this sequence causes the BUG_ON in rds_send_xmit:
while (ret) {
tmp = min_t(int, ret, sg->length - conn->c_xmit_data_off);
[tmp = 65536 - 57632 = 7904]
conn->c_xmit_data_off += tmp;
[c_xmit_data_off = 57632 + 7904 = 65536]
ret -= tmp;
[ret = 8240 - 7904 = 336]
if (conn->c_xmit_data_off == sg->length) {
conn->c_xmit_data_off = 0;
sg++;
conn->c_xmit_sg++;
BUG_ON(ret != 0 &&
conn->c_xmit_sg == rm->data.op_nents);
[c_xmit_sg = 1, rm->data.op_nents = 1]

What the current fix does:
Since the congestion update over loopback is not actually transmitted
as a message, all that rds_ib_xmit needs to do is let the caller think
the full message has been transmitted and not return partial bytes.
It will return 8240 (RDS_CONG_MAP_BYTES+48) when PAGE_SIZE is 4Kb.
And 64Kb+48 when page size is 64Kb.

Reported-by: Josh Hunt <joshh...@gmail.com>
Tested-by: Honggang Li <ho...@redhat.com>
Acked-by: Bang Nguyen <bang....@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.v...@oracle.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/rds/ib_send.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c
index e590949..37be6e2 100644
--- a/net/rds/ib_send.c
+++ b/net/rds/ib_send.c
@@ -552,9 +552,8 @@ int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm,
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
scat = &rm->data.op_sg[sg];
- ret = sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
- ret = min_t(int, ret, scat->length - conn->c_xmit_data_off);
- return ret;
+ ret = max_t(int, RDS_CONG_MAP_BYTES, scat->length);
+ return sizeof(struct rds_header) + ret;
}

/* FIXME we may overallocate here */

Luis Henriques

unread,
Jan 13, 2014, 11:20:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Simon Guinot <sgu...@lacie.com>

commit e098f5cbe9d410e7878b50f524dce36cc83ec40e upstream.

This patch adds support for the PCI ID provided by the Marvell 88SE9170
SATA controller.

Signed-off-by: Simon Guinot <sgu...@lacie.com>
Signed-off-by: Tejun Heo <t...@kernel.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/ata/ahci.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 7e4de82..54bf3f8 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -427,6 +427,9 @@ static const struct pci_device_id ahci_pci_tbl[] = {
.driver_data = board_ahci_yes_fbs }, /* 88se9128 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x9125),
.driver_data = board_ahci_yes_fbs }, /* 88se9125 */
+ { PCI_DEVICE_SUB(PCI_VENDOR_ID_MARVELL_EXT, 0x9178,
+ PCI_VENDOR_ID_MARVELL_EXT, 0x9170),
+ .driver_data = board_ahci_yes_fbs }, /* 88se9170 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x917a),
.driver_data = board_ahci_yes_fbs }, /* 88se9172 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x9172),

Luis Henriques

unread,
Jan 13, 2014, 11:20:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

commit 28e24c62ab3062e965ef1b3bcc244d50aee7fa85 upstream.

Few network drivers really supports frag_list : virtual drivers.

Some drivers wrongly advertise NETIF_F_FRAGLIST feature.

If skb with a frag_list is given to them, packet on the wire will be
corrupt.

Remove this flag, as core networking stack will make sure to
provide packets that can be sent without corruption.

Signed-off-by: Eric Dumazet <edum...@google.com>
Cc: Thadeu Lima de Souza Cascardo <casc...@linux.vnet.ibm.com>
Cc: Anirudha Sarangi <ani...@xilinx.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/ethernet/ibm/ehea/ehea_main.c | 2 +-
drivers/net/ethernet/tehuti/tehuti.c | 1 -
drivers/net/ethernet/xilinx/ll_temac_main.c | 2 +-
drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 2 +-
4 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c
index 35853b4..105868e 100644
--- a/drivers/net/ethernet/ibm/ehea/ehea_main.c
+++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c
@@ -3022,7 +3022,7 @@ static struct ehea_port *ehea_setup_single_port(struct ehea_adapter *adapter,

dev->hw_features = NETIF_F_SG | NETIF_F_TSO |
NETIF_F_IP_CSUM | NETIF_F_HW_VLAN_CTAG_TX;
- dev->features = NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_TSO |
+ dev->features = NETIF_F_SG | NETIF_F_TSO |
NETIF_F_HIGHDMA | NETIF_F_IP_CSUM |
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_RXCSUM;
diff --git a/drivers/net/ethernet/tehuti/tehuti.c b/drivers/net/ethernet/tehuti/tehuti.c
index 571452e..61a1540 100644
--- a/drivers/net/ethernet/tehuti/tehuti.c
+++ b/drivers/net/ethernet/tehuti/tehuti.c
@@ -2019,7 +2019,6 @@ bdx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
ndev->features = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO
| NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_RXCSUM
- /*| NETIF_F_FRAGLIST */
;
ndev->hw_features = NETIF_F_IP_CSUM | NETIF_F_SG |
NETIF_F_TSO | NETIF_F_HW_VLAN_CTAG_TX;
diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index 96cb897..7032aab 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -1016,7 +1016,7 @@ static int temac_of_probe(struct platform_device *op)
platform_set_drvdata(op, ndev);
SET_NETDEV_DEV(ndev, &op->dev);
ndev->flags &= ~IFF_MULTICAST; /* clear multicast */
- ndev->features = NETIF_F_SG | NETIF_F_FRAGLIST;
+ ndev->features = NETIF_F_SG;
ndev->netdev_ops = &temac_netdev_ops;
ndev->ethtool_ops = &temac_ethtool_ops;
#if 0
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index fb7d1c2..005c650 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1488,7 +1488,7 @@ static int axienet_of_probe(struct platform_device *op)

SET_NETDEV_DEV(ndev, &op->dev);
ndev->flags &= ~IFF_MULTICAST; /* clear multicast */
- ndev->features = NETIF_F_SG | NETIF_F_FRAGLIST;
+ ndev->features = NETIF_F_SG;
ndev->netdev_ops = &axienet_netdev_ops;
ndev->ethtool_ops = &axienet_ethtool_ops;

Luis Henriques

unread,
Jan 13, 2014, 11:20:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Changli Gao <xia...@gmail.com>

commit d323e92cc3f4edd943610557c9ea1bb4bb5056e8 upstream.

maxattr in genl_family should be used to save the max attribute
type, but not the max command type. Drop monitor doesn't support
any attributes, so we should leave it as zero.

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/core/drop_monitor.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 5e78d44..f27d126 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -64,7 +64,6 @@ static struct genl_family net_drop_monitor_family = {
.hdrsize = 0,
.name = "NET_DM",
.version = 2,
- .maxattr = NET_DM_CMD_MAX,
};

static DEFINE_PER_CPU(struct per_cpu_dm_data, dm_cpu_data);

Luis Henriques

unread,
Jan 13, 2014, 11:20:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: "Michael S. Tsirkin" <m...@redhat.com>

commit 8fc3b9e9a229778e5af3aa453c44f1a3857ba769 upstream.

Eric Dumazet noticed that if we encounter an error
when processing a mergeable buffer, we don't
dequeue all of the buffers from this packet,
the result is almost sure to be loss of networking.

Jason Wang noticed that we also leak a page and that we don't decrement
the rq buf count, so we won't repost buffers (a resource leak).

Fix both issues.

Cc: Rusty Russell <ru...@rustcorp.com.au>
Cc: Michael Dalton <mwda...@google.com>
Reported-by: Eric Dumazet <edum...@google.com>
Reported-by: Jason Wang <jaso...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Jason Wang <jaso...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
[ luis: backported to 3.11: used davem's backport to 3.10 ]
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/virtio_net.c | 66 +++++++++++++++++++++++++++++++++---------------
1 file changed, 46 insertions(+), 20 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5055eb7..ad9b385 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -294,26 +294,33 @@ static struct sk_buff *page_to_skb(struct receive_queue *rq,
return skb;
}

-static int receive_mergeable(struct receive_queue *rq, struct sk_buff *skb)
+static struct sk_buff *receive_mergeable(struct net_device *dev,
+ struct receive_queue *rq,
+ void *buf,
+ unsigned int len)
{
- struct skb_vnet_hdr *hdr = skb_vnet_hdr(skb);
- struct page *page;
- int num_buf, i, len;
+ struct skb_vnet_hdr *hdr = page_address(buf);
+ int num_buf = hdr->mhdr.num_buffers;
+ struct page *page = buf;
+ struct sk_buff *skb = page_to_skb(rq, page, len);
+ int i;
+
+ if (unlikely(!skb))
+ goto err_skb;

- num_buf = hdr->mhdr.num_buffers;
while (--num_buf) {
i = skb_shinfo(skb)->nr_frags;
if (i >= MAX_SKB_FRAGS) {
pr_debug("%s: packet too long\n", skb->dev->name);
skb->dev->stats.rx_length_errors++;
- return -EINVAL;
+ return NULL;
}
page = virtqueue_get_buf(rq->vq, &len);
if (!page) {
- pr_debug("%s: rx error: %d buffers missing\n",
- skb->dev->name, hdr->mhdr.num_buffers);
- skb->dev->stats.rx_length_errors++;
- return -EINVAL;
+ pr_debug("%s: rx error: %d buffers %d missing\n",
+ dev->name, hdr->mhdr.num_buffers, num_buf);
+ dev->stats.rx_length_errors++;
+ goto err_buf;
}

if (len > PAGE_SIZE)
@@ -323,7 +330,25 @@ static int receive_mergeable(struct receive_queue *rq, struct sk_buff *skb)

--rq->num;
}
- return 0;
+ return skb;
+err_skb:
+ give_pages(rq, page);
+ while (--num_buf) {
+ buf = virtqueue_get_buf(rq->vq, &len);
+ if (unlikely(!buf)) {
+ pr_debug("%s: rx error: %d buffers missing\n",
+ dev->name, num_buf);
+ dev->stats.rx_length_errors++;
+ break;
+ }
+ page = buf;
+ give_pages(rq, page);
+ --rq->num;
+ }
+err_buf:
+ dev->stats.rx_dropped++;
+ dev_kfree_skb(skb);
+ return NULL;
}

static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
@@ -351,17 +376,18 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
skb_trim(skb, len);
} else {
page = buf;
- skb = page_to_skb(rq, page, len);
- if (unlikely(!skb)) {
- dev->stats.rx_dropped++;
- give_pages(rq, page);
- return;
- }
- if (vi->mergeable_rx_bufs)
- if (receive_mergeable(rq, skb)) {
- dev_kfree_skb(skb);
+ if (vi->mergeable_rx_bufs) {
+ skb = receive_mergeable(dev, rq, page, len);
+ if (unlikely(!skb))
+ return;
+ } else {
+ skb = page_to_skb(rq, page, len);
+ if (unlikely(!skb)) {
+ dev->stats.rx_dropped++;
+ give_pages(rq, page);
return;
}
+ }
}

hdr = skb_vnet_hdr(skb);

Luis Henriques

unread,
Jan 13, 2014, 11:20:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <han...@stressinduktion.org>

commit a3300ef4bbb1f1e33ff0400e1e6cf7733d988f4f upstream.

Brett Ciphery reported that new ipv6 addresses failed to get installed
because the addrconf generated dsts where counted against the dst gc
limit. We don't need to count those routes like we currently don't count
administratively added routes.

Because the max_addresses check enforces a limit on unbounded address
generation first in case someone plays with router advertisments, we
are still safe here.

Reported-by: Brett Ciphery <brett....@windriver.com>
Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/ipv6/route.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index ff3223c..9e0f8e1 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2122,12 +2122,10 @@ struct rt6_info *addrconf_dst_alloc(struct inet6_dev *idev,
bool anycast)
{
struct net *net = dev_net(idev->dev);
- struct rt6_info *rt = ip6_dst_alloc(net, net->loopback_dev, 0, NULL);
-
- if (!rt) {
- net_warn_ratelimited("Maximum number of routes reached, consider increasing route/max_size\n");
+ struct rt6_info *rt = ip6_dst_alloc(net, net->loopback_dev,
+ DST_NOCOUNT, NULL);
+ if (!rt)
return ERR_PTR(-ENOMEM);
- }

in6_dev_hold(idev);

Luis Henriques

unread,
Jan 13, 2014, 11:20:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tg...@linutronix.de>

commit 73beb63d290f961c299526852884846b0d868840 upstream.

This fixes a kernel panic when resuming from suspend to RAM.
Without this fix an interrupt hits after the delayed work is canceled
and thus requeues it. So we end up freeing an armed timer.

Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Signed-off-by: Samuel Ortiz <sa...@linux.intel.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/mfd/rtsx_pcr.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/mfd/rtsx_pcr.c b/drivers/mfd/rtsx_pcr.c
index dd186c4..ccf79a1 100644
--- a/drivers/mfd/rtsx_pcr.c
+++ b/drivers/mfd/rtsx_pcr.c
@@ -1200,8 +1200,14 @@ static void rtsx_pci_remove(struct pci_dev *pcidev)

pcr->remove_pci = true;

- cancel_delayed_work(&pcr->carddet_work);
- cancel_delayed_work(&pcr->idle_work);
+ /* Disable interrupts at the pcr level */
+ spin_lock_irq(&pcr->lock);
+ rtsx_pci_writel(pcr, RTSX_BIER, 0);
+ pcr->bier = 0;
+ spin_unlock_irq(&pcr->lock);
+
+ cancel_delayed_work_sync(&pcr->carddet_work);
+ cancel_delayed_work_sync(&pcr->idle_work);

mfd_remove_devices(&pcidev->dev);

Luis Henriques

unread,
Jan 13, 2014, 11:20:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

commit 66e56cd46b93ef407c60adcac62cf33b06119d50 upstream.

Commit e40526cb20b5 introduced a cached dev pointer, that gets
hooked into register_prot_hook(), __unregister_prot_hook() to
update the device used for the send path.

We need to fix this up, as otherwise this will not work with
sockets created with protocol = 0, plus with sll_protocol = 0
passed via sockaddr_ll when doing the bind.

So instead, assign the pointer directly. The compiler can inline
these helper functions automagically.

While at it, also assume the cached dev fast-path as likely(),
and document this variant of socket creation as it seems it is
not widely used (seems not even the author of TX_RING was aware
of that in his reference example [1]). Tested with reproducer
from e40526cb20b5.

[1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example

Fixes: e40526cb20b5 ("packet: fix use after free race in send path when dev is released")
Signed-off-by: Daniel Borkmann <dbor...@redhat.com>
Tested-by: Salam Noureddine <noure...@aristanetworks.com>
Tested-by: Jesper Dangaard Brouer <bro...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
Documentation/networking/packet_mmap.txt | 10 +++++
net/packet/af_packet.c | 65 ++++++++++++++++++++------------
2 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt
index 8572796..f56c278 100644
--- a/Documentation/networking/packet_mmap.txt
+++ b/Documentation/networking/packet_mmap.txt
@@ -123,6 +123,16 @@ Transmission process is similar to capture as shown below.
[shutdown] close() --------> destruction of the transmission socket and
deallocation of all associated resources.

+Socket creation and destruction is also straight forward, and is done
+the same way as in capturing described in the previous paragraph:
+
+ int fd = socket(PF_PACKET, mode, 0);
+
+The protocol can optionally be 0 in case we only want to transmit
+via this socket, which avoids an expensive call to packet_rcv().
+In this case, you also need to bind(2) the TX_RING with sll_protocol = 0
+set. Otherwise, htons(ETH_P_ALL) or any other protocol, for example.
+
Binding the socket to your network interface is mandatory (with zero copy) to
know the header size of frames used in the circular buffer.

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 151e8ae..1b6d773 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -237,6 +237,30 @@ struct packet_skb_cb {
static void __fanout_unlink(struct sock *sk, struct packet_sock *po);
static void __fanout_link(struct sock *sk, struct packet_sock *po);

+static struct net_device *packet_cached_dev_get(struct packet_sock *po)
+{
+ struct net_device *dev;
+
+ rcu_read_lock();
+ dev = rcu_dereference(po->cached_dev);
+ if (likely(dev))
+ dev_hold(dev);
+ rcu_read_unlock();
+
+ return dev;
+}
+
+static void packet_cached_dev_assign(struct packet_sock *po,
+ struct net_device *dev)
+{
+ rcu_assign_pointer(po->cached_dev, dev);
+}
+
+static void packet_cached_dev_reset(struct packet_sock *po)
+{
+ RCU_INIT_POINTER(po->cached_dev, NULL);
+}
+
/* register_prot_hook must be invoked with the po->bind_lock held,
* or from a context in which asynchronous accesses to the packet
* socket is not possible (packet_create()).
@@ -246,12 +270,10 @@ static void register_prot_hook(struct sock *sk)
struct packet_sock *po = pkt_sk(sk);

if (!po->running) {
- if (po->fanout) {
+ if (po->fanout)
__fanout_link(sk, po);
- } else {
+ else
dev_add_pack(&po->prot_hook);
- rcu_assign_pointer(po->cached_dev, po->prot_hook.dev);
- }

sock_hold(sk);
po->running = 1;
@@ -270,12 +292,11 @@ static void __unregister_prot_hook(struct sock *sk, bool sync)
struct packet_sock *po = pkt_sk(sk);

po->running = 0;
- if (po->fanout) {
+
+ if (po->fanout)
__fanout_unlink(sk, po);
- } else {
+ else
__dev_remove_pack(&po->prot_hook);
- RCU_INIT_POINTER(po->cached_dev, NULL);
- }

__sock_put(sk);

@@ -2048,19 +2069,6 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
return tp_len;
}

-static struct net_device *packet_cached_dev_get(struct packet_sock *po)
-{
- struct net_device *dev;
-
- rcu_read_lock();
- dev = rcu_dereference(po->cached_dev);
- if (dev)
- dev_hold(dev);
- rcu_read_unlock();
-
- return dev;
-}
-
static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
{
struct sk_buff *skb;
@@ -2077,7 +2085,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)

mutex_lock(&po->pg_vec_lock);

- if (saddr == NULL) {
+ if (likely(saddr == NULL)) {
dev = packet_cached_dev_get(po);
proto = po->num;
addr = NULL;
@@ -2231,7 +2239,7 @@ static int packet_snd(struct socket *sock,
* Get and verify the address.
*/

- if (saddr == NULL) {
+ if (likely(saddr == NULL)) {
dev = packet_cached_dev_get(po);
proto = po->num;
addr = NULL;
@@ -2440,6 +2448,8 @@ static int packet_release(struct socket *sock)

spin_lock(&po->bind_lock);
unregister_prot_hook(sk, false);
+ packet_cached_dev_reset(po);
+
if (po->prot_hook.dev) {
dev_put(po->prot_hook.dev);
po->prot_hook.dev = NULL;
@@ -2495,14 +2505,17 @@ static int packet_do_bind(struct sock *sk, struct net_device *dev, __be16 protoc

spin_lock(&po->bind_lock);
unregister_prot_hook(sk, true);
+
po->num = protocol;
po->prot_hook.type = protocol;
if (po->prot_hook.dev)
dev_put(po->prot_hook.dev);
- po->prot_hook.dev = dev;

+ po->prot_hook.dev = dev;
po->ifindex = dev ? dev->ifindex : 0;

+ packet_cached_dev_assign(po, dev);
+
if (protocol == 0)
goto out_unlock;

@@ -2615,7 +2628,8 @@ static int packet_create(struct net *net, struct socket *sock, int protocol,
po = pkt_sk(sk);
sk->sk_family = PF_PACKET;
po->num = proto;
- RCU_INIT_POINTER(po->cached_dev, NULL);
+
+ packet_cached_dev_reset(po);

sk->sk_destruct = packet_sock_destruct;
sk_refcnt_debug_inc(sk);
@@ -3370,6 +3384,7 @@ static int packet_notifier(struct notifier_block *this,
sk->sk_error_report(sk);
}
if (msg == NETDEV_UNREGISTER) {
+ packet_cached_dev_reset(po);
po->ifindex = -1;
if (po->prot_hook.dev)
dev_put(po->prot_hook.dev);

Luis Henriques

unread,
Jan 13, 2014, 11:20:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <da...@davemloft.net>

commit 2205369a314e12fcec4781cc73ac9c08fc2b47de upstream.

When the vlan code detects that the real device can do TX VLAN offloads
in hardware, it tries to arrange for the real device's header_ops to
be invoked directly.

But it does so illegally, by simply hooking the real device's
header_ops up to the VLAN device.

This doesn't work because we will end up invoking a set of header_ops
routines which expect a device type which matches the real device, but
will see a VLAN device instead.

Fix this by providing a pass-thru set of header_ops which will arrange
to pass the proper real device instead.

To facilitate this add a dev_rebuild_header(). There are
implementations which provide a ->cache and ->create but not a
->rebuild (f.e. PLIP). So we need a helper function just like
dev_hard_header() to avoid crashes.

Use this helper in the one existing place where the
header_ops->rebuild was being invoked, the neighbour code.

With lots of help from Florian Westphal.

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
include/linux/netdevice.h | 9 +++++++++
net/8021q/vlan_dev.c | 19 ++++++++++++++++++-
net/core/neighbour.c | 2 +-
3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9a41568..1ffe7d7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1836,6 +1836,15 @@ static inline int dev_parse_header(const struct sk_buff *skb,
return dev->header_ops->parse(skb, haddr);
}

+static inline int dev_rebuild_header(struct sk_buff *skb)
+{
+ const struct net_device *dev = skb->dev;
+
+ if (!dev->header_ops || !dev->header_ops->rebuild)
+ return 0;
+ return dev->header_ops->rebuild(skb);
+}
+
typedef int gifconf_func_t(struct net_device * dev, char __user * bufptr, int len);
extern int register_gifconf(unsigned int family, gifconf_func_t * gifconf);
static inline int unregister_gifconf(unsigned int family)
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 1cd3d2a..4af64af 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -549,6 +549,23 @@ static const struct header_ops vlan_header_ops = {
.parse = eth_header_parse,
};

+static int vlan_passthru_hard_header(struct sk_buff *skb, struct net_device *dev,
+ unsigned short type,
+ const void *daddr, const void *saddr,
+ unsigned int len)
+{
+ struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
+ struct net_device *real_dev = vlan->real_dev;
+
+ return dev_hard_header(skb, real_dev, type, daddr, saddr, len);
+}
+
+static const struct header_ops vlan_passthru_header_ops = {
+ .create = vlan_passthru_hard_header,
+ .rebuild = dev_rebuild_header,
+ .parse = eth_header_parse,
+};
+
static struct device_type vlan_type = {
.name = "vlan",
};
@@ -592,7 +609,7 @@ static int vlan_dev_init(struct net_device *dev)

dev->needed_headroom = real_dev->needed_headroom;
if (real_dev->features & NETIF_F_HW_VLAN_CTAG_TX) {
- dev->header_ops = real_dev->header_ops;
+ dev->header_ops = &vlan_passthru_header_ops;
dev->hard_header_len = real_dev->hard_header_len;
} else {
dev->header_ops = &vlan_header_ops;
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 60533db..72d71c9 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1274,7 +1274,7 @@ int neigh_compat_output(struct neighbour *neigh, struct sk_buff *skb)

if (dev_hard_header(skb, dev, ntohs(skb->protocol), NULL, NULL,
skb->len) < 0 &&
- dev->header_ops->rebuild(skb))
+ dev_rebuild_header(skb))
return 0;

return dev_queue_xmit(skb);

Luis Henriques

unread,
Jan 13, 2014, 11:20:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Jason Wang <jaso...@redhat.com>

commit 50dc875f2e6e2e04aed3b3033eb0ac99192d6d02 upstream.

There's a possible deadlock if we flush the peers notifying work during setting
mtu:

[ 22.991149] ======================================================
[ 22.991173] [ INFO: possible circular locking dependency detected ]
[ 22.991198] 3.10.0-54.0.1.el7.x86_64.debug #1 Not tainted
[ 22.991219] -------------------------------------------------------
[ 22.991243] ip/974 is trying to acquire lock:
[ 22.991261] ((&(&net_device_ctx->dwork)->work)){+.+.+.}, at: [<ffffffff8108af95>] flush_work+0x5/0x2e0
[ 22.991307]
but task is already holding lock:
[ 22.991330] (rtnl_mutex){+.+.+.}, at: [<ffffffff81539deb>] rtnetlink_rcv+0x1b/0x40
[ 22.991367]
which lock already depends on the new lock.

[ 22.991398]
the existing dependency chain (in reverse order) is:
[ 22.991426]
-> #1 (rtnl_mutex){+.+.+.}:
[ 22.991449] [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260
[ 22.991477] [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0
[ 22.991501] [<ffffffff81673659>] mutex_lock_nested+0x89/0x4f0
[ 22.991529] [<ffffffff815392b7>] rtnl_lock+0x17/0x20
[ 22.991552] [<ffffffff815230b2>] netdev_notify_peers+0x12/0x30
[ 22.991579] [<ffffffffa0340212>] netvsc_send_garp+0x22/0x30 [hv_netvsc]
[ 22.991610] [<ffffffff8108d251>] process_one_work+0x211/0x6e0
[ 22.991637] [<ffffffff8108d83b>] worker_thread+0x11b/0x3a0
[ 22.991663] [<ffffffff81095e5d>] kthread+0xed/0x100
[ 22.991686] [<ffffffff81681c6c>] ret_from_fork+0x7c/0xb0
[ 22.991715]
-> #0 ((&(&net_device_ctx->dwork)->work)){+.+.+.}:
[ 22.991715] [<ffffffff810de817>] check_prevs_add+0x967/0x970
[ 22.991715] [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260
[ 22.991715] [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0
[ 22.991715] [<ffffffff8108afde>] flush_work+0x4e/0x2e0
[ 22.991715] [<ffffffff8108e1b5>] __cancel_work_timer+0x95/0x130
[ 22.991715] [<ffffffff8108e303>] cancel_delayed_work_sync+0x13/0x20
[ 22.991715] [<ffffffffa03404e4>] netvsc_change_mtu+0x84/0x200 [hv_netvsc]
[ 22.991715] [<ffffffff815233d4>] dev_set_mtu+0x34/0x80
[ 22.991715] [<ffffffff8153bc2a>] do_setlink+0x23a/0xa00
[ 22.991715] [<ffffffff8153d054>] rtnl_newlink+0x394/0x5e0
[ 22.991715] [<ffffffff81539eac>] rtnetlink_rcv_msg+0x9c/0x260
[ 22.991715] [<ffffffff8155cdd9>] netlink_rcv_skb+0xa9/0xc0
[ 22.991715] [<ffffffff81539dfa>] rtnetlink_rcv+0x2a/0x40
[ 22.991715] [<ffffffff8155c41d>] netlink_unicast+0xdd/0x190
[ 22.991715] [<ffffffff8155c807>] netlink_sendmsg+0x337/0x750
[ 22.991715] [<ffffffff8150d219>] sock_sendmsg+0x99/0xd0
[ 22.991715] [<ffffffff8150d63e>] ___sys_sendmsg+0x39e/0x3b0
[ 22.991715] [<ffffffff8150eba2>] __sys_sendmsg+0x42/0x80
[ 22.991715] [<ffffffff8150ebf2>] SyS_sendmsg+0x12/0x20
[ 22.991715] [<ffffffff81681d19>] system_call_fastpath+0x16/0x1b

This is because we hold the rtnl_lock() before ndo_change_mtu() and try to flush
the work in netvsc_change_mtu(), in the mean time, netdev_notify_peers() may be
called from worker and also trying to hold the rtnl_lock. This will lead the
flush won't succeed forever. Solve this by not canceling and flushing the work,
this is safe because the transmission done by NETDEV_NOTIFY_PEERS was
synchronized with the netif_tx_disable() called by netvsc_change_mtu().

Reported-by: Yaju Cao <ya...@redhat.com>
Tested-by: Yaju Cao <ya...@redhat.com>
Cc: K. Y. Srinivasan <k...@microsoft.com>
Cc: Haiyang Zhang <haiy...@microsoft.com>
Signed-off-by: Jason Wang <jaso...@redhat.com>
Acked-by: Haiyang Zhang <haiy...@microsoft.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/hyperv/netvsc_drv.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 23a0fff..aea78fc 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -328,7 +328,6 @@ static int netvsc_change_mtu(struct net_device *ndev, int mtu)
return -EINVAL;

nvdev->start_remove = true;
- cancel_delayed_work_sync(&ndevctx->dwork);
cancel_work_sync(&ndevctx->work);
netif_tx_disable(ndev);
rndis_filter_device_remove(hdev);

Luis Henriques

unread,
Jan 13, 2014, 11:20:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <ti...@suse.de>

commit 638298dc66ea36623dbc2757a24fc2c4ab41b016 upstream.

Haswell LynxPoint and LynxPoint-LP with the recent Intel BIOS show
mysterious wakeups after shutdown occasionally. After discussing with
BIOS engineers, they explained that the new BIOS expects that the
wakeup sources are cleared and set to D3 for all wakeup devices when
the system is going to sleep or power off, but the current xhci driver
doesn't do this properly (partly intentionally).

This patch introduces a new quirk, XHCI_SPURIOUS_WAKEUP, for
fixing the spurious wakeups at S5 by calling xhci_reset() in the xhci
shutdown ops as done in xhci_stop(), and setting the device to PCI D3
at shutdown and remove ops.

The PCI D3 call is based on the initial fix patch by Oliver Neukum.

[Note: Sarah changed the quirk name from XHCI_HSW_SPURIOUS_WAKEUP to
XHCI_SPURIOUS_WAKEUP, since none of the other quirks have system names
in them. Sarah also fixed a collision with a quirk submitted around the
same time, by changing the xhci->quirks bit from 17 to 18.]

This patch should be backported to kernels as old as 3.0, that
contain the commit 1c12443ab8eba71a658fae4572147e56d1f84f66 "xhci: Add
Lynx Point to list of Intel switchable hosts."

Cc: Oliver Neukum <one...@suse.de>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Sarah Sharp <sarah....@linux.intel.com>
[ luis: backported to 3.11: adjusted context ]
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/usb/host/xhci-pci.c | 17 +++++++++++++++++
drivers/usb/host/xhci.c | 7 +++++++
drivers/usb/host/xhci.h | 1 +
3 files changed, 25 insertions(+)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 6480158..c79e3b9 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -34,6 +34,9 @@
#define PCI_VENDOR_ID_ETRON 0x1b6f
#define PCI_DEVICE_ID_ASROCK_P67 0x7023

+#define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31
+#define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31
+
static const char hcd_name[] = "xhci_hcd";

/* called after powerup, by probe or system-pm "wakeup" */
@@ -114,6 +117,15 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_SPURIOUS_REBOOT;
xhci->quirks |= XHCI_AVOID_BEI;
}
+ if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
+ (pdev->device == PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI ||
+ pdev->device == PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI)) {
+ /* Workaround for occasional spurious wakeups from S5 (or
+ * any other sleep) on Haswell machines with LPT and LPT-LP
+ * with the new Intel BIOS
+ */
+ xhci->quirks |= XHCI_SPURIOUS_WAKEUP;
+ }
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
pdev->device == PCI_DEVICE_ID_ASROCK_P67) {
xhci->quirks |= XHCI_RESET_ON_RESUME;
@@ -220,6 +232,11 @@ static void xhci_pci_remove(struct pci_dev *dev)
usb_put_hcd(xhci->shared_hcd);
}
usb_hcd_pci_remove(dev);
+
+ /* Workaround for spurious wakeups at shutdown with HSW */
+ if (xhci->quirks & XHCI_SPURIOUS_WAKEUP)
+ pci_set_power_state(dev, PCI_D3hot);
+
kfree(xhci);
}

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 5b341e5..2047fb4 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -780,12 +780,19 @@ void xhci_shutdown(struct usb_hcd *hcd)

spin_lock_irq(&xhci->lock);
xhci_halt(xhci);
+ /* Workaround for spurious wakeups at shutdown with HSW */
+ if (xhci->quirks & XHCI_SPURIOUS_WAKEUP)
+ xhci_reset(xhci);
spin_unlock_irq(&xhci->lock);

xhci_cleanup_msix(xhci);

xhci_dbg(xhci, "xhci_shutdown completed - status = %x\n",
xhci_readl(xhci, &xhci->op_regs->status));
+
+ /* Yet another workaround for spurious wakeups at shutdown with HSW */
+ if (xhci->quirks & XHCI_SPURIOUS_WAKEUP)
+ pci_set_power_state(to_pci_dev(hcd->self.controller), PCI_D3hot);
}

#ifdef CONFIG_PM
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index b72601c..a2bc934 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1545,6 +1545,7 @@ struct xhci_hcd {
#define XHCI_AVOID_BEI (1 << 15)
#define XHCI_PLAT (1 << 16)
#define XHCI_SLOW_SUSPEND (1 << 17)
+#define XHCI_SPURIOUS_WAKEUP (1 << 18)
unsigned int num_active_eps;
unsigned int limit_active_eps;
/* There are two roothubs to keep track of bus suspend info for */

Luis Henriques

unread,
Jan 13, 2014, 11:20:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Vlad Yasevich <vyas...@redhat.com>

commit 006da7b07bc4d3a7ffabad17cf639eec6849c9dc upstream.

Currently macvlan will count received packets after calling each
vlans receive handler. Macvtap attempts to count the packet
yet again when the user reads the packet from the tap socket.
This code doesn't do this consistently either. Remove the
counting from macvtap and let only macvlan count received
packets.

Signed-off-by: Vlad Yasevich <vyas...@redhat.com>
Acked-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Jason Wang <jaso...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/macvtap.c | 10 ----------
1 file changed, 10 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 3c87e53..70d5800 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -870,7 +870,6 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
const struct sk_buff *skb,
const struct iovec *iv, int len)
{
- struct macvlan_dev *vlan;
int ret;
int vnet_hdr_len = 0;
int vlan_offset = 0;
@@ -924,15 +923,6 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
copied += len;

done:
- rcu_read_lock();
- vlan = rcu_dereference(q->vlan);
- if (vlan) {
- preempt_disable();
- macvlan_count_rx(vlan, copied - vnet_hdr_len, ret == 0, 0);
- preempt_enable();
- }
- rcu_read_unlock();
-
return ret ? ret : copied;

Luis Henriques

unread,
Jan 13, 2014, 11:30:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit f714f4f20e59ea6eea264a86b9a51fd51b88fc54 upstream.

MMU notifiers must be called on THP page migration or secondary MMUs
will get very confused.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/migrate.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index b3ffb81..65bbca5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -36,6 +36,7 @@
#include <linux/hugetlb_cgroup.h>
#include <linux/gfp.h>
#include <linux/balloon_compaction.h>
+#include <linux/mmu_notifier.h>

#include <asm/tlbflush.h>

@@ -1652,12 +1653,13 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
unsigned long address,
struct page *page, int node)
{
- unsigned long haddr = address & HPAGE_PMD_MASK;
pg_data_t *pgdat = NODE_DATA(node);
int isolated = 0;
struct page *new_page = NULL;
struct mem_cgroup *memcg = NULL;
int page_lru = page_is_file_cache(page);
+ unsigned long mmun_start = address & HPAGE_PMD_MASK;
+ unsigned long mmun_end = mmun_start + HPAGE_PMD_SIZE;
pmd_t orig_entry;

/*
@@ -1699,10 +1701,12 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
WARN_ON(PageLRU(new_page));

/* Recheck the target PMD */
+ mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
spin_lock(&mm->page_table_lock);
if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) {
fail_putback:
spin_unlock(&mm->page_table_lock);
+ mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);

/* Reverse changes made by migrate_page_copy() */
if (TestClearPageActive(new_page))
@@ -1743,15 +1747,16 @@ fail_putback:
* The SetPageUptodate on the new page and page_add_new_anon_rmap
* guarantee the copy is visible before the pagetable update.
*/
- flush_cache_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
- page_add_new_anon_rmap(new_page, vma, haddr);
- pmdp_clear_flush(vma, haddr, pmd);
- set_pmd_at(mm, haddr, pmd, entry);
+ flush_cache_range(vma, mmun_start, mmun_end);
+ page_add_new_anon_rmap(new_page, vma, mmun_start);
+ pmdp_clear_flush(vma, mmun_start, pmd);
+ set_pmd_at(mm, mmun_start, pmd, entry);
+ flush_tlb_range(vma, mmun_start, mmun_end);
update_mmu_cache_pmd(vma, address, &entry);

if (page_count(page) != 2) {
- set_pmd_at(mm, haddr, pmd, orig_entry);
- flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
+ set_pmd_at(mm, mmun_start, pmd, orig_entry);
+ flush_tlb_range(vma, mmun_start, mmun_end);
update_mmu_cache_pmd(vma, address, &entry);
page_remove_rmap(new_page);
goto fail_putback;
@@ -1766,6 +1771,7 @@ fail_putback:
*/
mem_cgroup_end_migration(memcg, page, new_page, true);
spin_unlock(&mm->page_table_lock);
+ mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);

unlock_page(new_page);
unlock_page(page);
@@ -1786,7 +1792,7 @@ out_dropref:
spin_lock(&mm->page_table_lock);
if (pmd_same(*pmd, entry)) {
entry = pmd_mknonnuma(entry);
- set_pmd_at(mm, haddr, pmd, entry);
+ set_pmd_at(mm, mmun_start, pmd, entry);
update_mmu_cache_pmd(vma, address, &entry);
}
spin_unlock(&mm->page_table_lock);

Luis Henriques

unread,
Jan 13, 2014, 11:30:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Lukas Czerner <lcze...@redhat.com>

commit 8f9ff189205a6817aee5a1f996f876541f86e07c upstream.

When using FITRIM ioctl on a file system without journal it will
only trim the block group once, no matter how many times you invoke
FITRIM ioctl and how many block you release from the block group.

It is because we only clear EXT4_GROUP_INFO_WAS_TRIMMED_BIT in journal
callback. Fix this by clearing the bit in no journal mode as well.

Signed-off-by: Lukas Czerner <lcze...@redhat.com>
Signed-off-by: "Theodore Ts'o" <ty...@mit.edu>
Reported-by: Jorge Fábregas <jorge.f...@gmail.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
fs/ext4/mballoc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index cb1ffce..5f54d05 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4776,8 +4776,8 @@ do_more:
" group:%d block:%d count:%lu failed"
" with %d", block_group, bit, count,
err);
- }
-
+ } else
+ EXT4_MB_GRP_CLEAR_TRIMMED(e4b.bd_info);

ext4_lock_group(sb, block_group);
mb_clear_bits(bitmap_bh->b_data, bit, count_clusters);

Luis Henriques

unread,
Jan 13, 2014, 11:30:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Alex Hung <alex...@canonical.com>

commit cfb743bf6173063b57ef5a8185ea87f130209d4d upstream.

Signed-off-by: Alex Hung <alex...@canonical.com>
Signed-off-by: Matthew Garrett <matthew...@nebula.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/platform/x86/dell-wmi.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/platform/x86/dell-wmi.c b/drivers/platform/x86/dell-wmi.c
index fa9a217..60e0900 100644
--- a/drivers/platform/x86/dell-wmi.c
+++ b/drivers/platform/x86/dell-wmi.c
@@ -130,7 +130,8 @@ static const u16 bios_to_linux_keycode[256] __initconst = {
KEY_BRIGHTNESSUP, KEY_UNKNOWN, KEY_KBDILLUMTOGGLE,
KEY_UNKNOWN, KEY_SWITCHVIDEOMODE, KEY_UNKNOWN, KEY_UNKNOWN,
KEY_SWITCHVIDEOMODE, KEY_UNKNOWN, KEY_UNKNOWN, KEY_PROG2,
- KEY_UNKNOWN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN,
+ KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, KEY_MICMUTE,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
@@ -139,8 +140,8 @@ static const u16 bios_to_linux_keycode[256] __initconst = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- KEY_PROG3
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, KEY_PROG3
};

static struct input_dev *dell_wmi_input_dev;

Luis Henriques

unread,
Jan 13, 2014, 11:30:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Kamala R <kam...@aristanetworks.com>

commit 7150aede5dd241539686e17d9592f5ebd28a2cda upstream.

The behaviour of blackhole and prohibit routes has been corrected by setting
the input and output pointers of the dst variable appropriately. For
blackhole routes, they are set to dst_discard and to ip6_pkt_discard and
ip6_pkt_discard_out respectively for prohibit routes.

ipv6: ip6_pkt_prohibit(_out) should not depend on
CONFIG_IPV6_MULTIPLE_TABLES

We need ip6_pkt_prohibit(_out) available without
CONFIG_IPV6_MULTIPLE_TABLES

Signed-off-by: Kamala R <kam...@aristanetworks.com>
Acked-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/ipv6/route.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b297e11..ff3223c 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -84,6 +84,8 @@ static int ip6_dst_gc(struct dst_ops *ops);

static int ip6_pkt_discard(struct sk_buff *skb);
static int ip6_pkt_discard_out(struct sk_buff *skb);
+static int ip6_pkt_prohibit(struct sk_buff *skb);
+static int ip6_pkt_prohibit_out(struct sk_buff *skb);
static void ip6_link_failure(struct sk_buff *skb);
static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
struct sk_buff *skb, u32 mtu);
@@ -234,9 +236,6 @@ static const struct rt6_info ip6_null_entry_template = {

#ifdef CONFIG_IPV6_MULTIPLE_TABLES

-static int ip6_pkt_prohibit(struct sk_buff *skb);
-static int ip6_pkt_prohibit_out(struct sk_buff *skb);
-
static const struct rt6_info ip6_prohibit_entry_template = {
.dst = {
.__refcnt = ATOMIC_INIT(1),
@@ -1521,21 +1520,24 @@ int ip6_route_add(struct fib6_config *cfg)
goto out;
}
}
- rt->dst.output = ip6_pkt_discard_out;
- rt->dst.input = ip6_pkt_discard;
rt->rt6i_flags = RTF_REJECT|RTF_NONEXTHOP;
switch (cfg->fc_type) {
case RTN_BLACKHOLE:
rt->dst.error = -EINVAL;
+ rt->dst.output = dst_discard;
+ rt->dst.input = dst_discard;
break;
case RTN_PROHIBIT:
rt->dst.error = -EACCES;
+ rt->dst.output = ip6_pkt_prohibit_out;
+ rt->dst.input = ip6_pkt_prohibit;
break;
case RTN_THROW:
- rt->dst.error = -EAGAIN;
- break;
default:
- rt->dst.error = -ENETUNREACH;
+ rt->dst.error = (cfg->fc_type == RTN_THROW) ? -EAGAIN
+ : -ENETUNREACH;
+ rt->dst.output = ip6_pkt_discard_out;
+ rt->dst.input = ip6_pkt_discard;
break;
}
goto install_route;
@@ -2100,8 +2102,6 @@ static int ip6_pkt_discard_out(struct sk_buff *skb)
return ip6_pkt_drop(skb, ICMPV6_NOROUTE, IPSTATS_MIB_OUTNOROUTES);
}

-#ifdef CONFIG_IPV6_MULTIPLE_TABLES
-
static int ip6_pkt_prohibit(struct sk_buff *skb)
{
return ip6_pkt_drop(skb, ICMPV6_ADM_PROHIBITED, IPSTATS_MIB_INNOROUTES);
@@ -2113,8 +2113,6 @@ static int ip6_pkt_prohibit_out(struct sk_buff *skb)
return ip6_pkt_drop(skb, ICMPV6_ADM_PROHIBITED, IPSTATS_MIB_OUTNOROUTES);
}

-#endif
-
/*
* Allocate a dst for local (unicast / anycast) address.
*/

Luis Henriques

unread,
Jan 13, 2014, 11:30:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit 67f87463d3a3362424efcbe8b40e4772fd34fc61 upstream.

On x86, PMD entries are similar to _PAGE_PROTNONE protection and are
handled as NUMA hinting faults. The following two page table protection
bits are what defines them

_PAGE_NUMA:set _PAGE_PRESENT:clear

A PMD is considered present if any of the _PAGE_PRESENT, _PAGE_PROTNONE,
_PAGE_PSE or _PAGE_NUMA bits are set. If pmdp_invalidate encounters a
pmd_numa, it clears the present bit leaving _PAGE_NUMA which will be
considered not present by the CPU but present by pmd_present. The
existing caller of pmdp_invalidate should handle it but it's an
inconsistent state for a PMD. This patch keeps the state consistent
when calling pmdp_invalidate.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/pgtable-generic.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index e1a6e4f..0e083c5 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -167,6 +167,9 @@ pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp)
void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp)
{
+ pmd_t entry = *pmdp;
+ if (pmd_numa(entry))
+ entry = pmd_mknonnuma(entry);
set_pmd_at(vma->vm_mm, address, pmdp, pmd_mknotpresent(*pmdp));
flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);

Luis Henriques

unread,
Jan 13, 2014, 11:30:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Rik van Riel <ri...@redhat.com>

commit 20841405940e7be0617612d521e206e4b6b325db upstream.

There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.

The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.

During that time, a CPU may continue writing to the page.

This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.

When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.

This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).

The basic race looks like this:

CPU A CPU B CPU C

load TLB entry
make entry PTE/PMD_NUMA
fault on entry
read/write old page
start migrating page
change PTE/PMD to new page
read/write old page [*]
flush TLB
reload TLB from new entry
read/write new page
lose data

[*] the old page may belong to a new user at this point!

The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.

This should fix both NUMA migration and compaction.

[mgo...@suse.de: fix build]
Signed-off-by: Rik van Riel <ri...@redhat.com>
Signed-off-by: Mel Gorman <mgo...@suse.de>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/sparc/include/asm/pgtable_64.h | 4 ++--
arch/x86/include/asm/pgtable.h | 11 ++++++++--
include/asm-generic/pgtable.h | 2 +-
include/linux/mm_types.h | 44 +++++++++++++++++++++++++++++++++++++
kernel/fork.c | 1 +
mm/huge_memory.c | 7 ++++++
mm/mprotect.c | 2 ++
mm/pgtable-generic.c | 5 +++--
8 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 3676031..90f289f 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -616,7 +616,7 @@ static inline unsigned long pte_present(pte_t pte)
}

#define pte_accessible pte_accessible
-static inline unsigned long pte_accessible(pte_t a)
+static inline unsigned long pte_accessible(struct mm_struct *mm, pte_t a)
{
return pte_val(a) & _PAGE_VALID;
}
@@ -806,7 +806,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
* SUN4V NOTE: _PAGE_VALID is the same value in both the SUN4U
* and SUN4V pte layout, so this inline test is fine.
*/
- if (likely(mm != &init_mm) && pte_accessible(orig))
+ if (likely(mm != &init_mm) && pte_accessible(mm, orig))
tlb_batch_add(mm, addr, ptep, orig, fullmm);
}

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 1c00631..90cf8db4 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -465,9 +465,16 @@ static inline int pte_present(pte_t a)
}

#define pte_accessible pte_accessible
-static inline int pte_accessible(pte_t a)
+static inline bool pte_accessible(struct mm_struct *mm, pte_t a)
{
- return pte_flags(a) & _PAGE_PRESENT;
+ if (pte_flags(a) & _PAGE_PRESENT)
+ return true;
+
+ if ((pte_flags(a) & (_PAGE_PROTNONE | _PAGE_NUMA)) &&
+ mm_tlb_flush_pending(mm))
+ return true;
+
+ return false;
}

static inline int pte_hidden(pte_t pte)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 0807ddf..380acce 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -221,7 +221,7 @@ static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
#endif

#ifndef pte_accessible
-# define pte_accessible(pte) ((void)(pte),1)
+# define pte_accessible(mm, pte) ((void)(pte), 1)
#endif

#ifndef flush_tlb_fix_spurious_fault
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index faf4b7c..ef9dc52 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -434,6 +434,14 @@ struct mm_struct {
*/
int first_nid;
#endif
+#if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_COMPACTION)
+ /*
+ * An operation with batched TLB flushing is going on. Anything that
+ * can move process memory needs to flush the TLB when moving a
+ * PROT_NONE or PROT_NUMA mapped page.
+ */
+ bool tlb_flush_pending;
+#endif
struct uprobes_state uprobes_state;
};

@@ -454,4 +462,40 @@ static inline cpumask_t *mm_cpumask(struct mm_struct *mm)
return mm->cpu_vm_mask_var;
}

+#if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_COMPACTION)
+/*
+ * Memory barriers to keep this state in sync are graciously provided by
+ * the page table locks, outside of which no page table modifications happen.
+ * The barriers below prevent the compiler from re-ordering the instructions
+ * around the memory barriers that are already present in the code.
+ */
+static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
+{
+ barrier();
+ return mm->tlb_flush_pending;
+}
+static inline void set_tlb_flush_pending(struct mm_struct *mm)
+{
+ mm->tlb_flush_pending = true;
+ barrier();
+}
+/* Clearing is done after a TLB flush, which also provides a barrier. */
+static inline void clear_tlb_flush_pending(struct mm_struct *mm)
+{
+ barrier();
+ mm->tlb_flush_pending = false;
+}
+#else
+static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
+{
+ return false;
+}
+static inline void set_tlb_flush_pending(struct mm_struct *mm)
+{
+}
+static inline void clear_tlb_flush_pending(struct mm_struct *mm)
+{
+}
+#endif
+
#endif /* _LINUX_MM_TYPES_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 200a7a2..f1f82cf 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -540,6 +540,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p)
spin_lock_init(&mm->page_table_lock);
mm_init_aio(mm);
mm_init_owner(mm, p);
+ clear_tlb_flush_pending(mm);

if (likely(!mm_alloc_pgd(mm))) {
mm->def_flags = 0;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 31950d6..37a8cd8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1372,6 +1372,13 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
}

/*
+ * The page_table_lock above provides a memory barrier
+ * with change_protection_range.
+ */
+ if (mm_tlb_flush_pending(mm))
+ flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
+
+ /*
* Migrate the THP to the requested node, returns with page unlocked
* and pmd_numa cleared.
*/
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 00edb75..7651a57 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -216,6 +216,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
BUG_ON(addr >= end);
pgd = pgd_offset(mm, addr);
flush_cache_range(vma, addr, end);
+ set_tlb_flush_pending(mm);
do {
next = pgd_addr_end(addr, end);
if (pgd_none_or_clear_bad(pgd))
@@ -227,6 +228,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
/* Only flush the TLB if we actually modified any entries: */
if (pages)
flush_tlb_range(vma, start, end);
+ clear_tlb_flush_pending(mm);

return pages;
}
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 0e083c5..683f476 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -86,9 +86,10 @@ int pmdp_clear_flush_young(struct vm_area_struct *vma,
pte_t ptep_clear_flush(struct vm_area_struct *vma, unsigned long address,
pte_t *ptep)
{
+ struct mm_struct *mm = (vma)->vm_mm;
pte_t pte;
- pte = ptep_get_and_clear((vma)->vm_mm, address, ptep);
- if (pte_accessible(pte))
+ pte = ptep_get_and_clear(mm, address, ptep);
+ if (pte_accessible(mm, pte))
flush_tlb_page(vma, address);
return pte;

Luis Henriques

unread,
Jan 13, 2014, 11:30:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Segall <bse...@google.com>

commit 1ee14e6c8cddeeb8a490d7b54cd9016e4bb900b4 upstream.

When we transition cfs_bandwidth_used to false, any currently
throttled groups will incorrectly return false from cfs_rq_throttled.
While tg_set_cfs_bandwidth will unthrottle them eventually, currently
running code (including at least dequeue_task_fair and
distribute_cfs_runtime) will cause errors.

Fix this by turning off cfs_bandwidth_used only after unthrottling all
cfs_rqs.

Tested: toggle bandwidth back and forth on a loaded cgroup. Caused
crashes in minutes without the patch, hasn't crashed with it.

Signed-off-by: Ben Segall <bse...@google.com>
Signed-off-by: Peter Zijlstra <pet...@infradead.org>
Cc: p...@google.com
Link: http://lkml.kernel.org/r/20131016181611.2...@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
Cc: Chris J Arges <chris....@canonical.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
kernel/sched/core.c | 9 ++++++++-
kernel/sched/fair.c | 16 +++++++++-------
kernel/sched/sched.h | 3 ++-
3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 05c39f0..68924be 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7283,7 +7283,12 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, u64 period, u64 quota)

runtime_enabled = quota != RUNTIME_INF;
runtime_was_enabled = cfs_b->quota != RUNTIME_INF;
- account_cfs_bandwidth_used(runtime_enabled, runtime_was_enabled);
+ /*
+ * If we need to toggle cfs_bandwidth_used, off->on must occur
+ * before making related changes, and on->off must occur afterwards
+ */
+ if (runtime_enabled && !runtime_was_enabled)
+ cfs_bandwidth_usage_inc();
raw_spin_lock_irq(&cfs_b->lock);
cfs_b->period = ns_to_ktime(period);
cfs_b->quota = quota;
@@ -7309,6 +7314,8 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, u64 period, u64 quota)
unthrottle_cfs_rq(cfs_rq);
raw_spin_unlock_irq(&rq->lock);
}
+ if (runtime_was_enabled && !runtime_enabled)
+ cfs_bandwidth_usage_dec();
out_unlock:
mutex_unlock(&cfs_constraints_mutex);

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a07a479..78a904d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2077,13 +2077,14 @@ static inline bool cfs_bandwidth_used(void)
return static_key_false(&__cfs_bandwidth_used);
}

-void account_cfs_bandwidth_used(int enabled, int was_enabled)
+void cfs_bandwidth_usage_inc(void)
{
- /* only need to count groups transitioning between enabled/!enabled */
- if (enabled && !was_enabled)
- static_key_slow_inc(&__cfs_bandwidth_used);
- else if (!enabled && was_enabled)
- static_key_slow_dec(&__cfs_bandwidth_used);
+ static_key_slow_inc(&__cfs_bandwidth_used);
+}
+
+void cfs_bandwidth_usage_dec(void)
+{
+ static_key_slow_dec(&__cfs_bandwidth_used);
}
#else /* HAVE_JUMP_LABEL */
static bool cfs_bandwidth_used(void)
@@ -2091,7 +2092,8 @@ static bool cfs_bandwidth_used(void)
return true;
}

-void account_cfs_bandwidth_used(int enabled, int was_enabled) {}
+void cfs_bandwidth_usage_inc(void) {}
+void cfs_bandwidth_usage_dec(void) {}
#endif /* HAVE_JUMP_LABEL */

/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index ef0a7b2..d77f2200 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1305,7 +1305,8 @@ extern void print_rt_stats(struct seq_file *m, int cpu);
extern void init_cfs_rq(struct cfs_rq *cfs_rq);
extern void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq);

-extern void account_cfs_bandwidth_used(int enabled, int was_enabled);
+extern void cfs_bandwidth_usage_inc(void);
+extern void cfs_bandwidth_usage_dec(void);

#ifdef CONFIG_NO_HZ_COMMON
enum rq_nohz_flag_bits {

Luis Henriques

unread,
Jan 13, 2014, 11:30:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Whitney <enwl...@gmail.com>

commit d0abafac8c9162f39c4f6b2f8141b772a09b3770 upstream.

Commit f5a44db5d2 introduced a regression on filesystems created with
the bigalloc feature (cluster size > blocksize). It causes xfstests
generic/006 and /013 to fail with an unexpected JBD2 failure and
transaction abort that leaves the test file system in a read only state.
Other xfstests run on bigalloc file systems are likely to fail as well.

The cause is the accidental use of a cluster mask where a cluster
offset was needed in ext4_ext_map_blocks().

Signed-off-by: Eric Whitney <enwl...@gmail.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
fs/ext4/extents.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index f8d6668..878338f 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4145,7 +4145,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
*/
map->m_flags &= ~EXT4_MAP_FROM_CLUSTER;
newex.ee_block = cpu_to_le32(map->m_lblk);
- cluster_offset = EXT4_LBLK_CMASK(sbi, map->m_lblk);
+ cluster_offset = EXT4_LBLK_COFF(sbi, map->m_lblk);

/*
* If we are doing bigalloc, check to see if the extent returned

Luis Henriques

unread,
Jan 13, 2014, 11:30:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Catalin Marinas <catalin...@arm.com>

commit 4f00130b70e5eee813cc7bc298e0f3fdf79673cc upstream.

This provides better performance compared to Device GRE and also allows
unaligned accesses. Such memory is intended to be used with standard RAM
(e.g. framebuffers) and not I/O.

Signed-off-by: Catalin Marinas <catalin...@arm.com>
Cc: Mark Brown <bro...@kernel.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/arm64/include/asm/pgtable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0b27b65..bb969e9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -255,7 +255,7 @@ static inline int has_transparent_hugepage(void)
#define pgprot_noncached(prot) \
__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRnE))
#define pgprot_writecombine(prot) \
- __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_GRE))
+ __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC))
#define pgprot_dmacoherent(prot) \
__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC))
#define __HAVE_PHYS_MEM_ACCESS_PROT

Luis Henriques

unread,
Jan 13, 2014, 11:30:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Paul Drews <paul....@intel.com>

commit f6308b36c411dc5afd6a6f73e6454722bfde57b7 upstream.

This adds the new ACPI ID (INT33FC) for the BayTrail GPIO
banks as seen on a BayTrail M System-On-Chip platform. This
ACPI ID is used by the BayTrail GPIO (pinctrl) driver to
manage the Low Power Subsystem (LPSS).

Signed-off-by: Paul Drews <paul....@intel.com>
Acked-by: Linus Walleij <linus....@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/acpi/acpi_lpss.c | 1 +
drivers/pinctrl/pinctrl-baytrail.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c
index fb78bb9..ab19263 100644
--- a/drivers/acpi/acpi_lpss.c
+++ b/drivers/acpi/acpi_lpss.c
@@ -156,6 +156,7 @@ static const struct acpi_device_id acpi_lpss_device_ids[] = {
{ "80860F14", (unsigned long)&byt_sdio_dev_desc },
{ "80860F41", (unsigned long)&byt_i2c_dev_desc },
{ "INT33B2", },
+ { "INT33FC", },

{ }
};
diff --git a/drivers/pinctrl/pinctrl-baytrail.c b/drivers/pinctrl/pinctrl-baytrail.c
index e9d735d..9041a6e 100644
--- a/drivers/pinctrl/pinctrl-baytrail.c
+++ b/drivers/pinctrl/pinctrl-baytrail.c
@@ -508,6 +508,7 @@ static const struct dev_pm_ops byt_gpio_pm_ops = {

static const struct acpi_device_id byt_gpio_acpi_match[] = {
{ "INT33B2", 0 },
+ { "INT33FC", 0 },
{ }
};
MODULE_DEVICE_TABLE(acpi, byt_gpio_acpi_match);

Luis Henriques

unread,
Jan 13, 2014, 11:30:03 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit 0c5f83c23ca703d32f930393825487257a5cde6d upstream.

The TLB must be flushed if the PTE is updated but change_pte_range is
clearing the PTE while marking PTEs pte_numa without necessarily
flushing the TLB if it reinserts the same entry. Without the flush,
it's conceivable that two processors have different TLBs for the same
virtual address and at the very least it would generate spurious faults.

This patch only unmaps the pages in change_pte_range for a full
protection change.

[ri...@redhat.com: write pte_numa pte back to the page tables]
Signed-off-by: Mel Gorman <mgo...@suse.de>
Signed-off-by: Rik van Riel <ri...@redhat.com>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Cc: Chegu Vinod <chegu...@hp.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/mprotect.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 6c3f56f..3277121 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -54,13 +54,14 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
pte_t ptent;
bool updated = false;

- ptent = ptep_modify_prot_start(mm, addr, pte);
if (!prot_numa) {
+ ptent = ptep_modify_prot_start(mm, addr, pte);
ptent = pte_modify(ptent, newprot);
updated = true;
} else {
struct page *page;

+ ptent = *pte;
page = vm_normal_page(vma, addr, oldpte);
if (page) {
int this_nid = page_to_nid(page);
@@ -73,6 +74,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
if (!pte_numa(oldpte) &&
page_mapcount(page) == 1) {
ptent = pte_mknuma(ptent);
+ set_pte_at(mm, addr, pte, ptent);
updated = true;
}
}
@@ -89,7 +91,10 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,

if (updated)
pages++;
- ptep_modify_prot_commit(mm, addr, pte, ptent);
+
+ /* Only !prot_numa always clears the pte */
+ if (!prot_numa)
+ ptep_modify_prot_commit(mm, addr, pte, ptent);
} else if (IS_ENABLED(CONFIG_MIGRATION) && !pte_file(oldpte)) {
swp_entry_t entry = pte_to_swp_entry(oldpte);

Luis Henriques

unread,
Jan 13, 2014, 11:30:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit b0943d61b8fa420180f92f64ef67662b4f6cc493 upstream.

THP migration can fail for a variety of reasons. Avoid flushing the TLB
to deal with THP migration races until the copy is ready to start.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/huge_memory.c | 7 -------
mm/migrate.c | 3 +++
2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 37a8cd8..31950d6 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1372,13 +1372,6 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
}

/*
- * The page_table_lock above provides a memory barrier
- * with change_protection_range.
- */
- if (mm_tlb_flush_pending(mm))
- flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
-
- /*
* Migrate the THP to the requested node, returns with page unlocked
* and pmd_numa cleared.
*/
diff --git a/mm/migrate.c b/mm/migrate.c
index c033a04..5130147 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1702,6 +1702,9 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
goto out_fail;
}

+ if (mm_tlb_flush_pending(mm))
+ flush_tlb_range(vma, mmun_start, mmun_end);
+
/* Prepare a page as a migration target */
__set_page_locked(new_page);
SetPageSwapBacked(new_page);

Luis Henriques

unread,
Jan 13, 2014, 11:30:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <han...@stressinduktion.org>

commit 239c78db9c41a8f524cce60507440d72229d73bc upstream.

We must clear local_df when passing the skb between namespaces as the
packet is not local to the new namespace any more and thus may not get
fragmented by local rules. Fred Templin noticed that other namespaces
do fragment IPv6 packets while forwarding. Instead they should have send
back a PTB.

The same problem should be present when forwarding DF-IPv4 packets
between namespaces.

Reported-by: Templin, Fred L <Fred.L....@boeing.com>
Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/core/skbuff.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a75022e..b2cd9a4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3511,6 +3511,7 @@ void skb_scrub_packet(struct sk_buff *skb)
skb->tstamp.tv64 = 0;
skb->pkt_type = PACKET_HOST;
skb->skb_iif = 0;
+ skb->local_df = 0;
skb_dst_drop(skb);
skb->mark = 0;
secpath_reset(skb);

Luis Henriques

unread,
Jan 13, 2014, 11:30:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Martin Schwidefsky <schwi...@de.ibm.com>

commit 36d9f4d3b68c7035ead3850dc85f310a579ed0eb upstream.

The tty3270_alloc_screen function is called from tty3270_install with
swapped arguments, the number of columns instead of rows and vice versa.
The number of rows is typically smaller than the number of columns which
makes the screen array too big but the individual cell arrays for the
lines too small. Creating lines longer than the number of rows will
clobber the memory after the end of the cell array.
The fix is simple, call tty3270_alloc_screen with the correct argument
order.

Signed-off-by: Martin Schwidefsky <schwi...@de.ibm.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/s390/char/tty3270.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/char/tty3270.c b/drivers/s390/char/tty3270.c
index cee69da..4dd71ca 100644
--- a/drivers/s390/char/tty3270.c
+++ b/drivers/s390/char/tty3270.c
@@ -942,7 +942,7 @@ static int tty3270_install(struct tty_driver *driver, struct tty_struct *tty)
return rc;
}

- tp->screen = tty3270_alloc_screen(tp->view.cols, tp->view.rows);
+ tp->screen = tty3270_alloc_screen(tp->view.rows, tp->view.cols);
if (IS_ERR(tp->screen)) {
rc = PTR_ERR(tp->screen);
raw3270_put_view(&tp->view);

Luis Henriques

unread,
Jan 13, 2014, 11:30:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit 2b4847e73004c10ae6666c2e27b5c5430aed8698 upstream.

Base pages are unmapped and flushed from cache and TLB during normal
page migration and replaced with a migration entry that causes any
parallel NUMA hinting fault or gup to block until migration completes.

THP does not unmap pages due to a lack of support for migration entries
at a PMD level. This allows races with get_user_pages and
get_user_pages_fast which commit 3f926ab945b6 ("mm: Close races between
THP migration and PMD numa clearing") made worse by introducing a
pmd_clear_flush().

This patch forces get_user_page (fast and normal) on a pmd_numa page to
go through the slow get_user_page path where it will serialise against
THP migration and properly account for the NUMA hinting fault. On the
migration side the page table lock is taken for each PTE update.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/x86/mm/gup.c | 13 +++++++++++++
mm/huge_memory.c | 24 ++++++++++++++++--------
mm/migrate.c | 38 +++++++++++++++++++++++++++++++-------
3 files changed, 60 insertions(+), 15 deletions(-)

diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
index dd74e46..0596e8e 100644
--- a/arch/x86/mm/gup.c
+++ b/arch/x86/mm/gup.c
@@ -83,6 +83,12 @@ static noinline int gup_pte_range(pmd_t pmd, unsigned long addr,
pte_t pte = gup_get_pte(ptep);
struct page *page;

+ /* Similar to the PMD case, NUMA hinting must take slow path */
+ if (pte_numa(pte)) {
+ pte_unmap(ptep);
+ return 0;
+ }
+
if ((pte_flags(pte) & (mask | _PAGE_SPECIAL)) != mask) {
pte_unmap(ptep);
return 0;
@@ -167,6 +173,13 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
if (pmd_none(pmd) || pmd_trans_splitting(pmd))
return 0;
if (unlikely(pmd_large(pmd))) {
+ /*
+ * NUMA hinting faults need to be handled in the GUP
+ * slowpath for accounting purposes and so that they
+ * can be serialised against THP migration.
+ */
+ if (pmd_numa(pmd))
+ return 0;
if (!gup_huge_pmd(pmd, addr, next, write, pages, nr))
return 0;
} else {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 12acb0b..72da10d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1252,6 +1252,10 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma,
if ((flags & FOLL_DUMP) && is_huge_zero_pmd(*pmd))
return ERR_PTR(-EFAULT);

+ /* Full NUMA hinting faults to serialise migration in fault paths */
+ if ((flags & FOLL_NUMA) && pmd_numa(*pmd))
+ goto out;
+
page = pmd_page(*pmd);
VM_BUG_ON(!PageHead(page));
if (flags & FOLL_TOUCH) {
@@ -1318,23 +1322,27 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
/* If the page was locked, there are no parallel migrations */
if (page_locked)
goto clear_pmdnuma;
+ }

- /*
- * Otherwise wait for potential migrations and retry. We do
- * relock and check_same as the page may no longer be mapped.
- * As the fault is being retried, do not account for it.
- */
+ /*
+ * If there are potential migrations, wait for completion and retry. We
+ * do not relock and check_same as the page may no longer be mapped.
+ * Furtermore, even if the page is currently misplaced, there is no
+ * guarantee it is still misplaced after the migration completes.
+ */
+ if (!page_locked) {
spin_unlock(&mm->page_table_lock);
wait_on_page_locked(page);
page_nid = -1;
goto out;
}

- /* Page is misplaced, serialise migrations and parallel THP splits */
+ /*
+ * Page is misplaced. Page lock serialises migrations. Acquire anon_vma
+ * to serialises splits
+ */
get_page(page);
spin_unlock(&mm->page_table_lock);
- if (!page_locked)
- lock_page(page);
anon_vma = page_lock_anon_vma_read(page);

/* Confirm the PTE did not while locked */
diff --git a/mm/migrate.c b/mm/migrate.c
index d22f6f0..b3ffb81 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1658,6 +1658,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
struct page *new_page = NULL;
struct mem_cgroup *memcg = NULL;
int page_lru = page_is_file_cache(page);
+ pmd_t orig_entry;

/*
* Don't migrate pages that are mapped in multiple processes.
@@ -1699,7 +1700,8 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,

/* Recheck the target PMD */
spin_lock(&mm->page_table_lock);
- if (unlikely(!pmd_same(*pmd, entry))) {
+ if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) {
+fail_putback:
spin_unlock(&mm->page_table_lock);

/* Reverse changes made by migrate_page_copy() */
@@ -1729,16 +1731,34 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
*/
mem_cgroup_prepare_migration(page, new_page, &memcg);

+ orig_entry = *pmd;
entry = mk_pmd(new_page, vma->vm_page_prot);
- entry = pmd_mknonnuma(entry);
- entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
entry = pmd_mkhuge(entry);
+ entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);

+ /*
+ * Clear the old entry under pagetable lock and establish the new PTE.
+ * Any parallel GUP will either observe the old page blocking on the
+ * page lock, block on the page table lock or observe the new page.
+ * The SetPageUptodate on the new page and page_add_new_anon_rmap
+ * guarantee the copy is visible before the pagetable update.
+ */
+ flush_cache_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
+ page_add_new_anon_rmap(new_page, vma, haddr);
pmdp_clear_flush(vma, haddr, pmd);
set_pmd_at(mm, haddr, pmd, entry);
- page_add_new_anon_rmap(new_page, vma, haddr);
update_mmu_cache_pmd(vma, address, &entry);
+
+ if (page_count(page) != 2) {
+ set_pmd_at(mm, haddr, pmd, orig_entry);
+ flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
+ update_mmu_cache_pmd(vma, address, &entry);
+ page_remove_rmap(new_page);
+ goto fail_putback;
+ }
+
page_remove_rmap(page);
+
/*
* Finish the charge transaction under the page table lock to
* prevent split_huge_page() from dividing up the charge
@@ -1763,9 +1783,13 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
out_fail:
count_vm_events(PGMIGRATE_FAIL, HPAGE_PMD_NR);
out_dropref:
- entry = pmd_mknonnuma(entry);
- set_pmd_at(mm, haddr, pmd, entry);
- update_mmu_cache_pmd(vma, address, &entry);
+ spin_lock(&mm->page_table_lock);
+ if (pmd_same(*pmd, entry)) {
+ entry = pmd_mknonnuma(entry);
+ set_pmd_at(mm, haddr, pmd, entry);
+ update_mmu_cache_pmd(vma, address, &entry);
+ }
+ spin_unlock(&mm->page_table_lock);

unlock_page(page);
put_page(page);

Luis Henriques

unread,
Jan 13, 2014, 11:30:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit 5a6dac3ec5f583cc8ee7bc53b5500a207c4ca433 upstream.

If the PMD is flushed then a parallel fault in handle_mm_fault() will
enter the pmd_none and do_huge_pmd_anonymous_page() path where it'll
attempt to insert a huge zero page. This is wasteful so the patch
avoids clearing the PMD when setting pmd_numa.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/huge_memory.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 72da10d..ac2546c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1486,20 +1486,22 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,

if (__pmd_trans_huge_lock(pmd, vma) == 1) {
pmd_t entry;
- entry = pmdp_get_and_clear(mm, addr, pmd);
if (!prot_numa) {
+ entry = pmdp_get_and_clear(mm, addr, pmd);
entry = pmd_modify(entry, newprot);
BUG_ON(pmd_write(entry));
+ set_pmd_at(mm, addr, pmd, entry);
} else {
struct page *page = pmd_page(*pmd);
+ entry = *pmd;

/* only check non-shared pages */
if (page_mapcount(page) == 1 &&
!pmd_numa(*pmd)) {
entry = pmd_mknuma(entry);
+ set_pmd_at(mm, addr, pmd, entry);
}
}
- set_pmd_at(mm, addr, pmd, entry);
spin_unlock(&vma->vm_mm->page_table_lock);
ret = 1;

Luis Henriques

unread,
Jan 13, 2014, 11:30:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit de466bd628e8d663fdf3f791bc8db318ee85c714 upstream.

do_huge_pmd_numa_page() handles the case where there is parallel THP
migration. However, by the time it is checked the NUMA hinting
information has already been disrupted. This patch adds an earlier
check with some helpers.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
include/linux/migrate.h | 10 +++++++++-
mm/huge_memory.c | 22 ++++++++++++++++------
mm/migrate.c | 12 ++++++++++++
3 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index a405d3dc..a7ec0a0 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -92,10 +92,18 @@ static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
#endif /* CONFIG_MIGRATION */

#ifdef CONFIG_NUMA_BALANCING
-extern int migrate_misplaced_page(struct page *page, int node);
+extern bool pmd_trans_migrating(pmd_t pmd);
+extern void wait_migrate_huge_page(struct anon_vma *anon_vma, pmd_t *pmd);
extern int migrate_misplaced_page(struct page *page, int node);
extern bool migrate_ratelimited(int node);
#else
+static inline bool pmd_trans_migrating(pmd_t pmd)
+{
+ return false;
+}
+static inline void wait_migrate_huge_page(struct anon_vma *anon_vma, pmd_t *pmd)
+{
+}
static inline int migrate_misplaced_page(struct page *page, int node)
{
return -EAGAIN; /* can't migrate now */
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 1385bf9..31950d6 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -899,6 +899,10 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm,
ret = 0;
goto out_unlock;
}
+
+ /* mmap_sem prevents this happening but warn if that changes */
+ WARN_ON(pmd_trans_migrating(pmd));
+
if (unlikely(pmd_trans_splitting(pmd))) {
/* split huge page running from under us */
spin_unlock(&src_mm->page_table_lock);
@@ -1306,6 +1310,17 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
if (unlikely(!pmd_same(pmd, *pmdp)))
goto out_unlock;

+ /*
+ * If there are potential migrations, wait for completion and retry
+ * without disrupting NUMA hinting information. Do not relock and
+ * check_same as the page may no longer be mapped.
+ */
+ if (unlikely(pmd_trans_migrating(*pmdp))) {
+ spin_unlock(&mm->page_table_lock);
+ wait_migrate_huge_page(vma->anon_vma, pmdp);
+ goto out;
+ }
+
page = pmd_page(pmd);
page_nid = page_to_nid(page);
count_vm_numa_event(NUMA_HINT_FAULTS);
@@ -1324,12 +1339,7 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
goto clear_pmdnuma;
}

- /*
- * If there are potential migrations, wait for completion and retry. We
- * do not relock and check_same as the page may no longer be mapped.
- * Furtermore, even if the page is currently misplaced, there is no
- * guarantee it is still misplaced after the migration completes.
- */
+ /* Migration could have started since the pmd_trans_migrating check */
if (!page_locked) {
spin_unlock(&mm->page_table_lock);
wait_on_page_locked(page);
diff --git a/mm/migrate.c b/mm/migrate.c
index cde4b34..c033a04 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1594,6 +1594,18 @@ int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
return 1;
}

+bool pmd_trans_migrating(pmd_t pmd)
+{
+ struct page *page = pmd_page(pmd);
+ return PageLocked(page);
+}
+
+void wait_migrate_huge_page(struct anon_vma *anon_vma, pmd_t *pmd)
+{
+ struct page *page = pmd_page(*pmd);
+ wait_on_page_locked(page);
+}
+
/*
* Attempt to migrate a misplaced page to the specified destination
* node. Caller is expected to have an elevated reference count on

Luis Henriques

unread,
Jan 13, 2014, 11:30:04 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

commit 1874119664dafda3ef2ed9b51b4759a9540d4a1a upstream.

We've tried to fix the error paths in this function before, but there
is still a hidden goto in the ceph_decode_need() macro which goes to the
wrong place. We need to release the "req" and unlock a mutex before
returning.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Reviewed-by: Sage Weil <sa...@inktank.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
net/ceph/osd_client.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index dbc0a73..559a832 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1488,14 +1488,14 @@ static void handle_reply(struct ceph_osd_client *osdc, struct ceph_msg *msg,
dout("handle_reply %p tid %llu req %p result %d\n", msg, tid,
req, result);

- ceph_decode_need(&p, end, 4, bad);
+ ceph_decode_need(&p, end, 4, bad_put);
numops = ceph_decode_32(&p);
if (numops > CEPH_OSD_MAX_OP)
goto bad_put;
if (numops != req->r_num_ops)
goto bad_put;
payload_len = 0;
- ceph_decode_need(&p, end, numops * sizeof(struct ceph_osd_op), bad);
+ ceph_decode_need(&p, end, numops * sizeof(struct ceph_osd_op), bad_put);
for (i = 0; i < numops; i++) {
struct ceph_osd_op *op = p;
int len;
@@ -1513,7 +1513,7 @@ static void handle_reply(struct ceph_osd_client *osdc, struct ceph_msg *msg,
goto bad_put;
}

- ceph_decode_need(&p, end, 4 + numops * 4, bad);
+ ceph_decode_need(&p, end, 4 + numops * 4, bad_put);
retry_attempt = ceph_decode_32(&p);
for (i = 0; i < numops; i++)
req->r_reply_op_result[i] = ceph_decode_32(&p);

Luis Henriques

unread,
Jan 13, 2014, 11:30:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Segall <bse...@google.com>

commit 927b54fccbf04207ec92f669dce6806848cbec7d upstream.

__start_cfs_bandwidth calls hrtimer_cancel while holding rq->lock,
waiting for the hrtimer to finish. However, if sched_cfs_period_timer
runs for another loop iteration, the hrtimer can attempt to take
rq->lock, resulting in deadlock.

Fix this by ensuring that cfs_b->timer_active is cleared only if the
_latest_ call to do_sched_cfs_period_timer is returning as idle. Then
__start_cfs_bandwidth can just call hrtimer_try_to_cancel and wait for
that to succeed or timer_active == 1.

Signed-off-by: Ben Segall <bse...@google.com>
Signed-off-by: Peter Zijlstra <pet...@infradead.org>
Cc: p...@google.com
Link: http://lkml.kernel.org/r/20131016181622.2...@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
Cc: Chris J Arges <chris....@canonical.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
kernel/sched/fair.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 260f98ec..5c83396 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2459,6 +2459,13 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun)
if (idle)
goto out_unlock;

+ /*
+ * if we have relooped after returning idle once, we need to update our
+ * status as actually running, so that other cpus doing
+ * __start_cfs_bandwidth will stop trying to cancel us.
+ */
+ cfs_b->timer_active = 1;
+
__refill_cfs_bandwidth_runtime(cfs_b);

if (!throttled) {
@@ -2727,11 +2734,11 @@ void __start_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
* (timer_active==0 becomes visible before the hrtimer call-back
* terminates). In either case we ensure that it's re-programmed
*/
- while (unlikely(hrtimer_active(&cfs_b->period_timer))) {
+ while (unlikely(hrtimer_active(&cfs_b->period_timer)) &&
+ hrtimer_try_to_cancel(&cfs_b->period_timer) < 0) {
+ /* bounce the lock to allow do_sched_cfs_period_timer to run */
raw_spin_unlock(&cfs_b->lock);
- /* ensure cfs_b->lock is available while we wait */
- hrtimer_cancel(&cfs_b->period_timer);
-
+ cpu_relax();
raw_spin_lock(&cfs_b->lock);
/* if someone else restarted the timer then we're done */
if (cfs_b->timer_active)

Luis Henriques

unread,
Jan 13, 2014, 11:30:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit 3c67f474558748b604e247d92b55dfe89654c81d upstream.

Inaccessible VMA should not be trapping NUMA hint faults. Skip them.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
kernel/sched/fair.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 66dae75..a07a479 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -974,6 +974,13 @@ void task_numa_work(struct callback_head *work)
if (vma->vm_end - vma->vm_start < HPAGE_SIZE)
continue;

+ /*
+ * Skip inaccessible VMAs to avoid any confusion between
+ * PROT_NONE and NUMA hinting ptes
+ */
+ if (!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)))
+ continue;
+
do {
start = max(start, vma->vm_start);
end = ALIGN(start + (pages << PAGE_SHIFT), HPAGE_SIZE);

Luis Henriques

unread,
Jan 13, 2014, 11:30:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Vladimir Davydov <vdav...@parallels.com>

commit 695c60830764945cf61a2cc623eb1392d137223e upstream.

The mem_cgroup structure contains nr_node_ids pointers to
mem_cgroup_per_node objects, not the objects themselves.

Signed-off-by: Vladimir Davydov <vdav...@parallels.com>
Acked-by: Michal Hocko <mho...@suse.cz>
Cc: Glauber Costa <glo...@openvz.org>
Cc: Johannes Weiner <han...@cmpxchg.org>
Cc: Balbir Singh <bsing...@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezaw...@jp.fujitsu.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/memcontrol.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index aa44621..97f2550 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -349,7 +349,7 @@ struct mem_cgroup {
static size_t memcg_size(void)
{
return sizeof(struct mem_cgroup) +
- nr_node_ids * sizeof(struct mem_cgroup_per_node);
+ nr_node_ids * sizeof(struct mem_cgroup_per_node *);
}

/* internal only representation about the status of kmem accounting. */

Luis Henriques

unread,
Jan 13, 2014, 11:30:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: majianpeng <majia...@gmail.com>

commit 2fbcbff1d6b9243ef71c64a8ab993bc3c7bb7af1 upstream.

Func ceph_calc_ceph_pg maybe failed.So add check for returned value.

Signed-off-by: Jianpeng Ma <majia...@gmail.com>
Reviewed-by: Sage Weil <sa...@inktank.com>
Signed-off-by: Sage Weil <sa...@inktank.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
fs/ceph/ioctl.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c
index a5ce62e..669622f 100644
--- a/fs/ceph/ioctl.c
+++ b/fs/ceph/ioctl.c
@@ -211,8 +211,12 @@ static long ceph_ioctl_get_dataloc(struct file *file, void __user *arg)
snprintf(dl.object_name, sizeof(dl.object_name), "%llx.%08llx",
ceph_ino(inode), dl.object_no);

- ceph_calc_ceph_pg(&pgid, dl.object_name, osdc->osdmap,
- ceph_file_layout_pg_pool(ci->i_layout));
+ r = ceph_calc_ceph_pg(&pgid, dl.object_name, osdc->osdmap,
+ ceph_file_layout_pg_pool(ci->i_layout));
+ if (r < 0) {
+ up_read(&osdc->map_sem);
+ return r;
+ }

dl.osd = ceph_calc_pg_primary(osdc->osdmap, pgid);
if (dl.osd >= 0) {

Luis Henriques

unread,
Jan 13, 2014, 11:30:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgo...@suse.de>

commit 1667918b6483b12a6496bf54151b827b8235d7b1 upstream.

On a protection change it is no longer clear if the page should be still
accessible. This patch clears the NUMA hinting fault bits on a
protection change.

Signed-off-by: Mel Gorman <mgo...@suse.de>
Reviewed-by: Rik van Riel <ri...@redhat.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
mm/huge_memory.c | 2 ++
mm/mprotect.c | 2 ++
2 files changed, 4 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0db1517..1385bf9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1495,6 +1495,8 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
pmd_t entry;
if (!prot_numa) {
entry = pmdp_get_and_clear(mm, addr, pmd);
+ if (pmd_numa(entry))
+ entry = pmd_mknonnuma(entry);
entry = pmd_modify(entry, newprot);
BUG_ON(pmd_write(entry));
set_pmd_at(mm, addr, pmd, entry);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 3277121..00edb75 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -56,6 +56,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,

if (!prot_numa) {
ptent = ptep_modify_prot_start(mm, addr, pte);
+ if (pte_numa(ptent))
+ ptent = pte_mknonnuma(ptent);
ptent = pte_modify(ptent, newprot);
updated = true;
} else {

Luis Henriques

unread,
Jan 13, 2014, 11:30:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Paul Turner <p...@google.com>

commit 0ac9b1c21874d2490331233b3242085f8151e166 upstream.

Currently, group entity load-weights are initialized to zero. This
admits some races with respect to the first time they are re-weighted in
earlty use. ( Let g[x] denote the se for "g" on cpu "x". )

Suppose that we have root->a and that a enters a throttled state,
immediately followed by a[0]->t1 (the only task running on cpu[0])
blocking:

put_prev_task(group_cfs_rq(a[0]), t1)
put_prev_entity(..., t1)
check_cfs_rq_runtime(group_cfs_rq(a[0]))
throttle_cfs_rq(group_cfs_rq(a[0]))

Then, before unthrottling occurs, let a[0]->b[0]->t2 wake for the first
time:

enqueue_task_fair(rq[0], t2)
enqueue_entity(group_cfs_rq(b[0]), t2)
enqueue_entity_load_avg(group_cfs_rq(b[0]), t2)
account_entity_enqueue(group_cfs_ra(b[0]), t2)
update_cfs_shares(group_cfs_rq(b[0]))
< skipped because b is part of a throttled hierarchy >
enqueue_entity(group_cfs_rq(a[0]), b[0])
...

We now have b[0] enqueued, yet group_cfs_rq(a[0])->load.weight == 0
which violates invariants in several code-paths. Eliminate the
possibility of this by initializing group entity weight.

Signed-off-by: Paul Turner <p...@google.com>
Signed-off-by: Peter Zijlstra <pet...@infradead.org>
Link: http://lkml.kernel.org/r/20131016181627.2...@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
Cc: Chris J Arges <chris....@canonical.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5c83396..b051edd 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6131,7 +6131,8 @@ void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
se->cfs_rq = parent->my_q;

se->my_q = cfs_rq;
- update_load_set(&se->load, 0);
+ /* guarantee group entities always have weight */
+ update_load_set(&se->load, NICE_0_LOAD);
se->parent = parent;

Luis Henriques

unread,
Jan 13, 2014, 11:30:05 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Segall <bse...@google.com>

commit db06e78cc13d70f10877e0557becc88ab3ad2be8 upstream.

hrtimer_expires_remaining does not take internal hrtimer locks and thus
must be guarded against concurrent __hrtimer_start_range_ns (but
returning HRTIMER_RESTART is safe). Use cfs_b->lock to make it safe.

Signed-off-by: Ben Segall <bse...@google.com>
Signed-off-by: Peter Zijlstra <pet...@infradead.org>
Cc: p...@google.com
Link: http://lkml.kernel.org/r/20131016181617.2...@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
Cc: Chris J Arges <chris....@canonical.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
kernel/sched/fair.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 78a904d..260f98ec 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2519,7 +2519,13 @@ static const u64 min_bandwidth_expiration = 2 * NSEC_PER_MSEC;
/* how long we wait to gather additional slack before distributing */
static const u64 cfs_bandwidth_slack_period = 5 * NSEC_PER_MSEC;

-/* are we near the end of the current quota period? */
+/*
+ * Are we near the end of the current quota period?
+ *
+ * Requires cfs_b->lock for hrtimer_expires_remaining to be safe against the
+ * hrtimer base being cleared by __hrtimer_start_range_ns. In the case of
+ * migrate_hrtimers, base is never cleared, so we are fine.
+ */
static int runtime_refresh_within(struct cfs_bandwidth *cfs_b, u64 min_expire)
{
struct hrtimer *refresh_timer = &cfs_b->period_timer;
@@ -2595,10 +2601,12 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b)
u64 expires;

/* confirm we're still not at a refresh boundary */
- if (runtime_refresh_within(cfs_b, min_bandwidth_expiration))
+ raw_spin_lock(&cfs_b->lock);
+ if (runtime_refresh_within(cfs_b, min_bandwidth_expiration)) {
+ raw_spin_unlock(&cfs_b->lock);
return;
+ }

- raw_spin_lock(&cfs_b->lock);
if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice) {
runtime = cfs_b->runtime;
cfs_b->runtime = 0;

Luis Henriques

unread,
Jan 13, 2014, 11:30:06 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Jiang Liu <jian...@huawei.com>

commit 6db83cea1c975b9a102e17def7d2795814e1ae2b upstream.

If context switching happens during executing fpsimd_flush_thread(),
stale value in FPSIMD registers will be saved into current thread's
fpsimd_state by fpsimd_thread_switch(). That may cause invalid
initialization state for the new process, so disable preemption
when executing fpsimd_flush_thread().

Signed-off-by: Jiang Liu <jian...@huawei.com>
Cc: Jiang Liu <liu...@gmail.com>
Signed-off-by: Catalin Marinas <catalin...@arm.com>
Cc: Mark Brown <bro...@kernel.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/arm64/kernel/fpsimd.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index e8b8357..2fa308e 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -79,8 +79,10 @@ void fpsimd_thread_switch(struct task_struct *next)

void fpsimd_flush_thread(void)
{
+ preempt_disable();
memset(&current->thread.fpsimd_state, 0, sizeof(struct fpsimd_state));
fpsimd_load_state(&current->thread.fpsimd_state);
+ preempt_enable();
}

/*

Luis Henriques

unread,
Jan 13, 2014, 11:30:07 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Nobuhiro Iwamatsu <nobuhiro.i...@renesas.com>

commit ad70b029d2c678386384bd72c7fa2705c449b518 upstream.

Min_low_pfn and max_low_pfn were used in pfn_valid macro if defined
CONFIG_FLATMEM. When the functions that use the pfn_valid is used in
driver module, max_low_pfn and min_low_pfn is to undefined, and fail to
build.

ERROR: "min_low_pfn" [drivers/block/aoe/aoe.ko] undefined!
ERROR: "max_low_pfn" [drivers/block/aoe/aoe.ko] undefined!
make[2]: *** [__modpost] Error 1
make[1]: *** [modules] Error 2

This patch fix this problem.

Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.i...@renesas.com>
Cc: Kuninori Morimoto <kuninori.m...@gmail.com>
Cc: Paul Mundt <let...@linux-sh.org>
Cc: Geert Uytterhoeven <ge...@linux-m68k.org>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/sh/kernel/sh_ksyms_32.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/sh/kernel/sh_ksyms_32.c b/arch/sh/kernel/sh_ksyms_32.c
index 2a0a596..d77f2f6 100644
--- a/arch/sh/kernel/sh_ksyms_32.c
+++ b/arch/sh/kernel/sh_ksyms_32.c
@@ -20,6 +20,11 @@ EXPORT_SYMBOL(csum_partial_copy_generic);
EXPORT_SYMBOL(copy_page);
EXPORT_SYMBOL(__clear_user);
EXPORT_SYMBOL(empty_zero_page);
+#ifdef CONFIG_FLATMEM
+/* need in pfn_valid macro */
+EXPORT_SYMBOL(min_low_pfn);
+EXPORT_SYMBOL(max_low_pfn);
+#endif

#define DECLARE_EXPORT(name) \
extern void name(void);EXPORT_SYMBOL(name)

Luis Henriques

unread,
Jan 13, 2014, 11:30:06 AM1/13/14
to
(adding back missing CC:)

On Mon, Jan 13, 2014 at 06:10:22PM +0200, Timo Teras wrote:
> On Mon, 13 Jan 2014 16:00:21 +0000
> Luis Henriques <luis.he...@canonical.com> wrote:
>
> > 3.11.10.3 -stable review patch. If anyone has any objections, please
> > let me know.
>
> Does it build? Seems you have missed the hunk to the headers that
> implement skb_pop_mac_header().

Well, it does build because patch:

[PATCH 3.11 178/208] ipv6: fix illegal mac_header comparison on 32bit

actually added that hunk. This is a fix that is stable-specific (not in
mainline). Also, I've used David's stable patches for 3.10 as a base for
this patch.

Maybe this was a mistake as skb_pop_mac_header() doesn't seem to be
required by patch 178 in this serie...?

Cheers,
--
Luis

> >
> > ------------------
> >
> > From: =?UTF-8?q?Timo=20Ter=C3=A4s?= <timo....@iki.fi>
> >
> > commit 0e3da5bb8da45890b1dc413404e0f978ab71173e upstream.
> >
> > ipgre_header_parse() needs to parse the tunnel's ip header and it
> > uses mac_header to locate the iphdr. This got broken when gre
> > tunneling was refactored as mac_header is no longer updated to point
> > to iphdr. Introduce skb_pop_mac_header() helper to do the mac_header
> > assignment and use it in ipgre_rcv() to fix msg_name parsing.
> >
> > Bug introduced in commit c54419321455 (GRE: Refactor GRE tunneling
> > code.)
> >
> > Cc: Pravin B Shelar <psh...@nicira.com>
> > Signed-off-by: Timo Ter�s <timo....@iki.fi>
> > Signed-off-by: David S. Miller <da...@davemloft.net>
> > Signed-off-by: Luis Henriques <luis.he...@canonical.com>
> > ---
> > net/ipv4/ip_gre.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> > index 8d6939e..2977400 100644
> > --- a/net/ipv4/ip_gre.c
> > +++ b/net/ipv4/ip_gre.c
> > @@ -217,6 +217,7 @@ static int ipgre_rcv(struct sk_buff *skb, const
> > struct tnl_ptk_info *tpi) iph->saddr, iph->daddr, tpi->key);
> >
> > if (tunnel) {
> > + skb_pop_mac_header(skb);
> > ip_tunnel_rcv(tunnel, skb, tpi, log_ecn_error);
> > return PACKET_RCVD;

Luis Henriques

unread,
Jan 13, 2014, 11:30:07 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <ti...@suse.de>

commit 6962d914f317b119e0db7189199b21ec77a4b3e0 upstream.

We've got regression reports that my previous fix for spurious wakeups
after S5 on HP Haswell machines leads to the automatic reboot at
shutdown on some machines. It turned out that the fix for one side
triggers another BIOS bug in other side. So, it's exclusive.

Since the original S5 wakeups have been confirmed only on HP machines,
it'd be safer to apply it only to limited machines. As a wild guess,
limiting to machines with HP PCI SSID should suffice.

This patch should be backported to kernels as old as 3.12, that
contain the commit 638298dc66ea36623dbc2757a24fc2c4ab41b016 "xhci: Fix
spurious wakeups after S5 on Haswell".

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=66171
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Sarah Sharp <sarah....@linux.intel.com>
Tested-by: <dashin...@gmail.com>
Reported-by: Niklas Schnelle <nik...@komani.de>
Reported-by: Giorgos <ganast...@gmail.com>
Reported-by: <ar...@vhex.net>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/usb/host/xhci-pci.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index c79e3b9..32ef829 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -124,7 +124,12 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
* any other sleep) on Haswell machines with LPT and LPT-LP
* with the new Intel BIOS
*/
- xhci->quirks |= XHCI_SPURIOUS_WAKEUP;
+ /* Limit the quirk to only known vendors, as this triggers
+ * yet another BIOS bug on some other machines
+ * https://bugzilla.kernel.org/show_bug.cgi?id=66171
+ */
+ if (pdev->subsystem_vendor == PCI_VENDOR_ID_HP)
+ xhci->quirks |= XHCI_SPURIOUS_WAKEUP;
}
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
pdev->device == PCI_DEVICE_ID_ASROCK_P67) {
--
1.8.3.2

Luis Henriques

unread,
Jan 13, 2014, 11:30:07 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Josh Durgin <josh....@inktank.com>

commit 20e0af67ce88c657d0601977b9941a2256afbdaa upstream.

The only user of rbd_obj_notify_ack() is rbd_watch_cb(). It used
asynchronously with no tracking of when the notify ack completes, so
it may still be in progress when the osd_client is shut down. This
results in a BUG() since the osd client assumes no requests are in
flight when it stops. Since all notifies are flushed before the
osd_client is stopped, waiting for the notify ack to complete before
returning from the watch callback ensures there are no notify acks in
flight during shutdown.

Rename rbd_obj_notify_ack() to rbd_obj_notify_ack_sync() to reflect
its new synchronous nature.

Signed-off-by: Josh Durgin <josh....@inktank.com>
Reviewed-by: Alex Elder <el...@linaro.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/block/rbd.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 16f6966..d88a27d 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -2808,7 +2808,7 @@ out_err:
obj_request_done_set(obj_request);
}

-static int rbd_obj_notify_ack(struct rbd_device *rbd_dev, u64 notify_id)
+static int rbd_obj_notify_ack_sync(struct rbd_device *rbd_dev, u64 notify_id)
{
struct rbd_obj_request *obj_request;
struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc;
@@ -2823,16 +2823,17 @@ static int rbd_obj_notify_ack(struct rbd_device *rbd_dev, u64 notify_id)
obj_request->osd_req = rbd_osd_req_create(rbd_dev, false, obj_request);
if (!obj_request->osd_req)
goto out;
- obj_request->callback = rbd_obj_request_put;

osd_req_op_watch_init(obj_request->osd_req, 0, CEPH_OSD_OP_NOTIFY_ACK,
notify_id, 0, 0);
rbd_osd_req_format_read(obj_request);

ret = rbd_obj_request_submit(osdc, obj_request);
-out:
if (ret)
- rbd_obj_request_put(obj_request);
+ goto out;
+ ret = rbd_obj_request_wait(obj_request);
+out:
+ rbd_obj_request_put(obj_request);

return ret;
}
@@ -2852,7 +2853,7 @@ static void rbd_watch_cb(u64 ver, u64 notify_id, u8 opcode, void *data)
if (ret)
rbd_warn(rbd_dev, "header refresh error (%d)\n", ret);

- rbd_obj_notify_ack(rbd_dev, notify_id);
+ rbd_obj_notify_ack_sync(rbd_dev, notify_id);
}

/*

Luis Henriques

unread,
Jan 13, 2014, 11:40:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Li Wang <liw...@ubuntukylin.com>

commit 56f91aad69444d650237295f68c195b74d888d95 upstream.

If the length of data to be read in readpage() is exactly
PAGE_CACHE_SIZE, the original code does not flush d-cache
for data consistency after finishing reading. This patches fixes
this.

Signed-off-by: Li Wang <liw...@ubuntukylin.com>
Signed-off-by: Sage Weil <sa...@inktank.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
fs/ceph/addr.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 5318a3b..0d2e0f1 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -214,9 +214,13 @@ static int readpage_nounlock(struct file *filp, struct page *page)
if (err < 0) {
SetPageError(page);
goto out;
- } else if (err < PAGE_CACHE_SIZE) {
+ } else {
+ if (err < PAGE_CACHE_SIZE) {
/* zero fill remainder of page */
- zero_user_segment(page, err, PAGE_CACHE_SIZE);
+ zero_user_segment(page, err, PAGE_CACHE_SIZE);
+ } else {
+ flush_dcache_page(page);
+ }
}
SetPageUptodate(page);

Luis Henriques

unread,
Jan 13, 2014, 11:40:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Catalin Marinas <catalin...@arm.com>

commit f3a1d7d53dccf51959aec16b574617cc6bfeca09 upstream.

This string has been moved to arch/arm64/kernel/cputable.c.

Signed-off-by: Catalin Marinas <catalin...@arm.com>
Cc: Mark Brown <bro...@kernel.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/arm64/mm/proc.S | 4 ----
1 file changed, 4 deletions(-)

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index a82ae88..f84fcf7 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -95,10 +95,6 @@ ENTRY(cpu_do_switch_mm)
ret
ENDPROC(cpu_do_switch_mm)

-cpu_name:
- .ascii "AArch64 Processor"
- .align
-
.section ".text.init", #alloc, #execinstr

/*

Luis Henriques

unread,
Jan 13, 2014, 11:40:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Rob Herring <ro...@kernel.org>

commit 13fcca8f25f4e9ce7f55da9cd353bb743236e212 upstream.

This reverts commit e38c0a1fbc5803cbacdaac0557c70ac8ca5152e7.

Nikita Yushchenko reports:
While trying to make freescale p2020ds and mpc8572ds boards working
with mainline kernel, I faced that commit e38c0a1f (Handle

Both these boards have uli1575 chip.
Corresponding part in device tree is something like

uli1575@0 {
reg = <0x0 0x0 0x0 0x0 0x0>;
#size-cells = <2>;
#address-cells = <3>;
ranges = <0x2000000 0x0 0x80000000
0x2000000 0x0 0x80000000
0x0 0x20000000

0x1000000 0x0 0x0
0x1000000 0x0 0x0
0x0 0x10000>;
isa@1e {
...

I.e. it has #address-cells = <3>

With commit e38c0a1f reverted, devices under uli1575 are registered
correctly, e.g. for rtc

OF: ** translation for device /pcie@ffe09000/pcie@0/uli1575@0/isa@1e/rtc@70 **
OF: bus is isa (na=2, ns=1) on /pcie@ffe09000/pcie@0/uli1575@0/isa@1e
OF: translating address: 00000001 00000070
OF: parent bus is default (na=3, ns=2) on /pcie@ffe09000/pcie@0/uli1575@0
OF: walking ranges...
OF: ISA map, cp=0, s=1000, da=70
OF: parent translation for: 01000000 00000000 00000000
OF: with offset: 70
OF: one level translation: 00000000 00000000 00000070
OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000/pcie@0
OF: walking ranges...
OF: default map, cp=a0000000, s=20000000, da=70
OF: default map, cp=0, s=10000, da=70
OF: parent translation for: 01000000 00000000 00000000
OF: with offset: 70
OF: one level translation: 01000000 00000000 00000070
OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000
OF: walking ranges...
OF: PCI map, cp=0, s=10000, da=70
OF: parent translation for: 01000000 00000000 00000000
OF: with offset: 70
OF: one level translation: 01000000 00000000 00000070
OF: parent bus is default (na=2, ns=2) on /
OF: walking ranges...
OF: PCI map, cp=0, s=10000, da=70
OF: parent translation for: 00000000 ffc10000
OF: with offset: 70
OF: one level translation: 00000000 ffc10070
OF: reached root node

With commit e38c0a1f in place, address translation fails:

OF: ** translation for device /pcie@ffe09000/pcie@0/uli1575@0/isa@1e/rtc@70 **
OF: bus is isa (na=2, ns=1) on /pcie@ffe09000/pcie@0/uli1575@0/isa@1e
OF: translating address: 00000001 00000070
OF: parent bus is default (na=3, ns=2) on /pcie@ffe09000/pcie@0/uli1575@0
OF: walking ranges...
OF: ISA map, cp=0, s=1000, da=70
OF: parent translation for: 01000000 00000000 00000000
OF: with offset: 70
OF: one level translation: 00000000 00000000 00000070
OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000/pcie@0
OF: walking ranges...
OF: default map, cp=a0000000, s=20000000, da=70
OF: default map, cp=0, s=10000, da=70
OF: not found !

Thierry Reding confirmed this commit was not needed after all:
"We ended up merging a different address representation for Tegra PCIe
and I've confirmed that reverting this commit doesn't cause any obvious
regressions. I think all other drivers in drivers/pci/host ended up
copying what we did on Tegra, so I wouldn't expect any other breakage
either."

There doesn't appear to be a simple way to support both behaviours, so
reverting this as nothing should be depending on the new behaviour.

Signed-off-by: Rob Herring <ro...@kernel.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/of/address.c | 8 --------
1 file changed, 8 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index b55c218..3c4b2af 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -69,14 +69,6 @@ static u64 of_bus_default_map(__be32 *addr, const __be32 *range,
(unsigned long long)cp, (unsigned long long)s,
(unsigned long long)da);

- /*
- * If the number of address cells is larger than 2 we assume the
- * mapping doesn't specify a physical address. Rather, the address
- * specifies an identifier that must match exactly.
- */
- if (na > 2 && memcmp(range, addr, na * 4) != 0)
- return OF_BAD_ADDR;
-
if (da < cp || da >= (cp + s))
return OF_BAD_ADDR;
return da - cp;

Luis Henriques

unread,
Jan 13, 2014, 11:40:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Dmitry Torokhov <dmitry....@gmail.com>

commit 28a2a2e1aedbe2d8b2301e6e0e4e63f6e4177aca upstream.

We need to make sure we allocate absinfo data when we are setting one of
EV_ABS/ABS_XXX capabilities, otherwise we may bomb when we try to emit this
event.

Rested-by: Paul Cercueil <pcer...@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry....@gmail.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/input/input.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/input/input.c b/drivers/input/input.c
index c044699..66984e2 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1866,6 +1866,10 @@ void input_set_capability(struct input_dev *dev, unsigned int type, unsigned int
break;

case EV_ABS:
+ input_alloc_absinfo(dev);
+ if (!dev->absinfo)
+ return;
+
__set_bit(code, dev->absbit);
break;

Luis Henriques

unread,
Jan 13, 2014, 11:40:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Mathy Vanhoef <vanh...@gmail.com>

commit 657eb17d87852c42b55c4b06d5425baa08b2ddb3 upstream.

Pick the MAC address of the first virtual interface as the new hardware MAC
address. Set BSSID mask according to this MAC address. This fixes CVE-2013-4579.

Signed-off-by: Mathy Vanhoef <vanh...@gmail.com>
Signed-off-by: John W. Linville <linv...@tuxdriver.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/wireless/ath/ath9k/htc_drv_main.c | 25 +++++++++++++++++--------
drivers/net/wireless/ath/ath9k/main.c | 5 +++--
2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_main.c b/drivers/net/wireless/ath/ath9k/htc_drv_main.c
index 5c1bec1..de02522 100644
--- a/drivers/net/wireless/ath/ath9k/htc_drv_main.c
+++ b/drivers/net/wireless/ath/ath9k/htc_drv_main.c
@@ -147,21 +147,26 @@ static void ath9k_htc_bssid_iter(void *data, u8 *mac, struct ieee80211_vif *vif)
struct ath9k_vif_iter_data *iter_data = data;
int i;

- for (i = 0; i < ETH_ALEN; i++)
- iter_data->mask[i] &= ~(iter_data->hw_macaddr[i] ^ mac[i]);
+ if (iter_data->hw_macaddr != NULL) {
+ for (i = 0; i < ETH_ALEN; i++)
+ iter_data->mask[i] &= ~(iter_data->hw_macaddr[i] ^ mac[i]);
+ } else {
+ iter_data->hw_macaddr = mac;
+ }
}

-static void ath9k_htc_set_bssid_mask(struct ath9k_htc_priv *priv,
+static void ath9k_htc_set_mac_bssid_mask(struct ath9k_htc_priv *priv,
struct ieee80211_vif *vif)
{
struct ath_common *common = ath9k_hw_common(priv->ah);
struct ath9k_vif_iter_data iter_data;

/*
- * Use the hardware MAC address as reference, the hardware uses it
- * together with the BSSID mask when matching addresses.
+ * Pick the MAC address of the first interface as the new hardware
+ * MAC address. The hardware will use it together with the BSSID mask
+ * when matching addresses.
*/
- iter_data.hw_macaddr = common->macaddr;
+ iter_data.hw_macaddr = NULL;
memset(&iter_data.mask, 0xff, ETH_ALEN);

if (vif)
@@ -173,6 +178,10 @@ static void ath9k_htc_set_bssid_mask(struct ath9k_htc_priv *priv,
ath9k_htc_bssid_iter, &iter_data);

memcpy(common->bssidmask, iter_data.mask, ETH_ALEN);
+
+ if (iter_data.hw_macaddr)
+ memcpy(common->macaddr, iter_data.hw_macaddr, ETH_ALEN);
+
ath_hw_setbssidmask(common);
}

@@ -1083,7 +1092,7 @@ static int ath9k_htc_add_interface(struct ieee80211_hw *hw,
goto out;
}

- ath9k_htc_set_bssid_mask(priv, vif);
+ ath9k_htc_set_mac_bssid_mask(priv, vif);

priv->vif_slot |= (1 << avp->index);
priv->nvifs++;
@@ -1148,7 +1157,7 @@ static void ath9k_htc_remove_interface(struct ieee80211_hw *hw,

ath9k_htc_set_opmode(priv);

- ath9k_htc_set_bssid_mask(priv, vif);
+ ath9k_htc_set_mac_bssid_mask(priv, vif);

/*
* Stop ANI only if there are no associated station interfaces.
diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index 5697c7a..76e6cd4 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -888,8 +888,9 @@ void ath9k_calculate_iter_data(struct ieee80211_hw *hw,
struct ath_common *common = ath9k_hw_common(ah);

/*
- * Use the hardware MAC address as reference, the hardware uses it
- * together with the BSSID mask when matching addresses.
+ * Pick the MAC address of the first interface as the new hardware
+ * MAC address. The hardware will use it together with the BSSID mask
+ * when matching addresses.
*/
memset(iter_data, 0, sizeof(*iter_data));
memset(&iter_data->mask, 0xff, ETH_ALEN);

Luis Henriques

unread,
Jan 13, 2014, 11:40:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Oleg Nesterov <ol...@redhat.com>

commit c0c1439541f5305b57a83d599af32b74182933fe upstream.

selinux_setprocattr() does ptrace_parent(p) under task_lock(p),
but task_struct->alloc_lock doesn't pin ->parent or ->ptrace,
this looks confusing and triggers the "suspicious RCU usage"
warning because ptrace_parent() does rcu_dereference_check().

And in theory this is wrong, spin_lock()->preempt_disable()
doesn't necessarily imply rcu_read_lock() we need to access
the ->parent.

Reported-by: Evan McNabb <emc...@redhat.com>
Signed-off-by: Oleg Nesterov <ol...@redhat.com>
Signed-off-by: Paul Moore <pmo...@redhat.com>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
security/selinux/hooks.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 122b956..56f7621 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -5558,11 +5558,11 @@ static int selinux_setprocattr(struct task_struct *p,
/* Check for ptracing, and update the task SID if ok.
Otherwise, leave SID unchanged and fail. */
ptsid = 0;
- task_lock(p);
+ rcu_read_lock();
tracer = ptrace_parent(p);
if (tracer)
ptsid = task_sid(tracer);
- task_unlock(p);
+ rcu_read_unlock();

if (tracer) {
error = avc_has_perm(ptsid, sid, SECCLASS_PROCESS,

Luis Henriques

unread,
Jan 13, 2014, 11:40:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Anton Blanchard <an...@samba.org>

commit 286e4f90a72c0b0621dde0294af6ed4b0baddabb upstream.

p_end is an 8 byte value embedded in the text section. This means it
is only 4 byte aligned when it should be 8 byte aligned. Fix this
by adding an explicit alignment.

This fixes an issue where POWER7 little endian builds with
CONFIG_RELOCATABLE=y fail to boot.

Signed-off-by: Anton Blanchard <an...@samba.org>
Signed-off-by: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
arch/powerpc/kernel/head_64.S | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index b61363d..192a3f5 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -467,6 +467,7 @@ _STATIC(__after_prom_start)
mtctr r8
bctr

+.balign 8
p_end: .llong _end - _stext

4: /* Now copy the rest of the kernel up to _end */

Luis Henriques

unread,
Jan 13, 2014, 11:40:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Marc Kleine-Budde <m...@pengutronix.de>

commit 20fb4eb96fb0350d28fc4d7cbfd5506711079592 upstream.

This patch fixes a memory leak in pcan_usb_pro_init(). In patch

f14e224 net: can: peak_usb: Do not do dma on the stack

the struct pcan_usb_pro_fwinfo *fi and struct pcan_usb_pro_blinfo *bi were
converted from stack to dynamic allocation va kmalloc(). However the
corresponding kfree() was not introduced.

This patch adds the missing kfree().

Reported-by: Stephane Grosjean <s.gro...@peak-system.com>
Signed-off-by: Marc Kleine-Budde <m...@pengutronix.de>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
drivers/net/can/usb/peak_usb/pcan_usb_pro.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
index 8ee9d15..263dd92 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
@@ -927,6 +927,9 @@ static int pcan_usb_pro_init(struct peak_usb_device *dev)
/* set LED in default state (end of init phase) */
pcan_usb_pro_set_led(dev, 0, 1);

+ kfree(bi);
+ kfree(fi);
+
return 0;

err_out:

Luis Henriques

unread,
Jan 13, 2014, 11:40:01 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <ty...@mit.edu>

commit f6c07cad081ba222d63623d913aafba5586c1d2c upstream.

If a handle runs out of space, we currently stop the kernel with a BUG
in jbd2_journal_dirty_metadata(). This makes it hard to figure out
what might be going on. So return an error of ENOSPC, so we can let
the file system layer figure out what is going on, to make it more
likely we can get useful debugging information). This should make it
easier to debug problems such as the one which was reported by:

https://bugzilla.kernel.org/show_bug.cgi?id=44731

The only two callers of this function are ext4_handle_dirty_metadata()
and ocfs2_journal_dirty(). The ocfs2 function will trigger a
BUG_ON(), which means there will be no change in behavior. The ext4
function will call ext4_error_inode() which will print the useful
debugging information and then handle the situation using ext4's error
handling mechanisms (i.e., which might mean halting the kernel or
remounting the file system read-only).

Also, since both file systems already call WARN_ON(), drop the WARN_ON
from jbd2_journal_dirty_metadata() to avoid two stack traces from
being displayed.

Signed-off-by: "Theodore Ts'o" <ty...@mit.edu>
Cc: ocfs2...@oss.oracle.com
Acked-by: Joel Becker <jl...@evilplan.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
fs/jbd2/transaction.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 7aa9a32..b0b74e5 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1290,7 +1290,10 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
* once a transaction -bzzz
*/
jh->b_modified = 1;
- J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
+ if (handle->h_buffer_credits <= 0) {
+ ret = -ENOSPC;
+ goto out_unlock_bh;
+ }
handle->h_buffer_credits--;
}

@@ -1373,7 +1376,6 @@ out_unlock_bh:
jbd2_journal_put_journal_head(jh);
out:
JBUFFER_TRACE(jh, "exit");
- WARN_ON(ret); /* All errors are bugs, so dump the stack */
return ret;

Luis Henriques

unread,
Jan 13, 2014, 11:40:02 AM1/13/14
to
3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

commit 688bac461ba3e9d221a879ab40b687f5d7b5b19c upstream.

We pass in a u64 value for "len" and then immediately truncate away the
upper 32 bits.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Reviewed-by: Sage Weil <sa...@inktank.com>
Reviewed-by: Alex Elder <alex....@linaro.org>
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
---
fs/ceph/file.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 2ddf061..93e53c8 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -313,9 +313,9 @@ static int striped_read(struct inode *inode,
{
struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
struct ceph_inode_info *ci = ceph_inode(inode);
- u64 pos, this_len;
+ u64 pos, this_len, left;
int io_align, page_align;
- int left, pages_left;
+ int pages_left;
int read;
struct page **page_pos;
int ret;
@@ -346,7 +346,7 @@ more:
ret = 0;
hit_stripe = this_len < left;
was_short = ret >= 0 && ret < this_len;
- dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read,
+ dout("striped_read %llu~%llu (read %u) got %d%s%s\n", pos, left, read,
ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : "");

if (ret > 0) {
@@ -378,7 +378,7 @@ more:
if (pos + left > inode->i_size)
left = inode->i_size - pos;

- dout("zero tail %d\n", left);
+ dout("zero tail %llu\n", left);
ceph_zero_page_vector_range(page_align + read, left,
pages);
read += left;
It is loading more messages.
0 new messages