[ 143/143] floppy: dont write kernel-only members to FDRAWCMD ioctl output

Willy Tarreau

unread,

May 11, 2014, 9:50:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Matthew Daley <ma...@bugfuzz.com>

Do not leak kernel-only floppy_raw_cmd structure members to userspace.
This includes the linked-list pointer and the pointer to the allocated
DMA space.

Signed-off-by: Matthew Daley <ma...@bugfuzz.com>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
(cherry picked from commit 2145e15e0557a01b9195d1c7199a1b92cb9be81f)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/block/floppy.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 19d45e6..f959aad 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3162,7 +3162,12 @@ static inline int raw_cmd_copyout(int cmd, char __user *param,
int ret;

while (ptr) {
- COPYOUT(*ptr);
+ struct floppy_raw_cmd cmd = *ptr;
+ cmd.next = NULL;
+ cmd.kernel_data = NULL;
+ ret = copy_to_user((void __user *)param, &cmd, sizeof(cmd));
+ if (ret)
+ return -EFAULT;
param += sizeof(struct floppy_raw_cmd);
if ((ptr->flags & FD_RAW_READ) && ptr->buffer_length) {
if (ptr->length >= 0
--
1.7.12.2.21.g234cd45.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Willy Tarreau

unread,

May 11, 2014, 9:50:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Peter Hurley <pe...@hurleysoftware.com>

The tty atomic_write_lock does not provide an exclusion guarantee for
the tty driver if the termios settings are LECHO & !OPOST. And since
it is unexpected and not allowed to call TTY buffer helpers like
tty_insert_flip_string concurrently, this may lead to crashes when
concurrect writers call pty_write. In that case the following two
writers:
* the ECHOing from a workqueue and
* pty_write from the process
race and can overflow the corresponding TTY buffer like follows.

If we look into tty_insert_flip_string_fixed_flag, there is:
int space = __tty_buffer_request_room(port, goal, flags);
struct tty_buffer *tb = port->buf.tail;
...
memcpy(char_buf_ptr(tb, tb->used), chars, space);
...
tb->used += space;

so the race of the two can result in something like this:
A B
__tty_buffer_request_room
__tty_buffer_request_room
memcpy(buf(tb->used), ...)
tb->used += space;
memcpy(buf(tb->used), ...) ->BOOM

B's memcpy is past the tty_buffer due to the previous A's tb->used
increment.

Since the N_TTY line discipline input processing can output
concurrently with a tty write, obtain the N_TTY ldisc output_lock to
serialize echo output with normal tty writes. This ensures the tty
buffer helper tty_insert_flip_string is not called concurrently and
everything is fine.

Note that this is nicely reproducible by an ordinary user using
forkpty and some setup around that (raw termios + ECHO). And it is
present in kernels at least after commit
d945cb9cce20ac7143c2de8d88b187f62db99bdc (pty: Rework the pty layer to
use the normal buffering logic) in 2.6.31-rc3.

js: add more info to the commit log
js: switch to bool
js: lock unconditionally
js: lock only the tty->ops->write call

References: CVE-2014-0196
Reported-and-tested-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Cc: Alan Cox <al...@lxorguk.ukuu.org.uk>
Cc: <sta...@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
(cherry picked from commit 4291086b1f081b869c6d79e5b7441633dc3ace00)
[wt: 2.6.32 has no n_tty_data, so output_lock is in tty, not tty->disc_data]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/char/n_tty.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index 2e50f4d..5269fa0 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1969,7 +1969,9 @@ static ssize_t n_tty_write(struct tty_struct *tty, struct file *file,
tty->ops->flush_chars(tty);
} else {
while (nr > 0) {
+ mutex_lock(&tty->output_lock);
c = tty->ops->write(tty, b, nr);
+ mutex_unlock(&tty->output_lock);
if (c < 0) {
retval = c;
goto break_out;

Willy Tarreau

unread,

May 11, 2014, 9:50:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Neil Horman <nho...@tuxdriver.com>

[ Upstream commit 5a0068deb611109c5ba77358be533f763f395ee4 ]

Recently grabbed this report:
https://bugzilla.redhat.com/show_bug.cgi?id=1005567

Of an issue in which the bonding driver, with an attached vlan encountered the
following errors when bond0 was taken down and back up:

dummy1: promiscuity touches roof, set promiscuity failed. promiscuity feature of
device might be broken.

The error occurs because, during __bond_release_one, if we release our last
slave, we take on a random mac address and issue a NETDEV_CHANGEADDR
notification. With an attached vlan, the vlan may see that the vlan and bond
mac address were in sync, but no longer are. This triggers a call to dev_uc_add
and dev_set_rx_mode, which enables IFF_PROMISC on the bond device. Then, when
we complete __bond_release_one, we use the current state of the bond flags to
determine if we should decrement the promiscuity of the releasing slave. But
since the bond changed promiscuity state during the release operation, we
incorrectly decrement the slave promisc count when it wasn't in promiscuous mode
to begin with, causing the above error

Fix is pretty simple, just cache the bonding flags at the start of the function
and use those when determining the need to set promiscuity.

This is also needed for the ALLMULTI flag

CC: Jay Vosburgh <fu...@us.ibm.com>
CC: Andy Gospodarek <an...@greyhouse.net>
CC: Mark Wu <wu...@linux.vnet.ibm.com>
CC: "David S. Miller" <da...@davemloft.net>
Reported-by: Mark Wu <wu...@linux.vnet.ibm.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/bonding/bond_main.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 6ffbfb7..4f52101 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1794,6 +1794,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
struct bonding *bond = netdev_priv(bond_dev);
struct slave *slave, *oldcurrent;
struct sockaddr addr;
+ int old_flags = bond_dev->flags;

/* slave is not a slave or master is not master of this slave */
if (!(slave_dev->flags & IFF_SLAVE) ||
@@ -1929,12 +1930,18 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
* already taken care of above when we detached the slave
*/
if (!USES_PRIMARY(bond->params.mode)) {
- /* unset promiscuity level from slave */
- if (bond_dev->flags & IFF_PROMISC)
+ /* unset promiscuity level from slave
+ * NOTE: The NETDEV_CHANGEADDR call above may change the value
+ * of the IFF_PROMISC flag in the bond_dev, but we need the
+ * value of that flag before that change, as that was the value
+ * when this slave was attached, so we cache at the start of the
+ * function and use it here. Same goes for ALLMULTI below
+ */
+ if (old_flags & IFF_PROMISC)
dev_set_promiscuity(slave_dev, -1);

/* unset allmulti level from slave */
- if (bond_dev->flags & IFF_ALLMULTI)
+ if (old_flags & IFF_ALLMULTI)
dev_set_allmulti(slave_dev, -1);

/* flush master's mc_list from slave */

Willy Tarreau

unread,

May 11, 2014, 9:50:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torv...@linux-foundation.org>

Before we do an EMMS in the AMD FXSAVE information leak workaround we
need to clear any pending exceptions, otherwise we trap with a
floating-point exception inside this code.

Reported-by: halfdog <m...@halfdog.net>
Tested-by: Borislav Petkov <b...@suse.de>
Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy...@mail.gmail.com
Signed-off-by: H. Peter Anvin <h...@zytor.com>
(cherry picked from commit 26bef1318adc1b3a530ecc807ef99346db2aa8b0)
[wt: in 2.6.32, patch applies to arch/x86/include/asm/i387.h. There's
no static_cpu_has() so we use boot_cpu_has() like other kernels do
with gcc3.
]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/x86/include/asm/i387.h | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 0b20bbb..cb42fad 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -242,12 +242,13 @@ clear_state:
/* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception
is pending. Clear the x87 state here by setting it to fixed
values. safe_address is a random variable that should be in L1 */
- alternative_input(
- GENERIC_NOP8 GENERIC_NOP2,
- "emms\n\t" /* clear stack tags */
- "fildl %[addr]", /* set F?P to defined value */
- X86_FEATURE_FXSAVE_LEAK,
- [addr] "m" (safe_address));
+ if (unlikely(boot_cpu_has(X86_FEATURE_FXSAVE_LEAK))) {
+ asm volatile(
+ "fnclex\n\t"
+ "emms\n\t"
+ "fildl %[addr]" /* set F?P to defined value */
+ : : [addr] "m" (safe_address));
+ }
end:
task_thread_info(tsk)->status &= ~TS_USEDFPU;

Willy Tarreau

unread,

May 11, 2014, 9:50:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

capable

From: Daniel Borkmann <dbor...@redhat.com>

[ Upstream commit ec0223ec48a90cb605244b45f7c62de856403729 ]

RFC4895 introduced AUTH chunks for SCTP; during the SCTP
handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS
being optional though):

---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
<------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------

A special case is when an endpoint requires COOKIE-ECHO
chunks to be authenticated:

---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
<------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
------------------ AUTH; COOKIE-ECHO ---------------->
<-------------------- COOKIE-ACK ---------------------

RFC4895, section 6.3. Receiving Authenticated Chunks says:

The receiver MUST use the HMAC algorithm indicated in
the HMAC Identifier field. If this algorithm was not
specified by the receiver in the HMAC-ALGO parameter in
the INIT or INIT-ACK chunk during association setup, the
AUTH chunk and all the chunks after it MUST be discarded
and an ERROR chunk SHOULD be sent with the error cause
defined in Section 4.1. [...] If no endpoint pair shared
key has been configured for that Shared Key Identifier,
all authenticated chunks MUST be silently discarded. [...]

When an endpoint requires COOKIE-ECHO chunks to be
authenticated, some special procedures have to be followed
because the reception of a COOKIE-ECHO chunk might result
in the creation of an SCTP association. If a packet arrives
containing an AUTH chunk as a first chunk, a COOKIE-ECHO
chunk as the second chunk, and possibly more chunks after
them, and the receiver does not have an STCB for that
packet, then authentication is based on the contents of
the COOKIE-ECHO chunk. In this situation, the receiver MUST
authenticate the chunks in the packet by using the RANDOM
parameters, CHUNKS parameters and HMAC_ALGO parameters
obtained from the COOKIE-ECHO chunk, and possibly a local
shared secret as inputs to the authentication procedure
specified in Section 6.3. If authentication fails, then
the packet is discarded. If the authentication is successful,
the COOKIE-ECHO and all the chunks after the COOKIE-ECHO
MUST be processed. If the receiver has an STCB, it MUST
process the AUTH chunk as described above using the STCB
from the existing association to authenticate the
COOKIE-ECHO chunk and all the chunks after it. [...]

Commit bbd0d59809f9 introduced the possibility to receive
and verification of AUTH chunk, including the edge case for
authenticated COOKIE-ECHO. On reception of COOKIE-ECHO,
the function sctp_sf_do_5_1D_ce() handles processing,
unpacks and creates a new association if it passed sanity
checks and also tests for authentication chunks being
present. After a new association has been processed, it
invokes sctp_process_init() on the new association and
walks through the parameter list it received from the INIT
chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO
and SCTP_PARAM_CHUNKS, and copies them into asoc->peer
meta data (peer_random, peer_hmacs, peer_chunks) in case
sysctl -w net.sctp.auth_enable=1 is set. If in INIT's
SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set,
peer_random != NULL and peer_hmacs != NULL the peer is to be
assumed asoc->peer.auth_capable=1, in any other case
asoc->peer.auth_capable=0.

Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is
available, we set up a fake auth chunk and pass that on to
sctp_sf_authenticate(), which at latest in
sctp_auth_calculate_hmac() reliably dereferences a NULL pointer
at position 0..0008 when setting up the crypto key in
crypto_hash_setkey() by using asoc->asoc_shared_key that is
NULL as condition key_id == asoc->active_key_id is true if
the AUTH chunk was injected correctly from remote. This
happens no matter what net.sctp.auth_enable sysctl says.

The fix is to check for net->sctp.auth_enable and for
asoc->peer.auth_capable before doing any operations like
sctp_sf_authenticate() as no key is activated in
sctp_auth_asoc_init_active_key() for each case.

Now as RFC4895 section 6.3 states that if the used HMAC-ALGO
passed from the INIT chunk was not used in the AUTH chunk, we
SHOULD send an error; however in this case it would be better
to just silently discard such a maliciously prepared handshake
as we didn't even receive a parameter at all. Also, as our
endpoint has no shared key configured, section 6.3 says that
MUST silently discard, which we are doing from now onwards.

Before calling sctp_sf_pdiscard(), we need not only to free
the association, but also the chunk->auth_chunk skb, as
commit bbd0d59809f9 created a skb clone in that case.

I have tested this locally by using netfilter's nfqueue and
re-injecting packets into the local stack after maliciously
modifying the INIT chunk (removing RANDOM; HMAC-ALGO param)
and the SCTP packet containing the COOKIE_ECHO (injecting
AUTH chunk before COOKIE_ECHO). Fixed with this patch applied.

Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dbor...@redhat.com>
Cc: Vlad Yasevich <yase...@gmail.com>
Cc: Neil Horman <nho...@tuxdriver.com>
Acked-by: Vlad Yasevich <vyas...@gmail.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/sctp/sm_statefuns.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 486df56..d43002b 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -745,6 +745,13 @@ sctp_disposition_t sctp_sf_do_5_1D_ce(const struct sctp_endpoint *ep,
struct sctp_chunk auth;
sctp_ierror_t ret;

+ /* Make sure that we and the peer are AUTH capable */
+ if (!sctp_auth_enable || !new_asoc->peer.auth_capable) {
+ kfree_skb(chunk->auth_chunk);
+ sctp_association_free(new_asoc);
+ return sctp_sf_pdiscard(ep, asoc, type, arg, commands);
+ }
+
/* set-up our fake chunk so that we can process it */
auth.skb = chunk->auth_chunk;
auth.asoc = chunk->asoc;

Willy Tarreau

unread,

May 11, 2014, 9:50:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Ding Tianhong <dingti...@huawei.com>

[ Upstream commit f873042093c0b418d2351fe142222b625c740149 ]

When the following commands are executed:

brctl addbr br0
ifconfig br0 hw ether <addr>
rmmod bridge

The calltrace will occur:

[ 563.312114] device eth1 left promiscuous mode
[ 563.312188] br0: port 1(eth1) entered disabled state
[ 563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects
[ 563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G O 3.12.0-0.7-default+ #9
[ 563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 563.468200] 0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8
[ 563.468204] ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8
[ 563.468206] ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78
[ 563.468209] Call Trace:
[ 563.468218] [<ffffffff814d1c92>] dump_stack+0x6a/0x78
[ 563.468234] [<ffffffff81148efd>] kmem_cache_destroy+0xfd/0x100
[ 563.468242] [<ffffffffa062a270>] br_fdb_fini+0x10/0x20 [bridge]
[ 563.468247] [<ffffffffa063ac76>] br_deinit+0x4e/0x50 [bridge]
[ 563.468254] [<ffffffff810c7dc9>] SyS_delete_module+0x199/0x2b0
[ 563.468259] [<ffffffff814e0922>] system_call_fastpath+0x16/0x1b
[ 570.377958] Bridge firewalling registered

--------------------------- cut here -------------------------------

The reason is that when the bridge dev's address is changed, the
br_fdb_change_mac_address() will add new address in fdb, but when
the bridge was removed, the address entry in the fdb did not free,
the bridge_fdb_cache still has objects when destroy the cache, Fix
this by flushing the bridge address entry when removing the bridge.

v2: according to the Toshiaki Makita and Vlad's suggestion, I only
delete the vlan0 entry, it still have a leak here if the vlan id
is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
to flush all entries whose dst is NULL for the bridge.

Suggested-by: Toshiaki Makita <toshiaki...@gmail.com>
Suggested-by: Vlad Yasevich <vyas...@gmail.com>
Signed-off-by: Ding Tianhong <dingti...@huawei.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/bridge/br_if.c | 2 ++

1 file changed, 2 insertions(+)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 4a9f527..c01e65d 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -162,6 +162,8 @@ static void del_br(struct net_bridge *br)
del_nbp(p);
}

+ br_fdb_delete_by_port(br, NULL, 1);
+
del_timer_sync(&br->gc_timer);

br_sysfs_delbr(br->dev);

Willy Tarreau

unread,

May 11, 2014, 9:50:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Mahesh Rajashekhara <Mahesh.Ra...@pmcs.com>

It appears that driver runs into a problem here if fibsize is too small
because we allocate user_srbcmd with fibsize size only but later we
access it until user_srbcmd->sg.count to copy it over to srbcmd.

It is not correct to test (fibsize < sizeof(*user_srbcmd)) because this
structure already includes one sg element and this is not needed for
commands without data. So, we would recommend to add the following
(instead of test for fibsize == 0).

Signed-off-by: Mahesh Rajashekhara <Mahesh.Ra...@pmcs.com>
Reported-by: Nico Golde <ni...@ngolde.de>
Reported-by: Fabian Yamaguchi <fa...@goesec.de>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
(cherry picked from commit b4789b8e6be3151a955ade74872822f30e8cd914)

CVE-2013-6380

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/scsi/aacraid/commctrl.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/aacraid/commctrl.c b/drivers/scsi/aacraid/commctrl.c
index a5b8e7b..c895174 100644
--- a/drivers/scsi/aacraid/commctrl.c
+++ b/drivers/scsi/aacraid/commctrl.c
@@ -507,7 +507,8 @@ static int aac_send_raw_srb(struct aac_dev* dev, void __user * arg)
goto cleanup;
}

- if (fibsize > (dev->max_fib_size - sizeof(struct aac_fibhdr))) {
+ if ((fibsize < (sizeof(struct user_aac_srb) - sizeof(struct user_sgentry))) ||
+ (fibsize > (dev->max_fib_size - sizeof(struct aac_fibhdr)))) {
rcode = -EINVAL;
goto cleanup;

Willy Tarreau

unread,

May 11, 2014, 9:50:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

[ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ]

We need to cap ->msg_namelen or it leads to a buffer overflow when we
to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to
exploit this bug.

The call tree is:
___sys_recvmsg()
move_addr_to_user()
audit_sockaddr()
__audit_sockaddr()

Reported-by: J�ri Aedla <juri....@gmail.com>
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

[wt: 2.6.32: msg_sys is a struct, not a pointer]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/compat.c | 2 ++
net/socket.c | 24 ++++++++++++++++++++----
2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/net/compat.c b/net/compat.c
index 9559afc..da3d0fc 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -69,6 +69,8 @@ int get_compat_msghdr(struct msghdr *kmsg, struct compat_msghdr __user *umsg)
__get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
__get_user(kmsg->msg_flags, &umsg->msg_flags))
return -EFAULT;
+ if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
+ return -EINVAL;
kmsg->msg_name = compat_ptr(tmp1);
kmsg->msg_iov = compat_ptr(tmp2);
kmsg->msg_control = compat_ptr(tmp3);
diff --git a/net/socket.c b/net/socket.c
index bf9fc68..9f8cd74 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1863,6 +1863,16 @@ SYSCALL_DEFINE2(shutdown, int, fd, int, how)
#define COMPAT_NAMELEN(msg) COMPAT_MSG(msg, msg_namelen)
#define COMPAT_FLAGS(msg) COMPAT_MSG(msg, msg_flags)

+static int copy_msghdr_from_user(struct msghdr *kmsg,
+ struct msghdr __user *umsg)
+{
+ if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
+ return -EFAULT;
+ if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
+ return -EINVAL;
+ return 0;
+}
+
/*
* BSD sendmsg interface
*/
@@ -1887,8 +1897,11 @@ SYSCALL_DEFINE3(sendmsg, int, fd, struct msghdr __user *, msg, unsigned, flags)
if (get_compat_msghdr(&msg_sys, msg_compat))
return -EFAULT;
}
- else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
- return -EFAULT;
+ else {
+ err = copy_msghdr_from_user(&msg_sys, msg);
+ if (err)
+ return err;
+ }

sock = sockfd_lookup_light(fd, &err, &fput_needed);
if (!sock)
@@ -1997,8 +2010,11 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
if (get_compat_msghdr(&msg_sys, msg_compat))
return -EFAULT;
}
- else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
- return -EFAULT;
+ else {
+ err = copy_msghdr_from_user(&msg_sys, msg);
+ if (err)
+ return err;
+ }

sock = sockfd_lookup_light(fd, &err, &fput_needed);
if (!sock)

Willy Tarreau

unread,

May 11, 2014, 9:50:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <mpat...@redhat.com>

commit d82b922a4acc1781d368aceac2f9da43b038cab2 upstream

The powernow-k6 driver used to read the initial multiplier from the
powernow register. However, there is a problem with this:

* If there was a frequency transition before, the multiplier read from the
register corresponds to the current multiplier.
* If there was no frequency transition since reset, the field in the
register always reads as zero, regardless of the current multiplier that
is set using switches on the mainboard and that the CPU is running at.

The zero value corresponds to multiplier 4.5, so as a consequence, the
powernow-k6 driver always assumes multiplier 4.5.

For example, if we have 550MHz CPU with bus frequency 100MHz and
multiplier 5.5, the powernow-k6 driver thinks that the multiplier is 4.5
and bus frequency is 122MHz. The powernow-k6 driver then sets the
multiplier to 4.5, underclocking the CPU to 450MHz, but reports the
current frequency as 550MHz.

There is no reliable way how to read the initial multiplier. I modified
the driver so that it contains a table of known frequencies (based on
parameters of existing CPUs and some common overclocking schemes) and sets
the multiplier according to the frequency. If the frequency is unknown
(because of unusual overclocking or underclocking), the user must supply
the bus speed and maximum multiplier as module parameters.

This patch should be backported to all stable kernels. If it doesn't
apply cleanly, change it, or ask me to change it.

Signed-off-by: Mikulas Patocka <mpat...@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 76 +++++++++++++++++++++++++++++--
1 file changed, 72 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
index 9f80a34..2d9161e 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
@@ -26,6 +26,14 @@
static unsigned int busfreq; /* FSB, in 10 kHz */
static unsigned int max_multiplier;

+static unsigned int param_busfreq = 0;
+static unsigned int param_max_multiplier = 0;
+
+module_param_named(max_multiplier, param_max_multiplier, uint, S_IRUGO);
+MODULE_PARM_DESC(max_multiplier, "Maximum multiplier (allowed values: 20 30 35 40 45 50 55 60)");
+
+module_param_named(bus_frequency, param_busfreq, uint, S_IRUGO);
+MODULE_PARM_DESC(bus_frequency, "Bus frequency in kHz");

/* Clock ratio multiplied by 10 - see table 27 in AMD#23446 */
static struct cpufreq_frequency_table clock_ratio[] = {
@@ -40,6 +48,27 @@ static struct cpufreq_frequency_table clock_ratio[] = {
{0, CPUFREQ_TABLE_END}
};

+static const struct {
+ unsigned freq;
+ unsigned mult;
+} usual_frequency_table[] = {
+ { 400000, 40 }, // 100 * 4
+ { 450000, 45 }, // 100 * 4.5
+ { 475000, 50 }, // 95 * 5
+ { 500000, 50 }, // 100 * 5
+ { 506250, 45 }, // 112.5 * 4.5
+ { 533500, 55 }, // 97 * 5.5
+ { 550000, 55 }, // 100 * 5.5
+ { 562500, 50 }, // 112.5 * 5
+ { 570000, 60 }, // 95 * 6
+ { 600000, 60 }, // 100 * 6
+ { 618750, 55 }, // 112.5 * 5.5
+ { 660000, 55 }, // 120 * 5.5
+ { 675000, 60 }, // 112.5 * 6
+ { 720000, 60 }, // 120 * 6
+};
+
+#define FREQ_RANGE 3000

/**
* powernow_k6_get_cpu_multiplier - returns the current FSB multiplier
@@ -163,18 +192,57 @@ static int powernow_k6_target(struct cpufreq_policy *policy,
return 0;
}

-
static int powernow_k6_cpu_init(struct cpufreq_policy *policy)
{
unsigned int i, f;
int result;
+ unsigned khz;

if (policy->cpu != 0)
return -ENODEV;

- /* get frequencies */
- max_multiplier = powernow_k6_get_cpu_multiplier();
- busfreq = cpu_khz / max_multiplier;
+ max_multiplier = 0;
+ khz = cpu_khz;
+ for (i = 0; i < ARRAY_SIZE(usual_frequency_table); i++) {
+ if (khz >= usual_frequency_table[i].freq - FREQ_RANGE &&
+ khz <= usual_frequency_table[i].freq + FREQ_RANGE) {
+ khz = usual_frequency_table[i].freq;
+ max_multiplier = usual_frequency_table[i].mult;
+ break;
+ }
+ }
+ if (param_max_multiplier) {
+ for (i = 0; (clock_ratio[i].frequency != CPUFREQ_TABLE_END); i++) {
+ if (clock_ratio[i].index == param_max_multiplier) {
+ max_multiplier = param_max_multiplier;
+ goto have_max_multiplier;
+ }
+ }
+ printk(KERN_ERR "powernow-k6: invalid max_multiplier parameter, valid parameters 20, 30, 35, 40, 45, 50, 55, 60\n");
+ return -EINVAL;
+ }
+
+ if (!max_multiplier) {
+ printk(KERN_WARNING "powernow-k6: unknown frequency %u, cannot determine current multiplier\n", khz);
+ printk(KERN_WARNING "powernow-k6: use module parameters max_multiplier and bus_frequency\n");
+ return -EOPNOTSUPP;
+ }
+
+have_max_multiplier:
+ param_max_multiplier = max_multiplier;
+
+ if (param_busfreq) {
+ if (param_busfreq >= 50000 && param_busfreq <= 150000) {
+ busfreq = param_busfreq / 10;
+ goto have_busfreq;
+ }
+ printk(KERN_ERR "powernow-k6: invalid bus_frequency parameter, allowed range 50000 - 150000 kHz\n");
+ return -EINVAL;
+ }
+
+ busfreq = khz / max_multiplier;
+have_busfreq:
+ param_busfreq = busfreq * 10;

/* table init */
for (i = 0; (clock_ratio[i].frequency != CPUFREQ_TABLE_END); i++) {

Willy Tarreau

unread,

May 11, 2014, 9:50:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

[ Upstream commit f1d8cba61c3c4b1eb88e507249c4cb8d635d9a76 ]

In commit c9e9042994d3 ("ipv4: fix possible seqlock deadlock") I left
another places where IP_INC_STATS_BH() were improperly used.

udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from
process context, not from softirq context.

This was detected by lockdep seqlock support.

Reported-by: jongman heo <jongm...@samsung.com>
Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Signed-off-by: Eric Dumazet <edum...@google.com>
Cc: Hannes Frederic Sowa <han...@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/tcp_ipv4.c | 2 +-
net/ipv4/udp.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d746d3b3..e60f0fd 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -174,7 +174,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
inet->sport, usin->sin_port, sk, 1);
if (tmp < 0) {
if (tmp == -ENETUNREACH)
- IP_INC_STATS_BH(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);
+ IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);
return tmp;
}

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index a9aed7e..dba3c01 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -723,7 +723,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
err = ip_route_output_flow(net, &rt, &fl, sk, 1);
if (err) {
if (err == -ENETUNREACH)
- IP_INC_STATS_BH(net, IPSTATS_MIB_OUTNOROUTES);
+ IP_INC_STATS(net, IPSTATS_MIB_OUTNOROUTES);
goto out;

Willy Tarreau

unread,

May 11, 2014, 9:50:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Andy Honig <aho...@google.com>

commit b963a22e6d1a266a67e9eecc88134713fd54775c upstream

Under guest controllable circumstances apic_get_tmcct will execute a
divide by zero and cause a crash. If the guest cpuid support
tsc deadline timers and performs the following sequence of requests
the host will crash.
- Set the mode to periodic
- Set the TMICT to 0
- Set the mode bits to 11 (neither periodic, nor one shot, nor tsc deadline)
- Set the TMICT to non-zero.
Then the lapic_timer.period will be 0, but the TMICT will not be. If the
guest then reads from the TMCCT then the host will perform a divide by 0.

This patch ensures that if the lapic_timer.period is 0, then the division
does not occur.

Reported-by: Andrew Honig <aho...@google.com>
Cc: sta...@vger.kernel.org
Signed-off-by: Andrew Honig <aho...@google.com>
Signed-off-by: Paolo Bonzini <pbon...@redhat.com>
[dannf: backported to Debian's 2.6.32]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/x86/kvm/lapic.c | 3 ++-

1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8dfeaaa..b77857f 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -519,7 +519,8 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic)
ASSERT(apic != NULL);

/* if initial count is 0, current count should also be 0 */
- if (apic_get_reg(apic, APIC_TMICT) == 0)
+ if (apic_get_reg(apic, APIC_TMICT) == 0 ||
+ apic->lapic_timer.period == 0)
return 0;

remaining = hrtimer_get_remaining(&apic->lapic_timer.timer);

Willy Tarreau

unread,

May 11, 2014, 9:50:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

[ Upstream commit ff862a4668dd6dba962b1d2d8bd344afa6375683 ]

This is inspired by a5cc68f3d6 "af_key: fix info leaks in notify
messages". There are some struct members which don't get initialized
and could disclose small amounts of private information.

Acked-by: Mathias Krause <min...@googlemail.com>
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Acked-by: Steffen Klassert <steffen....@secunet.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/key/af_key.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 9d22e46..3f55faa 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -2079,6 +2079,7 @@ static int pfkey_xfrm_policy2msg(struct sk_buff *skb, struct xfrm_policy *xp, in
pol->sadb_x_policy_type = IPSEC_POLICY_NONE;
}
pol->sadb_x_policy_dir = dir+1;
+ pol->sadb_x_policy_reserved = 0;
pol->sadb_x_policy_id = xp->index;
pol->sadb_x_policy_priority = xp->priority;

@@ -3111,7 +3112,9 @@ static int pfkey_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *t, struct
pol->sadb_x_policy_exttype = SADB_X_EXT_POLICY;
pol->sadb_x_policy_type = IPSEC_POLICY_IPSEC;
pol->sadb_x_policy_dir = dir+1;
+ pol->sadb_x_policy_reserved = 0;
pol->sadb_x_policy_id = xp->index;
+ pol->sadb_x_policy_priority = xp->priority;

/* Set sadb_comb's. */
if (x->id.proto == IPPROTO_AH)
@@ -3499,6 +3502,7 @@ static int pfkey_send_migrate(struct xfrm_selector *sel, u8 dir, u8 type,
pol->sadb_x_policy_exttype = SADB_X_EXT_POLICY;
pol->sadb_x_policy_type = IPSEC_POLICY_IPSEC;
pol->sadb_x_policy_dir = dir + 1;
+ pol->sadb_x_policy_reserved = 0;
pol->sadb_x_policy_id = 0;
pol->sadb_x_policy_priority = 0;

Willy Tarreau

unread,

May 11, 2014, 9:50:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torv...@linux-foundation.org>

Various drivers end up replicating the code to mmap() their memory
buffers into user space, and our core memory remapping function may be
very flexible but it is unnecessarily complicated for the common cases
to use.

Our internal VM uses pfn's ("page frame numbers") which simplifies
things for the VM, and allows us to pass physical addresses around in a
denser and more efficient format than passing a "phys_addr_t" around,
and having to shift it up and down by the page size. But it just means
that drivers end up doing that shifting instead at the interface level.

It also means that drivers end up mucking around with internal VM things
like the vma details (vm_pgoff, vm_start/end) way more than they really
need to.

So this just exports a function to map a certain physical memory range
into user space (using a phys_addr_t based interface that is much more
natural for a driver) and hides all the complexity from the driver.
Some drivers will still end up tweaking the vm_page_prot details for
things like prefetching or cacheability etc, but that's actually
relevant to the driver, rather than caring about what the page offset of
the mapping is into the particular IO memory region.

Acked-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
(cherry picked from commit b4cbb197c7e7a68dbad0d491242e3ca67420c13e)
[WT: only needed in 2.6.32 for next commit]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

include/linux/mm.h | 2 ++
mm/memory.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 49 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 11e5be6..5ef50c1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1243,6 +1243,8 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
+int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len);
+

struct page *follow_page(struct vm_area_struct *, unsigned long address,
unsigned int foll_flags);
diff --git a/mm/memory.c b/mm/memory.c
index 6c836d3..085b068 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1811,6 +1811,53 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(remap_pfn_range);

+/**
+ * vm_iomap_memory - remap memory to userspace
+ * @vma: user vma to map to
+ * @start: start of area
+ * @len: size of area
+ *
+ * This is a simplified io_remap_pfn_range() for common driver use. The
+ * driver just needs to give us the physical memory range to be mapped,
+ * we'll figure out the rest from the vma information.
+ *
+ * NOTE! Some drivers might want to tweak vma->vm_page_prot first to get
+ * whatever write-combining details or similar.
+ */
+int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len)
+{
+ unsigned long vm_len, pfn, pages;
+
+ /* Check that the physical memory area passed in looks valid */
+ if (start + len < start)
+ return -EINVAL;
+ /*
+ * You *really* shouldn't map things that aren't page-aligned,
+ * but we've historically allowed it because IO memory might
+ * just have smaller alignment.
+ */
+ len += start & ~PAGE_MASK;
+ pfn = start >> PAGE_SHIFT;
+ pages = (len + ~PAGE_MASK) >> PAGE_SHIFT;
+ if (pfn + pages < pfn)
+ return -EINVAL;
+
+ /* We start the mapping 'vm_pgoff' pages into the area */
+ if (vma->vm_pgoff > pages)
+ return -EINVAL;
+ pfn += vma->vm_pgoff;
+ pages -= vma->vm_pgoff;
+
+ /* Can we fit all of the mapping? */
+ vm_len = vma->vm_end - vma->vm_start;
+ if (vm_len >> PAGE_SHIFT > pages)
+ return -EINVAL;
+
+ /* Ok, let it rip */
+ return io_remap_pfn_range(vma, vma->vm_start, pfn, vm_len, vma->vm_page_prot);
+}
+EXPORT_SYMBOL(vm_iomap_memory);
+
static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
unsigned long addr, unsigned long end,
pte_fn_t fn, void *data)

Willy Tarreau

unread,

May 11, 2014, 9:50:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torv...@linux-foundation.org>

Nico Golde reports a few straggling uses of [io_]remap_pfn_range() that
really should use the vm_iomap_memory() helper. This trivially converts
two of them to the helper, and comments about why the third one really
needs to continue to use remap_pfn_range(), and adds the missing size
check.

Reported-by: Nico Golde <ni...@ngolde.de>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
(cherry picked from commit 7314e613d5ff9f0934f7a0f74ed7973b903315d1)
[wt: vm_flags were absent in mainline, Ben removed them in 3.2, but
I kept them to minimize changes and avoid any side effect]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/uio/uio.c | 16 +++++++++++++++-
drivers/video/au1100fb.c | 26 +-------------------------
drivers/video/au1200fb.c | 26 +-------------------------
3 files changed, 17 insertions(+), 51 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index e941367..e3804d3 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -669,16 +669,30 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
{
struct uio_device *idev = vma->vm_private_data;
int mi = uio_find_mem_index(vma);
+ struct uio_mem *mem;
if (mi < 0)
return -EINVAL;
+ mem = idev->info->mem + mi;
+
+ if (vma->vm_end - vma->vm_start > mem->size)
+ return -EINVAL;

vma->vm_flags |= VM_IO | VM_RESERVED;

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

+ /*
+ * We cannot use the vm_iomap_memory() helper here,
+ * because vma->vm_pgoff is the map index we looked
+ * up above in uio_find_mem_index(), rather than an
+ * actual page offset into the mmap.
+ *
+ * So we just do the physical mmap without a page
+ * offset.
+ */
return remap_pfn_range(vma,
vma->vm_start,
- idev->info->mem[mi].addr >> PAGE_SHIFT,
+ mem->addr >> PAGE_SHIFT,
vma->vm_end - vma->vm_start,
vma->vm_page_prot);
}
diff --git a/drivers/video/au1100fb.c b/drivers/video/au1100fb.c
index a699aab..745e5b3 100644
--- a/drivers/video/au1100fb.c
+++ b/drivers/video/au1100fb.c
@@ -392,39 +392,15 @@ void au1100fb_fb_rotate(struct fb_info *fbi, int angle)
int au1100fb_fb_mmap(struct fb_info *fbi, struct vm_area_struct *vma)
{
struct au1100fb_device *fbdev;
- unsigned int len;
- unsigned long start=0, off;

fbdev = to_au1100fb_device(fbi);

- if (vma->vm_pgoff > (~0UL >> PAGE_SHIFT)) {
- return -EINVAL;
- }
-
- start = fbdev->fb_phys & PAGE_MASK;
- len = PAGE_ALIGN((start & ~PAGE_MASK) + fbdev->fb_len);
-
- off = vma->vm_pgoff << PAGE_SHIFT;
-
- if ((vma->vm_end - vma->vm_start + off) > len) {
- return -EINVAL;
- }
-
- off += start;
- vma->vm_pgoff = off >> PAGE_SHIFT;
-
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
pgprot_val(vma->vm_page_prot) |= (6 << 9); //CCA=6

vma->vm_flags |= VM_IO;

- if (io_remap_pfn_range(vma, vma->vm_start, off >> PAGE_SHIFT,
- vma->vm_end - vma->vm_start,
- vma->vm_page_prot)) {
- return -EAGAIN;
- }
-
- return 0;
+ return vm_iomap_memory(vma, fbdev->fb_phys, fbdev->fb_len);
}

/* fb_cursor
diff --git a/drivers/video/au1200fb.c b/drivers/video/au1200fb.c
index 0d96f1d..5d6e509 100644
--- a/drivers/video/au1200fb.c
+++ b/drivers/video/au1200fb.c
@@ -1241,42 +1241,18 @@ static int au1200fb_fb_blank(int blank_mode, struct fb_info *fbi)
* method mainly to allow the use of the TLB streaming flag (CCA=6)
*/
static int au1200fb_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
-
{
- unsigned int len;
- unsigned long start=0, off;
struct au1200fb_device *fbdev = (struct au1200fb_device *) info;

#ifdef CONFIG_PM
au1xxx_pm_access(LCD_pm_dev);
#endif
-
- if (vma->vm_pgoff > (~0UL >> PAGE_SHIFT)) {
- return -EINVAL;
- }
-
- start = fbdev->fb_phys & PAGE_MASK;
- len = PAGE_ALIGN((start & ~PAGE_MASK) + fbdev->fb_len);
-
- off = vma->vm_pgoff << PAGE_SHIFT;
-
- if ((vma->vm_end - vma->vm_start + off) > len) {
- return -EINVAL;
- }
-
- off += start;
- vma->vm_pgoff = off >> PAGE_SHIFT;
-
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
pgprot_val(vma->vm_page_prot) |= _CACHE_MASK; /* CCA=7 */

vma->vm_flags |= VM_IO;

- return io_remap_pfn_range(vma, vma->vm_start, off >> PAGE_SHIFT,
- vma->vm_end - vma->vm_start,
- vma->vm_page_prot);
-
- return 0;
+ return vm_iomap_memory(vma, fbdev->fb_phys, fbdev->fb_len);
}

static void set_global(u_int cmd, struct au1200_lcd_global_regs_t *pdata)

Willy Tarreau

unread,

May 11, 2014, 9:50:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

[ Upstream commit 4d231b76eef6c4a6bd9c96769e191517765942cb ]

While commit 30a584d944fb fixes datagram interface in LLC, a use
after free bug has been introduced for SOCK_STREAM sockets that do
not make use of MSG_PEEK.

The flow is as follow ...

if (!(flags & MSG_PEEK)) {
...
sk_eat_skb(sk, skb, false);
...
}
...
if (used + offset < skb->len)
continue;

... where sk_eat_skb() calls __kfree_skb(). Therefore, cache
original length and work on skb_len to check partial reads.

Fixes: 30a584d944fb ("[LLX]: SOCK_DGRAM interface fixes")
Signed-off-by: Daniel Borkmann <dbor...@redhat.com>
Cc: Stephen Hemminger <ste...@networkplumber.org>
Cc: Arnaldo Carvalho de Melo <ac...@ghostprotocols.net>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/llc/af_llc.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 606b6ad..f62b63e 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -669,7 +669,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
struct llc_sock *llc = llc_sk(sk);
size_t copied = 0;
u32 peek_seq = 0;
- u32 *seq;
+ u32 *seq, skb_len;
unsigned long used;
int target; /* Read at least this many bytes */
long timeo;
@@ -767,6 +767,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
}
continue;
found_ok_skb:
+ skb_len = skb->len;
/* Ok so how much can we use? */
used = skb->len - offset;
if (len < used)
@@ -797,7 +798,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
goto copy_uaddr;

/* Partial read */
- if (used + offset < skb->len)
+ if (used + offset < skb_len)
continue;
} while (len > 0);

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Changli Gao <xia...@gmail.com>

[ Upstream commit b1a5a34bd0b8767ea689e68f8ea513e9710b671e ]

Ver and type in pppoe_hdr should be swapped as defined by RFC2516
section-4.

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

include/linux/if_pppox.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/if_pppox.h b/include/linux/if_pppox.h
index 90b5fae..1750054 100644
--- a/include/linux/if_pppox.h
+++ b/include/linux/if_pppox.h
@@ -108,11 +108,11 @@ struct pppoe_tag {

struct pppoe_hdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
- __u8 ver : 4;
__u8 type : 4;
+ __u8 ver : 4;
#elif defined(__BIG_ENDIAN_BITFIELD)
- __u8 type : 4;
__u8 ver : 4;
+ __u8 type : 4;
#else
#error "Please fix <asm/byteorder.h>"
#endif

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Ben Greear <gre...@candelatech.com>

The stop machine logic can lock up if all but one of the migration
threads make it through the disable-irq step and the one remaining
thread gets stuck in __do_softirq. The reason __do_softirq can hang is
that it has a bail-out based on jiffies timeout, but in the lockup case,
jiffies itself is not incremented.

To work around this, re-add the max_restart counter in __do_irq and stop
processing irqs after 10 restarts.

Thanks to Tejun Heo and Rusty Russell and others for helping me track
this down.

This was introduced in 3.9 by commit c10d73671ad3 ("softirq: reduce
latencies").

It may be worth looking into ath9k to see if it has issues with its irq
handler at a later date.

The hang stack traces look something like this:

------------[ cut here ]------------
WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7()
Watchdog detected hard LOCKUP on cpu 2
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11
Call Trace:
<NMI> warn_slowpath_common+0x85/0x9f
warn_slowpath_fmt+0x46/0x48
watchdog_overflow_callback+0x9c/0xa7
__perf_event_overflow+0x137/0x1cb
perf_event_overflow+0x14/0x16
intel_pmu_handle_irq+0x2dc/0x359
perf_event_nmi_handler+0x19/0x1b
nmi_handle+0x7f/0xc2
do_nmi+0xbc/0x304
end_repeat_nmi+0x1e/0x2e
<<EOE>>
cpu_stopper_thread+0xae/0x162
smpboot_thread_fn+0x258/0x260
kthread+0xc7/0xcf
ret_from_fork+0x7c/0xb0
---[ end trace 4947dfa9b0a4cec3 ]---
BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17]
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
irq event stamp: 835637905
hardirqs last enabled at (835637904): __do_softirq+0x9f/0x257
hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80
softirqs last enabled at (5654720): __do_softirq+0x1ff/0x257
softirqs last disabled at (5654725): irq_exit+0x5f/0xbb
CPU 1
Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
RIP: tasklet_hi_action+0xf0/0xf0
Process migration/1
Call Trace:
<IRQ>
__do_softirq+0x117/0x257
irq_exit+0x5f/0xbb
smp_apic_timer_interrupt+0x8a/0x98
apic_timer_interrupt+0x72/0x80
<EOI>
printk+0x4d/0x4f
stop_machine_cpu_stop+0x22c/0x274
cpu_stopper_thread+0xae/0x162
smpboot_thread_fn+0x258/0x260
kthread+0xc7/0xcf
ret_from_fork+0x7c/0xb0

Signed-off-by: Ben Greear <gre...@candelatech.com>
Acked-by: Tejun Heo <t...@kernel.org>
Acked-by: Pekka Riikonen <prii...@iki.fi>
Cc: Eric Dumazet <eric.d...@gmail.com>
Cc: sta...@kernel.org
Cc: Ben Hutchings <b...@decadent.org.uk>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
(cherry picked from commit 34376a50fb1fa095b9d0636fa41ed2e73125f214)

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

kernel/softirq.c | 13 ++++++++++---

1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index d75c136..e4d5d8c 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -194,8 +194,12 @@ void local_bh_enable_ip(unsigned long ip)
EXPORT_SYMBOL(local_bh_enable_ip);

/*
- * We restart softirq processing for at most 2 ms,
- * and if need_resched() is not set.
+ * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
+ * but break the loop if need_resched() is set or after 2 ms.
+ * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in
+ * certain cases, such as stop_machine(), jiffies may cease to
+ * increment and so we need the MAX_SOFTIRQ_RESTART limit as
+ * well to make sure we eventually return from this method.
*
* These limits have been established via experimentation.
* The two things to balance is latency against fairness -
@@ -203,6 +207,7 @@ EXPORT_SYMBOL(local_bh_enable_ip);
* should not be able to lock up the box.
*/
#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
+#define MAX_SOFTIRQ_RESTART 10

asmlinkage void __do_softirq(void)
{
@@ -210,6 +215,7 @@ asmlinkage void __do_softirq(void)
__u32 pending;
unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
int cpu;
+ int max_restart = MAX_SOFTIRQ_RESTART;

pending = local_softirq_pending();
account_system_vtime(current);
@@ -254,7 +260,8 @@ restart:

pending = local_softirq_pending();
if (pending) {
- if (time_before(jiffies, end) && !need_resched())
+ if (time_before(jiffies, end) && !need_resched() &&
+ --max_restart)
goto restart;

wakeup_softirqd();

Willy Tarreau

unread,

May 11, 2014, 9:50:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Matthew Daley <ma...@bugfuzz.com>

Always clear out these floppy_raw_cmd struct members after copying the
entire structure from userspace so that the in-kernel version is always
valid and never left in an interdeterminate state.

Signed-off-by: Matthew Daley <ma...@bugfuzz.com>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
(cherry picked from commit ef87dbe7614341c2e7bfe8d32fcb7028cc97442c)
[wt: be careful in 2.6.32 we still have the ugly macros everywhere]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/block/floppy.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 5c01f74..19d45e6 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3209,9 +3209,12 @@ static inline int raw_cmd_copyin(int cmd, char __user *param,
if (!ptr)
return -ENOMEM;
*rcmd = ptr;
- COPYIN(*ptr);
+ ret = copy_from_user(ptr, (void __user *)param, sizeof(*ptr));
ptr->next = NULL;
ptr->buffer_length = 0;
+ ptr->kernel_data = NULL;

+ if (ret)
+ return -EFAULT;
param += sizeof(struct floppy_raw_cmd);

if (ptr->cmd_count > 33)
/* the command may now also take up the space

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Willy Tarreau <w...@1wt.eu>

syscall_trace_enter() and syscall_trace_leave() are only called from
within asm code and do not need to be declared in the .c at all.
Removing their reference fixes the build issue that was happening
with gcc 4.7.

Both Sven-Haegar Koch and Christoph Biedl confirmed this patch
addresses their respective build issues.

Cc: Sven-Haegar Koch <hae...@sdinet.de>
Cc: Christoph Biedl <linux-ke...@manchmal.in-ulm.de>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/x86/include/asm/ptrace.h | 3 ---
1 file changed, 3 deletions(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 0f0d908..1ec926d 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -142,9 +142,6 @@ extern void send_sigtrap(struct task_struct *tsk, struct pt_regs *regs,
int error_code, int si_code);
void signal_fault(struct pt_regs *regs, void __user *frame, char *where);

-extern long syscall_trace_enter(struct pt_regs *);
-extern void syscall_trace_leave(struct pt_regs *);
-
static inline unsigned long regs_return_value(struct pt_regs *regs)
{
return regs->ax;

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <eric.d...@gmail.com>

[ Upstream commit c9ab4d85de222f3390c67aedc9c18a50e767531e ]

There is a race in neighbour code, because neigh_destroy() uses
skb_queue_purge(&neigh->arp_queue) without holding neighbour lock,
while other parts of the code assume neighbour rwlock is what
protects arp_queue

Convert all skb_queue_purge() calls to the __skb_queue_purge() variant

Use __skb_queue_head_init() instead of skb_queue_head_init()
to make clear we do not use arp_queue.lock

And hold neigh->lock in neigh_destroy() to close the race.

Reported-by: Joe Jin <joe...@oracle.com>
Signed-off-by: Eric Dumazet <edum...@google.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/core/neighbour.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index e696250..fc9feaa 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -222,7 +222,7 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev)
we must kill timers etc. and move
it to safe state.
*/
- skb_queue_purge(&n->arp_queue);
+ __skb_queue_purge(&n->arp_queue);
n->output = neigh_blackhole;
if (n->nud_state & NUD_VALID)
n->nud_state = NUD_NOARP;
@@ -276,7 +276,7 @@ static struct neighbour *neigh_alloc(struct neigh_table *tbl)
if (!n)
goto out_entries;

- skb_queue_head_init(&n->arp_queue);
+ __skb_queue_head_init(&n->arp_queue);
rwlock_init(&n->lock);
n->updated = n->used = now;
n->nud_state = NUD_NONE;
@@ -646,7 +646,9 @@ void neigh_destroy(struct neighbour *neigh)
kfree(hh);
}

- skb_queue_purge(&neigh->arp_queue);
+ write_lock_bh(&neigh->lock);
+ __skb_queue_purge(&neigh->arp_queue);
+ write_unlock_bh(&neigh->lock);

dev_put(neigh->dev);
neigh_parms_put(neigh->parms);
@@ -789,7 +791,7 @@ static void neigh_invalidate(struct neighbour *neigh)
neigh->ops->error_report(neigh, skb);
write_lock(&neigh->lock);
}
- skb_queue_purge(&neigh->arp_queue);
+ __skb_queue_purge(&neigh->arp_queue);
}

/* Called when a timer expires for a neighbour entry. */
@@ -1105,7 +1107,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
n1->output(skb);
write_lock_bh(&neigh->lock);
}
- skb_queue_purge(&neigh->arp_queue);
+ __skb_queue_purge(&neigh->arp_queue);
}
out:
if (update_isrouter) {

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: stephen hemminger <ste...@networkplumber.org>

[ Upstream commit cbd375567f7e4811b1c721f75ec519828ac6583f ]

When userspace passes a large priority value
the assignment of the unsigned value hopt->prio
to signed int cl->prio causes cl->prio to become negative and the
comparison is with TC_HTB_NUMPRIO is always false.

The result is that HTB crashes by referencing outside
the array when processing packets. With this patch the large value
wraps around like other values outside the normal range.

See: https://bugzilla.kernel.org/show_bug.cgi?id=60669

Signed-off-by: Stephen Hemminger <ste...@networkplumber.org>
Acked-by: Eric Dumazet <edum...@google.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/sched/sch_htb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 2f074d6..9ce5963 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -85,7 +85,7 @@ struct htb_class {
unsigned int children;
struct htb_class *parent; /* parent class */

- int prio; /* these two are used only by leaves... */
+ u32 prio; /* these two are used only by leaves... */
int quantum; /* but stored for parent-to-leaf return */

union {

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Marc Kleine-Budde <m...@pengutronix.de>

[ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ]

This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().

Cc: Patrick McHardy <ka...@trash.net>
Signed-off-by: Marc Kleine-Budde <m...@pengutronix.de>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/8021q/vlan_netlink.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
index a915048..1f13bcf 100644
--- a/net/8021q/vlan_netlink.c
+++ b/net/8021q/vlan_netlink.c
@@ -169,7 +169,7 @@ static size_t vlan_get_size(const struct net_device *dev)
struct vlan_dev_info *vlan = vlan_dev_info(dev);

return nla_total_size(2) + /* IFLA_VLAN_ID */
- sizeof(struct ifla_vlan_flags) + /* IFLA_VLAN_FLAGS */
+ nla_total_size(sizeof(struct ifla_vlan_flags)) + /* IFLA_VLAN_FLAGS */
vlan_qos_map_size(vlan->nr_ingress_mappings) +
vlan_qos_map_size(vlan->nr_egress_mappings);

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

[ Upstream commit 087d273caf4f7d3f2159256f255f1f432bc84a5b ]

This patch doesn't change the compiled code because ARC_HDR_SIZE is 4
and sizeof(int) is 4, but the intent was to use the header size and not
the sizeof the header size.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/arcnet/arcnet.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c
index 75a5725..e29940d 100644
--- a/drivers/net/arcnet/arcnet.c
+++ b/drivers/net/arcnet/arcnet.c
@@ -1008,7 +1008,7 @@ static void arcnet_rx(struct net_device *dev, int bufnum)

soft = &pkt.soft.rfc1201;

- lp->hw.copy_from_card(dev, bufnum, 0, &pkt, sizeof(ARC_HDR_SIZE));
+ lp->hw.copy_from_card(dev, bufnum, 0, &pkt, ARC_HDR_SIZE);
if (pkt.hard.offset[0]) {
ofs = pkt.hard.offset[0];
length = 256 - ofs;

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Duan Jiong <duanj...@cn.fujitsu.com>

[ Upstream commit f104a567e673f382b09542a8dc3500aa689957b4 ]

As the rfc 4191 said, the Router Preference and Lifetime values in a
::/0 Route Information Option should override the preference and lifetime
values in the Router Advertisement header. But when the kernel deals with
a ::/0 Route Information Option, the rt6_get_route_info() always return
NULL, that means that overriding will not happen, because those default
routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router().

In order to deal with that condition, we should call rt6_get_dflt_router
when the prefix length is 0.

Signed-off-by: Duan Jiong <duanj...@cn.fujitsu.com>

Acked-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv6/route.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index e307517..5af0d1e 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -495,8 +495,11 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt, int len,
prefix = &prefix_buf;
}

- rt = rt6_get_route_info(net, prefix, rinfo->prefix_len, gwaddr,
- dev->ifindex);
+ if (rinfo->prefix_len == 0)
+ rt = rt6_get_dflt_router(gwaddr, dev);
+ else
+ rt = rt6_get_route_info(net, prefix, rinfo->prefix_len,
+ gwaddr, dev->ifindex);

if (rt && !lifetime) {
ip6_del_rt(rt);

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Kees Cook <kees...@chromium.org>

commit 412f30105ec6735224535791eed5cdc02888ecb4 upstream

A HID device could send a malicious output report that would cause the
pantherlord HID driver to write beyond the output report allocation
during initialization, causing a heap overflow:

[ 310.939483] usb 1-1: New USB device found, idVendor=0e8f, idProduct=0003
...
[ 315.980774] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten

CVE-2013-2892

Signed-off-by: Kees Cook <kees...@chromium.org>
Cc: sta...@kernel.org
Signed-off-by: Jiri Kosina <jko...@suse.cz>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/hid/hid-pl.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/hid-pl.c b/drivers/hid/hid-pl.c
index c6d7dbc..8cdf7b8 100644
--- a/drivers/hid/hid-pl.c
+++ b/drivers/hid/hid-pl.c
@@ -128,8 +128,14 @@ static int plff_init(struct hid_device *hid)
strong = &report->field[0]->value[2];
weak = &report->field[0]->value[3];
debug("detected single-field device");
- } else if (report->maxfield >= 4 && report->field[0]->maxusage == 1 &&
- report->field[0]->usage[0].hid == (HID_UP_LED | 0x43)) {
+ } else if (report->field[0]->maxusage == 1 &&
+ report->field[0]->usage[0].hid ==
+ (HID_UP_LED | 0x43) &&
+ report->maxfield >= 4 &&
+ report->field[0]->report_count >= 1 &&
+ report->field[1]->report_count >= 1 &&
+ report->field[2]->report_count >= 1 &&
+ report->field[3]->report_count >= 1) {
report->field[0]->value[0] = 0x00;
report->field[1]->value[0] = 0x00;
strong = &report->field[2]->value[0];

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <mpat...@redhat.com>

CVE-2013-4299

BugLink: http://bugs.launchpad.net/bugs/1241769

This patch fixes a particular type of data corruption that has been
encountered when loading a snapshot's metadata from disk.

When we allocate a new chunk in persistent_prepare, we increment
ps->next_free and we make sure that it doesn't point to a metadata area
by further incrementing it if necessary.

When we load metadata from disk on device activation, ps->next_free is
positioned after the last used data chunk. However, if this last used
data chunk is followed by a metadata area, ps->next_free is positioned
erroneously to the metadata area. A newly-allocated chunk is placed at
the same location as the metadata area, resulting in data or metadata
corruption.

This patch changes the code so that ps->next_free skips the metadata
area when metadata are loaded in function read_exceptions.

The patch also moves a piece of code from persistent_prepare_exception
to a separate function skip_metadata to avoid code duplication.

CVE-2013-4299

Signed-off-by: Mikulas Patocka <mpat...@redhat.com>
Cc: sta...@vger.kernel.org
Cc: Mike Snitzer <sni...@redhat.com>
Signed-off-by: Alasdair G Kergon <a...@redhat.com>
(back ported from commit e9c6a182649f4259db704ae15a91ac820e63b0ca)
Signed-off-by: Luis Henriques <luis.he...@canonical.com>
Acked-by: Stefan Bader <stefan...@canonical.com>
Signed-off-by: Tim Gardner <tim.g...@canonical.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/md/dm-snap-persistent.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 0c74642..97c3f06 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -252,6 +252,14 @@ static chunk_t area_location(struct pstore *ps, chunk_t area)
return 1 + ((ps->exceptions_per_area + 1) * area);
}

+static void skip_metadata(struct pstore *ps)
+{
+ uint32_t stride = ps->exceptions_per_area + 1;
+ chunk_t next_free = ps->next_free;
+ if (sector_div(next_free, stride) == 1)
+ ps->next_free++;
+}
+
/*
* Read or write a metadata area. Remembering to skip the first
* chunk which holds the header.
@@ -481,6 +489,8 @@ static int read_exceptions(struct pstore *ps,

ps->current_area--;

+ skip_metadata(ps);
+
return 0;
}

@@ -587,8 +597,6 @@ static int persistent_prepare_exception(struct dm_exception_store *store,
struct dm_snap_exception *e)
{
struct pstore *ps = get_info(store);
- uint32_t stride;
- chunk_t next_free;
sector_t size = get_dev_size(store->cow->bdev);

/* Is there enough room ? */
@@ -601,10 +609,8 @@ static int persistent_prepare_exception(struct dm_exception_store *store,
* Move onto the next free pending, making sure to take
* into account the location of the metadata chunks.
*/
- stride = (ps->exceptions_per_area + 1);
- next_free = ++ps->next_free;
- if (sector_div(next_free, stride) == 1)
- ps->next_free++;
+ ps->next_free++;
+ skip_metadata(ps);

atomic_inc(&ps->pending_count);
return 0;

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

commit f856567b930dfcdbc3323261bf77240ccdde01f5 upstream

In commit d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we
added a check on CAP_SYS_RAWIO to the ioctl. The compat ioctls need the
check as well.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Cc: sta...@kernel.org
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/scsi/aacraid/linit.c | 2 ++

1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 9b97c3e..387872c 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -754,6 +754,8 @@ static long aac_compat_do_ioctl(struct aac_dev *dev, unsigned cmd, unsigned long
static int aac_compat_ioctl(struct scsi_device *sdev, int cmd, void __user *arg)
{
struct aac_dev *dev = (struct aac_dev *)sdev->host->hostdata;
+ if (!capable(CAP_SYS_RAWIO))
+ return -EPERM;
return aac_compat_do_ioctl(dev, cmd, (unsigned long)arg);

Willy Tarreau

unread,

May 11, 2014, 9:50:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

[ Upstream commit 3e3aac497513c669e1c62c71e1d552ea85c1d974 ]

egress_priority_map[] hash table updates are protected by rtnl,
and we never remove elements until device is dismantled.

We have to make sure that before inserting an new element in hash table,
all its fields are committed to memory or else another cpu could
find corrupt values and crash.

Signed-off-by: Eric Dumazet <edum...@google.com>
Cc: Patrick McHardy <ka...@trash.net>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/8021q/vlan_dev.c | 7 +++++++

1 file changed, 7 insertions(+)

diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 4198ec5..9796ea4 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -220,6 +220,8 @@ vlan_dev_get_egress_qos_mask(struct net_device *dev, struct sk_buff *skb)
{
struct vlan_priority_tci_mapping *mp;

+ smp_rmb(); /* coupled with smp_wmb() in vlan_dev_set_egress_priority() */
+
mp = vlan_dev_info(dev)->egress_priority_map[(skb->priority & 0xF)];
while (mp) {
if (mp->priority == skb->priority) {
@@ -418,6 +420,11 @@ int vlan_dev_set_egress_priority(const struct net_device *dev,
np->next = mp;
np->priority = skb_prio;
np->vlan_qos = vlan_qos;
+ /* Before inserting this element in hash table, make sure all its fields
+ * are committed to memory.
+ * coupled with smp_rmb() in vlan_dev_get_egress_qos_mask()
+ */
+ smp_wmb();
vlan->egress_priority_map[skb_prio & 0xF] = np;
if (vlan_qos)
vlan->nr_egress_mappings++;

Willy Tarreau

unread,

May 11, 2014, 9:50:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

sockaddr_storage)

From: Hannes Frederic Sowa <han...@stressinduktion.org>

[ Upstream commit 68c6beb373955da0886d8f4f5995b3922ceda4be ]

In that case it is probable that kernel code overwrote part of the
stack. So we should bail out loudly here.

The BUG_ON may be removed in future if we are sure all protocols are
conformant.

Suggested-by: Eric Dumazet <eric.d...@gmail.com>
Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/socket.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/socket.c b/net/socket.c
index 2b7be6d..712f977 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -216,12 +216,13 @@ int move_addr_to_user(struct sockaddr *kaddr, int klen, void __user *uaddr,
int err;
int len;

+ BUG_ON(klen > sizeof(struct sockaddr_storage));
err = get_user(len, ulen);
if (err)
return err;
if (len > klen)
len = klen;
- if (len < 0 || len > sizeof(struct sockaddr_storage))
+ if (len < 0)
return -EINVAL;
if (len) {
if (audit_sockaddr(klen, kaddr))

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: =?latin1?q?Salva=20Peir=F3?= <spe...@ai2.upv.es>

[ Upstream commit 8e3fbf870481eb53b2d3a322d1fc395ad8b367ed ]

The yam_ioctl() code fails to initialise the cmd field
of the struct yamdrv_ioctl_cfg. Add an explicit memset(0)
before filling the structure to avoid the 4-byte info leak.

Signed-off-by: Salva Peir� <spe...@ai2.upv.es>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/hamradio/yam.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/hamradio/yam.c b/drivers/net/hamradio/yam.c
index 694132e..1a1002d 100644
--- a/drivers/net/hamradio/yam.c
+++ b/drivers/net/hamradio/yam.c
@@ -1060,6 +1060,7 @@ static int yam_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
break;

case SIOCYAMGCFG:
+ memset(&yi, 0, sizeof(yi));
yi.cfg.mask = 0xffffffff;
yi.cfg.iobase = yp->iobase;
yi.cfg.irq = yp->irq;

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Nicolas Dichtel <nicolas...@6wind.com>

commit 85dfb745ee40232876663ae206cba35f24ab2a40 upstream

This field was left uninitialized. Some user daemons perform check against this
field.

Signed-off-by: Nicolas Dichtel <nicolas...@6wind.com>
Signed-off-by: Steffen Klassert <steffen....@secunet.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/key/af_key.c | 1 +

1 file changed, 1 insertion(+)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 03d626f..9d22e46 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -2694,6 +2694,7 @@ static int key_notify_policy_flush(struct km_event *c)
hdr->sadb_msg_pid = c->pid;
hdr->sadb_msg_version = PF_KEY_V2;
hdr->sadb_msg_errno = (uint8_t) 0;
+ hdr->sadb_msg_satype = SADB_SATYPE_UNSPEC;
hdr->sadb_msg_len = (sizeof(struct sadb_msg) / sizeof(uint64_t));
hdr->sadb_msg_reserved = 0;
pfkey_broadcast(skb_out, GFP_ATOMIC, BROADCAST_ALL, NULL, c->net);

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

[ Upstream commit 51c37a70aaa3f95773af560e6db3073520513912 ]

For properly initialising the Tausworthe generator [1], we have
a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15.

Commit 697f8d0348 ("random32: seeding improvement") introduced
a __seed() function that imposes boundary checks proposed by the
errata paper [2] to properly ensure above conditions.

However, we're off by one, as the function is implemented as:
"return (x < m) ? x + m : x;", and called with __seed(X, 1),
__seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15
would be possible, whereas the lower boundary should actually
be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise
an initialization with an unwanted seed could have the effect
that Tausworthe's PRNG properties cannot not be ensured.

Note that this PRNG is *not* used for cryptography in the kernel.

[1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
[2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps

Joint work with Hannes Frederic Sowa.

Fixes: 697f8d0348a6 ("random32: seeding improvement")
Cc: Stephen Hemminger <ste...@networkplumber.org>
Cc: Florian Weimer <fwe...@redhat.com>
Cc: Theodore Ts'o <ty...@mit.edu>
Signed-off-by: Daniel Borkmann <dbor...@redhat.com>

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

lib/random32.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/random32.c b/lib/random32.c
index 217d5c4..b9275d2 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -96,7 +96,7 @@ void srandom32(u32 entropy)
*/
for_each_possible_cpu (i) {
struct rnd_state *state = &per_cpu(net_rand_state, i);
- state->s1 = __seed(state->s1 ^ entropy, 1);
+ state->s1 = __seed(state->s1 ^ entropy, 2);
}
}
EXPORT_SYMBOL(srandom32);
@@ -113,9 +113,9 @@ static int __init random32_init(void)
struct rnd_state *state = &per_cpu(net_rand_state,i);

#define LCG(x) ((x) * 69069) /* super-duper LCG */
- state->s1 = __seed(LCG(i + jiffies), 1);
- state->s2 = __seed(LCG(state->s1), 7);
- state->s3 = __seed(LCG(state->s2), 15);
+ state->s1 = __seed(LCG(i + jiffies), 2);
+ state->s2 = __seed(LCG(state->s1), 8);
+ state->s3 = __seed(LCG(state->s2), 16);

/* "warm it up" */
__random32(state);
@@ -142,9 +142,9 @@ static int __init random32_reseed(void)
u32 seeds[3];

get_random_bytes(&seeds, sizeof(seeds));
- state->s1 = __seed(seeds[0], 1);
- state->s2 = __seed(seeds[1], 7);
- state->s3 = __seed(seeds[2], 15);
+ state->s1 = __seed(seeds[0], 2);
+ state->s2 = __seed(seeds[1], 8);
+ state->s3 = __seed(seeds[2], 16);

/* mix it in */
__random32(state);

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Nicolas Dichtel <nicolas...@6wind.com>

The bug was introduced in 2.6.32.61 by commit b8710128e201 ("inet: add RCU
protection to inet->opt") (it's a backport of upstream commit f6d8bd051c39).

In SCTP case, packet is already routed, hence we jump to the label
'packet_routed', but without rcu_read_lock(). After this label,
rcu_read_unlock() is called unconditionally.

Spotted-by: Guo Fengtian <fengti...@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas...@6wind.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/ip_output.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 7dde039..2cd69e3 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -320,13 +320,13 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok)
/* Skip all of this if the packet is already routed,
* f.e. by something like SCTP.
*/
+ rcu_read_lock();
rt = skb_rtable(skb);
if (rt != NULL)
goto packet_routed;

/* Make sure we can route this packet. */
rt = (struct rtable *)__sk_dst_check(sk, 0);
- rcu_read_lock();
inet_opt = rcu_dereference(inet->inet_opt);
if (rt == NULL) {
__be32 daddr;

Willy Tarreau

unread,

May 11, 2014, 9:50:06 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dave Kleikamp <dave.k...@oracle.com>

[ Upstream commit aabb9875d02559ab9b928cd6f259a5cc4c21a589 ]

The missing call to unregister_netdev() leaves the interface active
after the driver is unloaded by rmmod.

Signed-off-by: Dave Kleikamp <dave.k...@oracle.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/sunvnet.c | 2 ++

1 file changed, 2 insertions(+)

diff --git a/drivers/net/sunvnet.c b/drivers/net/sunvnet.c
index bc74db0..b6d0348 100644
--- a/drivers/net/sunvnet.c
+++ b/drivers/net/sunvnet.c
@@ -1260,6 +1260,8 @@ static int vnet_port_remove(struct vio_dev *vdev)
dev_set_drvdata(&vdev->dev, NULL);

kfree(port);
+
+ unregister_netdev(vp->dev);
}
return 0;

Willy Tarreau

unread,

May 11, 2014, 9:50:06 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

[ Upstream commit 54d27fcb338bd9c42d1dfc5a39e18f6f9d373c2e ]

TCP md5 communications fail [1] for some devices, because sg/crypto code
assume page offsets are below PAGE_SIZE.

This was discovered using mlx4 driver [2], but I suspect loopback
might trigger the same bug now we use order-3 pages in tcp_sendmsg()

[1] Failure is giving following messages.

huh, entered softirq 3 NET_RX ffffffff806ad230 preempt_count 00000100,
exited with 00000101?

[2] mlx4 driver uses order-2 pages to allocate RX frags

Reported-by: Matt Schnall <misc...@google.com>
Signed-off-by: Eric Dumazet <edum...@google.com>
Cc: Bernhard Beck <bb...@google.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/tcp.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 6232462..fc18410 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2826,7 +2826,11 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,

for (i = 0; i < shi->nr_frags; ++i) {
const struct skb_frag_struct *f = &shi->frags[i];
- sg_set_page(&sg, f->page, f->size, f->page_offset);
+ unsigned int offset = f->page_offset;
+ struct page *page = f->page + (offset >> PAGE_SHIFT);
+
+ sg_set_page(&sg, page, f->size,
+ offset_in_page(offset));
if (crypto_hash_update(desc, &sg, f->size))
return 1;

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

[ Upstream commit 8cb3b9c3642c0263d48f31d525bcee7170eedc20 ]

The "pvc" struct has a hole after pvc.sap_family which is not cleared.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Reviewed-by: Jiri Pirko <ji...@resnulli.us>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/sched/sch_atm.c | 1 +

1 file changed, 1 insertion(+)

diff --git a/net/sched/sch_atm.c b/net/sched/sch_atm.c
index ab82f14..b022c59 100644
--- a/net/sched/sch_atm.c
+++ b/net/sched/sch_atm.c
@@ -628,6 +628,7 @@ static int atm_tc_dump_class(struct Qdisc *sch, unsigned long cl,
struct sockaddr_atmpvc pvc;
int state;

+ memset(&pvc, 0, sizeof(pvc));
pvc.sap_family = AF_ATMPVC;
pvc.sap_addr.itf = flow->vcc->dev ? flow->vcc->dev->number : -1;
pvc.sap_addr.vpi = flow->vcc->vpi;

Willy Tarreau

unread,

May 11, 2014, 9:50:06 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Bork <t...@eisfair.net>

Thomas Bork reported that commit c6203cd ("scsi: use __uX
types for headers exported to user space") caused a regression
as now he's getting this warning :

> /usr/src/linux-2.6.32-eisfair-1/usr/include/scsi/scsi_netlink.h:108:
> found __[us]{8,16,32,64} type without #include <linux/types.h>

This issue was addressed later by commit 10db4e1 ("headers:
include linux/types.h where appropriate"), so let's just pick the
relevant part from it.

Cc: Thomas Bork <t...@eisfair.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

include/scsi/scsi_netlink.h | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/scsi/scsi_netlink.h b/include/scsi/scsi_netlink.h
index 58ce8fe..5cb20cc 100644
--- a/include/scsi/scsi_netlink.h
+++ b/include/scsi/scsi_netlink.h
@@ -23,7 +23,7 @@
#define SCSI_NETLINK_H

#include <linux/netlink.h>
-
+#include <linux/types.h>

/*
* This file intended to be included by both kernel and user space

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

syscalls

From: Hannes Frederic Sowa <han...@stressinduktion.org>

[ Upstream commit bceaa90240b6019ed73b49965eac7d167610be69 ]

Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.

If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.

Reported-by: mpb <mpb....@gmail.com>
Suggested-by: Eric Dumazet <eric.d...@gmail.com>

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: David S. Miller <da...@davemloft.net>

[wt: no ieee802154, ping nor l2tp in 2.6.32]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/raw.c | 4 +---
net/ipv4/udp.c | 7 +------
net/ipv6/raw.c | 4 +---
net/ipv6/udp.c | 5 +----
net/phonet/datagram.c | 9 ++++-----
5 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 07ab583..c50344b 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -681,9 +681,6 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
if (flags & MSG_OOB)
goto out;

- if (addr_len)
- *addr_len = sizeof(*sin);
-
if (flags & MSG_ERRQUEUE) {
err = ip_recv_error(sk, msg, len);
goto out;
@@ -711,6 +708,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
sin->sin_addr.s_addr = ip_hdr(skb)->saddr;
sin->sin_port = 0;
memset(&sin->sin_zero, 0, sizeof(sin->sin_zero));
+ *addr_len = sizeof(*sin);
}
if (inet->cmsg_flags)
ip_cmsg_recv(msg, skb);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index af559e0..80487ee 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -941,12 +941,6 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
int err;
int is_udplite = IS_UDPLITE(sk);

- /*
- * Check any passed addresses
- */
- if (addr_len)
- *addr_len = sizeof(*sin);
-
if (flags & MSG_ERRQUEUE)
return ip_recv_error(sk, msg, len);

@@ -1001,6 +995,7 @@ try_again:
sin->sin_port = udp_hdr(skb)->source;
sin->sin_addr.s_addr = ip_hdr(skb)->saddr;
memset(sin->sin_zero, 0, sizeof(sin->sin_zero));
+ *addr_len = sizeof(*sin);
}
if (inet->cmsg_flags)
ip_cmsg_recv(msg, skb);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 4f24570..df75f94 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -456,9 +456,6 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
if (flags & MSG_OOB)
return -EOPNOTSUPP;

- if (addr_len)
- *addr_len=sizeof(*sin6);
-
if (flags & MSG_ERRQUEUE)
return ipv6_recv_error(sk, msg, len);

@@ -495,6 +492,7 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
sin6->sin6_scope_id = 0;
if (ipv6_addr_type(&sin6->sin6_addr) & IPV6_ADDR_LINKLOCAL)
sin6->sin6_scope_id = IP6CB(skb)->iif;
+ *addr_len = sizeof(*sin6);
}

sock_recv_timestamp(msg, sk, skb);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d8c0374..ce291af 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -200,9 +200,6 @@ int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk,
int is_udplite = IS_UDPLITE(sk);
int is_udp4;

- if (addr_len)
- *addr_len=sizeof(struct sockaddr_in6);
-
if (flags & MSG_ERRQUEUE)
return ipv6_recv_error(sk, msg, len);

@@ -273,7 +270,7 @@ try_again:
if (ipv6_addr_type(&sin6->sin6_addr) & IPV6_ADDR_LINKLOCAL)
sin6->sin6_scope_id = IP6CB(skb)->iif;
}
-
+ *addr_len = sizeof(*sin6);
}
if (is_udp4) {
if (inet->cmsg_flags)
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index ef5c75c..c88da73 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -122,9 +122,6 @@ static int pn_recvmsg(struct kiocb *iocb, struct sock *sk,
if (flags & MSG_OOB)
goto out_nofree;

- if (addr_len)
- *addr_len = sizeof(sa);
-
skb = skb_recv_datagram(sk, flags, noblock, &rval);
if (skb == NULL)
goto out_nofree;
@@ -145,8 +142,10 @@ static int pn_recvmsg(struct kiocb *iocb, struct sock *sk,

rval = (flags & MSG_TRUNC) ? skb->len : copylen;

- if (msg->msg_name != NULL)
- memcpy(msg->msg_name, &sa, sizeof(struct sockaddr_pn));
+ if (msg->msg_name != NULL) {
+ memcpy(msg->msg_name, &sa, sizeof(sa));
+ *addr_len = sizeof(sa);
+ }

out:
skb_free_datagram(sk, skb);

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <han...@stressinduktion.org>

[ This is a simplified -stable version of a set of upstream commits. ]

This is a replacement patch only for stable which does fix the problems
handled by the following two commits in -net:

"ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9)
"ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b)

Three frames are written on a corked udp socket for which the output
netdevice has UFO enabled. If the first and third frame are smaller than
the mtu and the second one is bigger, we enqueue the second frame with
skb_append_datato_frags without initializing the gso fields. This leads
to the third frame appended regulary and thus constructing an invalid skb.

This fixes the problem by always using skb_append_datato_frags as soon
as the first frag got enqueued to the skb without marking the packet
as SKB_GSO_UDP.

The problem with only two frames for ipv6 was fixed by "ipv6: udp
packets following an UFO enqueued packet need also be handled by UFO"
(2811ebac2521ceac84f2bdae402455baa6a7fb47).

Cc: Jiri Pirko <ji...@resnulli.us>
Cc: Eric Dumazet <eric.d...@gmail.com>
Cc: David Miller <da...@davemloft.net>

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Signed-off-by: Ben Hutchings <b...@decadent.org.uk>
(cherry picked from commit 5124ae99ac8a8f63d0fca9b75adaef40b20678ff)

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/ip_output.c | 2 +-
net/ipv6/ip6_output.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 2cd69e3..faa6623 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -875,7 +875,7 @@ int ip_append_data(struct sock *sk,
skb = skb_peek_tail(&sk->sk_write_queue);

inet->cork.length += length;
- if (((length > mtu) || (skb && skb_is_gso(skb))) &&
+ if (((length > mtu) || (skb && skb_has_frags(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->u.dst.dev->features & NETIF_F_UFO)) {
err = ip_ufo_append_data(sk, getfrag, from, length, hh_len,
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5a1b5bc..6dff3d7 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1257,7 +1257,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
inet->cork.length += length;
skb = skb_peek_tail(&sk->sk_write_queue);
if (((length > mtu) ||
- (skb && skb_is_gso(skb))) &&
+ (skb && skb_has_frags(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->u.dst.dev->features & NETIF_F_UFO)) {
err = ip6_ufo_append_data(sk, getfrag, from, length,

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Liu Yu <allan...@tencent.com>

commit b9f47a3aaeab (tcp_cubic: limit delayed_ack ratio to prevent
divide error) try to prevent divide error, but there is still a little
chance that delayed_ack can reach zero. In case the param cnt get
negative value, then ratio+cnt would overflow and may happen to be zero.
As a result, min(ratio, ACK_RATIO_LIMIT) will calculate to be zero.

In some old kernels, such as 2.6.32, there is a bug that would
pass negative param, which then ultimately leads to this divide error.

commit 5b35e1e6e9c (tcp: fix tcp_trim_head() to adjust segment count
with skb MSS) fixed the negative param issue. However,
it's safe that we fix the range of delayed_ack as well,
to make sure we do not hit a divide by zero.

CC: Stephen Hemminger <shemm...@vyatta.com>
Signed-off-by: Liu Yu <allan...@tencent.com>
Signed-off-by: Eric Dumazet <edum...@google.com>
Acked-by: Neal Cardwell <ncar...@google.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

(cherry picked from commit 0cda345d1b2201dd15591b163e3c92bad5191745)

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/tcp_cubic.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 1b6b8c2..db41113 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -385,7 +385,7 @@ static void bictcp_acked(struct sock *sk, u32 cnt, s32 rtt_us)
ratio -= ca->delayed_ack >> ACK_RATIO_SHIFT;
ratio += cnt;

- ca->delayed_ack = min(ratio, ACK_RATIO_LIMIT);
+ ca->delayed_ack = clamp(ratio, 1U, ACK_RATIO_LIMIT);
}

/* Some calls are for duplicates without timetamps */

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: =?latin1?q?Salva=20Peir=F3?= <spe...@ai2.upv.es>

[ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ]

The wanxl_ioctl() code fails to initialize the two padding bytes of
struct sync_serial_settings after the ->loopback member. Add an explicit
memset(0) before filling the structure to avoid the info leak.

Signed-off-by: Salva Peir� <spe...@ai2.upv.es>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/wan/wanxl.c | 1 +

1 file changed, 1 insertion(+)

diff --git a/drivers/net/wan/wanxl.c b/drivers/net/wan/wanxl.c
index daee8a0..b52b378 100644
--- a/drivers/net/wan/wanxl.c
+++ b/drivers/net/wan/wanxl.c
@@ -354,6 +354,7 @@ static int wanxl_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
ifr->ifr_settings.size = size; /* data size wanted */
return -ENOBUFS;
}
+ memset(&line, 0, sizeof(line));
line.clock_type = get_status(port)->clocking;
line.clock_rate = 0;
line.loopback = 0;

Willy Tarreau

unread,

May 11, 2014, 9:50:06 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <han...@stressinduktion.org>

[ Upstream commit f46078cfcd77fa5165bf849f5e568a7ac5fa569c ]

It is not allowed for an ipv6 packet to contain multiple fragmentation
headers. So discard packets which were already reassembled by
fragmentation logic and send back a parameter problem icmp.

The updates for RFC 6980 will come in later, I have to do a bit more
research here.

Cc: YOSHIFUJI Hideaki <yosh...@linux-ipv6.org>

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

include/linux/ipv6.h | 1 +
net/ipv6/reassembly.c | 5 +++++
2 files changed, 6 insertions(+)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index c662efa..5bf3324 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -248,6 +248,7 @@ struct inet6_skb_parm {

#define IP6SKB_XFRM_TRANSFORMED 1
#define IP6SKB_FORWARDED 2
+#define IP6SKB_FRAGMENTED 16
};

#define IP6CB(skb) ((struct inet6_skb_parm*)((skb)->cb))
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index 105de22..0c09d8e 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -503,6 +503,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
head->tstamp = fq->q.stamp;
ipv6_hdr(head)->payload_len = htons(payload_len);
IP6CB(head)->nhoff = nhoff;
+ IP6CB(head)->flags |= IP6SKB_FRAGMENTED;

/* Yes, and fold redundant checksum back. 8) */
if (head->ip_summed == CHECKSUM_COMPLETE)
@@ -537,6 +538,9 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
struct ipv6hdr *hdr = ipv6_hdr(skb);
struct net *net = dev_net(skb_dst(skb)->dev);

+ if (IP6CB(skb)->flags & IP6SKB_FRAGMENTED)
+ goto fail_hdr;
+
IP6_INC_STATS_BH(net, ip6_dst_idev(skb_dst(skb)), IPSTATS_MIB_REASMREQDS);

/* Jumbo payload inhibits frag. header */
@@ -557,6 +561,7 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
ip6_dst_idev(skb_dst(skb)), IPSTATS_MIB_REASMOKS);

IP6CB(skb)->nhoff = (u8 *)fhdr - skb_network_header(skb);
+ IP6CB(skb)->flags |= IP6SKB_FRAGMENTED;
return 1;

Willy Tarreau

unread,

May 11, 2014, 9:50:06 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Kees Cook <kees...@chromium.org>

commit 331415ff16a12147d57d5c953f3a961b7ede348b upstream

Many drivers need to validate the characteristics of their HID report
during initialization to avoid misusing the reports. This adds a common
helper to perform validation of the report exisitng, the field existing,
and the expected number of values within the field.

Signed-off-by: Kees Cook <kees...@chromium.org>
Cc: sta...@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin....@redhat.com>
Signed-off-by: Jiri Kosina <jko...@suse.cz>

[jmm: backported to 2.6.32]
[wt: dev_err() in 2.6.32 instead of hid_err()]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/hid/hid-core.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/hid.h | 4 ++++
2 files changed, 62 insertions(+)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index a222cbb..e7e28b5 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -808,6 +808,64 @@ static __inline__ int search(__s32 *array, __s32 value, unsigned n)
return -1;
}

+static const char * const hid_report_names[] = {
+ "HID_INPUT_REPORT",
+ "HID_OUTPUT_REPORT",
+ "HID_FEATURE_REPORT",
+};
+/**
+ * hid_validate_values - validate existing device report's value indexes
+ *
+ * @device: hid device
+ * @type: which report type to examine
+ * @id: which report ID to examine (0 for first)
+ * @field_index: which report field to examine
+ * @report_counts: expected number of values
+ *
+ * Validate the number of values in a given field of a given report, after
+ * parsing.
+ */
+struct hid_report *hid_validate_values(struct hid_device *hid,
+ unsigned int type, unsigned int id,
+ unsigned int field_index,
+ unsigned int report_counts)
+{
+ struct hid_report *report;
+
+ if (type > HID_FEATURE_REPORT) {
+ dev_err(&hid->dev, "invalid HID report type %u\n", type);
+ return NULL;
+ }
+
+ if (id >= HID_MAX_IDS) {
+ dev_err(&hid->dev, "invalid HID report id %u\n", id);
+ return NULL;
+ }
+
+ /*
+ * Explicitly not using hid_get_report() here since it depends on
+ * ->numbered being checked, which may not always be the case when
+ * drivers go to access report values.
+ */
+ report = hid->report_enum[type].report_id_hash[id];
+ if (!report) {
+ dev_err(&hid->dev, "missing %s %u\n", hid_report_names[type], id);
+ return NULL;
+ }
+ if (report->maxfield <= field_index) {
+ dev_err(&hid->dev, "not enough fields in %s %u\n",
+ hid_report_names[type], id);
+ return NULL;
+ }
+ if (report->field[field_index]->report_count < report_counts) {
+ dev_err(&hid->dev, "not enough values in %s %u field %u\n",
+ hid_report_names[type], id, field_index);
+ return NULL;
+ }
+ return report;
+}
+EXPORT_SYMBOL_GPL(hid_validate_values);
+
/**
* hid_match_report - check if driver's raw_event should be called
*
diff --git a/include/linux/hid.h b/include/linux/hid.h
index 481080d..e5db8e5 100644
--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -693,6 +693,10 @@ int hidinput_find_field(struct hid_device *hid, unsigned int type, unsigned int
void hid_output_report(struct hid_report *report, __u8 *data);
struct hid_device *hid_allocate_device(void);
int hid_parse_report(struct hid_device *hid, __u8 *start, unsigned size);
+struct hid_report *hid_validate_values(struct hid_device *hid,
+ unsigned int type, unsigned int id,
+ unsigned int field_index,
+ unsigned int report_counts);
int hid_check_keys_pressed(struct hid_device *hid);
int hid_connect(struct hid_device *hid, unsigned int connect_mask);
void hid_disconnect(struct hid_device *hid);

Willy Tarreau

unread,

May 11, 2014, 9:50:06 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

tranformation

From: "fan.du" <fan...@windriver.com>

[ Upstream commit 3868204d6b89ea373a273e760609cb08020beb1a ]

commit a553e4a6317b2cfc7659542c10fe43184ffe53da ("[PKTGEN]: IPSEC support")
tried to support IPsec ESP transport transformation for pktgen, but acctually
this doesn't work at all for two reasons(The orignal transformed packet has
bad IPv4 checksum value, as well as wrong auth value, reported by wireshark)

- After transpormation, IPv4 header total length needs update,
because encrypted payload's length is NOT same as that of plain text.

- After transformation, IPv4 checksum needs re-caculate because of payload
has been changed.

With this patch, armmed pktgen with below cofiguration, Wireshark is able to
decrypted ESP packet generated by pktgen without any IPv4 checksum error or
auth value error.

pgset "flag IPSEC"
pgset "flows 1"

Signed-off-by: Fan Du <fan...@windriver.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/core/pktgen.c | 7 +++++++

1 file changed, 7 insertions(+)

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 6a993b1..f776b99 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2495,6 +2495,8 @@ static int process_ipsec(struct pktgen_dev *pkt_dev,
if (x) {
int ret;
__u8 *eth;
+ struct iphdr *iph;
+
nhead = x->props.header_len - skb_headroom(skb);
if (nhead > 0) {
ret = pskb_expand_head(skb, nhead, 0, GFP_ATOMIC);
@@ -2517,6 +2519,11 @@ static int process_ipsec(struct pktgen_dev *pkt_dev,
eth = (__u8 *) skb_push(skb, ETH_HLEN);
memcpy(eth, pkt_dev->hh, 12);
*(u16 *) &eth[12] = protocol;
+
+ /* Update IPv4 header len as well as checksum value */
+ iph = ip_hdr(skb);
+ iph->tot_len = htons(skb->len - ETH_HLEN);
+ ip_send_check(iph);
}
}
return 1;

Willy Tarreau

unread,

May 11, 2014, 9:50:05 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Ying Xue <ying...@windriver.com>

[ Upstream commit b5de4a22f157ca345cdb3575207bf46402414bc1 ]

init_card() calls dev_get_by_name() to get a network deceive. But it
doesn't decrease network device reference count after the device is
used.

Signed-off-by: Ying Xue <ying...@windriver.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/atm/idt77252.c | 1 +

1 file changed, 1 insertion(+)

diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c
index e33ae00..adbaed5 100644
--- a/drivers/atm/idt77252.c
+++ b/drivers/atm/idt77252.c
@@ -3557,6 +3557,7 @@ init_card(struct atm_dev *dev)
if (tmp) {
memcpy(card->atmdev->esi, tmp->dev_addr, 6);

+ dev_put(tmp);
printk("%s: ESI %02x:%02x:%02x:%02x:%02x:%02x\n",
card->name, card->atmdev->esi[0], card->atmdev->esi[1],
card->atmdev->esi[2], card->atmdev->esi[3],

Willy Tarreau

unread,

May 11, 2014, 9:50:06 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Zhu Yanjun <Yanju...@windriver.com>

2.6.x kernels require a similar logic change as commit e1653c3e
[gianfar: do vlan cleanup] and commit 51b8cbfc
[gianfar: fix bug caused by e1653c3e] introduces for newer kernels.

Since there is something wrong with tx vlan of gianfar nic driver,
in kernel(3.1+), tx vlan is disabled. But in kernel 2.6.x, tx vlan
is still enabled. Thus,gianfar nic driver can not support vlan
packets and non-vlan packets at the same time.

Signed-off-by: Zhu Yanjun <Yanju...@windriver.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/gianfar.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 934a28f..8aa2cf6 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -365,7 +365,7 @@ static int gfar_probe(struct of_device *ofdev,
priv->vlgrp = NULL;

if (priv->device_flags & FSL_GIANFAR_DEV_HAS_VLAN)
- dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
+ dev->features |= NETIF_F_HW_VLAN_RX;

if (priv->device_flags & FSL_GIANFAR_DEV_HAS_EXTENDED_HASH) {
priv->extended_hash = 1;
@@ -1451,12 +1451,6 @@ static void gfar_vlan_rx_register(struct net_device *dev,
priv->vlgrp = grp;

if (grp) {
- /* Enable VLAN tag insertion */
- tempval = gfar_read(&priv->regs->tctrl);
- tempval |= TCTRL_VLINS;
-
- gfar_write(&priv->regs->tctrl, tempval);
-
/* Enable VLAN tag extraction */
tempval = gfar_read(&priv->regs->rctrl);
tempval |= (RCTRL_VLEX | RCTRL_PRSDEP_INIT);

Willy Tarreau

unread,

May 11, 2014, 10:00:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: dingtianhong <dingti...@huawei.com>

[ Upstream commit 2c8a01894a12665d8059fad8f0a293c98a264121 ]

We rename the dummy in modprobe.conf like this:

install dummy0 /sbin/modprobe -o dummy0 --ignore-install dummy
install dummy1 /sbin/modprobe -o dummy1 --ignore-install dummy

We got oops when we run the command:

modprobe dummy0
modprobe dummy1

------------[ cut here ]------------

[ 3302.187584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 3302.195411] IP: [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0
[ 3302.201844] PGD 85c94a067 PUD 8517bd067 PMD 0
[ 3302.206305] Oops: 0002 [#1] SMP
[ 3302.299737] task: ffff88105ccea300 ti: ffff880eba4a0000 task.ti: ffff880eba4a0000
[ 3302.307186] RIP: 0010:[<ffffffff813fe62a>] [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0
[ 3302.316044] RSP: 0018:ffff880eba4a1dd8 EFLAGS: 00010246
[ 3302.321332] RAX: 0000000000000000 RBX: ffffffff81a9d738 RCX: 0000000000000002
[ 3302.328436] RDX: 0000000000000000 RSI: ffffffffa04d602c RDI: ffff880eba4a1dd8
[ 3302.335541] RBP: ffff880eba4a1e18 R08: dead000000200200 R09: dead000000100100
[ 3302.342644] R10: 0000000000000080 R11: 0000000000000003 R12: ffffffff81a9d788
[ 3302.349748] R13: ffffffffa04d7020 R14: ffffffff81a9d670 R15: ffff880eba4a1dd8
[ 3302.364910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3302.370630] CR2: 0000000000000008 CR3: 000000085e15e000 CR4: 00000000000427e0
[ 3302.377734] DR0: 0000000000000003 DR1: 00000000000000b0 DR2: 0000000000000001
[ 3302.384838] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 3302.391940] Stack:
[ 3302.393944] ffff880eba4a1dd8 ffff880eba4a1dd8 ffff880eba4a1e18 ffffffffa04d70c0
[ 3302.401350] 00000000ffffffef ffffffffa01a8000 0000000000000000 ffffffff816111c8
[ 3302.408758] ffff880eba4a1e48 ffffffffa01a80be ffff880eba4a1e48 ffffffffa04d70c0
[ 3302.416164] Call Trace:
[ 3302.418605] [<ffffffffa01a8000>] ? 0xffffffffa01a7fff
[ 3302.423727] [<ffffffffa01a80be>] dummy_init_module+0xbe/0x1000 [dummy0]
[ 3302.430405] [<ffffffffa01a8000>] ? 0xffffffffa01a7fff
[ 3302.435535] [<ffffffff81000322>] do_one_initcall+0x152/0x1b0
[ 3302.441263] [<ffffffff810ab24b>] do_init_module+0x7b/0x200
[ 3302.446824] [<ffffffff810ad3d2>] load_module+0x4e2/0x530
[ 3302.452215] [<ffffffff8127ae40>] ? ddebug_dyndbg_boot_param_cb+0x60/0x60
[ 3302.458979] [<ffffffff810ad5f1>] SyS_init_module+0xd1/0x130
[ 3302.464627] [<ffffffff814b9652>] system_call_fastpath+0x16/0x1b
[ 3302.490090] RIP [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0
[ 3302.496607] RSP <ffff880eba4a1dd8>
[ 3302.500084] CR2: 0000000000000008
[ 3302.503466] ---[ end trace 8342d49cd49f78ed ]---

The reason is that when loading dummy, if __rtnl_link_register() return failed,
the init_module should return and avoid take the wrong path.

Signed-off-by: Tan Xiaojun <tanxi...@huawei.com>
Signed-off-by: Ding Tianhong <dingti...@huawei.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/dummy.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 37dcfdc..9d9de18 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -137,11 +137,15 @@ static int __init dummy_init_module(void)

rtnl_lock();
err = __rtnl_link_register(&dummy_link_ops);
+ if (err < 0)
+ goto out;

for (i = 0; i < numdummies && !err; i++)
err = dummy_init_one();
if (err < 0)
__rtnl_link_unregister(&dummy_link_ops);
+
+out:
rtnl_unlock();

return err;

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <han...@stressinduktion.org>

[ Upstream commit 4b08a8f1bd8cb4541c93ec170027b4d0782dab52 ]

Because of the max_addresses check attackers were able to disable privacy
extensions on an interface by creating enough autoconfigured addresses:

<http://seclists.org/oss-sec/2012/q4/292>

But the check is not actually needed: max_addresses protects the
kernel to install too many ipv6 addresses on an interface and guards
addrconf_prefix_rcv to install further addresses as soon as this limit
is reached. We only generate temporary addresses in direct response of
a new address showing up. As soon as we filled up the maximum number of
addresses of an interface, we stop installing more addresses and thus
also stop generating more temp addresses.

Even if the attacker tries to generate a lot of temporary addresses
by announcing a prefix and removing it again (lifetime == 0) we won't
install more temp addresses, because the temporary addresses do count
to the maximum number of addresses, thus we would stop installing new
autoconfigured addresses when the limit is reached.

This patch fixes CVE-2013-0343 (but other layer-2 attacks are still
possible).

Thanks to Ding Tianhong to bring this topic up again.

Cc: Ding Tianhong <dingti...@huawei.com>
Cc: George Kargiotakis <kar...@void.gr>
Cc: P J P <ppa...@redhat.com>

Cc: YOSHIFUJI Hideaki <yosh...@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Acked-by: Ding Tianhong <dingti...@huawei.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv6/addrconf.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 8ac3d09..e8c4fd9 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -920,12 +920,10 @@ retry:
if (ifp->flags & IFA_F_OPTIMISTIC)
addr_flags |= IFA_F_OPTIMISTIC;

- ift = !max_addresses ||
- ipv6_count_addresses(idev) < max_addresses ?
- ipv6_add_addr(idev, &addr, tmp_plen,
- ipv6_addr_type(&addr)&IPV6_ADDR_SCOPE_MASK,
- addr_flags) : NULL;
- if (!ift || IS_ERR(ift)) {
+ ift = ipv6_add_addr(idev, &addr, tmp_plen,
+ ipv6_addr_type(&addr)&IPV6_ADDR_SCOPE_MASK,
+ addr_flags);
+ if (IS_ERR(ift)) {
in6_ifa_put(ifp);
in6_dev_put(idev);
printk(KERN_INFO

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

[ Upstream commit c9e9042994d37cbc1ee538c500e9da1bb9d1bcdf ]

ip4_datagram_connect() being called from process context,
it should use IP_INC_STATS() instead of IP_INC_STATS_BH()
otherwise we can deadlock on 32bit arches, or get corruptions of
SNMP counters.

Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Signed-off-by: Eric Dumazet <edum...@google.com>
Reported-by: Dave Jones <da...@redhat.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/datagram.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
index 5e6c5a0..30aeb26 100644
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -52,7 +52,7 @@ int ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
inet->sport, usin->sin_port, sk, 1);
if (err) {
if (err == -ENETUNREACH)
- IP_INC_STATS_BH(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);
+ IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);
return err;

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Sasha Levin <sasha...@oracle.com>

[ Upstream commit c2349758acf1874e4c2b93fe41d072336f1a31d0 ]

Binding might result in a NULL device, which is dereferenced
causing this BUG:

[ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097
4
[ 1317.261847] IP: [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110
[ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0
[ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 1317.264179] Dumping ftrace buffer:
[ 1317.264774] (ftrace buffer empty)
[ 1317.265220] Modules linked in:
[ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G W 3.13.0-rc4-
next-20131218-sasha-00013-g2cebb9b-dirty #4159
[ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000
[ 1317.268399] RIP: 0010:[<ffffffff84225f52>] [<ffffffff84225f52>] rds_ib_laddr_check+
0x82/0x110
[ 1317.269670] RSP: 0000:ffff8803cd31bdf8 EFLAGS: 00010246
[ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000
[ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286
[ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000
[ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031
[ 1317.270230] FS: 00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000
0000
[ 1317.270230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0
[ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
[ 1317.270230] Stack:
[ 1317.270230] 0000000054086700 5408670000a25de0 5408670000000002 0000000000000000
[ 1317.270230] ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160
[ 1317.270230] ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280
[ 1317.270230] Call Trace:
[ 1317.270230] [<ffffffff84223542>] ? rds_trans_get_preferred+0x42/0xa0
[ 1317.270230] [<ffffffff84223556>] rds_trans_get_preferred+0x56/0xa0
[ 1317.270230] [<ffffffff8421c9c3>] rds_bind+0x73/0xf0
[ 1317.270230] [<ffffffff83e4ce62>] SYSC_bind+0x92/0xf0
[ 1317.270230] [<ffffffff812493f8>] ? context_tracking_user_exit+0xb8/0x1d0
[ 1317.270230] [<ffffffff8119313d>] ? trace_hardirqs_on+0xd/0x10
[ 1317.270230] [<ffffffff8107a852>] ? syscall_trace_enter+0x32/0x290
[ 1317.270230] [<ffffffff83e4cece>] SyS_bind+0xe/0x10
[ 1317.270230] [<ffffffff843a6ad0>] tracesys+0xdd/0xe2
[ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00
89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7
4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02
[ 1317.270230] RIP [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110
[ 1317.270230] RSP <ffff8803cd31bdf8>
[ 1317.270230] CR2: 0000000000000974

Signed-off-by: Sasha Levin <sasha...@oracle.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/rds/ib.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/rds/ib.c b/net/rds/ib.c
index 536ebe5..5018f3d 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -235,7 +235,8 @@ static int rds_ib_laddr_check(__be32 addr)
ret = rdma_bind_addr(cm_id, (struct sockaddr *)&sin);
/* due to this, we will claim to support iWARP devices unless we
check node_type. */
- if (ret || cm_id->device->node_type != RDMA_NODE_IB_CA)
+ if (ret || !cm_id->device ||
+ cm_id->device->node_type != RDMA_NODE_IB_CA)
ret = -EADDRNOTAVAIL;

rdsdebug("addr %pI4 ret %d node type %d\n",

Willy Tarreau

unread,

May 11, 2014, 10:00:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: dingtianhong <dingti...@huawei.com>

[ Upstream commit f2966cd5691058b8674a20766525bedeaea9cbcf ]

If __rtnl_link_register() return faild when loading the ifb, it will
take the wrong path and get oops, so fix it just like dummy.

Signed-off-by: Ding Tianhong <dingti...@huawei.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/ifb.c | 4 ++++

1 file changed, 4 insertions(+)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index cfd5510..509c6f5 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -269,6 +269,8 @@ static int __init ifb_init_module(void)

rtnl_lock();
err = __rtnl_link_register(&ifb_link_ops);

+ if (err < 0)
+ goto out;

for (i = 0; i < numifbs && !err; i++) {
err = ifb_init_one(i);
@@ -276,6 +278,8 @@ static int __init ifb_init_module(void)
}
if (err)
__rtnl_link_unregister(&ifb_link_ops);

+
+out:
rtnl_unlock();

return err;

Willy Tarreau

unread,

May 11, 2014, 10:00:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Maciej Zenczykowski <ma...@google.com>

[ Upstream commit 946c032e5a53992ea45e062ecb08670ba39b99e3 ]

ip rules with iif/oif references do not update:
(detach/attach) across interface renames.

Signed-off-by: Maciej Zenczykowski <ma...@google.com>
CC: Willem de Bruijn <wil...@google.com>
CC: Eric Dumazet <edum...@google.com>
CC: Chris Davis <chr...@google.com>
CC: Carlo Contavalli <ccont...@google.com>

Google-Bug-Id: 12936021
Acked-by: Eric Dumazet <edum...@google.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/core/fib_rules.c | 7 +++++++

1 file changed, 7 insertions(+)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index de9eac9..06bdee7 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -633,6 +633,13 @@ static int fib_rules_event(struct notifier_block *this, unsigned long event,
attach_rules(&ops->rules_list, dev);
break;

+ case NETDEV_CHANGENAME:
+ list_for_each_entry(ops, &net->rules_ops, list) {
+ detach_rules(&ops->rules_list, dev);
+ attach_rules(&ops->rules_list, dev);
+ }
+ break;
+
case NETDEV_UNREGISTER:
list_for_each_entry(ops, &net->rules_ops, list)
detach_rules(&ops->rules_list, dev);

Willy Tarreau

unread,

May 11, 2014, 10:00:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

[ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ]

TCP stack should make sure it owns skbs before mangling them.

We had various crashes using bnx2x, and it turned out gso_size
was cleared right before bnx2x driver was populating TC descriptor
of the _previous_ packet send. TCP stack can sometime retransmit
packets that are still in Qdisc.

Of course we could make bnx2x driver more robust (using
ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.

We have identified two points where skb_unclone() was needed.

This patch adds a WARN_ON_ONCE() to warn us if we missed another
fix of this kind.

Kudos to Neal for finding the root cause of this bug. Its visible
using small MSS.

Signed-off-by: Eric Dumazet <edum...@google.com>
Signed-off-by: Neal Cardwell <ncar...@google.com>
Cc: Yuchung Cheng <ych...@google.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

include/linux/skbuff.h | 10 ++++++++++
net/ipv4/tcp_output.c | 9 ++++++---
2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4e647bb..ae77862 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -641,6 +641,16 @@ static inline int skb_cloned(const struct sk_buff *skb)
(atomic_read(&skb_shinfo(skb)->dataref) & SKB_DATAREF_MASK) != 1;
}

+static inline int skb_unclone(struct sk_buff *skb, gfp_t pri)
+{
+ might_sleep_if(pri & __GFP_WAIT);
+
+ if (skb_cloned(skb))
+ return pskb_expand_head(skb, 0, 0, pri);
+
+ return 0;
+}
+
/**
* skb_header_cloned - is the header a clone
* @skb: buffer to check
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 38a23e4..49da29e 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -744,6 +744,9 @@ static void tcp_queue_skb(struct sock *sk, struct sk_buff *skb)
static void tcp_set_skb_tso_segs(struct sock *sk, struct sk_buff *skb,
unsigned int mss_now)
{
+ /* Make sure we own this skb before messing gso_size/gso_segs */
+ WARN_ON_ONCE(skb_cloned(skb));
+
if (skb->len <= mss_now || !sk_can_gso(sk) ||
skb->ip_summed == CHECKSUM_NONE) {
/* Avoid the costly divide in the normal
@@ -824,9 +827,7 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
if (nsize < 0)
nsize = 0;

- if (skb_cloned(skb) &&
- skb_is_nonlinear(skb) &&
- pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
+ if (skb_unclone(skb, GFP_ATOMIC))
return -ENOMEM;

/* Get a new skb... force flag on. */
@@ -1932,6 +1933,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
int oldpcount = tcp_skb_pcount(skb);

if (unlikely(oldpcount > 1)) {
+ if (skb_unclone(skb, GFP_ATOMIC))
+ return -ENOMEM;
tcp_init_tso_segs(sk, skb, cur_mss);
tcp_adjust_pcount(sk, skb, oldpcount - tcp_skb_pcount(skb));

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Michael Chan <mc...@broadcom.com>

[ Upstream commit d7b95315cc7f441418845a165ee56df723941487 ]

Redefine the RXD_ERR_MASK to include only relevant error bits. This fixes
a customer reported issue of randomly dropping packets on the 5719.

Signed-off-by: Michael Chan <mc...@broadcom.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/tg3.c | 3 +--
drivers/net/tg3.h | 6 +++++-
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 56648b4..17e8abe 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4557,8 +4557,7 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)

work_mask |= opaque_key;

- if ((desc->err_vlan & RXD_ERR_MASK) != 0 &&
- (desc->err_vlan != RXD_ERR_ODD_NIBBLE_RCVD_MII)) {
+ if (desc->err_vlan & RXD_ERR_MASK) {
drop_it:
tg3_recycle_rx(tnapi, opaque_key,
desc_idx, *post_ptr);
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 529f55a..593f8c6 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2219,7 +2219,11 @@ struct tg3_rx_buffer_desc {
#define RXD_ERR_TOO_SMALL 0x00400000
#define RXD_ERR_NO_RESOURCES 0x00800000
#define RXD_ERR_HUGE_FRAME 0x01000000
-#define RXD_ERR_MASK 0xffff0000
+
+#define RXD_ERR_MASK (RXD_ERR_BAD_CRC | RXD_ERR_COLLISION | \
+ RXD_ERR_LINK_LOST | RXD_ERR_PHY_DECODE | \
+ RXD_ERR_MAC_ABRT | RXD_ERR_TOO_SMALL | \
+ RXD_ERR_NO_RESOURCES | RXD_ERR_HUGE_FRAME)

u32 reserved;
u32 opaque;

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Kees Cook <kees...@chromium.org>

commit be67b68d52fa28b9b721c47bb42068f0c1214855 upstream

Defensively check that the field to be worked on is not NULL.

Signed-off-by: Kees Cook <kees...@chromium.org>
Cc: sta...@kernel.org
Signed-off-by: Jiri Kosina <jko...@suse.cz>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/hid/hid-core.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index e40e3c4..a222cbb 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -983,7 +983,12 @@ EXPORT_SYMBOL_GPL(hid_output_report);

int hid_set_field(struct hid_field *field, unsigned offset, __s32 value)
{
- unsigned size = field->report_size;
+ unsigned size;
+
+ if (!field)
+ return -1;
+
+ size = field->report_size;

hid_dump_input(field->report->device, field->usage + offset, value);

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Jason Wang <jaso...@redhat.com>

[ Upstream commit 0e7ede80d929ff0f830c44a543daa1acd590c749 ]

We should alloc big buffers also when guest can receive UFO
packets to let the big packets fit into guest rx buffer.

Fixes 5c5167515d80f78f6bb538492c423adcae31ad65
(virtio-net: Allow UFO feature to be set and advertised.)

Cc: Rusty Russell <ru...@rustcorp.com.au>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Sridhar Samudrala <s...@us.ibm.com>
Signed-off-by: Jason Wang <jaso...@redhat.com>
Acked-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Rusty Russell <ru...@rustcorp.com.au>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/virtio_net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index bf6d850..97a56f0 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -904,7 +904,8 @@ static int virtnet_probe(struct virtio_device *vdev)
/* If we can receive ANY GSO packets, we must allocate large ones. */
if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4)
|| virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6)
- || virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN))
+ || virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN)
+ || virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_UFO))
vi->big_packets = true;

if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Neal Cardwell <ncar...@google.com>

This commit fixes tcp_trim_head() to recalculate the number of
segments in the skb with the skb's existing MSS, so trimming the head
causes the skb segment count to be monotonically non-increasing - it
should stay the same or go down, but not increase.

Previously tcp_trim_head() used the current MSS of the connection. But
if there was a decrease in MSS between original transmission and ACK
(e.g. due to PMTUD), this could cause tcp_trim_head() to
counter-intuitively increase the segment count when trimming bytes off
the head of an skb. This violated assumptions in tcp_tso_acked() that
tcp_trim_head() only decreases the packet count, so that packets_acked
in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to
pass u32 pkts_acked values as large as 0xffffffff to
ca_ops->pkts_acked().

As an aside, if tcp_trim_head() had really wanted the skb to reflect
the current MSS, it should have called tcp_set_skb_tso_segs()
unconditionally, since a decrease in MSS would mean that a
single-packet skb should now be sliced into multiple segments.

Signed-off-by: Neal Cardwell <ncar...@google.com>
Acked-by: Nandita Dukkipati <nand...@google.com>
Acked-by: Ilpo J�rvinen <ilpo.j...@helsinki.fi>

Signed-off-by: David S. Miller <da...@davemloft.net>

(cherry picked from commit 5b35e1e6e9ca651e6b291c96d1106043c9af314a)

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/tcp_output.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 49da29e..0fc0a73 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -949,11 +949,9 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len)
sk_mem_uncharge(sk, len);
sock_set_flag(sk, SOCK_QUEUE_SHRUNK);

- /* Any change of skb->len requires recalculation of tso
- * factor and mss.
- */
+ /* Any change of skb->len requires recalculation of tso factor. */
if (tcp_skb_pcount(skb) > 1)
- tcp_set_skb_tso_segs(sk, skb, tcp_current_mss(sk));
+ tcp_set_skb_tso_segs(sk, skb, tcp_skb_mss(skb));

return 0;

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Nikolay Aleksandrov <nik...@redhat.com>

[ Upstream commit b869ccfab1e324507fa3596e3e1308444fb68227 ]

This patch fixes two race conditions between bond_store_updelay/downdelay
and bond_store_miimon which could lead to division by zero as miimon can
be set to 0 while either updelay/downdelay are being set and thus miss the
zero check in the beginning, the zero div happens because updelay/downdelay
are stored as new_value / bond->params.miimon. Use rtnl to synchronize with
miimon setting.

CC: Jay Vosburgh <fu...@us.ibm.com>
CC: Andy Gospodarek <an...@greyhouse.net>
CC: Veaceslav Falico <vfa...@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nik...@redhat.com>
Acked-by: Veaceslav Falico <vfa...@redhat.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/bonding/bond_sysfs.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 8762a27..3666a9a 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -755,6 +755,8 @@ static ssize_t bonding_store_downdelay(struct device *d,
int new_value, ret = count;
struct bonding *bond = to_bond(d);

+ if (!rtnl_trylock())
+ return restart_syscall();
if (!(bond->params.miimon)) {
pr_err(DRV_NAME
": %s: Unable to set down delay as MII monitoring is disabled\n",
@@ -795,6 +797,7 @@ static ssize_t bonding_store_downdelay(struct device *d,
}

out:
+ rtnl_unlock();
return ret;
}
static DEVICE_ATTR(downdelay, S_IRUGO | S_IWUSR,
@@ -817,6 +820,8 @@ static ssize_t bonding_store_updelay(struct device *d,
int new_value, ret = count;
struct bonding *bond = to_bond(d);

+ if (!rtnl_trylock())
+ return restart_syscall();
if (!(bond->params.miimon)) {
pr_err(DRV_NAME
": %s: Unable to set up delay as MII monitoring is disabled\n",
@@ -856,6 +861,7 @@ static ssize_t bonding_store_updelay(struct device *d,
}

out:
+ rtnl_unlock();
return ret;
}
static DEVICE_ATTR(updelay, S_IRUGO | S_IWUSR,

Willy Tarreau

unread,

May 11, 2014, 10:00:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <mpat...@redhat.com>

commit e20e1d0ac02308e2211306fc67abcd0b2668fb8b upstream

I found out that a system with k6-3+ processor is unstable during network
server load. The system locks up or the network card stops receiving. The
reason for the instability is the CPU frequency scaling.

During frequency transition the processor is in "EPM Stop Grant" state.
The documentation says that the processor doesn't respond to inquiry
requests in this state. Consequently, coherency of processor caches and
bus master devices is not maintained, causing the system instability.

This patch flushes the cache during frequency transition. It fixes the
instability.

Other minor changes:
* u64 invalue changed to unsigned long because the variable is 32-bit
* move the logic to set the multiplier to a separate function
powernow_k6_set_cpu_multiplier
* preserve lower 5 bits of the powernow port instead of 4 (the voltage
field has 5 bits)
* mask interrupts when reading the multiplier, so that the port is not
open during other activity (running other kernel code with the port open
shouldn't cause any misbehavior, but we should better be safe and keep
the port closed)

This patch should be backported to all stable kernels. If it doesn't
apply cleanly, change it, or ask me to change it.

Signed-off-by: Mikulas Patocka <mpat...@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 56 +++++++++++++++++++++----------
1 file changed, 39 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
index cb01dac..9f80a34 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
@@ -44,23 +44,58 @@ static struct cpufreq_frequency_table clock_ratio[] = {
/**
* powernow_k6_get_cpu_multiplier - returns the current FSB multiplier
*
- * Returns the current setting of the frequency multiplier. Core clock
+ * Returns the current setting of the frequency multiplier. Core clock
* speed is frequency of the Front-Side Bus multiplied with this value.
*/
static int powernow_k6_get_cpu_multiplier(void)
{
- u64 invalue = 0;
+ unsigned long invalue = 0;
u32 msrval;

+ local_irq_disable();
+
msrval = POWERNOW_IOPORT + 0x1;
wrmsr(MSR_K6_EPMR, msrval, 0); /* enable the PowerNow port */
invalue = inl(POWERNOW_IOPORT + 0x8);
msrval = POWERNOW_IOPORT + 0x0;
wrmsr(MSR_K6_EPMR, msrval, 0); /* disable it again */

+ local_irq_enable();
+
return clock_ratio[(invalue >> 5)&7].index;
}

+static void powernow_k6_set_cpu_multiplier(unsigned int best_i)
+{
+ unsigned long outvalue, invalue;
+ unsigned long msrval;
+ unsigned long cr0;
+
+ /* we now need to transform best_i to the BVC format, see AMD#23446 */
+
+ /*
+ * The processor doesn't respond to inquiry cycles while changing the
+ * frequency, so we must disable cache.
+ */
+ local_irq_disable();
+ cr0 = read_cr0();
+ write_cr0(cr0 | X86_CR0_CD);
+ wbinvd();
+
+ outvalue = (1<<12) | (1<<10) | (1<<9) | (best_i<<5);
+
+ msrval = POWERNOW_IOPORT + 0x1;
+ wrmsr(MSR_K6_EPMR, msrval, 0); /* enable the PowerNow port */
+ invalue = inl(POWERNOW_IOPORT + 0x8);
+ invalue = invalue & 0x1f;
+ outvalue = outvalue | invalue;
+ outl(outvalue, (POWERNOW_IOPORT + 0x8));
+ msrval = POWERNOW_IOPORT + 0x0;
+ wrmsr(MSR_K6_EPMR, msrval, 0); /* disable it again */
+
+ write_cr0(cr0);
+ local_irq_enable();
+}

/**
* powernow_k6_set_state - set the PowerNow! multiplier
@@ -70,8 +105,6 @@ static int powernow_k6_get_cpu_multiplier(void)
*/
static void powernow_k6_set_state(unsigned int best_i)
{
- unsigned long outvalue = 0, invalue = 0;
- unsigned long msrval;
struct cpufreq_freqs freqs;

if (clock_ratio[best_i].index > max_multiplier) {
@@ -85,18 +118,7 @@ static void powernow_k6_set_state(unsigned int best_i)

cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);

- /* we now need to transform best_i to the BVC format, see AMD#23446 */
-
- outvalue = (1<<12) | (1<<10) | (1<<9) | (best_i<<5);
-
- msrval = POWERNOW_IOPORT + 0x1;
- wrmsr(MSR_K6_EPMR, msrval, 0); /* enable the PowerNow port */
- invalue = inl(POWERNOW_IOPORT + 0x8);
- invalue = invalue & 0xf;
- outvalue = outvalue | invalue;
- outl(outvalue , (POWERNOW_IOPORT + 0x8));
- msrval = POWERNOW_IOPORT + 0x0;
- wrmsr(MSR_K6_EPMR, msrval, 0); /* disable it again */
+ powernow_k6_set_cpu_multiplier(best_i);

cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);

@@ -164,7 +186,7 @@ static int powernow_k6_cpu_init(struct cpufreq_policy *policy)
}

/* cpuinfo and default policy values */
- policy->cpuinfo.transition_latency = 200000;
+ policy->cpuinfo.transition_latency = 500000;
policy->cur = busfreq * max_multiplier;

result = cpufreq_frequency_table_cpuinfo(policy, clock_ratio);

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Andreas Henriksson <and...@fatal.se>

[ Upstream commit 13eb2ab2d33c57ebddc57437a7d341995fc9138c ]

When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.

Originally reported at: http://bugs.debian.org/724783

Reported-by: Nicolas HICHER <nhi...@avencall.com>
Signed-off-by: Andreas Henriksson <and...@fatal.se>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/core/fib_rules.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index bd30938..de9eac9 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -381,7 +381,8 @@ static int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
if (frh->action && (frh->action != rule->action))
continue;

- if (frh->table && (frh_get_table(frh, tb) != rule->table))
+ if (frh_get_table(frh, tb) &&
+ (frh_get_table(frh, tb) != rule->table))
continue;

if (tb[FRA_PRIORITY] &&

Willy Tarreau

unread,

May 11, 2014, 10:00:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Vlad Yasevich <vyas...@redhat.com>

[ Upstream commit d2615bf450694c1302d86b9cc8a8958edfe4c3a4 ]

The following commit:
b6c40d68ff6498b7f63ddf97cf0aa818d748dee7
net: only invoke dev->change_rx_flags when device is UP

tried to fix a problem with VLAN devices and promiscuouse flag setting.
The issue was that VLAN device was setting a flag on an interface that
was down, thus resulting in bad promiscuity count.
This commit blocked flag propagation to any device that is currently
down.

A later commit:
deede2fabe24e00bd7e246eb81cd5767dc6fcfc7
vlan: Don't propagate flag changes on down interfaces

fixed VLAN code to only propagate flags when the VLAN interface is up,
thus fixing the same issue as above, only localized to VLAN.

The problem we have now is that if we have create a complex stack
involving multiple software devices like bridges, bonds, and vlans,
then it is possible that the flags would not propagate properly to
the physical devices. A simple examle of the scenario is the
following:

eth0----> bond0 ----> bridge0 ---> vlan50

If bond0 or eth0 happen to be down at the time bond0 is added to
the bridge, then eth0 will never have promisc mode set which is
currently required for operation as part of the bridge. As a
result, packets with vlan50 will be dropped by the interface.

The only 2 devices that implement the special flag handling are
VLAN and DSA and they both have required code to prevent incorrect
flag propagation. As a result we can remove the generic solution
introduced in b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 and leave
it to the individual devices to decide whether they will block
flag propagation or not.

Reported-by: Stefan Priebe <s.pr...@profihost.ag>
Suggested-by: Veaceslav Falico <vfa...@redhat.com>
Signed-off-by: Vlad Yasevich <vyas...@redhat.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/core/dev.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index d775563..d250444 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3388,7 +3388,7 @@ static void dev_change_rx_flags(struct net_device *dev, int flags)
{
const struct net_device_ops *ops = dev->netdev_ops;

- if ((dev->flags & IFF_UP) && ops->ndo_change_rx_flags)
+ if (ops->ndo_change_rx_flags)
ops->ndo_change_rx_flags(dev, flags);

Willy Tarreau

unread,

May 11, 2014, 10:00:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Mathias Krause <min...@googlemail.com>

[ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ]

The current code tests the length of the whole netlink message to be
at least as long to fit a cn_msg. This is wrong as nlmsg_len includes
the length of the netlink message header. Use nlmsg_len() instead to
fix this "off-by-NLMSG_HDRLEN" size check.

Cc: sta...@vger.kernel.org
Signed-off-by: Mathias Krause <min...@googlemail.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/connector/connector.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index 537c29a..980412b 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -177,17 +177,18 @@ static int cn_call_callback(struct sk_buff *skb)
static void cn_rx_skb(struct sk_buff *__skb)
{
struct nlmsghdr *nlh;
- int err;
struct sk_buff *skb;
+ int len, err;

skb = skb_get(__skb);

if (skb->len >= NLMSG_SPACE(0)) {
nlh = nlmsg_hdr(skb);
+ len = nlmsg_len(nlh);

- if (nlh->nlmsg_len < sizeof(struct cn_msg) ||
+ if (len < (int)sizeof(struct cn_msg) ||
skb->len < nlh->nlmsg_len ||
- nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) {
+ len > CONNECTOR_MAX_MSG_SIZE) {
kfree_skb(skb);
return;

Willy Tarreau

unread,

May 11, 2014, 10:00:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: YOSHIFUJI Hideaki <yosh...@linux-ipv6.org>

[ Upstream commit 77bc6bed7121936bb2e019a8c336075f4c8eef62 ]

Return -EINVAL unless all of user-given strings are correctly
NUL-terminated.

Signed-off-by: YOSHIFUJI Hideaki <yosh...@linux-ipv6.org>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/isdn/isdnloop/isdnloop.c | 6 ++++++

1 file changed, 6 insertions(+)

diff --git a/drivers/isdn/isdnloop/isdnloop.c b/drivers/isdn/isdnloop/isdnloop.c
index bf4168b..4267d48 100644
--- a/drivers/isdn/isdnloop/isdnloop.c
+++ b/drivers/isdn/isdnloop/isdnloop.c
@@ -1071,6 +1071,12 @@ isdnloop_start(isdnloop_card * card, isdnloop_sdef * sdefp)
return -EBUSY;
if (copy_from_user((char *) &sdef, (char *) sdefp, sizeof(sdef)))
return -EFAULT;
+
+ for (i = 0; i < 3; i++) {
+ if (!memchr(sdef.num[i], 0, sizeof(sdef.num[i])))
+ return -EINVAL;
+ }
+
spin_lock_irqsave(&card->isdnloop_lock, flags);
switch (sdef.ptype) {
case ISDN_PTYPE_EURO:

Willy Tarreau

unread,

May 11, 2014, 10:00:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Andy Honig <aho...@google.com>

commit 338c7dbadd2671189cec7faf64c84d01071b3f96 upstream

In multiple functions the vcpu_id is used as an offset into a bitfield. Ag
malicious user could specify a vcpu_id greater than 255 in order to set or
clear bits in kernel memory. This could be used to elevate priveges in the
kernel. This patch verifies that the vcpu_id provided is less than 255.
The api documentation already specifies that the vcpu_id must be less than
max_vcpus, but this is currently not checked.

Reported-by: Andrew Honig <aho...@google.com>
Cc: sta...@vger.kernel.org
Signed-off-by: Andrew Honig <aho...@google.com>
Signed-off-by: Paolo Bonzini <pbon...@redhat.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

virt/kvm/kvm_main.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 82b6fdc..3b9443b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1846,6 +1846,9 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
int r;
struct kvm_vcpu *vcpu, *v;

+ if (id >= KVM_MAX_VCPUS)
+ return -EINVAL;
+
vcpu = kvm_arch_vcpu_create(kvm, id);
if (IS_ERR(vcpu))
return PTR_ERR(vcpu);

Willy Tarreau

unread,

May 11, 2014, 10:00:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

[ Upstream commit 2dc85bf323515e59e15dfa858d1472bb25cad0fe ]

uaddr->sa_data is exactly of size 14, which is hard-coded here and
passed as a size argument to strncpy(). A device name can be of size
IFNAMSIZ (== 16), meaning we might leave the destination string
unterminated. Thus, use strlcpy() and also sizeof() while we're
at it. We need to memset the data area beforehand, since strlcpy
does not padd the remaining buffer with zeroes for user space, so
that we do not possibly leak anything.

Signed-off-by: Daniel Borkmann <dbor...@redhat.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/packet/af_packet.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 728c080..f084e01 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1525,12 +1525,12 @@ static int packet_getname_spkt(struct socket *sock, struct sockaddr *uaddr,
return -EOPNOTSUPP;

uaddr->sa_family = AF_PACKET;
+ memset(uaddr->sa_data, 0, sizeof(uaddr->sa_data));
dev = dev_get_by_index(sock_net(sk), pkt_sk(sk)->ifindex);
if (dev) {
- strncpy(uaddr->sa_data, dev->name, 14);
+ strlcpy(uaddr->sa_data, dev->name, sizeof(uaddr->sa_data));
dev_put(dev);
- } else
- memset(uaddr->sa_data, 0, 14);
+ }
*uaddr_len = sizeof(*uaddr);

return 0;

Willy Tarreau

unread,

May 11, 2014, 10:00:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Jonathan Salwan <jonatha...@gmail.com>

commit 542db01579fbb7ea7d1f7bb9ddcef1559df660b2 upstream

In drivers/cdrom/cdrom.c mmc_ioctl_cdrom_read_data() allocates a memory
area with kmalloc in line 2885.

2885 cgc->buffer = kmalloc(blocksize, GFP_KERNEL);
2886 if (cgc->buffer == NULL)
2887 return -ENOMEM;

In line 2908 we can find the copy_to_user function:

2908 if (!ret && copy_to_user(arg, cgc->buffer, blocksize))

The cgc->buffer is never cleaned and initialized before this function.
If ret = 0 with the previous basic block, it's possible to display some
memory bytes in kernel space from userspace.

When we read a block from the disk it normally fills the ->buffer but if
the drive is malfunctioning there is a chance that it would only be
partially filled. The result is an leak information to userspace.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Cc: Jens Axboe <ax...@kernel.dk>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/cdrom/cdrom.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index a4592ec..71a78dc 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -2822,7 +2822,7 @@ static noinline int mmc_ioctl_cdrom_read_data(struct cdrom_device_info *cdi,
if (lba < 0)
return -EINVAL;

- cgc->buffer = kmalloc(blocksize, GFP_KERNEL);
+ cgc->buffer = kzalloc(blocksize, GFP_KERNEL);
if (cgc->buffer == NULL)
return -ENOMEM;

Willy Tarreau

unread,

May 11, 2014, 10:00:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Salam Noureddine <noure...@aristanetworks.com>

[ Upstream commit e2401654dd0f5f3fb7a8d80dad9554d73d7ca394 ]

It is possible for the timer handlers to run after the call to
ip_mc_down so use in_dev_put instead of __in_dev_put in the handler
function in order to do proper cleanup when the refcnt reaches 0.
Otherwise, the refcnt can reach zero without the in_device being
destroyed and we end up leaking a reference to the net_device and
see messages like the following,

unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Tested on linux-3.4.43.

Signed-off-by: Salam Noureddine <noure...@aristanetworks.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/igmp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 169da93..c07be7c 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -697,7 +697,7 @@ static void igmp_gq_timer_expire(unsigned long data)

in_dev->mr_gq_running = 0;
igmpv3_send_report(in_dev, NULL);
- __in_dev_put(in_dev);
+ in_dev_put(in_dev);
}

static void igmp_ifc_timer_expire(unsigned long data)
@@ -709,7 +709,7 @@ static void igmp_ifc_timer_expire(unsigned long data)
in_dev->mr_ifc_count--;
igmp_ifc_start_timer(in_dev, IGMP_Unsolicited_Report_Interval);
}
- __in_dev_put(in_dev);
+ in_dev_put(in_dev);
}

static void igmp_ifc_event(struct in_device *in_dev)

Willy Tarreau

unread,

May 11, 2014, 10:00:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: dingtianhong <dingti...@huawei.com>

[ Upstream commit 440d57bc5ff55ec1efb3efc9cbe9420b4bbdfefa ]

According to the commit 16b0dc29c1af9df341428f4c49ada4f626258082
(dummy: fix rcu_sched self-detected stalls)

Eric Dumazet fix the problem in dummy, but the ifb will occur the
same problem like the dummy modules.

Trying to "modprobe ifb numifbs=30000" triggers :

INFO: rcu_sched self-detected stall on CPU

After this splat, RTNL is locked and reboot is needed.

We must call cond_resched() to avoid this, even holding RTNL.

Signed-off-by: Ding Tianhong <dingti...@huawei.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

[wt: 2.6.32: cond_resched() needs linux/sched.h]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/ifb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 030913f..cfd5510 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -33,6 +33,7 @@
#include <linux/etherdevice.h>
#include <linux/init.h>
#include <linux/moduleparam.h>
+#include <linux/sched.h>
#include <net/pkt_sched.h>
#include <net/net_namespace.h>

@@ -269,8 +270,10 @@ static int __init ifb_init_module(void)
rtnl_lock();
err = __rtnl_link_register(&ifb_link_ops);

- for (i = 0; i < numifbs && !err; i++)
+ for (i = 0; i < numifbs && !err; i++) {
err = ifb_init_one(i);
+ cond_resched();
+ }
if (err)
__rtnl_link_unregister(&ifb_link_ops);
rtnl_unlock();

Willy Tarreau

unread,

May 11, 2014, 10:00:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

commit 201f99f170df14ba52ea4c52847779042b7a623b upstream

We don't cap the size of buffer from the user so we could write past the
end of the array here. Only root can write to this file.

Reported-by: Nico Golde <ni...@ngolde.de>
Reported-by: Fabian Yamaguchi <fa...@goesec.de>
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Cc: sta...@kernel.org
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/um/kernel/exitcode.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/um/kernel/exitcode.c b/arch/um/kernel/exitcode.c
index 6540d2c..ce057af 100644
--- a/arch/um/kernel/exitcode.c
+++ b/arch/um/kernel/exitcode.c
@@ -42,9 +42,11 @@ static int write_proc_exitcode(struct file *file, const char __user *buffer,
unsigned long count, void *data)
{
char *end, buf[sizeof("nnnnn\0")];
+ size_t size;
int tmp;

- if (copy_from_user(buf, buffer, count))
+ size = min(count, sizeof(buf));
+ if (copy_from_user(buf, buffer, size))
return -EFAULT;

tmp = simple_strtol(buf, &end, 0);

Willy Tarreau

unread,

May 11, 2014, 10:00:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

match

From: Hannes Frederic Sowa <han...@stressinduktion.org>

[ Upstream commit 3e3be275851bc6fc90bfdcd732cd95563acd982b ]

In case a subtree did not match we currently stop backtracking and return
NULL (root table from fib_lookup). This could yield in invalid routing
table lookups when using subtrees.

Instead continue to backtrack until a valid subtree or node is found
and return this match.

Also remove unneeded NULL check.

Reported-by: Teco Boot <te...@inf-net.nl>
Cc: YOSHIFUJI Hideaki <yosh...@linux-ipv6.org>
Cc: David Lamparter <equ...@diac24.net>
Cc: <bou...@pps.univ-paris-diderot.fr>

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv6/ip6_fib.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 0e93ca5..0a36d8d 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -846,14 +846,22 @@ static struct fib6_node * fib6_lookup_1(struct fib6_node *root,

if (ipv6_prefix_equal(&key->addr, args->addr, key->plen)) {
#ifdef CONFIG_IPV6_SUBTREES
- if (fn->subtree)
- fn = fib6_lookup_1(fn->subtree, args + 1);
+ if (fn->subtree) {
+ struct fib6_node *sfn;
+ sfn = fib6_lookup_1(fn->subtree,
+ args + 1);
+ if (!sfn)
+ goto backtrack;
+ fn = sfn;
+ }
#endif
- if (!fn || fn->fn_flags & RTN_RTINFO)
+ if (fn->fn_flags & RTN_RTINFO)
return fn;
}
}
-
+#ifdef CONFIG_IPV6_SUBTREES
+backtrack:
+#endif
if (fn->fn_flags & RTN_ROOT)
break;

Willy Tarreau

unread,

May 11, 2014, 10:00:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

commit 627aad1c01da6f881e7f98d71fd928ca0c316b1a upstream

The pciinfo struct has a two byte hole after ->dev_fn so stack
information could be leaked to the user.

This was assigned CVE-2013-2147.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Acked-by: Mike Miller <mike....@hp.com>

Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/block/cpqarray.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/block/cpqarray.c b/drivers/block/cpqarray.c
index 6422651..f9caa45 100644
--- a/drivers/block/cpqarray.c
+++ b/drivers/block/cpqarray.c
@@ -1181,6 +1181,7 @@ out_passthru:
ida_pci_info_struct pciinfo;

if (!arg) return -EINVAL;
+ memset(&pciinfo, 0, sizeof(pciinfo));
pciinfo.bus = host->pci_dev->bus->number;
pciinfo.dev_fn = host->pci_dev->devfn;
pciinfo.board_id = host->board_id;

Willy Tarreau

unread,

May 11, 2014, 10:00:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <han...@stressinduktion.org>

[ Upstream commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c ]

This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.

This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.

Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.

Also document these changes in include/linux/net.h as suggested by David
Miller.

Changes since RFC:

Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.

With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
msg->msg_name = NULL
".

This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.

Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.

Cc: David Miller <da...@davemloft.net>
Suggested-by: Eric Dumazet <eric.d...@gmail.com>

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: David S. Miller <da...@davemloft.net>

[wt: 2.6.32: msg_sys is a struct not a pointer]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/isdn/mISDN/socket.c | 13 ++++---------
include/linux/net.h | 8 ++++++++
net/appletalk/ddp.c | 16 +++++++---------
net/atm/common.c | 2 --
net/ax25/af_ax25.c | 4 ++--
net/bluetooth/af_bluetooth.c | 2 --
net/bluetooth/hci_sock.c | 2 --
net/bluetooth/rfcomm/sock.c | 3 ---
net/compat.c | 3 ++-
net/core/iovec.c | 3 ++-
net/ipx/af_ipx.c | 3 +--
net/irda/af_irda.c | 4 ----
net/iucv/af_iucv.c | 2 --
net/key/af_key.c | 1 -
net/llc/af_llc.c | 2 --
net/netlink/af_netlink.c | 2 --
net/netrom/af_netrom.c | 3 +--
net/packet/af_packet.c | 32 +++++++++++++++-----------------
net/rds/recv.c | 2 --
net/rose/af_rose.c | 8 +++++---
net/rxrpc/ar-recvmsg.c | 8 +++++---
net/socket.c | 9 +++++++--
net/tipc/socket.c | 6 ------
net/unix/af_unix.c | 5 -----
net/x25/af_x25.c | 3 +--
25 files changed, 60 insertions(+), 86 deletions(-)

diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index feb0fa4..db69cb4 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -115,7 +115,6 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
{
struct sk_buff *skb;
struct sock *sk = sock->sk;
- struct sockaddr_mISDN *maddr;

int copied, err;

@@ -133,9 +132,9 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
if (!skb)
return err;

- if (msg->msg_namelen >= sizeof(struct sockaddr_mISDN)) {
- msg->msg_namelen = sizeof(struct sockaddr_mISDN);
- maddr = (struct sockaddr_mISDN *)msg->msg_name;
+ if (msg->msg_name) {
+ struct sockaddr_mISDN *maddr = msg->msg_name;
+
maddr->family = AF_ISDN;
maddr->dev = _pms(sk)->dev->id;
if ((sk->sk_protocol == ISDN_P_LAPD_TE) ||
@@ -148,11 +147,7 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
maddr->sapi = _pms(sk)->ch.addr & 0xFF;
maddr->tei = (_pms(sk)->ch.addr >> 8) & 0xFF;
}
- } else {
- if (msg->msg_namelen)
- printk(KERN_WARNING "%s: too small namelen %d\n",
- __func__, msg->msg_namelen);
- msg->msg_namelen = 0;
+ msg->msg_namelen = sizeof(*maddr);
}

copied = skb->len + MISDN_HEADER_LEN;
diff --git a/include/linux/net.h b/include/linux/net.h
index 529a093..e40cbcc 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -187,6 +187,14 @@ struct proto_ops {
int optname, char __user *optval, int __user *optlen);
int (*sendmsg) (struct kiocb *iocb, struct socket *sock,
struct msghdr *m, size_t total_len);
+ /* Notes for implementing recvmsg:
+ * ===============================
+ * msg->msg_namelen should get updated by the recvmsg handlers
+ * iff msg_name != NULL. It is by default 0 to prevent
+ * returning uninitialized memory to user space. The recvfrom
+ * handlers can assume that msg.msg_name is either NULL or has
+ * a minimum size of sizeof(struct sockaddr_storage).
+ */
int (*recvmsg) (struct kiocb *iocb, struct socket *sock,
struct msghdr *m, size_t total_len,
int flags);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index b1a4290..5eae360 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1703,7 +1703,6 @@ static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
size_t size, int flags)
{
struct sock *sk = sock->sk;
- struct sockaddr_at *sat = (struct sockaddr_at *)msg->msg_name;
struct ddpehdr *ddp;
int copied = 0;
int offset = 0;
@@ -1728,14 +1727,13 @@ static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
}
err = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copied);

- if (!err) {
- if (sat) {
- sat->sat_family = AF_APPLETALK;
- sat->sat_port = ddp->deh_sport;
- sat->sat_addr.s_node = ddp->deh_snode;
- sat->sat_addr.s_net = ddp->deh_snet;
- }
- msg->msg_namelen = sizeof(*sat);
+ if (!err && msg->msg_name) {
+ struct sockaddr_at *sat = msg->msg_name;
+ sat->sat_family = AF_APPLETALK;
+ sat->sat_port = ddp->deh_sport;
+ sat->sat_addr.s_node = ddp->deh_snode;
+ sat->sat_addr.s_net = ddp->deh_snet;
+ msg->msg_namelen = sizeof(*sat);
}

skb_free_datagram(sk, skb); /* Free the datagram. */
diff --git a/net/atm/common.c b/net/atm/common.c
index 65737b8..0baf05e 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -473,8 +473,6 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
struct sk_buff *skb;
int copied, error = -EINVAL;

- msg->msg_namelen = 0;
-
if (sock->state != SS_CONNECTED)
return -ENOTCONN;
if (flags & ~MSG_DONTWAIT) /* only handle MSG_DONTWAIT */
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index 8613bd1..6b9d62b 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1648,11 +1648,11 @@ static int ax25_recvmsg(struct kiocb *iocb, struct socket *sock,

skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);

- if (msg->msg_namelen != 0) {
- struct sockaddr_ax25 *sax = (struct sockaddr_ax25 *)msg->msg_name;
+ if (msg->msg_name) {
ax25_digi digi;
ax25_address src;
const unsigned char *mac = skb_mac_header(skb);
+ struct sockaddr_ax25 *sax = msg->msg_name;

memset(sax, 0, sizeof(struct full_sockaddr_ax25));
ax25_addr_parse(mac + 1, skb->data - mac - 1, &src, NULL,
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index d7239dd..143b8a7 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -240,8 +240,6 @@ int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
if (flags & (MSG_OOB))
return -EOPNOTSUPP;

- msg->msg_namelen = 0;
-
if (!(skb = skb_recv_datagram(sk, flags, noblock, &err))) {
if (sk->sk_shutdown & RCV_SHUTDOWN)
return 0;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 45caaaa..0e0f517 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -370,8 +370,6 @@ static int hci_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
if (!(skb = skb_recv_datagram(sk, flags, noblock, &err)))
return err;

- msg->msg_namelen = 0;
-
copied = skb->len;
if (len < copied) {
msg->msg_flags |= MSG_TRUNC;
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 1db0132..3fabaad 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -652,15 +652,12 @@ static int rfcomm_sock_recvmsg(struct kiocb *iocb, struct socket *sock,

if (test_and_clear_bit(RFCOMM_DEFER_SETUP, &d->flags)) {
rfcomm_dlc_accept(d);
- msg->msg_namelen = 0;
return 0;
}

if (flags & MSG_OOB)
return -EOPNOTSUPP;

- msg->msg_namelen = 0;
-
BT_DBG("sk %p size %zu", sk, size);

lock_sock(sk);
diff --git a/net/compat.c b/net/compat.c
index da3d0fc..d325d16 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -91,7 +91,8 @@ int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *kern_iov,
if (err < 0)
return err;
}
- kern_msg->msg_name = kern_address;
+ if (kern_msg->msg_name)
+ kern_msg->msg_name = kern_address;
} else
kern_msg->msg_name = NULL;

diff --git a/net/core/iovec.c b/net/core/iovec.c
index f911e66..39369e9 100644
--- a/net/core/iovec.c
+++ b/net/core/iovec.c
@@ -47,7 +47,8 @@ int verify_iovec(struct msghdr *m, struct iovec *iov, struct sockaddr *address,
if (err < 0)
return err;
}
- m->msg_name = address;
+ if (m->msg_name)
+ m->msg_name = address;
} else {
m->msg_name = NULL;
}
diff --git a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c
index 66c7a20..25931b3 100644
--- a/net/ipx/af_ipx.c
+++ b/net/ipx/af_ipx.c
@@ -1808,8 +1808,6 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock,
if (skb->tstamp.tv64)
sk->sk_stamp = skb->tstamp;

- msg->msg_namelen = sizeof(*sipx);
-
if (sipx) {
sipx->sipx_family = AF_IPX;
sipx->sipx_port = ipx->ipx_source.sock;
@@ -1817,6 +1815,7 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock,
sipx->sipx_network = IPX_SKB_CB(skb)->ipx_source_net;
sipx->sipx_type = ipx->ipx_type;
sipx->sipx_zero = 0;
+ msg->msg_namelen = sizeof(*sipx);
}
rc = copied;

diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index bfb325d..7cb7613 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1338,8 +1338,6 @@ static int irda_recvmsg_dgram(struct kiocb *iocb, struct socket *sock,
if ((err = sock_error(sk)) < 0)
return err;

- msg->msg_namelen = 0;
-
skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
flags & MSG_DONTWAIT, &err);
if (!skb)
@@ -1402,8 +1400,6 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct socket *sock,
target = sock_rcvlowat(sk, flags & MSG_WAITALL, size);
timeo = sock_rcvtimeo(sk, noblock);

- msg->msg_namelen = 0;
-
do {
int chunk;
struct sk_buff *skb = skb_dequeue(&sk->sk_receive_queue);
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index f605b23..bada1b9 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1160,8 +1160,6 @@ static int iucv_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
struct sk_buff *skb, *rskb, *cskb;
int err = 0;

- msg->msg_namelen = 0;
-
if ((sk->sk_state == IUCV_DISCONN || sk->sk_state == IUCV_SEVERED) &&
skb_queue_empty(&iucv->backlog_skb_q) &&
skb_queue_empty(&sk->sk_receive_queue) &&
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 3f55faa..3e5d0dc 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3597,7 +3597,6 @@ static int pfkey_recvmsg(struct kiocb *kiocb,
if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT))
goto out;

- msg->msg_namelen = 0;
skb = skb_recv_datagram(sk, flags, flags & MSG_DONTWAIT, &err);
if (skb == NULL)
goto out;
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 8a814a5..606b6ad 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -674,8 +674,6 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
int target; /* Read at least this many bytes */
long timeo;

- msg->msg_namelen = 0;
-
lock_sock(sk);
copied = -ENOTCONN;
if (unlikely(sk->sk_type == SOCK_STREAM && sk->sk_state == TCP_LISTEN))
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index fc91ff6..39a6d5d 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1400,8 +1400,6 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock,
}
#endif

- msg->msg_namelen = 0;
-
copied = data_skb->len;
if (len < copied) {
msg->msg_flags |= MSG_TRUNC;
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 7a83495..ad1ec1b 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1184,10 +1184,9 @@ static int nr_recvmsg(struct kiocb *iocb, struct socket *sock,
sax->sax25_family = AF_NETROM;
skb_copy_from_linear_data_offset(skb, 7, sax->sax25_call.ax25_call,
AX25_ADDR_LEN);
+ msg->msg_namelen = sizeof(*sax);
}

- msg->msg_namelen = sizeof(*sax);
-
skb_free_datagram(sk, skb);

release_sock(sk);
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index f084e01..06707d0 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1423,7 +1423,6 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
struct sock *sk = sock->sk;
struct sk_buff *skb;
int copied, err;
- struct sockaddr_ll *sll;

err = -EINVAL;
if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT))
@@ -1455,22 +1454,10 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
if (skb == NULL)
goto out;

- /*
- * If the address length field is there to be filled in, we fill
- * it in now.
+ /* You lose any data beyond the buffer you gave. If it worries
+ * a user program they can ask the device for its MTU
+ * anyway.
*/
-
- sll = &PACKET_SKB_CB(skb)->sa.ll;
- if (sock->type == SOCK_PACKET)
- msg->msg_namelen = sizeof(struct sockaddr_pkt);
- else
- msg->msg_namelen = sll->sll_halen + offsetof(struct sockaddr_ll, sll_addr);
-
- /*
- * You lose any data beyond the buffer you gave. If it worries a
- * user program they can ask the device for its MTU anyway.
- */
-
copied = skb->len;
if (copied > len) {
copied = len;
@@ -1483,9 +1470,20 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,

sock_recv_timestamp(msg, sk, skb);

- if (msg->msg_name)
+ if (msg->msg_name) {
+ /* If the address length field is there to be filled
+ * in, we fill it in now.
+ */
+ if (sock->type == SOCK_PACKET) {
+ msg->msg_namelen = sizeof(struct sockaddr_pkt);
+ } else {
+ struct sockaddr_ll *sll = &PACKET_SKB_CB(skb)->sa.ll;
+ msg->msg_namelen = sll->sll_halen +
+ offsetof(struct sockaddr_ll, sll_addr);
+ }
memcpy(msg->msg_name, &PACKET_SKB_CB(skb)->sa,
msg->msg_namelen);
+ }

if (pkt_sk(sk)->auxdata) {
struct tpacket_auxdata aux;
diff --git a/net/rds/recv.c b/net/rds/recv.c
index c45a881c..a11cab9 100644
--- a/net/rds/recv.c
+++ b/net/rds/recv.c
@@ -410,8 +410,6 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,

rdsdebug("size %zu flags 0x%x timeo %ld\n", size, msg_flags, timeo);

- msg->msg_namelen = 0;
-
if (msg_flags & MSG_OOB)
goto out;

diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index 2984999..08a86f6 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1238,7 +1238,6 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
{
struct sock *sk = sock->sk;
struct rose_sock *rose = rose_sk(sk);
- struct sockaddr_rose *srose = (struct sockaddr_rose *)msg->msg_name;
size_t copied;
unsigned char *asmptr;
struct sk_buff *skb;
@@ -1274,8 +1273,11 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,

skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);

- if (srose != NULL) {
- memset(srose, 0, msg->msg_namelen);
+ if (msg->msg_name) {
+ struct sockaddr_rose *srose;
+
+ memset(msg->msg_name, 0, sizeof(struct full_sockaddr_rose));
+ srose = msg->msg_name;
srose->srose_family = AF_ROSE;
srose->srose_addr = rose->dest_addr;
srose->srose_call = rose->dest_call;
diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
index a39bf97..f779fc3 100644
--- a/net/rxrpc/ar-recvmsg.c
+++ b/net/rxrpc/ar-recvmsg.c
@@ -142,10 +142,12 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,

/* copy the peer address and timestamp */
if (!continue_call) {
- if (msg->msg_name && msg->msg_namelen > 0)
+ if (msg->msg_name) {
+ size_t len =
+ sizeof(call->conn->trans->peer->srx);
memcpy(msg->msg_name,
- &call->conn->trans->peer->srx,
- sizeof(call->conn->trans->peer->srx));
+ &call->conn->trans->peer->srx, len);
+ }
sock_recv_timestamp(msg, &rx->sk, skb);
}

diff --git a/net/socket.c b/net/socket.c
index 9f8cd74..2b7be6d 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1744,8 +1744,10 @@ SYSCALL_DEFINE6(recvfrom, int, fd, void __user *, ubuf, size_t, size,
msg.msg_iov = &iov;
iov.iov_len = size;
iov.iov_base = ubuf;
- msg.msg_name = (struct sockaddr *)&address;
- msg.msg_namelen = sizeof(address);
+ /* Save some cycles and don't copy the address if not needed */
+ msg.msg_name = addr ? (struct sockaddr *)&address : NULL;
+ /* We assume all kernel code knows the size of sockaddr_storage */
+ msg.msg_namelen = 0;
if (sock->file->f_flags & O_NONBLOCK)
flags |= MSG_DONTWAIT;
err = sock_recvmsg(sock, &msg, size, flags);
@@ -2055,6 +2057,9 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
cmsg_ptr = (unsigned long)msg_sys.msg_control;
msg_sys.msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);

+ /* We assume all kernel code knows the size of sockaddr_storage */
+ msg_sys.msg_namelen = 0;
+
if (sock->file->f_flags & O_NONBLOCK)
flags |= MSG_DONTWAIT;
err = sock_recvmsg(sock, &msg_sys, total_len, flags);
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index eccb86b..124f1a2 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -917,9 +917,6 @@ static int recv_msg(struct kiocb *iocb, struct socket *sock,
goto exit;
}

- /* will be updated in set_orig_addr() if needed */
- m->msg_namelen = 0;
-
restart:

/* Look for a message in receive queue; wait if necessary */
@@ -1053,9 +1050,6 @@ static int recv_stream(struct kiocb *iocb, struct socket *sock,
goto exit;
}

- /* will be updated in set_orig_addr() if needed */
- m->msg_namelen = 0;
-
restart:

/* Look for a message in receive queue; wait if necessary */
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index d146b76..bb0b008 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1682,7 +1682,6 @@ static void unix_copy_addr(struct msghdr *msg, struct sock *sk)
{
struct unix_sock *u = unix_sk(sk);

- msg->msg_namelen = 0;
if (u->addr) {
msg->msg_namelen = u->addr->len;
memcpy(msg->msg_name, u->addr->name, u->addr->len);
@@ -1705,8 +1704,6 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
if (flags&MSG_OOB)
goto out;

- msg->msg_namelen = 0;
-
mutex_lock(&u->readlock);

skb = skb_recv_datagram(sk, flags, noblock, &err);
@@ -1832,8 +1829,6 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
target = sock_rcvlowat(sk, flags&MSG_WAITALL, size);
timeo = sock_rcvtimeo(sk, flags&MSG_DONTWAIT);

- msg->msg_namelen = 0;
-
/* Lock the socket to prevent queue disordering
* while sleeps in memcpy_tomsg
*/
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 2e9e300..40c447f 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1294,10 +1294,9 @@ static int x25_recvmsg(struct kiocb *iocb, struct socket *sock,
if (sx25) {
sx25->sx25_family = AF_X25;
sx25->sx25_addr = x25->dest_addr;
+ msg->msg_namelen = sizeof(*sx25);
}

- msg->msg_namelen = sizeof(struct sockaddr_x25);
-
lock_sock(sk);
x25_check_rbuf(sk);
release_sock(sk);

Willy Tarreau

unread,

May 11, 2014, 10:00:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <han...@stressinduktion.org>

[ Upstream commit 1fa4c710b6fe7b0aac9907240291b6fe6aafc3b8 ]

Offenders don't have port numbers, so set it to 0.

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv6/datagram.c | 1 +

1 file changed, 1 insertion(+)

diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index ef6436d..5da306b 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -342,6 +342,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
if (serr->ee.ee_origin != SO_EE_ORIGIN_LOCAL) {
sin->sin6_family = AF_INET6;
sin->sin6_flowinfo = 0;
+ sin->sin6_port = 0;
sin->sin6_scope_id = 0;
if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP6) {
ipv6_addr_copy(&sin->sin6_addr, &ipv6_hdr(skb)->saddr);

Willy Tarreau

unread,

May 11, 2014, 10:00:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Kees Cook <kees...@chromium.org>

commit e0e29b683d6784ef59bbc914eac85a04b650e63c upstream

The module parameter "fwpostfix" is userspace controllable, unfiltered,
and is used to define the firmware filename. b43_do_request_fw() populates
ctx->errors[] on error, containing the firmware filename. b43err()
parses its arguments as a format string. For systems with b43 hardware,
this could lead to a uid-0 to ring-0 escalation.

CVE-2013-2852

Signed-off-by: Kees Cook <kees...@chromium.org>
Cc: sta...@vger.kernel.org
Signed-off-by: John W. Linville <linv...@tuxdriver.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/wireless/b43/main.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
index 94dae56..3cf2472 100644
--- a/drivers/net/wireless/b43/main.c
+++ b/drivers/net/wireless/b43/main.c
@@ -2257,7 +2257,7 @@ static int b43_request_firmware(struct b43_wldev *dev)
for (i = 0; i < B43_NR_FWTYPES; i++) {
errmsg = ctx->errors[i];
if (strlen(errmsg))
- b43err(dev->wl, errmsg);
+ b43err(dev->wl, "%s", errmsg);
}
b43_print_fw_helptext(dev->wl, 1);
err = -ENOENT;

Willy Tarreau

unread,

May 11, 2014, 10:00:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Jitendra Bhivare <jitendra...@gmail.com>

Backported Alex Williamson's commit to 2.6.32.y
http://git.kernel.org/linus/7b668357810ecb5fdda4418689d50f5d95aea6a8

It resolves the following assert when module is immediately reloaded.

kernel BUG at drivers/pci/iova.c:155!
<snip>
Call Trace:
[<ffffffff812645c5>] intel_alloc_iova+0xb5/0xe0
[<ffffffff8126725e>] __intel_map_single+0xbe/0x210
[<ffffffff812674ae>] intel_alloc_coherent+0xae/0x120
[<ffffffffa035f909>] be_queue_alloc+0xb9/0x140 [be2net]
[<ffffffffa035fa5a>] be_rx_qs_create+0xca/0x370 [be2net]
<snip>

The issue is reproducible in 2.6.32.60 and also gets resolved
by passing intel-iommu=strict to kernel.

Signed-off-by: Jitendra Bhivare <jitendra...@gmail.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/pci/intel-iommu.c | 4 ++++

1 file changed, 4 insertions(+)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 5b680df..c1a7b01 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -1434,6 +1434,10 @@ static void domain_exit(struct dmar_domain *domain)
if (!domain)
return;

+ /* Flush any lazy unmaps that may reference this domain */
+ if (!intel_iommu_strict)
+ flush_unmaps_timeout(0);
+
domain_remove_dev_info(domain);
/* destroy iovas */
put_iova_domain(&domain->iovad);

Willy Tarreau

unread,

May 11, 2014, 10:10:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Neil Horman <nho...@tuxdriver.com>

[ Upstream commit c5c7774d7eb4397891edca9ebdf750ba90977a69 ]

In commit 2f94aabd9f6c925d77aecb3ff020f1cc12ed8f86
(refactor sctp_outq_teardown to insure proper re-initalization)
we modified sctp_outq_teardown to use sctp_outq_init to fully re-initalize the
outq structure. Steve West recently asked me why I removed the q->error = 0
initalization from sctp_outq_teardown. I did so because I was operating under
the impression that sctp_outq_init would properly initalize that value for us,
but it doesn't. sctp_outq_init operates under the assumption that the outq
struct is all 0's (as it is when called from sctp_association_init), but using
it in __sctp_outq_teardown violates that assumption. We should do a memset in
sctp_outq_init to ensure that the entire structure is in a known state there
instead.

Signed-off-by: Neil Horman <nho...@tuxdriver.com>
Reported-by: "West, Steve (NSN - US/Fort Worth)" <steve...@nsn.com>
CC: Vlad Yasevich <vyas...@gmail.com>
CC: net...@vger.kernel.org
CC: da...@davemloft.net
Acked-by: Vlad Yasevich <vyas...@gmail.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/sctp/outqueue.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 23e5e97..bc423b4 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -203,6 +203,8 @@ static inline int sctp_cacc_skip(struct sctp_transport *primary,
*/
void sctp_outq_init(struct sctp_association *asoc, struct sctp_outq *q)
{
+ memset(q, 0, sizeof(struct sctp_outq));
+
q->asoc = asoc;
INIT_LIST_HEAD(&q->out_chunk_list);
INIT_LIST_HEAD(&q->control_chunk_list);
@@ -210,13 +212,7 @@ void sctp_outq_init(struct sctp_association *asoc, struct sctp_outq *q)
INIT_LIST_HEAD(&q->sacked);
INIT_LIST_HEAD(&q->abandoned);

- q->fast_rtx = 0;
- q->outstanding_bytes = 0;
q->empty = 1;
- q->cork = 0;
-
- q->malloced = 0;
- q->out_qlen = 0;
}

/* Free the outqueue structure and any related pending chunks.

Willy Tarreau

unread,

May 11, 2014, 10:10:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: =?latin1?q?Salva=20Peir=F3?= <spe...@ai2.upv.es>

[ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ]

The fst_get_iface() code fails to initialize the two padding bytes of
struct sync_serial_settings after the ->loopback member. Add an explicit
memset(0) before filling the structure to avoid the info leak.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/wan/farsync.c | 1 +

1 file changed, 1 insertion(+)

diff --git a/drivers/net/wan/farsync.c b/drivers/net/wan/farsync.c
index beda387..433bf99 100644
--- a/drivers/net/wan/farsync.c
+++ b/drivers/net/wan/farsync.c
@@ -1971,6 +1971,7 @@ fst_get_iface(struct fst_card_info *card, struct fst_port_info *port,
}

i = port->index;
+ memset(&sync, 0, sizeof(sync));
sync.clock_rate = FST_RDL(card, portConfig[i].lineSpeed);
/* Lucky card and linux use same encoding here */
sync.clock_type = FST_RDB(card, portConfig[i].internalClock) ==

Willy Tarreau

unread,

May 11, 2014, 10:10:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Pablo Neira <pa...@netfilter.org>

[ Upstream commit 8b7b932434f5eee495b91a2804f5b64ebb2bc835 ]

nla_strcmp compares the string length plus one, so it's implicitly
including the nul-termination in the comparison.

int nla_strcmp(const struct nlattr *nla, const char *str)
{
int len = strlen(str) + 1;
...
d = memcmp(nla_data(nla), str, len);

However, if NLA_STRING is used, userspace can send us a string without
the nul-termination. This is a problem since the string
comparison will not match as the last byte may be not the
nul-termination.

Fix this by skipping the comparison of the nul-termination if the
attribute data is nul-terminated. Suggested by Thomas Graf.

Cc: Florian Westphal <f...@strlen.de>
Cc: Thomas Graf <tg...@suug.ch>
Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

lib/nlattr.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/nlattr.c b/lib/nlattr.c
index 109d4fe..51b84de 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -299,9 +299,15 @@ int nla_memcmp(const struct nlattr *nla, const void *data,
*/
int nla_strcmp(const struct nlattr *nla, const char *str)
{
- int len = strlen(str) + 1;
- int d = nla_len(nla) - len;
+ int len = strlen(str);
+ char *buf = nla_data(nla);
+ int attrlen = nla_len(nla);
+ int d;

+ if (attrlen > 0 && buf[attrlen - 1] == '\0')
+ attrlen--;
+
+ d = attrlen - len;
if (d == 0)
d = memcmp(nla_data(nla), str, len);

Willy Tarreau

unread,

May 11, 2014, 10:10:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Jiri Bohac <ji...@boha.cz>

[ Upstream commit 163c8ff30dbe473abfbb24a7eac5536c87f3baa9 ]

aggregator_identifier is used to assign unique aggregator identifiers
to aggregators of a bond during device enslaving.

aggregator_identifier is currently a global variable that is zeroed in
bond_3ad_initialize().

This sequence will lead to duplicate aggregator identifiers for eth1 and eth3:

create bond0
change bond0 mode to 802.3ad
enslave eth0 to bond0 //eth0 gets agg id 1
enslave eth1 to bond0 //eth1 gets agg id 2
create bond1
change bond1 mode to 802.3ad
enslave eth2 to bond1 //aggregator_identifier is reset to 0
//eth2 gets agg id 1
enslave eth3 to bond0 //eth3 gets agg id 2

Fix this by making aggregator_identifier private to the bond.

Signed-off-by: Jiri Bohac <jbo...@suse.cz>
Acked-by: Veaceslav Falico <vfa...@redhat.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/bonding/bond_3ad.c | 6 ++----
drivers/net/bonding/bond_3ad.h | 1 +
2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 05308e6..ec2bf8c 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -1846,8 +1846,6 @@ void bond_3ad_initiate_agg_selection(struct bonding *bond, int timeout)
BOND_AD_INFO(bond).agg_select_mode = bond->params.ad_select;
}

-static u16 aggregator_identifier;
-
/**
* bond_3ad_initialize - initialize a bond's 802.3ad parameters and structures
* @bond: bonding struct to work on
@@ -1862,7 +1860,7 @@ void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution, int lacp_fas
if (MAC_ADDRESS_COMPARE(&(BOND_AD_INFO(bond).system.sys_mac_addr),
bond->dev->dev_addr)) {

- aggregator_identifier = 0;
+ BOND_AD_INFO(bond).aggregator_identifier = 0;

BOND_AD_INFO(bond).lacp_fast = lacp_fast;
BOND_AD_INFO(bond).system.sys_priority = 0xFFFF;
@@ -1937,7 +1935,7 @@ int bond_3ad_bind_slave(struct slave *slave)
ad_initialize_agg(aggregator);

aggregator->aggregator_mac_address = *((struct mac_addr *)bond->dev->dev_addr);
- aggregator->aggregator_identifier = (++aggregator_identifier);
+ aggregator->aggregator_identifier = ++BOND_AD_INFO(bond).aggregator_identifier;
aggregator->slave = slave;
aggregator->is_active = 0;
aggregator->num_of_ports = 0;
diff --git a/drivers/net/bonding/bond_3ad.h b/drivers/net/bonding/bond_3ad.h
index 2c46a154..f04f465 100644
--- a/drivers/net/bonding/bond_3ad.h
+++ b/drivers/net/bonding/bond_3ad.h
@@ -253,6 +253,7 @@ struct ad_system {
struct ad_bond_info {
struct ad_system system; /* 802.3ad system structure */
u32 agg_select_timer; // Timer to select aggregator after all adapter's hand shakes
+ u16 aggregator_identifier;
u32 agg_select_mode; // Mode of selection of active aggregator(bandwidth/count)
int lacp_fast; /* whether fast periodic tx should be
* requested

Willy Tarreau

unread,

May 11, 2014, 10:10:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

governor can be used

From: Krzysztof Helt <krzysz...@wp.pl>

Set the transition latency to value smaller than CPUFREQ_ETERNAL so
governors other than "performance" work (like the "ondemand" one).

The value is found in "AMD PowerNow! Technology Platform Design Guide for
Embedded Processors" dated December 2000 (AMD doc #24267A). There is the
answer to one of FAQs on page 40 which states that suggested complete transition
period is 200 us.

Tested on K6-2+ CPU with K6-3 core (model 13, stepping 4).

Signed-off-by: Krzysztof Helt <krzysz...@wp.pl>
Signed-off-by: Dave Jones <da...@redhat.com>
(cherry picked from commit db2820dd5445a44b4726f15a2bc89b9ded2503eb)
[wt: in 2.6.32, we only need this one so that next series applies cleanly]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
index f10dea4..cb01dac 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
@@ -164,7 +164,7 @@ static int powernow_k6_cpu_init(struct cpufreq_policy *policy)

}

/* cpuinfo and default policy values */

- policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
+ policy->cpuinfo.transition_latency = 200000;

policy->cur = busfreq * max_multiplier;

result = cpufreq_frequency_table_cpuinfo(policy, clock_ratio);

Willy Tarreau

unread,

May 11, 2014, 10:10:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

handled by UFO

From: Hannes Frederic Sowa <han...@stressinduktion.org>

In the following scenario the socket is corked:
If the first UDP packet is larger then the mtu we try to append it to the
write queue via ip6_ufo_append_data. A following packet, which is smaller
than the mtu would be appended to the already queued up gso-skb via
plain ip6_append_data. This causes random memory corruptions.

In ip6_ufo_append_data we also have to be careful to not queue up the
same skb multiple times. So setup the gso frame only when no first skb
is available.

This also fixes a shortcoming where we add the current packet's length to
cork->length but return early because of a packet > mtu with dontfrag set
(instead of sutracting it again).

Found with trinity.

Cc: YOSHIFUJI Hideaki <yosh...@linux-ipv6.org>

Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Reported-by: Dmitry Vyukov <dvy...@google.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

(cherry picked from commit 2811ebac2521ceac84f2bdae402455baa6a7fb47)
[wt: 2.6.32 doesn't have dontfrag so remove the optimization]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv6/ip6_output.c | 31 ++++++++++++-------------------
1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6ff4d07..5a1b5bc 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1086,6 +1086,8 @@ static inline int ip6_ufo_append_data(struct sock *sk,
* udp datagram
*/
if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) {
+ struct frag_hdr fhdr;
+
skb = sock_alloc_send_skb(sk,
hh_len + fragheaderlen + transhdrlen + 20,
(flags & MSG_DONTWAIT), &err);
@@ -1107,12 +1109,6 @@ static inline int ip6_ufo_append_data(struct sock *sk,
skb->ip_summed = CHECKSUM_PARTIAL;
skb->csum = 0;
sk->sk_sndmsg_off = 0;
- }
-
- err = skb_append_datato_frags(sk,skb, getfrag, from,
- (length - transhdrlen));
- if (!err) {
- struct frag_hdr fhdr;

/* Specify the length of each IPv6 datagram fragment.
* It has to be a multiple of 8.
@@ -1123,15 +1119,10 @@ static inline int ip6_ufo_append_data(struct sock *sk,
ipv6_select_ident(&fhdr, rt);
skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
__skb_queue_tail(&sk->sk_write_queue, skb);
-
- return 0;
}
- /* There is not enough support do UPD LSO,
- * so follow normal path
- */
- kfree_skb(skb);

- return err;
+ return skb_append_datato_frags(sk, skb, getfrag, from,
+ (length - transhdrlen));
}

static inline struct ipv6_opt_hdr *ip6_opt_dup(struct ipv6_opt_hdr *src,
@@ -1264,18 +1255,20 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
*/

inet->cork.length += length;
- if (((length > mtu) && (sk->sk_protocol == IPPROTO_UDP)) &&
+ skb = skb_peek_tail(&sk->sk_write_queue);
+ if (((length > mtu) ||
+ (skb && skb_is_gso(skb))) &&
+ (sk->sk_protocol == IPPROTO_UDP) &&
(rt->u.dst.dev->features & NETIF_F_UFO)) {
-
- err = ip6_ufo_append_data(sk, getfrag, from, length, hh_len,
- fragheaderlen, transhdrlen, mtu,
- flags, rt);
+ err = ip6_ufo_append_data(sk, getfrag, from, length,
+ hh_len, fragheaderlen,
+ transhdrlen, mtu, flags, rt);
if (err)
goto error;
return 0;
}

- if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL)
+ if (!skb)
goto alloc_new_skb;

while (length > 0) {

Willy Tarreau

unread,

May 11, 2014, 10:10:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <da...@davemloft.net>

[ Upstream commit a0db856a95a29efb1c23db55c02d9f0ff4f0db48 ]

Make sure the reserved fields, and padding (if any), are
fully initialized.

Based upon a patch by Dan Carpenter and feedback from
Joe Perches.

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/sched/sch_cbq.c | 1 +

1 file changed, 1 insertion(+)

diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 5b132c4..8b6f05d 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1458,6 +1458,7 @@ static __inline__ int cbq_dump_wrr(struct sk_buff *skb, struct cbq_class *cl)
unsigned char *b = skb_tail_pointer(skb);
struct tc_cbq_wrropt opt;

+ memset(&opt, 0, sizeof(opt));
opt.flags = 0;
opt.allot = cl->allot;
opt.priority = cl->priority+1;

Willy Tarreau

unread,

May 11, 2014, 10:10:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

[ Upstream commit ffd5939381c609056b33b7585fb05a77b4c695f3 ]

SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit
emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of
'struct sctp_getaddrs_old' which includes a struct sockaddr pointer,
sizeof(param) check will always fail in kernel as the structure in
64bit kernel space is 4bytes larger than for user binaries compiled
in 32bit mode. Thus, applications making use of sctp_connectx() won't
be able to run under such circumstances.

Introduce a compat interface in the kernel to deal with such
situations by using a 'struct compat_sctp_getaddrs_old' structure
where user data is copied into it, and then sucessively transformed
into a 'struct sctp_getaddrs_old' structure with the help of
compat_ptr(). That fixes sctp_connectx() abi without any changes
needed in user space, and lets the SCTP test suite pass when compiled
in 32bit and run on 64bit kernels.

Fixes: f9c67811ebc0 ("sctp: Fix regression introduced by new sctp_connectx api")
Signed-off-by: Daniel Borkmann <dbor...@redhat.com>
Acked-by: Neil Horman <nho...@tuxdriver.com>
Acked-by: Vlad Yasevich <vyas...@gmail.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/sctp/socket.c | 41 ++++++++++++++++++++++++++++++++---------
1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 44d8eab..c26d905 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -67,6 +67,7 @@
#include <linux/poll.h>
#include <linux/init.h>
#include <linux/crypto.h>
+#include <linux/compat.h>

#include <net/ip.h>
#include <net/icmp.h>
@@ -1284,11 +1285,19 @@ SCTP_STATIC int sctp_setsockopt_connectx(struct sock* sk,
/*
* New (hopefully final) interface for the API.
* We use the sctp_getaddrs_old structure so that use-space library
- * can avoid any unnecessary allocations. The only defferent part
+ * can avoid any unnecessary allocations. The only different part
* is that we store the actual length of the address buffer into the
- * addrs_num structure member. That way we can re-use the existing
+ * addrs_num structure member. That way we can re-use the existing
* code.
*/
+#ifdef CONFIG_COMPAT
+struct compat_sctp_getaddrs_old {
+ sctp_assoc_t assoc_id;
+ s32 addr_num;
+ compat_uptr_t addrs; /* struct sockaddr * */
+};
+#endif
+
SCTP_STATIC int sctp_getsockopt_connectx3(struct sock* sk, int len,

char __user *optval,
int __user *optlen)

@@ -1297,16 +1306,30 @@ SCTP_STATIC int sctp_getsockopt_connectx3(struct sock* sk, int len,
sctp_assoc_t assoc_id = 0;
int err = 0;

- if (len < sizeof(param))
- return -EINVAL;
+#ifdef CONFIG_COMPAT
+ if (is_compat_task()) {
+ struct compat_sctp_getaddrs_old param32;

- if (copy_from_user(&param, optval, sizeof(param)))
- return -EFAULT;
+ if (len < sizeof(param32))
+ return -EINVAL;
+ if (copy_from_user(&param32, optval, sizeof(param32)))
+ return -EFAULT;

- err = __sctp_setsockopt_connectx(sk,
- (struct sockaddr __user *)param.addrs,
- param.addr_num, &assoc_id);
+ param.assoc_id = param32.assoc_id;
+ param.addr_num = param32.addr_num;
+ param.addrs = compat_ptr(param32.addrs);
+ } else
+#endif
+ {
+ if (len < sizeof(param))
+ return -EINVAL;
+ if (copy_from_user(&param, optval, sizeof(param)))
+ return -EFAULT;
+ }

+ err = __sctp_setsockopt_connectx(sk, (struct sockaddr __user *)
+ param.addrs, param.addr_num,
+ &assoc_id);
if (err == 0 || err == -EINPROGRESS) {
if (copy_to_user(optval, &assoc_id, sizeof(assoc_id)))
return -EFAULT;

Willy Tarreau

unread,

May 11, 2014, 10:10:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Kees Cook <kees...@chromium.org>

commit ffc8b30866879ed9ba62bd0a86fecdbd51cd3d19 upstream

Disk names may contain arbitrary strings, so they must not be
interpreted as format strings. It seems that only md allows arbitrary
strings to be used for disk names, but this could allow for a local
memory corruption from uid 0 into ring 0.

CVE-2013-2851

Signed-off-by: Kees Cook <kees...@chromium.org>
Cc: Jens Axboe <ax...@kernel.dk>
Cc: <sta...@vger.kernel.org>

Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>

[jmm: Backport to 2.6.32]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/block/nbd.c | 4 +++-
fs/partitions/check.c | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 26ada47..90550ba 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -655,7 +655,9 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *lo,

mutex_unlock(&lo->tx_lock);

- thread = kthread_create(nbd_thread, lo, lo->disk->disk_name);
+ thread = kthread_create(nbd_thread, lo, "%s",
+ lo->disk->disk_name);
+
if (IS_ERR(thread)) {
mutex_lock(&lo->tx_lock);
return PTR_ERR(thread);
diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 7b685e1..aa90d88 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -476,7 +476,7 @@ void register_disk(struct gendisk *disk)

ddev->parent = disk->driverfs_dev;

- dev_set_name(ddev, disk->disk_name);
+ dev_set_name(ddev, "%s", disk->disk_name);

/* delay uevents, until we scanned partition table */
dev_set_uevent_suppress(ddev, 1);

Willy Tarreau

unread,

May 11, 2014, 10:10:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

If we do a zero size allocation then it will oops. Also we can't be
sure the user passes us a NUL terminated string so I've added a
terminator.

This code can only be triggered by root.

Reported-by: Nico Golde <ni...@ngolde.de>
Reported-by: Fabian Yamaguchi <fa...@goesec.de>

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Acked-by: Dan Williams <dc...@redhat.com>

Signed-off-by: John W. Linville <linv...@tuxdriver.com>

(cherry picked from commit a497e47d4aec37aaf8f13509f3ef3d1f6a717d88)

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/wireless/libertas/debugfs.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/libertas/debugfs.c b/drivers/net/wireless/libertas/debugfs.c
index 893a55c..89532a6 100644
--- a/drivers/net/wireless/libertas/debugfs.c
+++ b/drivers/net/wireless/libertas/debugfs.c
@@ -925,7 +925,10 @@ static ssize_t lbs_debugfs_write(struct file *f, const char __user *buf,
char *p2;
struct debug_data *d = (struct debug_data *)f->private_data;

- pdata = kmalloc(cnt, GFP_KERNEL);
+ if (cnt == 0)
+ return 0;
+
+ pdata = kmalloc(cnt + 1, GFP_KERNEL);
if (pdata == NULL)
return 0;

@@ -934,6 +937,7 @@ static ssize_t lbs_debugfs_write(struct file *f, const char __user *buf,
kfree(pdata);
return 0;
}
+ pdata[cnt] = '\0';

p0 = pdata;
for (i = 0; i < num_of_items; i++) {

Willy Tarreau

unread,

May 11, 2014, 10:10:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

[ Upstream commit 1abd165ed757db1afdefaac0a4bc8a70f97d258c ]

While stress testing sctp sockets, I hit the following panic:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
PGD 7cead067 PUD 7ce76067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: sctp(F) libcrc32c(F) [...]
CPU: 7 PID: 2950 Comm: acc Tainted: GF 3.10.0-rc2+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task: ffff88007ce0e0c0 ti: ffff88007b568000 task.ti: ffff88007b568000
RIP: 0010:[<ffffffffa0490c4e>] [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP: 0018:ffff88007b569e08 EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff88007db78a00 RCX: dead000000200200
RDX: ffffffffa049fdb0 RSI: ffff8800379baf38 RDI: 0000000000000000
RBP: ffff88007b569e18 R08: ffff88007c230da0 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880077990d00 R14: 0000000000000084 R15: ffff88007db78a00
FS: 00007fc18ab61700(0000) GS:ffff88007fc60000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 000000007cf9d000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded
ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e
0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e
Call Trace:
[<ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp]
[<ffffffff8145b60e>] sk_common_release+0x1e/0xf0
[<ffffffff814df36e>] inet_create+0x2ae/0x350
[<ffffffff81455a6f>] __sock_create+0x11f/0x240
[<ffffffff81455bf0>] sock_create+0x30/0x40
[<ffffffff8145696c>] SyS_socket+0x4c/0xc0
[<ffffffff815403be>] ? do_page_fault+0xe/0x10
[<ffffffff8153cb32>] ? page_fault+0x22/0x30
[<ffffffff81544e02>] system_call_fastpath+0x16/0x1b
Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f
1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48>
8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48
RIP [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP <ffff88007b569e08>
CR2: 0000000000000020
---[ end trace e0d71ec1108c1dd9 ]---

I did not hit this with the lksctp-tools functional tests, but with a
small, multi-threaded test program, that heavily allocates, binds,
listens and waits in accept on sctp sockets, and then randomly kills
some of them (no need for an actual client in this case to hit this).
Then, again, allocating, binding, etc, and then killing child processes.

This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable''
is set. The cause for that is actually very simple: in sctp_endpoint_init()
we enter the path of sctp_auth_init_hmacs(). There, we try to allocate
our crypto transforms through crypto_alloc_hash(). In our scenario,
it then can happen that crypto_alloc_hash() fails with -EINTR from
crypto_larval_wait(), thus we bail out and release the socket via
sk_common_release(), sctp_destroy_sock() and hit the NULL pointer
dereference as soon as we try to access members in the endpoint during
sctp_endpoint_free(), since endpoint at that time is still NULL. Now,
if we have that case, we do not need to do any cleanup work and just
leave the destruction handler.

Signed-off-by: Daniel Borkmann <dbor...@redhat.com>
Acked-by: Neil Horman <nho...@tuxdriver.com>
Acked-by: Vlad Yasevich <vyas...@gmail.com>
Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/sctp/socket.c | 6 ++++++

1 file changed, 6 insertions(+)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 26ffae2..44d8eab 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -3743,6 +3743,12 @@ SCTP_STATIC void sctp_destroy_sock(struct sock *sk)

/* Release our hold on the endpoint. */
ep = sctp_sk(sk)->ep;
+ /* This could happen during socket init, thus we bail out
+ * early, since the rest of the below is not setup either.
+ */
+ if (ep == NULL)
+ return;
+
sctp_endpoint_free(ep);
percpu_counter_dec(&sctp_sockets_allocated);
local_bh_disable();

Willy Tarreau

unread,

May 11, 2014, 10:10:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Sasha Levin <sasha...@oracle.com>

[ Upstream commit 37ab4fa7844a044dc21fde45e2a0fc2f3c3b6490 ]

This is similar to the set_peek_off patch where calling bind while the
socket is stuck in unix_dgram_recvmsg() will block and cause a hung task
spew after a while.

This is also the last place that did a straightforward mutex_lock(), so
there shouldn't be any more of these patches.

Signed-off-by: Sasha Levin <sasha...@oracle.com>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/unix/af_unix.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index bb0b008..79c1dce 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -674,7 +674,9 @@ static int unix_autobind(struct socket *sock)
int err;
unsigned int retries = 0;

- mutex_lock(&u->readlock);
+ err = mutex_lock_interruptible(&u->readlock);
+ if (err)
+ return err;

err = 0;
if (u->addr)
@@ -806,7 +808,9 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
goto out;
addr_len = err;

- mutex_lock(&u->readlock);
+ err = mutex_lock_interruptible(&u->readlock);
+ if (err)
+ goto out;

err = -EINVAL;
if (u->addr)

Willy Tarreau

unread,

May 11, 2014, 10:10:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Max Matveev <ma...@redhat.com>

commit d5ccd496601b8776a516d167a6485754575dc38f upstream

Attempt to reduce the number of IP packets emitted in response to single
SCTP packet (2e3216cd) introduced a complication - if a packet contains
two COOKIE_ECHO chunks and nothing else then SCTP state machine corks the
socket while processing first COOKIE_ECHO and then loses the association
and forgets to uncork the socket. To deal with the issue add new SCTP
command which can be used to set association explictly. Use this new
command when processing second COOKIE_ECHO chunk to restore the context
for SCTP state machine.

Signed-off-by: Max Matveev <ma...@redhat.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

[dannf: backported to Debian's 2.6.32]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

include/net/sctp/command.h | 1 +
net/sctp/sm_sideeffect.c | 5 +++++
net/sctp/sm_statefuns.c | 6 ++++++
3 files changed, 12 insertions(+)

diff --git a/include/net/sctp/command.h b/include/net/sctp/command.h
index 2c55a7e..0edc14d 100644
--- a/include/net/sctp/command.h
+++ b/include/net/sctp/command.h
@@ -108,6 +108,7 @@ typedef enum {
SCTP_CMD_UPDATE_INITTAG, /* Update peer inittag */
SCTP_CMD_SEND_MSG, /* Send the whole use message */
SCTP_CMD_SEND_NEXT_ASCONF, /* Send the next ASCONF after ACK */
+ SCTP_CMD_SET_ASOC, /* Restore association context */
SCTP_CMD_LAST
} sctp_verb_t;

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index ed742bf..9005d83 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -1676,6 +1676,11 @@ static int sctp_cmd_interpreter(sctp_event_t event_type,
case SCTP_CMD_SEND_NEXT_ASCONF:
sctp_cmd_send_asconf(asoc);
break;
+
+ case SCTP_CMD_SET_ASOC:
+ asoc = cmd->obj.asoc;
+ break;
+
default:
printk(KERN_WARNING "Impossible command: %u, %p\n",
cmd->verb, cmd->obj.ptr);
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 2f8e1c8..9e4e846 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -2048,6 +2048,12 @@ sctp_disposition_t sctp_sf_do_5_2_4_dupcook(const struct sctp_endpoint *ep,
sctp_add_cmd_sf(commands, SCTP_CMD_NEW_ASOC, SCTP_ASOC(new_asoc));
sctp_add_cmd_sf(commands, SCTP_CMD_DELETE_TCB, SCTP_NULL());

+ /* Restore association pointer to provide SCTP command interpeter
+ * with a valid context in case it needs to manipulate
+ * the queues */
+ sctp_add_cmd_sf(commands, SCTP_CMD_SET_ASOC,
+ SCTP_ASOC((struct sctp_association *)asoc));
+
return retval;

nomem:

Willy Tarreau

unread,

May 11, 2014, 10:10:01 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

[ Upstream commit 7563487cbf865284dcd35e9ef5a95380da046737 ]

There are three buffer overflows addressed in this patch.

1) In isdnloop_fake_err() we add an 'E' to a 60 character string and
then copy it into a 60 character buffer. I have made the destination
buffer 64 characters and I'm changed the sprintf() to a snprintf().

2) In isdnloop_parse_cmd(), p points to a 6 characters into a 60
character buffer so we have 54 characters. The ->eazlist[] is 11
characters long. I have modified the code to return if the source
buffer is too long.

3) In isdnloop_command() the cbuf[] array was 60 characters long but the
max length of the string then can be up to 79 characters. I made the
cbuf array 80 characters long and changed the sprintf() to snprintf().
I also removed the temporary "dial" buffer and changed it to use "p"
directly.

Unfortunately, we pass the "cbuf" string from isdnloop_command() to
isdnloop_writecmd() which truncates anything over 60 characters to make
it fit in card->omsg[]. (It can accept values up to 255 characters so
long as there is a '\n' character every 60 characters). For now I have
just fixed the memory corruption bug and left the other problems in this
driver alone.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/isdn/isdnloop/isdnloop.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/isdn/isdnloop/isdnloop.c b/drivers/isdn/isdnloop/isdnloop.c
index 92d895f..bf4168b 100644
--- a/drivers/isdn/isdnloop/isdnloop.c
+++ b/drivers/isdn/isdnloop/isdnloop.c
@@ -517,9 +517,9 @@ static isdnloop_stat isdnloop_cmd_table[] =
static void
isdnloop_fake_err(isdnloop_card * card)
{
- char buf[60];
+ char buf[64];

- sprintf(buf, "E%s", card->omsg);
+ snprintf(buf, sizeof(buf), "E%s", card->omsg);
isdnloop_fake(card, buf, -1);
isdnloop_fake(card, "NAK", -1);
}
@@ -902,6 +902,8 @@ isdnloop_parse_cmd(isdnloop_card * card)
case 7:
/* 0x;EAZ */
p += 3;
+ if (strlen(p) >= sizeof(card->eazlist[0]))
+ break;
strcpy(card->eazlist[ch - 1], p);
break;
case 8:
@@ -1126,7 +1128,7 @@ isdnloop_command(isdn_ctrl * c, isdnloop_card * card)
{
ulong a;
int i;
- char cbuf[60];
+ char cbuf[80];
isdn_ctrl cmd;
isdnloop_cdef cdef;

@@ -1191,7 +1193,6 @@ isdnloop_command(isdn_ctrl * c, isdnloop_card * card)
break;
if ((c->arg & 255) < ISDNLOOP_BCH) {
char *p;
- char dial[50];
char dcode[4];

a = c->arg;
@@ -1203,10 +1204,10 @@ isdnloop_command(isdn_ctrl * c, isdnloop_card * card)
} else
/* Normal Dial */
strcpy(dcode, "CAL");
- strcpy(dial, p);
- sprintf(cbuf, "%02d;D%s_R%s,%02d,%02d,%s\n", (int) (a + 1),
- dcode, dial, c->parm.setup.si1,
- c->parm.setup.si2, c->parm.setup.eazmsn);
+ snprintf(cbuf, sizeof(cbuf),
+ "%02d;D%s_R%s,%02d,%02d,%s\n", (int) (a + 1),
+ dcode, p, c->parm.setup.si1,
+ c->parm.setup.si2, c->parm.setup.eazmsn);
i = isdnloop_writecmd(cbuf, strlen(cbuf), 0, card);
}
break;

Willy Tarreau

unread,

May 11, 2014, 10:10:02 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <mpat...@redhat.com>

commit 22c73795b101597051924556dce019385a1e2fa0 upstream

This patch reorders reported frequencies from the highest to the lowest,
just like in other frequency drivers.

Signed-off-by: Mikulas Patocka <mpat...@redhat.com>
Acked-by: Viresh Kumar <viresh...@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
index 2d9161e..eb890f1 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
@@ -37,17 +37,20 @@ MODULE_PARM_DESC(bus_frequency, "Bus frequency in kHz");

/* Clock ratio multiplied by 10 - see table 27 in AMD#23446 */
static struct cpufreq_frequency_table clock_ratio[] = {
- {45, /* 000 -> 4.5x */ 0},
+ {60, /* 110 -> 6.0x */ 0},
+ {55, /* 011 -> 5.5x */ 0},
{50, /* 001 -> 5.0x */ 0},
+ {45, /* 000 -> 4.5x */ 0},
{40, /* 010 -> 4.0x */ 0},
- {55, /* 011 -> 5.5x */ 0},
- {20, /* 100 -> 2.0x */ 0},
- {30, /* 101 -> 3.0x */ 0},
- {60, /* 110 -> 6.0x */ 0},
{35, /* 111 -> 3.5x */ 0},
+ {30, /* 101 -> 3.0x */ 0},
+ {20, /* 100 -> 2.0x */ 0},
{0, CPUFREQ_TABLE_END}
};

+static const u8 index_to_register[8] = { 6, 3, 1, 0, 2, 7, 5, 4 };
+static const u8 register_to_index[8] = { 3, 2, 4, 1, 7, 6, 0, 5 };
+
static const struct {
unsigned freq;
unsigned mult;
@@ -91,7 +94,7 @@ static int powernow_k6_get_cpu_multiplier(void)

local_irq_enable();

- return clock_ratio[(invalue >> 5)&7].index;
+ return clock_ratio[register_to_index[(invalue >> 5)&7]].index;
}

static void powernow_k6_set_cpu_multiplier(unsigned int best_i)
@@ -111,7 +114,7 @@ static void powernow_k6_set_cpu_multiplier(unsigned int best_i)
write_cr0(cr0 | X86_CR0_CD);
wbinvd();

- outvalue = (1<<12) | (1<<10) | (1<<9) | (best_i<<5);
+ outvalue = (1<<12) | (1<<10) | (1<<9) | (index_to_register[best_i]<<5);

msrval = POWERNOW_IOPORT + 0x1;
wrmsr(MSR_K6_EPMR, msrval, 0); /* enable the PowerNow port */

Willy Tarreau

unread,

May 11, 2014, 10:10:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Ricardo Ribalda <ricardo...@gmail.com>

[ Upstream commit 7167cf0e8cd10287b7912b9ffcccd9616f382922 ]

The dma descriptors indexes are only initialized on the probe function.

If a packet is on the buffer when temac_stop is called, the dma
descriptors indexes can be left on a incorrect state where no other
package can be sent.

So an interface could be left in an usable state after ifdow/ifup.

This patch makes sure that the descriptors indexes are in a proper
status when the device is open.

Signed-off-by: Ricardo Ribalda Delgado <ricardo...@gmail.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/net/ll_temac_main.c | 6 ++++++

1 file changed, 6 insertions(+)

diff --git a/drivers/net/ll_temac_main.c b/drivers/net/ll_temac_main.c
index f2a197f..d2516dd 100644
--- a/drivers/net/ll_temac_main.c
+++ b/drivers/net/ll_temac_main.c
@@ -190,6 +190,12 @@ static int temac_dma_bd_init(struct net_device *ndev)
lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * (RX_BD_NUM - 1)));
temac_dma_out32(lp, TX_CURDESC_PTR, lp->tx_bd_p);

+ /* Init descriptor indexes */
+ lp->tx_bd_ci = 0;
+ lp->tx_bd_next = 0;
+ lp->tx_bd_tail = 0;
+ lp->rx_bd_ci = 0;
+
return 0;

Willy Tarreau

unread,

May 11, 2014, 10:10:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Neil Horman <nho...@tuxdriver.com>

commit 714b33d15130cbb5ab426456d4e3de842d6c5b8a upstream

Stephan Mueller reported to me recently a error in random number generation in
the ansi cprng. If several small requests are made that are less than the
instances block size, the remainder for loop code doesn't increment
rand_data_valid in the last iteration, meaning that the last bytes in the
rand_data buffer gets reused on the subsequent smaller-than-a-block request for
random data.

The fix is pretty easy, just re-code the for loop to make sure that
rand_data_valid gets incremented appropriately

Signed-off-by: Neil Horman <nho...@tuxdriver.com>
Reported-by: Stephan Mueller <stephan...@atsec.com>
CC: Stephan Mueller <stephan...@atsec.com>
CC: Petr Matousek <pmat...@redhat.com>
CC: Herbert Xu <her...@gondor.apana.org.au>
CC: "David S. Miller" <da...@davemloft.net>
Signed-off-by: Herbert Xu <her...@gondor.apana.org.au>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

crypto/ansi_cprng.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/ansi_cprng.c b/crypto/ansi_cprng.c
index 3aa6e38..0ffd5995 100644
--- a/crypto/ansi_cprng.c
+++ b/crypto/ansi_cprng.c
@@ -232,11 +232,11 @@ remainder:
*/
if (byte_count < DEFAULT_BLK_SZ) {
empty_rbuf:
- for (; ctx->rand_data_valid < DEFAULT_BLK_SZ;
- ctx->rand_data_valid++) {
+ while (ctx->rand_data_valid < DEFAULT_BLK_SZ) {
*ptr = ctx->rand_data[ctx->rand_data_valid];
ptr++;
byte_count--;
+ ctx->rand_data_valid++;
if (byte_count == 0)
goto done;

Willy Tarreau

unread,

May 11, 2014, 10:10:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <eric.d...@gmail.com>

commit 72a3effaf633bc ([NET]: Size listen hash tables using backlog
hint) added a bug allowing inet6_synq_hash() to return an out of bound
array index, because of u16 overflow.

Bug can happen if system admins set net.core.somaxconn &
net.ipv4.tcp_max_syn_backlog sysctls to values greater than 65536

Signed-off-by: Eric Dumazet <eric.d...@gmail.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

(cherry picked from commit c16a98ed91597b40b22b540c6517103497ef8e74)

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv6/inet6_connection_sock.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index cc4797d..59f4063 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -57,7 +57,7 @@ EXPORT_SYMBOL_GPL(inet6_csk_bind_conflict);
* request_sock (formerly open request) hash tables.
*/
static u32 inet6_synq_hash(const struct in6_addr *raddr, const __be16 rport,
- const u32 rnd, const u16 synq_hsize)
+ const u32 rnd, const u32 synq_hsize)
{
u32 a = (__force u32)raddr->s6_addr32[0];
u32 b = (__force u32)raddr->s6_addr32[1];

Willy Tarreau

unread,

May 11, 2014, 10:10:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Kees Cook <kees...@chromium.org>

commit d049f74f2dbe71354d43d393ac3a188947811348 upstream

The get_dumpable() return value is not boolean. Most users of the
function actually want to be testing for non-SUID_DUMP_USER(1) rather than
SUID_DUMP_DISABLE(0). The SUID_DUMP_ROOT(2) is also considered a
protected state. Almost all places did this correctly, excepting the two
places fixed in this patch.

Wrong logic:
if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ }
or
if (dumpable == 0) { /* be protective */ }
or
if (!dumpable) { /* be protective */ }

Correct logic:
if (dumpable != SUID_DUMP_USER) { /* be protective */ }
or
if (dumpable != 1) { /* be protective */ }

Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a
user was able to ptrace attach to processes that had dropped privileges to
that user. (This may have been partially mitigated if Yama was enabled.)

The macros have been moved into the file that declares get/set_dumpable(),
which means things like the ia64 code can see them too.

CVE-2013-2929

Reported-by: Vasily Kulikov <seg...@openwall.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
Cc: Tony Luck <tony...@intel.com>
Cc: Oleg Nesterov <ol...@redhat.com>
Cc: "Eric W. Biederman" <ebie...@xmission.com>

Cc: <sta...@vger.kernel.org>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>

[dannf: backported to Debian's 2.6.32]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

arch/ia64/include/asm/processor.h | 2 +-
fs/exec.c | 6 ++++++
include/linux/binfmts.h | 3 ---
include/linux/sched.h | 4 ++++
kernel/ptrace.c | 2 +-
5 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index 3eaeedf..d77b342 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -361,7 +361,7 @@ struct thread_struct {
regs->loadrs = 0; \
regs->r8 = get_dumpable(current->mm); /* set "don't zap registers" flag */ \
regs->r12 = new_sp - 16; /* allocate 16 byte scratch area */ \
- if (unlikely(!get_dumpable(current->mm))) { \
+ if (unlikely(get_dumpable(current->mm) != SUID_DUMP_USER)) { \
/* \
* Zap scratch regs to avoid leaking bits between processes with different \
* uid/privileges. \
diff --git a/fs/exec.c b/fs/exec.c
index feb2435..c32ae34 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1793,6 +1793,12 @@ void set_dumpable(struct mm_struct *mm, int value)
}
}

+/*
+ * This returns the actual value of the suid_dumpable flag. For things
+ * that are using this for checking for privilege transitions, it must
+ * test against SUID_DUMP_USER rather than treating it as a boolean
+ * value.
+ */
int get_dumpable(struct mm_struct *mm)
{
int ret;
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index 9ffffec..8eab628 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -107,9 +107,6 @@ extern int flush_old_exec(struct linux_binprm * bprm);
extern void setup_new_exec(struct linux_binprm * bprm);

extern int suid_dumpable;
-#define SUID_DUMP_DISABLE 0 /* No setuid dumping */
-#define SUID_DUMP_USER 1 /* Dump as user of process */
-#define SUID_DUMP_ROOT 2 /* Dump as root */

/* Stack area protections */
#define EXSTACK_DEFAULT 0 /* Whatever the arch defaults to */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 73c3b9b..56e1771 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -442,6 +442,10 @@ static inline unsigned long get_mm_hiwater_vm(struct mm_struct *mm)
extern void set_dumpable(struct mm_struct *mm, int value);
extern int get_dumpable(struct mm_struct *mm);

+#define SUID_DUMP_DISABLE 0 /* No setuid dumping */
+#define SUID_DUMP_USER 1 /* Dump as user of process */
+#define SUID_DUMP_ROOT 2 /* Dump as root */
+
/* mm flags */
/* dumpable bits */
#define MMF_DUMPABLE 0 /* core dump is permitted */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index d9c8c47..4185220 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -187,7 +187,7 @@ int __ptrace_may_access(struct task_struct *task, unsigned int mode)
smp_rmb();
if (task->mm)
dumpable = get_dumpable(task->mm);
- if (!dumpable && !capable(CAP_SYS_PTRACE))
+ if (dumpable != SUID_DUMP_USER && !capable(CAP_SYS_PTRACE))
return -EPERM;

return security_ptrace_access_check(task, mode);

Willy Tarreau

unread,

May 11, 2014, 10:10:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

[ Upstream commit f9a23c84486ed350cce7bb1b2828abd1f6658796 ]

These strings come from a copy_from_user() and there is no way to be
sure they are NUL terminated.

Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

drivers/isdn/isdnloop/isdnloop.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/isdn/isdnloop/isdnloop.c b/drivers/isdn/isdnloop/isdnloop.c
index 22446f7..92d895f 100644
--- a/drivers/isdn/isdnloop/isdnloop.c
+++ b/drivers/isdn/isdnloop/isdnloop.c
@@ -1082,8 +1082,10 @@ isdnloop_start(isdnloop_card * card, isdnloop_sdef * sdefp)
spin_unlock_irqrestore(&card->isdnloop_lock, flags);
return -ENOMEM;
}
- for (i = 0; i < 3; i++)
- strcpy(card->s0num[i], sdef.num[i]);
+ for (i = 0; i < 3; i++) {
+ strlcpy(card->s0num[i], sdef.num[i],
+ sizeof(card->s0num[0]));
+ }
break;
case ISDN_PTYPE_1TR6:
if (isdnloop_fake(card, "DRV1.04TC-1TR6-CAPI-CNS-BASIS-29.11.95",
@@ -1096,7 +1098,7 @@ isdnloop_start(isdnloop_card * card, isdnloop_sdef * sdefp)
spin_unlock_irqrestore(&card->isdnloop_lock, flags);
return -ENOMEM;
}
- strcpy(card->s0num[0], sdef.num[0]);
+ strlcpy(card->s0num[0], sdef.num[0], sizeof(card->s0num[0]));
card->s0num[1][0] = '\0';
card->s0num[2][0] = '\0';
break;

Willy Tarreau

unread,

May 11, 2014, 10:10:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

commit a963a37d384d71ad43b3e9e79d68d42fbe0901f3 upstream

It's possible to use AF_INET6 sockets and to connect to an IPv4
destination. After this, socket dst cache is a pointer to a rtable,
not rt6_info.

ip6_sk_dst_check() should check the socket dst cache is IPv6, or else
various corruptions/crashes can happen.

Dave Jones can reproduce immediate crash with
trinity -q -l off -n -c sendmsg -c connect

With help from Hannes Frederic Sowa

Reported-by: Dave Jones <da...@redhat.com>
Reported-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: Eric Dumazet <edum...@google.com>
Acked-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv6/ip6_output.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6ba0fe2..bba91a1 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -920,11 +920,17 @@ static struct dst_entry *ip6_sk_dst_check(struct sock *sk,
struct flowi *fl)
{
struct ipv6_pinfo *np = inet6_sk(sk);
- struct rt6_info *rt = (struct rt6_info *)dst;
+ struct rt6_info *rt;

if (!dst)
goto out;

+ if (dst->ops->family != AF_INET6) {
+ dst_release(dst);
+ return NULL;
+ }
+
+ rt = (struct rt6_info *)dst;
/* Yes, checking route validity in not connected
* case is not very simple. Take into account,
* that we do not support routing by source, TOS,

Willy Tarreau

unread,

May 11, 2014, 10:10:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dbor...@redhat.com>

commit b22f5126a24b3b2f15448c3f2a254fc10cbc2b92 upstream

Some occurences in the netfilter tree use skb_header_pointer() in
the following way ...

struct dccp_hdr _dh, *dh;
...
skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);

... where dh itself is a pointer that is being passed as the copy
buffer. Instead, we need to use &_dh as the forth argument so that
we're copying the data into an actual buffer that sits on the stack.

Currently, we probably could overwrite memory on the stack (e.g.
with a possibly mal-formed DCCP packet), but unintentionally, as
we only want the buffer to be placed into _dh variable.

Fixes: 2bc780499aa3 ("[NETFILTER]: nf_conntrack: add DCCP protocol support")
Signed-off-by: Daniel Borkmann <dbor...@redhat.com>

Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/netfilter/nf_conntrack_proto_dccp.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto_dccp.c b/net/netfilter/nf_conntrack_proto_dccp.c
index 1b816a2..274e8a7 100644
--- a/net/netfilter/nf_conntrack_proto_dccp.c
+++ b/net/netfilter/nf_conntrack_proto_dccp.c
@@ -430,7 +430,7 @@ static bool dccp_new(struct nf_conn *ct, const struct sk_buff *skb,
const char *msg;
u_int8_t state;

- dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);
+ dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &_dh);
BUG_ON(dh == NULL);

state = dccp_state_table[CT_DCCP_ROLE_CLIENT][dh->dccph_type][CT_DCCP_NONE];
@@ -479,7 +479,7 @@ static int dccp_packet(struct nf_conn *ct, const struct sk_buff *skb,
u_int8_t type, old_state, new_state;
enum ct_dccp_roles role;

- dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);
+ dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &_dh);
BUG_ON(dh == NULL);
type = dh->dccph_type;

@@ -570,7 +570,7 @@ static int dccp_error(struct net *net, struct sk_buff *skb,
unsigned int cscov;
const char *msg;

- dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);
+ dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &_dh);
if (dh == NULL) {
msg = "nf_ct_dccp: short packet ";
goto out_invalid;

Willy Tarreau

unread,

May 11, 2014, 10:10:03 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Jiri Bohac <jbo...@suse.cz>

[ Upstream commit 61e76b178dbe7145e8d6afa84bb4ccea71918994 ]

RFC 4443 has defined two additional codes for ICMPv6 type 1 (destination
unreachable) messages:
5 - Source address failed ingress/egress policy
6 - Reject route to destination

Now they are treated as protocol error and icmpv6_err_convert() converts them
to EPROTO.

RFC 4443 says:
"Codes 5 and 6 are more informative subsets of code 1."

Treat codes 5 and 6 as code 1 (EACCES)

Btw, connect() returning -EPROTO confuses firefox, so that fallback to
other/IPv4 addresses does not work:
https://bugzilla.mozilla.org/show_bug.cgi?id=910773

Signed-off-by: Jiri Bohac <jbo...@suse.cz>

Acked-by: Hannes Frederic Sowa <han...@stressinduktion.org>
Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

include/linux/icmpv6.h | 2 ++
net/ipv6/icmp.c | 10 +++++++++-
2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/icmpv6.h b/include/linux/icmpv6.h
index c0d8357..2e3d33c 100644
--- a/include/linux/icmpv6.h
+++ b/include/linux/icmpv6.h
@@ -123,6 +123,8 @@ static inline struct icmp6hdr *icmp6_hdr(const struct sk_buff *skb)
#define ICMPV6_NOT_NEIGHBOUR 2
#define ICMPV6_ADDR_UNREACH 3
#define ICMPV6_PORT_UNREACH 4
+#define ICMPV6_POLICY_FAIL 5
+#define ICMPV6_REJECT_ROUTE 6

/*
* Codes for Time Exceeded
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index f23ebbe..376a4b6 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -903,6 +903,14 @@ static const struct icmp6_err {
.err = ECONNREFUSED,
.fatal = 1,
},
+ { /* POLICY_FAIL */
+ .err = EACCES,
+ .fatal = 1,
+ },
+ { /* REJECT_ROUTE */
+ .err = EACCES,
+ .fatal = 1,
+ },
};

int icmpv6_err_convert(u8 type, u8 code, int *err)
@@ -914,7 +922,7 @@ int icmpv6_err_convert(u8 type, u8 code, int *err)
switch (type) {
case ICMPV6_DEST_UNREACH:
fatal = 1;
- if (code <= ICMPV6_PORT_UNREACH) {
+ if (code < ARRAY_SIZE(tab_unreach)) {
*err = tab_unreach[code].err;
fatal = tab_unreach[code].fatal;

Willy Tarreau

unread,

May 11, 2014, 10:10:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: stephen hemminger <shemm...@vyatta.com>

TCP Cubic keeps a metric that estimates the amount of delayed
acknowledgements to use in adjusting the window. If an abnormally
large number of packets are acknowledged at once, then the update
could wrap and reach zero. This kind of ACK could only
happen when there was a large window and huge number of
ACK's were lost.

This patch limits the value of delayed ack ratio. The choice of 32
is just a conservative value since normally it should be range of
1 to 4 packets.

Signed-off-by: Stephen Hemminger <shemm...@vyatta.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

(cherry picked from commit b9f47a3aaeabdce3b42829bbb27765fa340f76ba)
[wt: in 2.6.32, this fix is needed for the next one]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/tcp_cubic.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 0d41c26..1b6b8c2 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -90,6 +90,7 @@ struct bictcp {
u32 ack_cnt; /* number of acks */
u32 tcp_cwnd; /* estimated tcp cwnd */
#define ACK_RATIO_SHIFT 4
+#define ACK_RATIO_LIMIT (32u << ACK_RATIO_SHIFT)
u16 delayed_ack; /* estimate the ratio of Packets/ACKs << 4 */
u8 sample_cnt; /* number of samples to decide curr_rtt */
u8 found; /* the exit point is found? */
@@ -379,8 +380,12 @@ static void bictcp_acked(struct sock *sk, u32 cnt, s32 rtt_us)
u32 delay;

if (icsk->icsk_ca_state == TCP_CA_Open) {
- cnt -= ca->delayed_ack >> ACK_RATIO_SHIFT;
- ca->delayed_ack += cnt;
+ u32 ratio = ca->delayed_ack;
+
+ ratio -= ca->delayed_ack >> ACK_RATIO_SHIFT;
+ ratio += cnt;
+
+ ca->delayed_ack = min(ratio, ACK_RATIO_LIMIT);
}

/* Some calls are for duplicates without timetamps */

Willy Tarreau

unread,

May 11, 2014, 10:10:04 PM5/11/14

to

2.6.32-longterm review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

[ Upstream commit a622260254ee481747cceaaa8609985b29a31565 ]

Daniel Petre reported crashes in icmp_dst_unreach() with following call
graph:

Daniel found a similar problem mentioned in
http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html

And indeed this is the root cause : skb->cb[] contains data fooling IP
stack.

We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure()
is called. Or else skb->cb[] might contain garbage from GSO segmentation
layer.

A similar fix was tested on linux-3.9, but gre code was refactored in
linux-3.10. I'll send patches for stable kernels as well.

Many thanks to Daniel for providing reports, patches and testing !

Reported-by: Daniel Petre <daniel...@rcs-rds.ro>
Signed-off-by: Eric Dumazet <edum...@google.com>

Signed-off-by: David S. Miller <da...@davemloft.net>

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---

net/ipv4/ipip.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 860b5c5..49aa1ad 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -408,6 +408,7 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
if (tos&1)
tos = old_iph->tos;

+ memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
if (!dst) {
/* NBMA tunnel */
if ((rt = skb_rtable(skb)) == NULL) {
@@ -494,7 +495,6 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
skb->transport_header = skb->network_header;
skb_push(skb, sizeof(struct iphdr));
skb_reset_network_header(skb);
- memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED |
IPSKB_REROUTED);
skb_dst_drop(skb);