Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH 4.4 022/137] Btrfs: fix fitrim discarding device area reserved for boot loaders use

159 views
Skip to first unread message

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:10:08 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdma...@suse.com>

commit 8cdc7c5b00d945a3c823fc4277af304abb9cb43d upstream.

As of the 4.3 kernel release, the fitrim ioctl can now discard any region
of a disk that is not allocated to any chunk/block group, including the
first megabyte which is used for our primary superblock and by the boot
loader (grub for example).

Fix this by not allowing to trim/discard any region in the device starting
with an offset not greater than min(alloc_start_mount_option, 1Mb), just
as it was not possible before 4.3.

A reproducer test case for xfstests follows.

seq=`basename $0`
seqres=$RESULT_DIR/$seq
echo "QA output created by $seq"
tmp=/tmp/$$
status=1 # failure is the default!
trap "_cleanup; exit \$status" 0 1 2 3 15

_cleanup()
{
cd /
rm -f $tmp.*
}

# get standard environment, filters and checks
. ./common/rc
. ./common/filter

# real QA test starts here
_need_to_be_root
_supported_fs btrfs
_supported_os Linux
_require_scratch

rm -f $seqres.full

_scratch_mkfs >>$seqres.full 2>&1

# Write to the [0, 64Kb[ and [68Kb, 1Mb[ ranges of the device. These ranges are
# reserved for a boot loader to use (GRUB for example) and btrfs should never
# use them - neither for allocating metadata/data nor should trim/discard them.
# The range [64Kb, 68Kb[ is used for the primary superblock of the filesystem.
$XFS_IO_PROG -c "pwrite -S 0xfd 0 64K" $SCRATCH_DEV | _filter_xfs_io
$XFS_IO_PROG -c "pwrite -S 0xfd 68K 956K" $SCRATCH_DEV | _filter_xfs_io

# Now mount the filesystem and perform a fitrim against it.
_scratch_mount
_require_batched_discard $SCRATCH_MNT
$FSTRIM_PROG $SCRATCH_MNT

# Now unmount the filesystem and verify the content of the ranges was not
# modified (no trim/discard happened on them).
_scratch_unmount
echo "Content of the ranges [0, 64Kb] and [68Kb, 1Mb[ after fitrim:"
od -t x1 -N $((64 * 1024)) $SCRATCH_DEV
od -t x1 -j $((68 * 1024)) -N $((956 * 1024)) $SCRATCH_DEV

status=0
exit

Reported-by: Vincent Petry <PVin...@yahoo.fr>
Reported-by: Andrei Borzenkov <arvi...@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109341
Fixes: 499f377f49f0 (btrfs: iterate over unused chunk space in FITRIM)
Signed-off-by: Filipe Manana <fdma...@suse.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/btrfs/volumes.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1257,6 +1257,15 @@ int find_free_dev_extent_start(struct bt
int ret;
int slot;
struct extent_buffer *l;
+ u64 min_search_start;
+
+ /*
+ * We don't want to overwrite the superblock on the drive nor any area
+ * used by the boot loader (grub for example), so we make sure to start
+ * at an offset of at least 1MB.
+ */
+ min_search_start = max(root->fs_info->alloc_start, 1024ull * 1024);
+ search_start = max(search_start, min_search_start);

path = btrfs_alloc_path();
if (!path)
@@ -1397,18 +1406,9 @@ int find_free_dev_extent(struct btrfs_tr
struct btrfs_device *device, u64 num_bytes,
u64 *start, u64 *len)
{
- struct btrfs_root *root = device->dev_root;
- u64 search_start;
-
/* FIXME use last free of some kind */
-
- /*
- * we don't want to overwrite the superblock on the drive,
- * so we make sure to start at an offset of at least 1MB
- */
- search_start = max(root->fs_info->alloc_start, 1024ull * 1024);
return find_free_dev_extent_start(trans->transaction, device,
- num_bytes, search_start, start, len);
+ num_bytes, 0, start, len);
}

static int btrfs_free_dev_extent(struct btrfs_trans_handle *trans,

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:10:09 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdma...@suse.com>

commit 313140023026ae542ad76e7e268c56a1eaa2c28e upstream.

In the extent_same ioctl, we were grabbing the pages (locked) and
attempting to read them without bothering about any concurrent IO
against them. That is, we were not checking for any ongoing ordered
extents nor waiting for them to complete, which leads to a race where
the extent_same() code gets a checksum verification error when it
reads the pages, producing a message like the following in dmesg
and making the operation fail to user space with -ENOMEM:

[18990.161265] BTRFS warning (device sdc): csum failed ino 259 off 495616 csum 685204116 expected csum 1515870868

Fix this by using btrfs_readpage() for reading the pages instead of
extent_read_full_page_nolock(), which waits for any concurrent ordered
extents to complete and locks the io range. Also do better error handling
and don't treat all failures as -ENOMEM, as that's clearly misleasing,
becoming identical to the checks and operation of prepare_uptodate_page().

The use of extent_read_full_page_nolock() was required before
commit f441460202cb ("btrfs: fix deadlock with extent-same and readpage"),
as we had the range locked in an inode's io tree before attempting to
read the pages.

Fixes: f441460202cb ("btrfs: fix deadlock with extent-same and readpage")
Signed-off-by: Filipe Manana <fdma...@suse.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/btrfs/ioctl.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2782,21 +2782,27 @@ out:
static struct page *extent_same_get_page(struct inode *inode, pgoff_t index)
{
struct page *page;
- struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;

page = grab_cache_page(inode->i_mapping, index);
if (!page)
- return NULL;
+ return ERR_PTR(-ENOMEM);

if (!PageUptodate(page)) {
- if (extent_read_full_page_nolock(tree, page, btrfs_get_extent,
- 0))
- return NULL;
+ int ret;
+
+ ret = btrfs_readpage(NULL, page);
+ if (ret)
+ return ERR_PTR(ret);
lock_page(page);
if (!PageUptodate(page)) {
unlock_page(page);
page_cache_release(page);
- return NULL;
+ return ERR_PTR(-EIO);
+ }
+ if (page->mapping != inode->i_mapping) {
+ unlock_page(page);
+ page_cache_release(page);
+ return ERR_PTR(-EAGAIN);
}
}

@@ -2810,9 +2816,16 @@ static int gather_extent_pages(struct in
pgoff_t index = off >> PAGE_CACHE_SHIFT;

for (i = 0; i < num_pages; i++) {
+again:
pages[i] = extent_same_get_page(inode, index + i);
- if (!pages[i])
- return -ENOMEM;
+ if (IS_ERR(pages[i])) {
+ int err = PTR_ERR(pages[i]);
+
+ if (err == -EAGAIN)
+ goto again;
+ pages[i] = NULL;
+ return err;
+ }
}
return 0;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:10:11 PM2/23/16
to
This is the start of the stable review cycle for the 4.4.3 release.
There are 137 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Fri Feb 26 03:33:58 UTC 2016.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.3-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gre...@linuxfoundation.org>
Linux 4.4.3-rc1

Luis R. Rodriguez <mcg...@suse.com>
modules: fix modparam async_probe request

Rusty Russell <ru...@rustcorp.com.au>
module: wrapper for symbol name.

Thomas Gleixner <tg...@linutronix.de>
itimers: Handle relative timers with CONFIG_TIME_LOW_RES proper

Thomas Gleixner <tg...@linutronix.de>
posix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper

Thomas Gleixner <tg...@linutronix.de>
timerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper

Mateusz Guzik <mgu...@redhat.com>
prctl: take mmap sem for writing to protect against others

Dave Chinner <dchi...@redhat.com>
xfs: log mount failures don't wait for buffers to be released

Dave Chinner <da...@fromorbit.com>
Revert "xfs: clear PF_NOFREEZE for xfsaild kthread"

Dave Chinner <dchi...@redhat.com>
xfs: inode recovery readahead can race with inode buffer creation

Darrick J. Wong <darric...@oracle.com>
libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct

Miklos Szeredi <mik...@szeredi.hu>
ovl: setattr: check permissions before copy-up

Miklos Szeredi <mik...@szeredi.hu>
ovl: root: copy attr

Konstantin Khlebnikov <khleb...@yandex-team.ru>
ovl: check dentry positiveness in ovl_cleanup_whiteouts()

Vito Caputo <vito....@coreos.com>
ovl: use a minimal buffer in ovl_copy_xattr

Miklos Szeredi <mik...@szeredi.hu>
ovl: allow zero size xattr

Thomas Gleixner <tg...@linutronix.de>
futex: Drop refcount if requeue_pi() acquired the rtmutex

Toshi Kani <toshi...@hpe.com>
devm_memremap_release(): fix memremap'd addr handling

Kirill A. Shutemov <kirill....@linux.intel.com>
ipc/shm: handle removed segments gracefully in shm_mmap()

Dan Carpenter <dan.ca...@oracle.com>
intel_scu_ipcutil: underflow in scu_reg_access()

Vineet Gupta <Vineet...@synopsys.com>
mm,thp: khugepaged: call pte flush at the time of collapse

Eric Dumazet <edum...@google.com>
dump_stack: avoid potential deadlocks

Konstantin Khlebnikov <koc...@gmail.com>
radix-tree: fix oops after radix_tree_iter_retry

Matthew Wilcox <wi...@linux.intel.com>
drivers/hwspinlock: fix race between radix tree insertion and lookup

Matthew Wilcox <wi...@linux.intel.com>
radix-tree: fix race in gang lookup

Rich Felker <dal...@libc.org>
MAINTAINERS: return arch/sh to maintained state, with new maintainers

Martijn Coenen <ma...@google.com>
memcg: only free spare array when readers are done

Michael Holzheu <hol...@linux.vnet.ibm.com>
numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390

Mike Kravetz <mike.k...@oracle.com>
fs/hugetlbfs/inode.c: fix bugs in hugetlb_vmtruncate_list()

Sergey Senozhatsky <sergey.seno...@gmail.com>
scripts/bloat-o-meter: fix python3 syntax error

Laura Abbott <lab...@fedoraproject.org>
dma-debug: switch check from _text to _stext

Sudip Mukherjee <sudipm.m...@gmail.com>
m32r: fix m32104ut_defconfig build fail

Mathias Nyman <mathia...@linux.intel.com>
xhci: Fix list corruption in urb dequeue at host removal

Mathias Nyman <mathia...@linux.intel.com>
Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"

David Woodhouse <David.W...@intel.com>
iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts

CQ Tang <cq....@intel.com>
iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG

David Woodhouse <David.W...@intel.com>
iommu/vt-d: Fix mm refcounting to hold mm_count not mm_users

Baoquan He <b...@redhat.com>
iommu/amd: Correct the wrong setting of alias DTE in do_attach

Jeremy McNicoll <jmcn...@redhat.com>
iommu/vt-d: Don't skip PCI devices when disabling IOTLB

Dmitry Torokhov <dmitry....@gmail.com>
Input: vmmouse - fix absolute device registration

James Bottomley <JBott...@Odin.com>
string_helpers: fix precision loss for some inputs

Aurélien Francillon <aure...@francillon.net>
Input: i8042 - add Fujitsu Lifebook U745 to the nomux list

Benjamin Tissoires <benjamin....@redhat.com>
Input: elantech - mark protocols v2 and v3 as semi-mt

Kirill A. Shutemov <kirill....@linux.intel.com>
mm: fix regression in remap_file_pages() emulation

Konstantin Khlebnikov <koc...@gmail.com>
mm: replace vma_lock_anon_vma with anon_vma_lock_read/write

Kirill A. Shutemov <kirill....@linux.intel.com>
mm: fix mlock accouting

Dan Williams <dan.j.w...@intel.com>
libnvdimm: fix namespace object confusion in is_uuid_busy()

Naoya Horiguchi <n-hor...@ah.jp.nec.com>
mm: soft-offline: check return value in second __get_any_page() call

Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
perf kvm record/report: 'unprocessable sample' error while recording/reporting guest data

Greg Kurz <gk...@linux.vnet.ibm.com>
KVM: PPC: Fix ONE_REG AltiVec support

Thomas Huth <th...@redhat.com>
KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8

Andre Przywara <andre.p...@arm.com>
KVM: arm/arm64: Fix reference to uninitialised VGIC

Marek Szyprowski <m.szyp...@samsung.com>
arm64: dma-mapping: fix handling of devices registered before arch_initcall

Tony Lindgren <to...@atomide.com>
ARM: OMAP2+: Fix ppa_zero_params and ppa_por_params for rodata

Tony Lindgren <to...@atomide.com>
ARM: OMAP2+: Fix save_secure_ram_context for rodata

Tony Lindgren <to...@atomide.com>
ARM: OMAP2+: Fix l2dis_3630 for rodata

Tony Lindgren <to...@atomide.com>
ARM: OMAP2+: Fix l2_inv_api_params for rodata

Tony Lindgren <to...@atomide.com>
ARM: OMAP2+: Fix wait_dll_lock_timed for rodata

Wenyou Yang <wenyo...@atmel.com>
ARM: dts: at91: sama5d4ek: add phy address and IRQ for macb0

Nicolas Ferre <nicola...@atmel.com>
ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type

Mohamed Jamsheeth Hajanajubudeen <mohamedjamsheet...@atmel.com>
ARM: dts: at91: sama5d4: fix instance id of DBGU

Alexandre Belloni <alexandr...@free-electrons.com>
ARM: dts: at91: sama5d4 xplained: properly mux phy interrupt

H. Nikolaus Schaller <h...@goldelico.com>
ARM: dts: omap5-board-common: enable rtc and charging of backup battery

Tony Lindgren <to...@atomide.com>
ARM: dts: Fix omap5 PMIC control lines for RTC writes

Adam Ford <afor...@gmail.com>
ARM: dts: Fix wl12xx missing clocks that cause hangs

Linus Walleij <linus....@linaro.org>
ARM: nomadik: fix up SD/MMC DT settings

Linus Walleij <linus....@linaro.org>
ARM: 8517/1: ICST: avoid arithmetic overflow in icst_hz()

Linus Walleij <linus....@linaro.org>
ARM: 8519/1: ICST: try other dividends than 1

Mika Penttilä <mika.p...@nextfour.com>
arm64: mm: avoid calling apply_to_page_range on empty range

Thomas Petazzoni <thomas.p...@free-electrons.com>
ARM: mvebu: remove duplicated regulator definition in Armada 388 GP

Alexey Kardashevskiy <a...@ozlabs.ru>
powerpc/ioda: Set "read" permission when "write" is set

Gavin Shan <gws...@linux.vnet.ibm.com>
powerpc/powernv: Fix stale PE primary bus

Gavin Shan <gws...@linux.vnet.ibm.com>
powerpc/eeh: Fix stale cached primary bus

Andreas Schwab <sch...@linux-m68k.org>
powerpc: Fix dedotify for binutils >= 2.26

Alan Modra <amo...@gmail.com>
powerpc: Simplify module TOC handling

Gavin Shan <gws...@linux.vnet.ibm.com>
powerpc/eeh: Fix PE location code

Trond Myklebust <trond.m...@primarydata.com>
SUNRPC: Fixup socket wait for memory

Andrew Gabbasov <andrew_...@mentor.com>
udf: Check output buffer length when converting name to CS0

Andrew Gabbasov <andrew_...@mentor.com>
udf: Prevent buffer overrun with multi-byte characters

Vegard Nossum <vegard...@oracle.com>
udf: limit the maximum number of indirect extents in a row

Trond Myklebust <trond.m...@primarydata.com>
pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn

Andrew Elble <awe...@rit.edu>
nfs: Fix race in __update_open_stateid()

Trond Myklebust <trond.m...@primarydata.com>
pNFS/flexfiles: Fix an Oopsable typo in ff_mirror_match_fh()

Trond Myklebust <trond.m...@primarydata.com>
NFS: Fix attribute cache revalidation

Anton Protopopov <a.s.pro...@gmail.com>
cifs: fix erroneous return value

Vasily Averin <v...@virtuozzo.com>
cifs_dbg() outputs an uninitialized buffer in cifs_readdir()

Rabin Vincent <rabin....@axis.com>
cifs: fix race between call_async() and reconnect()

Jamie Bainbridge <jamie.ba...@gmail.com>
cifs: Ratelimit kernel log messages

Dan Carpenter <dan.ca...@oracle.com>
iio: inkern: fix a NULL dereference on error

Akinobu Mita <akinob...@gmail.com>
iio: pressure: mpl115: fix temperature offset sign

Gabriele Mazzotta <gabrie...@gmail.com>
iio: light: acpi-als: Report data as processed

Yong Li <sdli...@gmail.com>
iio: dac: mcp4725: set iio name property in sysfs

Vegard Nossum <vegard...@oracle.com>
iio: add IIO_TRIGGER dependency to STK8BA50

Vegard Nossum <vegard...@oracle.com>
iio: add HAS_IOMEM dependency to VF610_ADC

Markus Elfring <elf...@users.sourceforge.net>
iio-light: Use a signed return type for ltr501_match_samp_freq()

Jonathan Cameron <ji...@kernel.org>
iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer.

Lars-Peter Clausen <la...@metafoo.de>
iio: adis_buffer: Fix out-of-bounds memory access

James Bottomley <James.B...@HansenPartnership.com>
scsi: fix soft lockup in scsi_remove_target() on module removal

Mika Westerberg <mika.we...@linux.intel.com>
SCSI: Add Marvell Console to VPD blacklist

Hannes Reinecke <ha...@suse.de>
scsi_dh_rdac: always retry MODE SELECT on command lock violation

Kirill A. Shutemov <kirill....@linux.intel.com>
drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration

Alan Stern <st...@rowland.harvard.edu>
SCSI: fix crashes in sd and sr runtime PM

Nicholas Bellinger <n...@linux-iscsi.org>
iscsi-target: Fix potential dead-lock during node acl delete

Mike Christie <mchr...@redhat.com>
scsi: add Synology to 1024 sector blacklist

James Bottomley <James.B...@HansenPartnership.com>
klist: fix starting point removed bug in klist iterators

Steven Rostedt (Red Hat) <ros...@goodmis.org>
tracepoints: Do not trace when cpu is offline

Arnd Bergmann <ar...@arndb.de>
tracing: Fix freak link error caused by branch tracer

Adrian Hunter <adrian...@intel.com>
perf tools: tracepoint_error() can receive e=NULL, robustify it

Steven Rostedt <ros...@goodmis.org>
tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines

Jann Horn <ja...@thejh.net>
ptrace: use fsuid, fsgid, effective creds for fs access checks

Filipe Manana <fdma...@suse.com>
Btrfs: fix direct IO requests not reporting IO error to user space

Filipe Manana <fdma...@suse.com>
Btrfs: fix hang on extent buffer lock caused by the inode_paths ioctl

Filipe Manana <fdma...@suse.com>
Btrfs: fix page reading in extent_same ioctl leading to csum errors

Filipe Manana <fdma...@suse.com>
Btrfs: fix invalid page accesses in extent_same (dedup) ioctl

David Sterba <dst...@suse.com>
btrfs: properly set the termination value of ctx->pos in readdir

David Sterba <dst...@suse.com>
Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()"

Filipe Manana <fdma...@suse.com>
Btrfs: fix fitrim discarding device area reserved for boot loader's use

David Sterba <dst...@suse.com>
btrfs: handle invalid num_stripes in sys_array

Eryu Guan <guan...@gmail.com>
ext4: don't read blocks from disk after extents being swapped

Insu Yun <wun...@gmail.com>
ext4: fix potential integer overflow

Jan Kara <ja...@suse.cz>
ext4: fix scheduling in atomic on group checksum failure

Peter Hurley <pe...@hurleysoftware.com>
serial: omap: Prevent DoS using unprivileged ioctl(TIOCSRS485)

Mika Westerberg <mika.we...@linux.intel.com>
serial: 8250_pci: Add Intel Broadwell ports

Jeremy McNicoll <jmcn...@redhat.com>
tty: Add support for PCIe WCH382 2S multi-IO card

Herton R. Krzesinski <her...@redhat.com>
pty: make sure super_block is still valid in final /dev/tty close

Herton R. Krzesinski <her...@redhat.com>
pty: fix possible use after free of tty->driver_data

Peter Hurley <pe...@hurleysoftware.com>
staging/speakup: Use tty_ldisc_ref() for paste kworker

Tony Lindgren <to...@atomide.com>
phy: twl4030-usb: Fix unbalanced pm_runtime_enable on module reload

Tony Lindgren <to...@atomide.com>
phy: twl4030-usb: Relase usb phy on unload

Takashi Iwai <ti...@suse.de>
ALSA: seq: Fix double port list deletion

Takashi Iwai <ti...@suse.de>
ALSA: seq: Fix leak of pool buffer at concurrent writes

Takashi Iwai <ti...@suse.de>
ALSA: pcm: Fix rwsem deadlock for non-atomic PCM stream

Takashi Iwai <ti...@suse.de>
ALSA: hda - Cancel probe work instead of flush at remove

Toshi Kani <toshi...@hpe.com>
x86/mm: Fix vmalloc_fault() to handle large pages properly

Toshi Kani <toshi...@hpe.com>
x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()

Toshi Kani <toshi...@hpe.com>
x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable

Matt Fleming <ma...@codeblueprint.co.uk>
x86/mm/pat: Avoid truncation when converting cpa->numpages to address

Jan Beulich <JBeu...@suse.com>
x86/mm: Fix types used in pgprot cacheability flags translations


-------------

Diffstat:

MAINTAINERS | 4 +-
Makefile | 4 +-
arch/arm/boot/dts/armada-388-gp.dts | 10 --
arch/arm/boot/dts/at91-sama5d4_xplained.dts | 8 +-
arch/arm/boot/dts/at91-sama5d4ek.dts | 11 +++
arch/arm/boot/dts/logicpd-torpedo-som.dtsi | 1 +
arch/arm/boot/dts/omap5-board-common.dtsi | 33 +++++++
arch/arm/boot/dts/sama5d4.dtsi | 2 +-
arch/arm/boot/dts/ste-nomadik-stn8815.dtsi | 37 +++----
arch/arm/common/icst.c | 9 +-
arch/arm/mach-omap2/sleep34xx.S | 61 ++++++------
arch/arm/mach-omap2/sleep44xx.S | 25 +++--
arch/arm64/mm/dma-mapping.c | 4 +
arch/arm64/mm/pageattr.c | 3 +
arch/m32r/kernel/setup.c | 3 +
arch/powerpc/include/asm/eeh.h | 1 +
arch/powerpc/kernel/eeh_driver.c | 3 +
arch/powerpc/kernel/eeh_pe.c | 35 +++----
arch/powerpc/kernel/misc_64.S | 28 ------
arch/powerpc/kernel/module_64.c | 14 ++-
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +-
arch/powerpc/kvm/powerpc.c | 20 ++--
arch/powerpc/platforms/powernv/eeh-powernv.c | 5 +-
arch/powerpc/platforms/powernv/pci-ioda.c | 1 +
arch/powerpc/platforms/powernv/pci.c | 26 +++++
arch/powerpc/platforms/powernv/pci.h | 1 +
arch/x86/include/asm/pgtable_types.h | 6 +-
arch/x86/lib/copy_user_64.S | 142 +++++++++++++++++++--------
arch/x86/mm/fault.c | 15 ++-
arch/x86/mm/pageattr.c | 4 +-
drivers/hwspinlock/hwspinlock_core.c | 4 +
drivers/iio/accel/Kconfig | 1 +
drivers/iio/adc/Kconfig | 1 +
drivers/iio/adc/ti_am335x_adc.c | 2 +-
drivers/iio/dac/mcp4725.c | 1 +
drivers/iio/imu/adis_buffer.c | 2 +-
drivers/iio/inkern.c | 2 +
drivers/iio/light/acpi-als.c | 6 +-
drivers/iio/light/ltr501.c | 2 +-
drivers/iio/pressure/mpl115.c | 2 +-
drivers/input/mouse/elantech.c | 2 +-
drivers/input/mouse/vmmouse.c | 13 +--
drivers/input/serio/i8042-x86ia64io.h | 7 ++
drivers/iommu/amd_iommu.c | 2 +-
drivers/iommu/dmar.c | 2 +-
drivers/iommu/intel-iommu.c | 2 +-
drivers/iommu/intel-svm.c | 37 +++++--
drivers/iommu/intel_irq_remapping.c | 2 +-
drivers/nvdimm/namespace_devs.c | 53 ++++++++++
drivers/nvdimm/region_devs.c | 56 -----------
drivers/phy/phy-twl4030-usb.c | 14 ++-
drivers/platform/x86/intel_scu_ipcutil.c | 2 +-
drivers/scsi/device_handler/scsi_dh_rdac.c | 4 +-
drivers/scsi/scsi_devinfo.c | 2 +
drivers/scsi/scsi_sysfs.c | 6 +-
drivers/scsi/sd.c | 7 +-
drivers/scsi/sg.c | 2 +-
drivers/scsi/sr.c | 4 +
drivers/staging/speakup/selection.c | 5 +-
drivers/target/iscsi/iscsi_target_configfs.c | 16 ++-
drivers/tty/pty.c | 21 +++-
drivers/tty/serial/8250/8250_pci.c | 50 ++++++++++
drivers/tty/serial/omap-serial.c | 8 +-
drivers/usb/host/xhci-ring.c | 10 --
drivers/usb/host/xhci.c | 4 +-
fs/btrfs/backref.c | 10 +-
fs/btrfs/delayed-inode.c | 3 +-
fs/btrfs/delayed-inode.h | 2 +-
fs/btrfs/disk-io.c | 1 -
fs/btrfs/inode.c | 16 ++-
fs/btrfs/ioctl.c | 119 +++++++++++++++++-----
fs/btrfs/volumes.c | 28 ++++--
fs/cifs/cifs_debug.c | 2 +-
fs/cifs/cifs_debug.h | 9 +-
fs/cifs/cifsencrypt.c | 2 +-
fs/cifs/connect.c | 2 +-
fs/cifs/readdir.c | 1 +
fs/cifs/transport.c | 6 +-
fs/devpts/inode.c | 20 ++++
fs/ext4/balloc.c | 7 +-
fs/ext4/ialloc.c | 6 +-
fs/ext4/move_extent.c | 15 ++-
fs/ext4/resize.c | 2 +-
fs/hugetlbfs/inode.c | 19 ++--
fs/nfs/flexfilelayout/flexfilelayout.c | 8 +-
fs/nfs/inode.c | 54 +++++++---
fs/nfs/nfs4proc.c | 2 +-
fs/overlayfs/copy_up.c | 41 +++++---
fs/overlayfs/inode.c | 13 +++
fs/overlayfs/readdir.c | 3 +-
fs/overlayfs/super.c | 5 +
fs/proc/array.c | 2 +-
fs/proc/base.c | 21 ++--
fs/proc/namespaces.c | 4 +-
fs/proc/task_mmu.c | 7 +-
fs/timerfd.c | 2 +-
fs/udf/inode.c | 15 +++
fs/udf/unicode.c | 21 +++-
fs/xfs/libxfs/xfs_format.h | 2 +-
fs/xfs/libxfs/xfs_inode_buf.c | 12 ++-
fs/xfs/xfs_buf.c | 17 ++++
fs/xfs/xfs_trans_ail.c | 1 -
include/linux/compiler.h | 2 +-
include/linux/devpts_fs.h | 4 +
include/linux/intel-iommu.h | 3 +
include/linux/ptrace.h | 24 ++++-
include/linux/radix-tree.h | 22 ++++-
include/linux/rmap.h | 14 ---
include/linux/tracepoint.h | 5 +
ipc/shm.c | 53 ++++++++--
kernel/events/core.c | 2 +-
kernel/futex.c | 7 +-
kernel/futex_compat.c | 2 +-
kernel/kcmp.c | 4 +-
kernel/memremap.c | 2 +-
kernel/module.c | 28 +++---
kernel/ptrace.c | 39 ++++++--
kernel/sys.c | 20 ++--
kernel/time/itimer.c | 2 +-
kernel/time/posix-timers.c | 2 +-
lib/dma-debug.c | 2 +-
lib/dump_stack.c | 7 +-
lib/klist.c | 6 +-
lib/radix-tree.c | 12 ++-
lib/string_helpers.c | 63 ++++++++----
mm/memcontrol.c | 11 ++-
mm/memory-failure.c | 2 +-
mm/mlock.c | 2 +-
mm/mmap.c | 89 ++++++++++-------
mm/pgtable-generic.c | 4 +-
mm/process_vm_access.c | 2 +-
net/sunrpc/xprtsock.c | 49 ++++-----
scripts/bloat-o-meter | 8 +-
scripts/mod/modpost.c | 3 +-
security/commoncap.c | 7 +-
sound/core/pcm_native.c | 16 ++-
sound/core/seq/seq_memory.c | 13 ++-
sound/core/seq/seq_ports.c | 13 ++-
sound/pci/hda/hda_intel.c | 4 +-
tools/lib/traceevent/event-parse.c | 5 +-
tools/perf/util/parse-events.c | 3 +
tools/perf/util/session.c | 2 +-
virt/kvm/arm/arch_timer.c | 9 +-
143 files changed, 1320 insertions(+), 619 deletions(-)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:10:17 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jann Horn <ja...@thejh.net>

commit caaee6234d05a58c5b4d05e7bf766131b810a657 upstream.

By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g. by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

/proc/$pid/stat - uses the check for determining whether pointers
should be visible, useful for bypassing ASLR
/proc/$pid/maps - also useful for bypassing ASLR
/proc/$pid/cwd - useful for gaining access to restricted
directories that contain files with lax permissions, e.g. in
this scenario:
lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar
drwx------ root root /root
drwxr-xr-x root root /root/foobar
-rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[ak...@linux-foundation.org: fix warning]
Signed-off-by: Jann Horn <ja...@thejh.net>
Acked-by: Kees Cook <kees...@chromium.org>
Cc: Casey Schaufler <ca...@schaufler-ca.com>
Cc: Oleg Nesterov <ol...@redhat.com>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: James Morris <james.l...@oracle.com>
Cc: "Serge E. Hallyn" <serge....@ubuntu.com>
Cc: Andy Shevchenko <andriy.s...@linux.intel.com>
Cc: Andy Lutomirski <lu...@kernel.org>
Cc: Al Viro <vi...@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebie...@xmission.com>
Cc: Willy Tarreau <w...@1wt.eu>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/proc/array.c | 2 +-
fs/proc/base.c | 21 +++++++++++----------
fs/proc/namespaces.c | 4 ++--
include/linux/ptrace.h | 24 +++++++++++++++++++++++-
kernel/events/core.c | 2 +-
kernel/futex.c | 2 +-
kernel/futex_compat.c | 2 +-
kernel/kcmp.c | 4 ++--
kernel/ptrace.c | 39 +++++++++++++++++++++++++++++++--------
mm/process_vm_access.c | 2 +-
security/commoncap.c | 7 ++++++-
11 files changed, 80 insertions(+), 29 deletions(-)

--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -395,7 +395,7 @@ static int do_task_stat(struct seq_file

state = *get_task_state(task);
vsize = eip = esp = 0;
- permitted = ptrace_may_access(task, PTRACE_MODE_READ | PTRACE_MODE_NOAUDIT);
+ permitted = ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS | PTRACE_MODE_NOAUDIT);
mm = get_task_mm(task);
if (mm) {
vsize = task_vsize(mm);
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -403,7 +403,7 @@ static const struct file_operations proc
static int proc_pid_auxv(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
- struct mm_struct *mm = mm_access(task, PTRACE_MODE_READ);
+ struct mm_struct *mm = mm_access(task, PTRACE_MODE_READ_FSCREDS);
if (mm && !IS_ERR(mm)) {
unsigned int nwords = 0;
do {
@@ -430,7 +430,8 @@ static int proc_pid_wchan(struct seq_fil

wchan = get_wchan(task);

- if (wchan && ptrace_may_access(task, PTRACE_MODE_READ) && !lookup_symbol_name(wchan, symname))
+ if (wchan && ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)
+ && !lookup_symbol_name(wchan, symname))
seq_printf(m, "%s", symname);
else
seq_putc(m, '0');
@@ -444,7 +445,7 @@ static int lock_trace(struct task_struct
int err = mutex_lock_killable(&task->signal->cred_guard_mutex);
if (err)
return err;
- if (!ptrace_may_access(task, PTRACE_MODE_ATTACH)) {
+ if (!ptrace_may_access(task, PTRACE_MODE_ATTACH_FSCREDS)) {
mutex_unlock(&task->signal->cred_guard_mutex);
return -EPERM;
}
@@ -697,7 +698,7 @@ static int proc_fd_access_allowed(struct
*/
task = get_proc_task(inode);
if (task) {
- allowed = ptrace_may_access(task, PTRACE_MODE_READ);
+ allowed = ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
put_task_struct(task);
}
return allowed;
@@ -732,7 +733,7 @@ static bool has_pid_permissions(struct p
return true;
if (in_group_p(pid->pid_gid))
return true;
- return ptrace_may_access(task, PTRACE_MODE_READ);
+ return ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
}


@@ -809,7 +810,7 @@ struct mm_struct *proc_mem_open(struct i
struct mm_struct *mm = ERR_PTR(-ESRCH);

if (task) {
- mm = mm_access(task, mode);
+ mm = mm_access(task, mode | PTRACE_MODE_FSCREDS);
put_task_struct(task);

if (!IS_ERR_OR_NULL(mm)) {
@@ -1856,7 +1857,7 @@ static int map_files_d_revalidate(struct
if (!task)
goto out_notask;

- mm = mm_access(task, PTRACE_MODE_READ);
+ mm = mm_access(task, PTRACE_MODE_READ_FSCREDS);
if (IS_ERR_OR_NULL(mm))
goto out;

@@ -2007,7 +2008,7 @@ static struct dentry *proc_map_files_loo
goto out;

result = -EACCES;
- if (!ptrace_may_access(task, PTRACE_MODE_READ))
+ if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS))
goto out_put_task;

result = -ENOENT;
@@ -2060,7 +2061,7 @@ proc_map_files_readdir(struct file *file
goto out;

ret = -EACCES;
- if (!ptrace_may_access(task, PTRACE_MODE_READ))
+ if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS))
goto out_put_task;

ret = 0;
@@ -2530,7 +2531,7 @@ static int do_io_accounting(struct task_
if (result)
return result;

- if (!ptrace_may_access(task, PTRACE_MODE_READ)) {
+ if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) {
result = -EACCES;
goto out_unlock;
}
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -42,7 +42,7 @@ static const char *proc_ns_follow_link(s
if (!task)
return error;

- if (ptrace_may_access(task, PTRACE_MODE_READ)) {
+ if (ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) {
error = ns_get_path(&ns_path, task, ns_ops);
if (!error)
nd_jump_link(&ns_path);
@@ -63,7 +63,7 @@ static int proc_ns_readlink(struct dentr
if (!task)
return res;

- if (ptrace_may_access(task, PTRACE_MODE_READ)) {
+ if (ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) {
res = ns_get_name(name, sizeof(name), task, ns_ops);
if (res >= 0)
res = readlink_copy(buffer, buflen, name);
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -57,7 +57,29 @@ extern void exit_ptrace(struct task_stru
#define PTRACE_MODE_READ 0x01
#define PTRACE_MODE_ATTACH 0x02
#define PTRACE_MODE_NOAUDIT 0x04
-/* Returns true on success, false on denial. */
+#define PTRACE_MODE_FSCREDS 0x08
+#define PTRACE_MODE_REALCREDS 0x10
+
+/* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
+#define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
+#define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
+#define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
+#define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
+
+/**
+ * ptrace_may_access - check whether the caller is permitted to access
+ * a target task.
+ * @task: target task
+ * @mode: selects type of access and caller credentials
+ *
+ * Returns true on success, false on denial.
+ *
+ * One of the flags PTRACE_MODE_FSCREDS and PTRACE_MODE_REALCREDS must
+ * be set in @mode to specify whether the access was requested through
+ * a filesystem syscall (should use effective capabilities and fsuid
+ * of the caller) or through an explicit syscall such as
+ * process_vm_writev or ptrace (and should use the real credentials).
+ */
extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);

static inline int ptrace_reparented(struct task_struct *child)
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3434,7 +3434,7 @@ find_lively_task_by_vpid(pid_t vpid)

/* Reuse ptrace permission checks for now. */
err = -EACCES;
- if (!ptrace_may_access(task, PTRACE_MODE_READ))
+ if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS))
goto errout;

return task;
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2881,7 +2881,7 @@ SYSCALL_DEFINE3(get_robust_list, int, pi
}

ret = -EPERM;
- if (!ptrace_may_access(p, PTRACE_MODE_READ))
+ if (!ptrace_may_access(p, PTRACE_MODE_READ_REALCREDS))
goto err_unlock;

head = p->robust_list;
--- a/kernel/futex_compat.c
+++ b/kernel/futex_compat.c
@@ -155,7 +155,7 @@ COMPAT_SYSCALL_DEFINE3(get_robust_list,
}

ret = -EPERM;
- if (!ptrace_may_access(p, PTRACE_MODE_READ))
+ if (!ptrace_may_access(p, PTRACE_MODE_READ_REALCREDS))
goto err_unlock;

head = p->compat_robust_list;
--- a/kernel/kcmp.c
+++ b/kernel/kcmp.c
@@ -122,8 +122,8 @@ SYSCALL_DEFINE5(kcmp, pid_t, pid1, pid_t
&task2->signal->cred_guard_mutex);
if (ret)
goto err;
- if (!ptrace_may_access(task1, PTRACE_MODE_READ) ||
- !ptrace_may_access(task2, PTRACE_MODE_READ)) {
+ if (!ptrace_may_access(task1, PTRACE_MODE_READ_REALCREDS) ||
+ !ptrace_may_access(task2, PTRACE_MODE_READ_REALCREDS)) {
ret = -EPERM;
goto err_unlock;
}
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -219,6 +219,14 @@ static int ptrace_has_cap(struct user_na
static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
{
const struct cred *cred = current_cred(), *tcred;
+ int dumpable = 0;
+ kuid_t caller_uid;
+ kgid_t caller_gid;
+
+ if (!(mode & PTRACE_MODE_FSCREDS) == !(mode & PTRACE_MODE_REALCREDS)) {
+ WARN(1, "denying ptrace access check without PTRACE_MODE_*CREDS\n");
+ return -EPERM;
+ }

/* May we inspect the given task?
* This check is used both for attaching with ptrace
@@ -228,18 +236,33 @@ static int __ptrace_may_access(struct ta
* because setting up the necessary parent/child relationship
* or halting the specified task is impossible.
*/
- int dumpable = 0;
+
/* Don't let security modules deny introspection */
if (same_thread_group(task, current))
return 0;
rcu_read_lock();
+ if (mode & PTRACE_MODE_FSCREDS) {
+ caller_uid = cred->fsuid;
+ caller_gid = cred->fsgid;
+ } else {
+ /*
+ * Using the euid would make more sense here, but something
+ * in userland might rely on the old behavior, and this
+ * shouldn't be a security problem since
+ * PTRACE_MODE_REALCREDS implies that the caller explicitly
+ * used a syscall that requests access to another process
+ * (and not a filesystem syscall to procfs).
+ */
+ caller_uid = cred->uid;
+ caller_gid = cred->gid;
+ }
tcred = __task_cred(task);
- if (uid_eq(cred->uid, tcred->euid) &&
- uid_eq(cred->uid, tcred->suid) &&
- uid_eq(cred->uid, tcred->uid) &&
- gid_eq(cred->gid, tcred->egid) &&
- gid_eq(cred->gid, tcred->sgid) &&
- gid_eq(cred->gid, tcred->gid))
+ if (uid_eq(caller_uid, tcred->euid) &&
+ uid_eq(caller_uid, tcred->suid) &&
+ uid_eq(caller_uid, tcred->uid) &&
+ gid_eq(caller_gid, tcred->egid) &&
+ gid_eq(caller_gid, tcred->sgid) &&
+ gid_eq(caller_gid, tcred->gid))
goto ok;
if (ptrace_has_cap(tcred->user_ns, mode))
goto ok;
@@ -306,7 +329,7 @@ static int ptrace_attach(struct task_str
goto out;

task_lock(task);
- retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH);
+ retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
task_unlock(task);
if (retval)
goto unlock_creds;
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -194,7 +194,7 @@ static ssize_t process_vm_rw_core(pid_t
goto free_proc_pages;
}

- mm = mm_access(task, PTRACE_MODE_ATTACH);
+ mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);
if (!mm || IS_ERR(mm)) {
rc = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH;
/*
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -137,12 +137,17 @@ int cap_ptrace_access_check(struct task_
{
int ret = 0;
const struct cred *cred, *child_cred;
+ const kernel_cap_t *caller_caps;

rcu_read_lock();
cred = current_cred();
child_cred = __task_cred(child);
+ if (mode & PTRACE_MODE_FSCREDS)
+ caller_caps = &cred->cap_effective;
+ else
+ caller_caps = &cred->cap_permitted;
if (cred->user_ns == child_cred->user_ns &&
- cap_issubset(child_cred->cap_permitted, cred->cap_permitted))
+ cap_issubset(child_cred->cap_permitted, *caller_caps))
goto out;
if (ns_capable(child_cred->user_ns, CAP_SYS_PTRACE))
goto out;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:11:12 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdma...@suse.com>

commit 1636d1d77ef4e01e57f706a4cae3371463896136 upstream.

If a bio for a direct IO request fails, we were not setting the error in
the parent bio (the main DIO bio), making us not return the error to
user space in btrfs_direct_IO(), that is, it made __blockdev_direct_IO()
return the number of bytes issued for IO and not the error a bio created
and submitted by btrfs_submit_direct() got from the block layer.
This essentially happens because when we call:

dio_end_io(dio_bio, bio->bi_error);

It does not set dio_bio->bi_error to the value of the second argument.
So just add this missing assignment in endio callbacks, just as we do in
the error path at btrfs_submit_direct() when we fail to clone the dio bio
or allocate its private object. This follows the convention of what is
done with other similar APIs such as bio_endio() where the caller is
responsible for setting the bi_error field in the bio it passes as an
argument to bio_endio().

This was detected by the new generic test cases in xfstests: 271, 272,
276 and 278. Which essentially setup a dm error target, then load the
error table, do a direct IO write and unload the error table. They
expect the write to fail with -EIO, which was not getting reported
when testing against btrfs.

Fixes: 4246a0b63bd8 ("block: add a bi_error field to struct bio")
Signed-off-by: Filipe Manana <fdma...@suse.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/btrfs/inode.c | 2 ++
1 file changed, 2 insertions(+)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7997,6 +7997,7 @@ static void btrfs_endio_direct_read(stru

kfree(dip);

+ dio_bio->bi_error = bio->bi_error;
dio_end_io(dio_bio, bio->bi_error);

if (io_bio->end_io)
@@ -8042,6 +8043,7 @@ out_test:

kfree(dip);

+ dio_bio->bi_error = bio->bi_error;
dio_end_io(dio_bio, bio->bi_error);
bio_put(bio);
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:11:13 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Sterba <dst...@suse.com>

commit 80ad623edd2d0ccb47d85357ee31c97e6c684e82 upstream.

This reverts commit 696249132158014d594896df3a81390616069c5c. The
cleaner thread can block freezing when there's a snapshot cleaning in
progress and the other threads get suspended first. From the logs
provided by Martin we're waiting for reading extent pages:

kernel: PM: Syncing filesystems ... done.
kernel: Freezing user space processes ... (elapsed 0.015 seconds) done.
kernel: Freezing remaining freezable tasks ...
kernel: Freezing of tasks failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0):
kernel: btrfs-cleaner D ffff88033dd13bc0 0 152 2 0x00000000
kernel: ffff88032ebc2e00 ffff88032e750000 ffff88032e74fa50 7fffffffffffffff
kernel: ffffffff814a58df 0000000000000002 ffffea000934d580 ffffffff814a5451
kernel: 7fffffffffffffff ffffffff814a6e8f 0000000000000000 0000000000000020
kernel: Call Trace:
kernel: [<ffffffff814a58df>] ? bit_wait+0x2c/0x2c
kernel: [<ffffffff814a5451>] ? schedule+0x6f/0x7c
kernel: [<ffffffff814a6e8f>] ? schedule_timeout+0x2f/0xd8
kernel: [<ffffffff81076f94>] ? timekeeping_get_ns+0xa/0x2e
kernel: [<ffffffff81077603>] ? ktime_get+0x36/0x44
kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2
kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2
kernel: [<ffffffff814a590b>] ? bit_wait_io+0x2c/0x30
kernel: [<ffffffff814a5694>] ? __wait_on_bit+0x41/0x73
kernel: [<ffffffff8109eba8>] ? wait_on_page_bit+0x6d/0x72
kernel: [<ffffffff8105d718>] ? autoremove_wake_function+0x2a/0x2a
kernel: [<ffffffff811a02d7>] ? read_extent_buffer_pages+0x1bd/0x203
kernel: [<ffffffff8117d9e9>] ? free_root_pointers+0x4c/0x4c
kernel: [<ffffffff8117e831>] ? btree_read_extent_buffer_pages.constprop.57+0x5a/0xe9
kernel: [<ffffffff8117f4f3>] ? read_tree_block+0x2d/0x45
kernel: [<ffffffff8116782a>] ? read_block_for_search.isra.34+0x22a/0x26b
kernel: [<ffffffff811656c3>] ? btrfs_set_path_blocking+0x1e/0x4a
kernel: [<ffffffff8116919b>] ? btrfs_search_slot+0x648/0x736
kernel: [<ffffffff81170559>] ? btrfs_lookup_extent_info+0xb7/0x2c7
kernel: [<ffffffff81170ee5>] ? walk_down_proc+0x9c/0x1ae
kernel: [<ffffffff81171c9d>] ? walk_down_tree+0x40/0xa4
kernel: [<ffffffff8117375f>] ? btrfs_drop_snapshot+0x2da/0x664
kernel: [<ffffffff8104ff21>] ? finish_task_switch+0x126/0x167
kernel: [<ffffffff811850f8>] ? btrfs_clean_one_deleted_snapshot+0xa6/0xb0
kernel: [<ffffffff8117eaba>] ? cleaner_kthread+0x13e/0x17b
kernel: [<ffffffff8117e97c>] ? btrfs_item_end+0x33/0x33
kernel: [<ffffffff8104d256>] ? kthread+0x95/0x9d
kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16
kernel: [<ffffffff814a7b5f>] ? ret_from_fork+0x3f/0x70
kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16

As this affects a released kernel (4.4) we need a minimal fix for
stable kernels.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=108361
Reported-by: Martin Ziegler <zie...@uni-freiburg.de>
CC: Jiri Kosina <jko...@suse.cz>
Signed-off-by: David Sterba <dst...@suse.com>
Signed-off-by: Chris Mason <c...@fb.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/btrfs/disk-io.c | 1 -
1 file changed, 1 deletion(-)

--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1762,7 +1762,6 @@ static int cleaner_kthread(void *arg)
int again;
struct btrfs_trans_handle *trans;

- set_freezable();
do {
again = 0;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Laura Abbott <lab...@fedoraproject.org>

commit ea535e418c01837d07b6c94e817540f50bfdadb0 upstream.

In include/asm-generic/sections.h:

/*
* Usage guidelines:
* _text, _data: architecture specific, don't use them in
* arch-independent code
* [_stext, _etext]: contains .text.* sections, may also contain
* .rodata.*
* and/or .init.* sections

_text is not guaranteed across architectures. Architectures such as ARM
may reuse parts which are not actually text and erroneously trigger a bug.
Switch to using _stext which is guaranteed to contain text sections.

Came out of https://lkml.kernel.org/g/<567B1176...@redhat.com>

Signed-off-by: Laura Abbott <lab...@fedoraproject.org>
Reviewed-by: Kees Cook <kees...@chromium.org>
Cc: Russell King <li...@arm.linux.org.uk>
Cc: Arnd Bergmann <ar...@arndb.de>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
lib/dma-debug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -1181,7 +1181,7 @@ static inline bool overlap(void *addr, u

static void check_for_illegal_area(struct device *dev, void *addr, unsigned long len)
{
- if (overlap(addr, len, _text, _etext) ||
+ if (overlap(addr, len, _stext, _etext) ||
overlap(addr, len, __start_rodata, __end_rodata))
err_printk(dev, NULL, "DMA-API: device driver maps memory from kernel text or rodata [addr=%p] [len=%lu]\n", addr, len);
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <David.W...@intel.com>

commit 46924008273ed03bd11dbb32136e3da4cfe056e1 upstream.

According to the VT-d specification we need to clear the PPR bit in
the Page Request Status register when handling page requests, or the
hardware won't generate any more interrupts.

This wasn't actually necessary on SKL/KBL (which may well be the
subject of a hardware erratum, although it's harmless enough). But
other implementations do appear to get it right, and we only ever get
one interrupt unless we clear the PPR bit.

Reported-by: CQ Tang <cq....@intel.com>
Signed-off-by: David Woodhouse <David.W...@intel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iommu/intel-svm.c | 4 ++++
include/linux/intel-iommu.h | 3 +++
2 files changed, 7 insertions(+)

--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -524,6 +524,10 @@ static irqreturn_t prq_event_thread(int
struct intel_svm *svm = NULL;
int head, tail, handled = 0;

+ /* Clear PPR bit before reading head/tail registers, to
+ * ensure that we get a new interrupt if needed. */
+ writel(DMA_PRS_PPR, iommu->reg + DMAR_PRS_REG);
+
tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK;
head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK;
while (head != tail) {
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -235,6 +235,9 @@ static inline void dmar_writeq(void __io
/* low 64 bit */
#define dma_frcd_page_addr(d) (d & (((u64)-1) << PAGE_SHIFT))

+/* PRS_REG */
+#define DMA_PRS_PPR ((u32)1)
+
#define IOMMU_WAIT_OP(iommu, offset, op, cond, sts) \
do { \
cycles_t start_time = get_cycles(); \

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <mik...@szeredi.hu>

commit cf9a6784f7c1b5ee2b9159a1246e327c331c5697 upstream.

Without this copy-up of a file can be forced, even without actually being
allowed to do anything on the file.

[Arnd Bergmann] include <linux/pagemap.h> for PAGE_CACHE_SIZE (used by
MAX_LFS_FILESIZE definition).

Signed-off-by: Miklos Szeredi <mik...@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/overlayfs/inode.c | 13 +++++++++++++
fs/overlayfs/super.c | 2 ++
2 files changed, 15 insertions(+)

--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -42,6 +42,19 @@ int ovl_setattr(struct dentry *dentry, s
int err;
struct dentry *upperdentry;

+ /*
+ * Check for permissions before trying to copy-up. This is redundant
+ * since it will be rechecked later by ->setattr() on upper dentry. But
+ * without this, copy-up can be triggered by just about anybody.
+ *
+ * We don't initialize inode->size, which just means that
+ * inode_newsize_ok() will always check against MAX_LFS_FILESIZE and not
+ * check for a swapfile (which this won't be anyway).
+ */
+ err = inode_change_ok(dentry->d_inode, attr);
+ if (err)
+ return err;
+
err = ovl_want_write(dentry);
if (err)
goto out;
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -9,6 +9,7 @@

#include <linux/fs.h>
#include <linux/namei.h>
+#include <linux/pagemap.h>
#include <linux/xattr.h>
#include <linux/security.h>
#include <linux/mount.h>
@@ -910,6 +911,7 @@ static int ovl_fill_super(struct super_b
}

sb->s_stack_depth = 0;
+ sb->s_maxbytes = MAX_LFS_FILESIZE;
if (ufs->config.upperdir) {
if (!ufs->config.workdir) {
pr_err("overlayfs: missing 'workdir'\n");

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <dan.j.w...@intel.com>

commit e07ecd76d4db7bda1e9495395b2110a3fe28845a upstream.

When btt devices were re-worked to be child devices of regions this
routine was overlooked. It mistakenly attempts to_nd_namespace_pmem()
or to_nd_namespace_blk() conversions on btt and pfn devices. By luck to
date we have happened to be hitting valid memory leading to a uuid
miscompare, but a recent change to struct nd_namespace_common causes:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
IP: [<ffffffff814610dc>] memcmp+0xc/0x40
[..]
Call Trace:
[<ffffffffa0028631>] is_uuid_busy+0xc1/0x2a0 [libnvdimm]
[<ffffffffa0028570>] ? to_nd_blk_region+0x50/0x50 [libnvdimm]
[<ffffffff8158c9c0>] device_for_each_child+0x50/0x90

Signed-off-by: Dan Williams <dan.j.w...@intel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/nvdimm/namespace_devs.c | 53 +++++++++++++++++++++++++++++++++++++
drivers/nvdimm/region_devs.c | 56 ----------------------------------------
2 files changed, 53 insertions(+), 56 deletions(-)

--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -77,6 +77,59 @@ static bool is_namespace_io(struct devic
return dev ? dev->type == &namespace_io_device_type : false;
}

+static int is_uuid_busy(struct device *dev, void *data)
+{
+ u8 *uuid1 = data, *uuid2 = NULL;
+
+ if (is_namespace_pmem(dev)) {
+ struct nd_namespace_pmem *nspm = to_nd_namespace_pmem(dev);
+
+ uuid2 = nspm->uuid;
+ } else if (is_namespace_blk(dev)) {
+ struct nd_namespace_blk *nsblk = to_nd_namespace_blk(dev);
+
+ uuid2 = nsblk->uuid;
+ } else if (is_nd_btt(dev)) {
+ struct nd_btt *nd_btt = to_nd_btt(dev);
+
+ uuid2 = nd_btt->uuid;
+ } else if (is_nd_pfn(dev)) {
+ struct nd_pfn *nd_pfn = to_nd_pfn(dev);
+
+ uuid2 = nd_pfn->uuid;
+ }
+
+ if (uuid2 && memcmp(uuid1, uuid2, NSLABEL_UUID_LEN) == 0)
+ return -EBUSY;
+
+ return 0;
+}
+
+static int is_namespace_uuid_busy(struct device *dev, void *data)
+{
+ if (is_nd_pmem(dev) || is_nd_blk(dev))
+ return device_for_each_child(dev, data, is_uuid_busy);
+ return 0;
+}
+
+/**
+ * nd_is_uuid_unique - verify that no other namespace has @uuid
+ * @dev: any device on a nvdimm_bus
+ * @uuid: uuid to check
+ */
+bool nd_is_uuid_unique(struct device *dev, u8 *uuid)
+{
+ struct nvdimm_bus *nvdimm_bus = walk_to_nvdimm_bus(dev);
+
+ if (!nvdimm_bus)
+ return false;
+ WARN_ON_ONCE(!is_nvdimm_bus_locked(&nvdimm_bus->dev));
+ if (device_for_each_child(&nvdimm_bus->dev, uuid,
+ is_namespace_uuid_busy) != 0)
+ return false;
+ return true;
+}
+
bool pmem_should_map_pages(struct device *dev)
{
struct nd_region *nd_region = to_nd_region(dev->parent);
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -134,62 +134,6 @@ int nd_region_to_nstype(struct nd_region
}
EXPORT_SYMBOL(nd_region_to_nstype);

-static int is_uuid_busy(struct device *dev, void *data)
-{
- struct nd_region *nd_region = to_nd_region(dev->parent);
- u8 *uuid = data;
-
- switch (nd_region_to_nstype(nd_region)) {
- case ND_DEVICE_NAMESPACE_PMEM: {
- struct nd_namespace_pmem *nspm = to_nd_namespace_pmem(dev);
-
- if (!nspm->uuid)
- break;
- if (memcmp(uuid, nspm->uuid, NSLABEL_UUID_LEN) == 0)
- return -EBUSY;
- break;
- }
- case ND_DEVICE_NAMESPACE_BLK: {
- struct nd_namespace_blk *nsblk = to_nd_namespace_blk(dev);
-
- if (!nsblk->uuid)
- break;
- if (memcmp(uuid, nsblk->uuid, NSLABEL_UUID_LEN) == 0)
- return -EBUSY;
- break;
- }
- default:
- break;
- }
-
- return 0;
-}
-
-static int is_namespace_uuid_busy(struct device *dev, void *data)
-{
- if (is_nd_pmem(dev) || is_nd_blk(dev))
- return device_for_each_child(dev, data, is_uuid_busy);
- return 0;
-}
-
-/**
- * nd_is_uuid_unique - verify that no other namespace has @uuid
- * @dev: any device on a nvdimm_bus
- * @uuid: uuid to check
- */
-bool nd_is_uuid_unique(struct device *dev, u8 *uuid)
-{
- struct nvdimm_bus *nvdimm_bus = walk_to_nvdimm_bus(dev);
-
- if (!nvdimm_bus)
- return false;
- WARN_ON_ONCE(!is_nvdimm_bus_locked(&nvdimm_bus->dev));
- if (device_for_each_child(&nvdimm_bus->dev, uuid,
- is_namespace_uuid_busy) != 0)
- return false;
- return true;
-}
-
static ssize_t size_show(struct device *dev,
struct device_attribute *attr, char *buf)
{

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Darrick J. Wong <darric...@oracle.com>

commit 96f859d52bcb1c6ea6f3388d39862bf7143e2f30 upstream.

Because struct xfs_agfl is 36 bytes long and has a 64-bit integer
inside it, gcc will quietly round the structure size up to the nearest
64 bits -- in this case, 40 bytes. This results in the XFS_AGFL_SIZE
macro returning incorrect results for v5 filesystems on 64-bit
machines (118 items instead of 119). As a result, a 32-bit xfs_repair
will see garbage in AGFL item 119 and complain.

Therefore, tell gcc not to pad the structure so that the AGFL size
calculation is correct.

Signed-off-by: Darrick J. Wong <darric...@oracle.com>
Reviewed-by: Dave Chinner <dchi...@redhat.com>
Signed-off-by: Dave Chinner <da...@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/xfs/libxfs/xfs_format.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -786,7 +786,7 @@ typedef struct xfs_agfl {
__be64 agfl_lsn;
__be32 agfl_crc;
__be32 agfl_bno[]; /* actually XFS_AGFL_SIZE(mp) */
-} xfs_agfl_t;
+} __attribute__((packed)) xfs_agfl_t;

#define XFS_AGFL_CRC_OFF offsetof(struct xfs_agfl, agfl_crc)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <a...@ozlabs.ru>

commit 6ecad912a0073c768db1491c27ca55ad2d0ee68f upstream.

Quite often drivers set only "write" permission assuming that this
includes "read" permission as well and this works on plenty of
platforms. However IODA2 is strict about this and produces an EEH when
"read" permission is not set and reading happens.

This adds a workaround in the IODA code to always add the "read" bit
when the "write" bit is set.

Fixes: 10b35b2b7485 ("powerpc/powernv: Do not set "read" flag if direction==DMA_NONE")
Cc: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
Tested-by: Douglas Miller <doug...@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/powerpc/platforms/powernv/pci.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -601,6 +601,9 @@ int pnv_tce_build(struct iommu_table *tb
u64 rpn = __pa(uaddr) >> tbl->it_page_shift;
long i;

+ if (proto_tce & TCE_PCI_WRITE)
+ proto_tce |= TCE_PCI_READ;
+
for (i = 0; i < npages; i++) {
unsigned long newtce = proto_tce |
((rpn + i) << tbl->it_page_shift);
@@ -622,6 +625,9 @@ int pnv_tce_xchg(struct iommu_table *tbl

BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl));

+ if (newtce & TCE_PCI_WRITE)
+ newtce |= TCE_PCI_READ;
+
oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce));
*hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ | TCE_PCI_WRITE);
*direction = iommu_tce_direction(oldtce);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <mik...@szeredi.hu>

commit ed06e069775ad9236087594a1c1667367e983fb5 upstream.

We copy i_uid and i_gid of underlying inode into overlayfs inode. Except
for the root inode.

Fix this omission.

Signed-off-by: Miklos Szeredi <mik...@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/overlayfs/super.c | 3 +++
1 file changed, 3 insertions(+)

--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1053,6 +1053,9 @@ static int ovl_fill_super(struct super_b

root_dentry->d_fsdata = oe;

+ ovl_copyattr(ovl_dentry_real(root_dentry)->d_inode,
+ root_dentry->d_inode);
+
sb->s_magic = OVERLAYFS_SUPER_MAGIC;
sb->s_op = &ovl_super_operations;
sb->s_root = root_dentry;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill....@linux.intel.com>

commit 48f7df329474b49d83d0dffec1b6186647f11976 upstream.

Grazvydas Ignotas has reported a regression in remap_file_pages()
emulation.

Testcase:
#define _GNU_SOURCE
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>

#define SIZE (4096 * 3)

int main(int argc, char **argv)
{
unsigned long *p;
long i;

p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (p == MAP_FAILED) {
perror("mmap");
return -1;
}

for (i = 0; i < SIZE / 4096; i++)
p[i * 4096 / sizeof(*p)] = i;

if (remap_file_pages(p, 4096, 0, 1, 0)) {
perror("remap_file_pages");
return -1;
}

if (remap_file_pages(p, 4096 * 2, 0, 1, 0)) {
perror("remap_file_pages");
return -1;
}

assert(p[0] == 1);

munmap(p, SIZE);

return 0;
}

The second remap_file_pages() fails with -EINVAL.

The reason is that remap_file_pages() emulation assumes that the target
vma covers whole area we want to over map. That assumption is broken by
first remap_file_pages() call: it split the area into two vma.

The solution is to check next adjacent vmas, if they map the same file
with the same flags.

Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation")
Signed-off-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Reported-by: Grazvydas Ignotas <not...@gmail.com>
Tested-by: Grazvydas Ignotas <not...@gmail.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
mm/mmap.c | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)

--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2668,12 +2668,29 @@ SYSCALL_DEFINE5(remap_file_pages, unsign
if (!vma || !(vma->vm_flags & VM_SHARED))
goto out;

- if (start < vma->vm_start || start + size > vma->vm_end)
+ if (start < vma->vm_start)
goto out;

- if (pgoff == linear_page_index(vma, start)) {
- ret = 0;
- goto out;
+ if (start + size > vma->vm_end) {
+ struct vm_area_struct *next;
+
+ for (next = vma->vm_next; next; next = next->vm_next) {
+ /* hole between vmas ? */
+ if (next->vm_start != next->vm_prev->vm_end)
+ goto out;
+
+ if (next->vm_file != vma->vm_file)
+ goto out;
+
+ if (next->vm_flags != vma->vm_flags)
+ goto out;
+
+ if (start + size <= next->vm_end)
+ break;
+ }
+
+ if (!next)
+ goto out;
}

prot |= vma->vm_flags & VM_READ ? PROT_READ : 0;
@@ -2683,9 +2700,16 @@ SYSCALL_DEFINE5(remap_file_pages, unsign
flags &= MAP_NONBLOCK;
flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
if (vma->vm_flags & VM_LOCKED) {
+ struct vm_area_struct *tmp;
flags |= MAP_LOCKED;
+
/* drop PG_Mlocked flag for over-mapped range */
- munlock_vma_pages_range(vma, start, start + size);
+ for (tmp = vma; tmp->vm_start >= start + size;
+ tmp = tmp->vm_next) {
+ munlock_vma_pages_range(tmp,
+ max(tmp->vm_start, start),
+ min(tmp->vm_end, start + size));
+ }
}

file = get_file(vma->vm_file);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <mik...@szeredi.hu>

commit 97daf8b97ad6f913a34c82515be64dc9ac08d63e upstream.

When ovl_copy_xattr() encountered a zero size xattr no more xattrs were
copied and the function returned success. This is clearly not the desired
behavior.

Signed-off-by: Miklos Szeredi <mik...@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/overlayfs/copy_up.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -54,7 +54,7 @@ int ovl_copy_xattr(struct dentry *old, s

for (name = buf; name < (buf + list_size); name += strlen(name) + 1) {
size = vfs_getxattr(old, name, value, XATTR_SIZE_MAX);
- if (size <= 0) {
+ if (size < 0) {
error = size;
goto out_free_value;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Toshi Kani <toshi...@hpe.com>

commit a82eee7424525e34e98d821dd059ce14560a1e35 upstream.

Data corruption issues were observed in tests which initiated
a system crash/reset while accessing BTT devices. This problem
is reproducible.

The BTT driver calls pmem_rw_bytes() to update data in pmem
devices. This interface calls __copy_user_nocache(), which
uses non-temporal stores so that the stores to pmem are
persistent.

__copy_user_nocache() uses non-temporal stores when a request
size is 8 bytes or larger (and is aligned by 8 bytes). The
BTT driver updates the BTT map table, which entry size is
4 bytes. Therefore, updates to the map table entries remain
cached, and are not written to pmem after a crash.

Change __copy_user_nocache() to use non-temporal store when
a request size is 4 bytes. The change extends the current
byte-copy path for a less-than-8-bytes request, and does not
add any overhead to the regular path.

Reported-and-tested-by: Micah Parrish <micah....@hpe.com>
Reported-and-tested-by: Brian Boylston <brian.b...@hpe.com>
Signed-off-by: Toshi Kani <toshi...@hpe.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: Andy Lutomirski <lu...@amacapital.net>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brg...@gmail.com>
Cc: Dan Williams <dan.j.w...@intel.com>
Cc: Denys Vlasenko <dvla...@redhat.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Cc: Luis R. Rodriguez <mcg...@suse.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ross Zwisler <ross.z...@linux.intel.com>
Cc: Thomas Gleixner <tg...@linutronix.de>
Cc: Toshi Kani <toshi...@hp.com>
Cc: Vishal Verma <vishal....@intel.com>
Cc: linux-...@lists.01.org
Link: http://lkml.kernel.org/r/1455225857-12039-3-git...@hpe.com
[ Small readability edits. ]
Signed-off-by: Ingo Molnar <mi...@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/x86/lib/copy_user_64.S | 36 ++++++++++++++++++++++++++++++++----
1 file changed, 32 insertions(+), 4 deletions(-)

--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -237,13 +237,14 @@ ENDPROC(copy_user_enhanced_fast_string)
* Note: Cached memory copy is used when destination or size is not
* naturally aligned. That is:
* - Require 8-byte alignment when size is 8 bytes or larger.
+ * - Require 4-byte alignment when size is 4 bytes.
*/
ENTRY(__copy_user_nocache)
ASM_STAC

- /* If size is less than 8 bytes, go to byte copy */
+ /* If size is less than 8 bytes, go to 4-byte copy */
cmpl $8,%edx
- jb .L_1b_cache_copy_entry
+ jb .L_4b_nocache_copy_entry

/* If destination is not 8-byte aligned, "cache" copy to align it */
ALIGN_DESTINATION
@@ -282,7 +283,7 @@ ENTRY(__copy_user_nocache)
movl %edx,%ecx
andl $7,%edx
shrl $3,%ecx
- jz .L_1b_cache_copy_entry /* jump if count is 0 */
+ jz .L_4b_nocache_copy_entry /* jump if count is 0 */

/* Perform 8-byte nocache loop-copy */
.L_8b_nocache_copy_loop:
@@ -294,11 +295,33 @@ ENTRY(__copy_user_nocache)
jnz .L_8b_nocache_copy_loop

/* If no byte left, we're done */
-.L_1b_cache_copy_entry:
+.L_4b_nocache_copy_entry:
+ andl %edx,%edx
+ jz .L_finish_copy
+
+ /* If destination is not 4-byte aligned, go to byte copy: */
+ movl %edi,%ecx
+ andl $3,%ecx
+ jnz .L_1b_cache_copy_entry
+
+ /* Set 4-byte copy count (1 or 0) and remainder */
+ movl %edx,%ecx
+ andl $3,%edx
+ shrl $2,%ecx
+ jz .L_1b_cache_copy_entry /* jump if count is 0 */
+
+ /* Perform 4-byte nocache copy: */
+30: movl (%rsi),%r8d
+31: movnti %r8d,(%rdi)
+ leaq 4(%rsi),%rsi
+ leaq 4(%rdi),%rdi
+
+ /* If no bytes left, we're done: */
andl %edx,%edx
jz .L_finish_copy

/* Perform byte "cache" loop-copy for the remainder */
+.L_1b_cache_copy_entry:
movl %edx,%ecx
.L_1b_cache_copy_loop:
40: movb (%rsi),%al
@@ -323,6 +346,9 @@ ENTRY(__copy_user_nocache)
.L_fixup_8b_copy:
lea (%rdx,%rcx,8),%rdx
jmp .L_fixup_handle_tail
+.L_fixup_4b_copy:
+ lea (%rdx,%rcx,4),%rdx
+ jmp .L_fixup_handle_tail
.L_fixup_1b_copy:
movl %ecx,%edx
.L_fixup_handle_tail:
@@ -348,6 +374,8 @@ ENTRY(__copy_user_nocache)
_ASM_EXTABLE(16b,.L_fixup_4x8b_copy)
_ASM_EXTABLE(20b,.L_fixup_8b_copy)
_ASM_EXTABLE(21b,.L_fixup_8b_copy)
+ _ASM_EXTABLE(30b,.L_fixup_4b_copy)
+ _ASM_EXTABLE(31b,.L_fixup_4b_copy)
_ASM_EXTABLE(40b,.L_fixup_1b_copy)
_ASM_EXTABLE(41b,.L_fixup_1b_copy)
ENDPROC(__copy_user_nocache)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <ti...@suse.de>

commit 0b8c82190c12e530eb6003720dac103bf63e146e upstream.

The commit [991f86d7ae4e: ALSA: hda - Flush the pending probe work at
remove] introduced the sync of async probe work at remove for fixing
the race. However, this may lead to another hangup when the module
removal is performed quickly before starting the probe work, because
it issues flush_work() and it's blocked forever.

The workaround is to use cancel_work_sync() instead of flush_work()
there.

Fixes: 991f86d7ae4e ('ALSA: hda - Flush the pending probe work at remove')
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
sound/pci/hda/hda_intel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -2143,10 +2143,10 @@ static void azx_remove(struct pci_dev *p
struct hda_intel *hda;

if (card) {
- /* flush the pending probing work */
+ /* cancel the pending probing work */
chip = card->private_data;
hda = container_of(chip, struct hda_intel, chip);
- flush_work(&hda->probe_work);
+ cancel_work_sync(&hda->probe_work);

snd_card_free(card);
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:08 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vegard Nossum <vegard...@oracle.com>

commit b0918d9f476a8434b055e362b83fa4fd1d462c3f upstream.

udf_next_aext() just follows extent pointers while extents are marked as
indirect. This can loop forever for corrupted filesystem. Limit number
the of indirect extents we are willing to follow in a row.

[JK: Updated changelog, limit, style]

Signed-off-by: Vegard Nossum <vegard...@oracle.com>
Cc: Jan Kara <ja...@suse.com>
Cc: Quentin Casasnovas <quentin.c...@oracle.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Jan Kara <ja...@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/udf/inode.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -2047,14 +2047,29 @@ void udf_write_aext(struct inode *inode,
epos->offset += adsize;
}

+/*
+ * Only 1 indirect extent in a row really makes sense but allow upto 16 in case
+ * someone does some weird stuff.
+ */
+#define UDF_MAX_INDIR_EXTS 16
+
int8_t udf_next_aext(struct inode *inode, struct extent_position *epos,
struct kernel_lb_addr *eloc, uint32_t *elen, int inc)
{
int8_t etype;
+ unsigned int indirections = 0;

while ((etype = udf_current_aext(inode, epos, eloc, elen, inc)) ==
(EXT_NEXT_EXTENT_ALLOCDECS >> 30)) {
int block;
+
+ if (++indirections > UDF_MAX_INDIR_EXTS) {
+ udf_err(inode->i_sb,
+ "too many indirect extents in inode %lu\n",
+ inode->i_ino);
+ return -1;
+ }
+
epos->block = *eloc;
epos->offset = sizeof(struct allocExtDesc);
brelse(epos->bh);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:08 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <ti...@suse.de>

commit 67ec1072b053c15564e6090ab30127895dc77a89 upstream.

A non-atomic PCM stream may take snd_pcm_link_rwsem rw semaphore twice
in the same code path, e.g. one in snd_pcm_action_nonatomic() and
another in snd_pcm_stream_lock(). Usually this is OK, but when a
write lock is issued between these two read locks, the problem
happens: the write lock is blocked due to the first reade lock, and
the second read lock is also blocked by the write lock. This
eventually deadlocks.

The reason is the way rwsem manages waiters; it's queued like FIFO, so
even if the writer itself doesn't take the lock yet, it blocks all the
waiters (including reads) queued after it.

As a workaround, in this patch, we replace the standard down_write()
with an spinning loop. This is far from optimal, but it's good
enough, as the spinning time is supposed to be relatively short for
normal PCM operations, and the code paths requiring the write lock
aren't called so often.

Reported-by: Vinod Koul <vinod...@intel.com>
Tested-by: Ramesh Babu <rames...@intel.com>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
sound/core/pcm_native.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -74,6 +74,18 @@ static int snd_pcm_open(struct file *fil
static DEFINE_RWLOCK(snd_pcm_link_rwlock);
static DECLARE_RWSEM(snd_pcm_link_rwsem);

+/* Writer in rwsem may block readers even during its waiting in queue,
+ * and this may lead to a deadlock when the code path takes read sem
+ * twice (e.g. one in snd_pcm_action_nonatomic() and another in
+ * snd_pcm_stream_lock()). As a (suboptimal) workaround, let writer to
+ * spin until it gets the lock.
+ */
+static inline void down_write_nonblock(struct rw_semaphore *lock)
+{
+ while (!down_write_trylock(lock))
+ cond_resched();
+}
+
/**
* snd_pcm_stream_lock - Lock the PCM stream
* @substream: PCM substream
@@ -1813,7 +1825,7 @@ static int snd_pcm_link(struct snd_pcm_s
res = -ENOMEM;
goto _nolock;
}
- down_write(&snd_pcm_link_rwsem);
+ down_write_nonblock(&snd_pcm_link_rwsem);
write_lock_irq(&snd_pcm_link_rwlock);
if (substream->runtime->status->state == SNDRV_PCM_STATE_OPEN ||
substream->runtime->status->state != substream1->runtime->status->state ||
@@ -1860,7 +1872,7 @@ static int snd_pcm_unlink(struct snd_pcm
struct snd_pcm_substream *s;
int res = 0;

- down_write(&snd_pcm_link_rwsem);
+ down_write_nonblock(&snd_pcm_link_rwsem);
write_lock_irq(&snd_pcm_link_rwlock);
if (!snd_pcm_stream_linked(substream)) {
res = -EALREADY;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:09 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Toshi Kani <toshi...@hpe.com>

commit ee9737c924706aaa72c2ead93e3ad5644681dc1c upstream.

Add comments to __copy_user_nocache() to clarify its procedures
and alignment requirements.

Also change numeric branch target labels to named local labels.

No code changed:

arch/x86/lib/copy_user_64.o:

text data bss dec hex filename
1239 0 0 1239 4d7 copy_user_64.o.before
1239 0 0 1239 4d7 copy_user_64.o.after

md5:
58bed94c2db98c1ca9a2d46d0680aaae copy_user_64.o.before.asm
58bed94c2db98c1ca9a2d46d0680aaae copy_user_64.o.after.asm

Signed-off-by: Toshi Kani <toshi...@hpe.com>
Cc: Andrew Morton <ak...@linux-foundation.org>
Cc: Andy Lutomirski <lu...@amacapital.net>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brg...@gmail.com>
Cc: Denys Vlasenko <dvla...@redhat.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Linus Torvalds <torv...@linux-foundation.org>
Cc: Luis R. Rodriguez <mcg...@suse.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <tg...@linutronix.de>
Cc: Toshi Kani <toshi...@hp.com>
Cc: brian.b...@hpe.com
Cc: dan.j.w...@intel.com
Cc: linux-...@lists.01.org
Cc: micah....@hpe.com
Cc: ross.z...@linux.intel.com
Cc: vishal....@intel.com
Link: http://lkml.kernel.org/r/1455225857-12039-2-git...@hpe.com
[ Small readability edits and added object file comparison. ]
Signed-off-by: Ingo Molnar <mi...@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/x86/lib/copy_user_64.S | 114 ++++++++++++++++++++++++++++----------------
1 file changed, 73 insertions(+), 41 deletions(-)

--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -232,17 +232,30 @@ ENDPROC(copy_user_enhanced_fast_string)

/*
* copy_user_nocache - Uncached memory copy with exception handling
- * This will force destination/source out of cache for more performance.
+ * This will force destination out of cache for more performance.
+ *
+ * Note: Cached memory copy is used when destination or size is not
+ * naturally aligned. That is:
+ * - Require 8-byte alignment when size is 8 bytes or larger.
*/
ENTRY(__copy_user_nocache)
ASM_STAC
+
+ /* If size is less than 8 bytes, go to byte copy */
cmpl $8,%edx
- jb 20f /* less then 8 bytes, go to byte copy loop */
+ jb .L_1b_cache_copy_entry
+
+ /* If destination is not 8-byte aligned, "cache" copy to align it */
ALIGN_DESTINATION
+
+ /* Set 4x8-byte copy count and remainder */
movl %edx,%ecx
andl $63,%edx
shrl $6,%ecx
- jz 17f
+ jz .L_8b_nocache_copy_entry /* jump if count is 0 */
+
+ /* Perform 4x8-byte nocache loop-copy */
+.L_4x8b_nocache_copy_loop:
1: movq (%rsi),%r8
2: movq 1*8(%rsi),%r9
3: movq 2*8(%rsi),%r10
@@ -262,60 +275,79 @@ ENTRY(__copy_user_nocache)
leaq 64(%rsi),%rsi
leaq 64(%rdi),%rdi
decl %ecx
- jnz 1b
-17: movl %edx,%ecx
+ jnz .L_4x8b_nocache_copy_loop
+
+ /* Set 8-byte copy count and remainder */
+.L_8b_nocache_copy_entry:
+ movl %edx,%ecx
andl $7,%edx
shrl $3,%ecx
- jz 20f
-18: movq (%rsi),%r8
-19: movnti %r8,(%rdi)
+ jz .L_1b_cache_copy_entry /* jump if count is 0 */
+
+ /* Perform 8-byte nocache loop-copy */
+.L_8b_nocache_copy_loop:
+20: movq (%rsi),%r8
+21: movnti %r8,(%rdi)
leaq 8(%rsi),%rsi
leaq 8(%rdi),%rdi
decl %ecx
- jnz 18b
-20: andl %edx,%edx
- jz 23f
+ jnz .L_8b_nocache_copy_loop
+
+ /* If no byte left, we're done */
+.L_1b_cache_copy_entry:
+ andl %edx,%edx
+ jz .L_finish_copy
+
+ /* Perform byte "cache" loop-copy for the remainder */
movl %edx,%ecx
-21: movb (%rsi),%al
-22: movb %al,(%rdi)
+.L_1b_cache_copy_loop:
+40: movb (%rsi),%al
+41: movb %al,(%rdi)
incq %rsi
incq %rdi
decl %ecx
- jnz 21b
-23: xorl %eax,%eax
+ jnz .L_1b_cache_copy_loop
+
+ /* Finished copying; fence the prior stores */
+.L_finish_copy:
+ xorl %eax,%eax
ASM_CLAC
sfence
ret

.section .fixup,"ax"
-30: shll $6,%ecx
+.L_fixup_4x8b_copy:
+ shll $6,%ecx
addl %ecx,%edx
- jmp 60f
-40: lea (%rdx,%rcx,8),%rdx
- jmp 60f
-50: movl %ecx,%edx
-60: sfence
+ jmp .L_fixup_handle_tail
+.L_fixup_8b_copy:
+ lea (%rdx,%rcx,8),%rdx
+ jmp .L_fixup_handle_tail
+.L_fixup_1b_copy:
+ movl %ecx,%edx
+.L_fixup_handle_tail:
+ sfence
jmp copy_user_handle_tail
.previous

- _ASM_EXTABLE(1b,30b)
- _ASM_EXTABLE(2b,30b)
- _ASM_EXTABLE(3b,30b)
- _ASM_EXTABLE(4b,30b)
- _ASM_EXTABLE(5b,30b)
- _ASM_EXTABLE(6b,30b)
- _ASM_EXTABLE(7b,30b)
- _ASM_EXTABLE(8b,30b)
- _ASM_EXTABLE(9b,30b)
- _ASM_EXTABLE(10b,30b)
- _ASM_EXTABLE(11b,30b)
- _ASM_EXTABLE(12b,30b)
- _ASM_EXTABLE(13b,30b)
- _ASM_EXTABLE(14b,30b)
- _ASM_EXTABLE(15b,30b)
- _ASM_EXTABLE(16b,30b)
- _ASM_EXTABLE(18b,40b)
- _ASM_EXTABLE(19b,40b)
- _ASM_EXTABLE(21b,50b)
- _ASM_EXTABLE(22b,50b)
+ _ASM_EXTABLE(1b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(2b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(3b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(4b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(5b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(6b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(7b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(8b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(9b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(10b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(11b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(12b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(13b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(14b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(15b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(16b,.L_fixup_4x8b_copy)
+ _ASM_EXTABLE(20b,.L_fixup_8b_copy)
+ _ASM_EXTABLE(21b,.L_fixup_8b_copy)
+ _ASM_EXTABLE(40b,.L_fixup_1b_copy)
+ _ASM_EXTABLE(41b,.L_fixup_1b_copy)
ENDPROC(__copy_user_nocache)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:09 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tg...@linutronix.de>

commit fb75a4282d0d9a3c7c44d940582c2d226cf3acfb upstream.

If the proxy lock in the requeue loop acquires the rtmutex for a
waiter then it acquired also refcount on the pi_state related to the
futex, but the waiter side does not drop the reference count.

Add the missing free_pi_state() call.

Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Darren Hart <dar...@dvhart.com>
Cc: Davidlohr Bueso <da...@stgolabs.net>
Cc: Bhuvanesh...@mentor.com
Cc: Andy Lowe <Andy...@mentor.com>
Link: http://lkml.kernel.org/r/201512192006...@linutronix.de
Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
kernel/futex.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2755,6 +2755,11 @@ static int futex_wait_requeue_pi(u32 __u
if (q.pi_state && (q.pi_state->owner != current)) {
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, &q, current);
+ /*
+ * Drop the reference to the pi state which
+ * the requeue_pi() code acquired for us.
+ */
+ free_pi_state(q.pi_state);
spin_unlock(q.lock_ptr);
}
} else {

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tony Lindgren <to...@atomide.com>

commit 58a66dba1beac2121d931cda4682ae4d40816af5 upstream.

If we reload phy-twl4030-usb, we get a warning about unbalanced
pm_runtime_enable. Let's fix the issue and also fix idling of the
device on unload before we attempt to shut it down.

If we don't properly idle the PHY before shutting it down on removal,
the twl4030 ends up consuming about 62mW of extra power compared to
running idle with the module loaded.

Cc: Bin Liu <b-...@ti.com>
Cc: Felipe Balbi <ba...@ti.com>
Cc: Kishon Vijay Abraham I <kis...@ti.com>
Cc: NeilBrown <ne...@brown.name>
Signed-off-by: Tony Lindgren <to...@atomide.com>
Signed-off-by: Kishon Vijay Abraham I <kis...@ti.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/phy/phy-twl4030-usb.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

--- a/drivers/phy/phy-twl4030-usb.c
+++ b/drivers/phy/phy-twl4030-usb.c
@@ -715,6 +715,7 @@ static int twl4030_usb_probe(struct plat
pm_runtime_use_autosuspend(&pdev->dev);
pm_runtime_set_autosuspend_delay(&pdev->dev, 2000);
pm_runtime_enable(&pdev->dev);
+ pm_runtime_get_sync(&pdev->dev);

/* Our job is to use irqs and status from the power module
* to keep the transceiver disabled when nothing's connected.
@@ -758,6 +759,13 @@ static int twl4030_usb_remove(struct pla
/* set transceiver mode to power on defaults */
twl4030_usb_set_mode(twl, -1);

+ /* idle ulpi before powering off */
+ if (cable_present(twl->linkstat))
+ pm_runtime_put_noidle(twl->dev);
+ pm_runtime_mark_last_busy(twl->dev);
+ pm_runtime_put_sync_suspend(twl->dev);
+ pm_runtime_disable(twl->dev);
+
/* autogate 60MHz ULPI clock,
* clear dpll clock request for i2c access,
* disable 32KHz
@@ -772,11 +780,6 @@ static int twl4030_usb_remove(struct pla
/* disable complete OTG block */
twl4030_usb_clear_bits(twl, POWER_CTRL, POWER_CTRL_OTG_ENAB);

- if (cable_present(twl->linkstat))
- pm_runtime_put_noidle(twl->dev);
- pm_runtime_mark_last_busy(twl->dev);
- pm_runtime_put(twl->dev);
-
return 0;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <ti...@suse.de>

commit d99a36f4728fcbcc501b78447f625bdcce15b842 upstream.

When multiple concurrent writes happen on the ALSA sequencer device
right after the open, it may try to allocate vmalloc buffer for each
write and leak some of them. It's because the presence check and the
assignment of the buffer is done outside the spinlock for the pool.

The fix is to move the check and the assignment into the spinlock.

(The current implementation is suboptimal, as there can be multiple
unnecessary vmallocs because the allocation is done before the check
in the spinlock. But the pool size is already checked beforehand, so
this isn't a big problem; that is, the only possible path is the
multiple writes before any pool assignment, and practically seen, the
current coverage should be "good enough".)

The issue was triggered by syzkaller fuzzer.

BugLink: http://lkml.kernel.org/r/CACT4Y+bSzazpXNvtAr=WXaL8hptqjHwqEyFA+VN2AWEx=au...@mail.gmail.com
Reported-by: Dmitry Vyukov <dvy...@google.com>
Tested-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
sound/core/seq/seq_memory.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

--- a/sound/core/seq/seq_memory.c
+++ b/sound/core/seq/seq_memory.c
@@ -383,15 +383,20 @@ int snd_seq_pool_init(struct snd_seq_poo

if (snd_BUG_ON(!pool))
return -EINVAL;
- if (pool->ptr) /* should be atomic? */
- return 0;

- pool->ptr = vmalloc(sizeof(struct snd_seq_event_cell) * pool->size);
- if (!pool->ptr)
+ cellptr = vmalloc(sizeof(struct snd_seq_event_cell) * pool->size);
+ if (!cellptr)
return -ENOMEM;

/* add new cells to the free cell list */
spin_lock_irqsave(&pool->lock, flags);
+ if (pool->ptr) {
+ spin_unlock_irqrestore(&pool->lock, flags);
+ vfree(cellptr);
+ return 0;
+ }
+
+ pool->ptr = cellptr;
pool->free = NULL;

for (cell = 0; cell < pool->size; cell++) {

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edum...@google.com>

commit d7ce36924344ace0dbdc855b1206cacc46b36d45 upstream.

Some servers experienced fatal deadlocks because of a combination of
bugs, leading to multiple cpus calling dump_stack().

The checksumming bug was fixed in commit 34ae6a1aa054 ("ipv6: update
skb->csum when CE mark is propagated").

The second problem is a faulty locking in dump_stack()

CPU1 runs in process context and calls dump_stack(), grabs dump_lock.

CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
call dump_stack() from netdev_rx_csum_fault().

dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
dump_lock is owned by CPU1

While dumping its stack, CPU1 is interrupted by a softirq, and happens
to process a packet for the TCP socket locked by CPU2.

CPU1 spins forever in spin_lock() : deadlock

Stack trace on CPU1 looked like :

NMI backtrace for cpu 1
RIP: _raw_spin_lock+0x25/0x30
...
Call Trace:
<IRQ>
tcp_v6_rcv+0x243/0x620
ip6_input_finish+0x11f/0x330
ip6_input+0x38/0x40
ip6_rcv_finish+0x3c/0x90
ipv6_rcv+0x2a9/0x500
process_backlog+0x461/0xaa0
net_rx_action+0x147/0x430
__do_softirq+0x167/0x2d0
call_softirq+0x1c/0x30
do_softirq+0x3f/0x80
irq_exit+0x6e/0xc0
smp_call_function_single_interrupt+0x35/0x40
call_function_single_interrupt+0x6a/0x70
<EOI>
printk+0x4d/0x4f
printk_address+0x31/0x33
print_trace_address+0x33/0x3c
print_context_stack+0x7f/0x119
dump_trace+0x26b/0x28e
show_trace_log_lvl+0x4f/0x5c
show_stack_log_lvl+0x104/0x113
show_stack+0x42/0x44
dump_stack+0x46/0x58
netdev_rx_csum_fault+0x38/0x3c
__skb_checksum_complete_head+0x6e/0x80
__skb_checksum_complete+0x11/0x20
tcp_rcv_established+0x2bd5/0x2fd0
tcp_v6_do_rcv+0x13c/0x620
sk_backlog_rcv+0x15/0x30
release_sock+0xd2/0x150
tcp_recvmsg+0x1c1/0xfc0
inet_recvmsg+0x7d/0x90
sock_recvmsg+0xaf/0xe0
___sys_recvmsg+0x111/0x3b0
SyS_recvmsg+0x5c/0xb0
system_call_fastpath+0x16/0x1b

Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()")
Signed-off-by: Eric Dumazet <edum...@google.com>
Cc: Alex Thorlton <atho...@sgi.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
lib/dump_stack.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -25,6 +25,7 @@ static atomic_t dump_lock = ATOMIC_INIT(

asmlinkage __visible void dump_stack(void)
{
+ unsigned long flags;
int was_locked;
int old;
int cpu;
@@ -33,9 +34,8 @@ asmlinkage __visible void dump_stack(voi
* Permit this cpu to perform nested stack dumps while serialising
* against other CPUs
*/
- preempt_disable();
-
retry:
+ local_irq_save(flags);
cpu = smp_processor_id();
old = atomic_cmpxchg(&dump_lock, -1, cpu);
if (old == -1) {
@@ -43,6 +43,7 @@ retry:
} else if (old == cpu) {
was_locked = 1;
} else {
+ local_irq_restore(flags);
cpu_relax();
goto retry;
}
@@ -52,7 +53,7 @@ retry:
if (!was_locked)
atomic_set(&dump_lock, -1);

- preempt_enable();
+ local_irq_restore(flags);
}
#else
asmlinkage __visible void dump_stack(void)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: CQ Tang <cq....@intel.com>

commit fda3bec12d0979aae3f02ee645913d66fbc8a26e upstream.

This is a 32-bit register. Apparently harmless on real hardware, but
causing justified warnings in simulation.

Signed-off-by: CQ Tang <cq....@intel.com>
Signed-off-by: David Woodhouse <David.W...@intel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iommu/dmar.c | 2 +-
drivers/iommu/intel_irq_remapping.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1347,7 +1347,7 @@ void dmar_disable_qi(struct intel_iommu

raw_spin_lock_irqsave(&iommu->register_lock, flags);

- sts = dmar_readq(iommu->reg + DMAR_GSTS_REG);
+ sts = readl(iommu->reg + DMAR_GSTS_REG);
if (!(sts & DMA_GSTS_QIES))
goto end;

--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -629,7 +629,7 @@ static void iommu_disable_irq_remapping(

raw_spin_lock_irqsave(&iommu->register_lock, flags);

- sts = dmar_readq(iommu->reg + DMAR_GSTS_REG);
+ sts = readl(iommu->reg + DMAR_GSTS_REG);
if (!(sts & DMA_GSTS_IRES))
goto end;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Chinner <dchi...@redhat.com>

commit b79f4a1c68bb99152d0785ee4ea3ab4396cdacc6 upstream.

When we do inode readahead in log recovery, we do can do the
readahead before we've replayed the icreate transaction that stamps
the buffer with inode cores. The inode readahead verifier catches
this and marks the buffer as !done to indicate that it doesn't yet
contain valid inodes.

In adding buffer error notification (i.e. setting b_error = -EIO at
the same time as as we clear the done flag) to such a readahead
verifier failure, we can then get subsequent inode recovery failing
with this error:

XFS (dm-0): metadata I/O error: block 0xa00060 ("xlog_recover_do..(read#2)") error 5 numblks 32

This occurs when readahead completion races with icreate item replay
such as:

inode readahead
find buffer
lock buffer
submit RA io
....
icreate recovery
xfs_trans_get_buffer
find buffer
lock buffer
<blocks on RA completion>
.....
<ra completion>
fails verifier
clear XBF_DONE
set bp->b_error = -EIO
release and unlock buffer
<icreate gains lock>
icreate initialises buffer
marks buffer as done
adds buffer to delayed write queue
releases buffer

At this point, we have an initialised inode buffer that is up to
date but has an -EIO state registered against it. When we finally
get to recovering an inode in that buffer:

inode item recovery
xfs_trans_read_buffer
find buffer
lock buffer
sees XBF_DONE is set, returns buffer
sees bp->b_error is set
fail log recovery!

Essentially, we need xfs_trans_get_buf_map() to clear the error status of
the buffer when doing a lookup. This function returns uninitialised
buffers, so the buffer returned can not be in an error state and
none of the code that uses this function expects b_error to be set
on return. Indeed, there is an ASSERT(!bp->b_error); in the
transaction case in xfs_trans_get_buf_map() that would have caught
this if log recovery used transactions....

This patch firstly changes the inode readahead failure to set -EIO
on the buffer, and secondly changes xfs_buf_get_map() to never
return a buffer with an error state set so this first change doesn't
cause unexpected log recovery failures.

Signed-off-by: Dave Chinner <dchi...@redhat.com>
Reviewed-by: Brian Foster <bfo...@redhat.com>
Signed-off-by: Dave Chinner <da...@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/xfs/libxfs/xfs_inode_buf.c | 12 +++++++-----
fs/xfs/xfs_buf.c | 7 +++++++
2 files changed, 14 insertions(+), 5 deletions(-)

--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -62,11 +62,12 @@ xfs_inobp_check(
* has not had the inode cores stamped into it. Hence for readahead, the buffer
* may be potentially invalid.
*
- * If the readahead buffer is invalid, we don't want to mark it with an error,
- * but we do want to clear the DONE status of the buffer so that a followup read
- * will re-read it from disk. This will ensure that we don't get an unnecessary
- * warnings during log recovery and we don't get unnecssary panics on debug
- * kernels.
+ * If the readahead buffer is invalid, we need to mark it with an error and
+ * clear the DONE status of the buffer so that a followup read will re-read it
+ * from disk. We don't report the error otherwise to avoid warnings during log
+ * recovery and we don't get unnecssary panics on debug kernels. We use EIO here
+ * because all we want to do is say readahead failed; there is no-one to report
+ * the error to, so this will distinguish it from a non-ra verifier failure.
*/
static void
xfs_inode_buf_verify(
@@ -93,6 +94,7 @@ xfs_inode_buf_verify(
XFS_RANDOM_ITOBP_INOTOBP))) {
if (readahead) {
bp->b_flags &= ~XBF_DONE;
+ xfs_buf_ioerror(bp, -EIO);
return;
}

--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -604,6 +604,13 @@ found:
}
}

+ /*
+ * Clear b_error if this is a lookup from a caller that doesn't expect
+ * valid data to be found in the buffer.
+ */
+ if (!(flags & XBF_READ))
+ xfs_buf_ioerror(bp, 0);
+
XFS_STATS_INC(target->bt_mount, xb_get);
trace_xfs_buf_get(bp, flags, _RET_IP_);
return bp;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:11 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vito Caputo <vito....@coreos.com>

commit e4ad29fa0d224d05e08b2858e65f112fd8edd4fe upstream.

Rather than always allocating the high-order XATTR_SIZE_MAX buffer
which is costly and prone to failure, only allocate what is needed and
realloc if necessary.

Fixes https://github.com/coreos/bugs/issues/489

Signed-off-by: Miklos Szeredi <mik...@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/overlayfs/copy_up.c | 39 +++++++++++++++++++++++++--------------
1 file changed, 25 insertions(+), 14 deletions(-)

--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -22,9 +22,9 @@

int ovl_copy_xattr(struct dentry *old, struct dentry *new)
{
- ssize_t list_size, size;
- char *buf, *name, *value;
- int error;
+ ssize_t list_size, size, value_size = 0;
+ char *buf, *name, *value = NULL;
+ int uninitialized_var(error);

if (!old->d_inode->i_op->getxattr ||
!new->d_inode->i_op->getxattr)
@@ -41,29 +41,40 @@ int ovl_copy_xattr(struct dentry *old, s
if (!buf)
return -ENOMEM;

- error = -ENOMEM;
- value = kmalloc(XATTR_SIZE_MAX, GFP_KERNEL);
- if (!value)
- goto out;
-
list_size = vfs_listxattr(old, buf, list_size);
if (list_size <= 0) {
error = list_size;
- goto out_free_value;
+ goto out;
}

for (name = buf; name < (buf + list_size); name += strlen(name) + 1) {
- size = vfs_getxattr(old, name, value, XATTR_SIZE_MAX);
+retry:
+ size = vfs_getxattr(old, name, value, value_size);
+ if (size == -ERANGE)
+ size = vfs_getxattr(old, name, NULL, 0);
+
if (size < 0) {
error = size;
- goto out_free_value;
+ break;
}
+
+ if (size > value_size) {
+ void *new;
+
+ new = krealloc(value, size, GFP_KERNEL);
+ if (!new) {
+ error = -ENOMEM;
+ break;
+ }
+ value = new;
+ value_size = size;
+ goto retry;
+ }
+
error = vfs_setxattr(new, name, value, size, 0);
if (error)
- goto out_free_value;
+ break;
}
-
-out_free_value:
kfree(value);
out:
kfree(buf);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:11 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vineet Gupta <Vineet...@synopsys.com>

commit 6a6ac72fd6ea32594b316513e1826c3f6db4cc93 upstream.

This showed up on ARC when running LMBench bw_mem tests as Overlapping
TLB Machine Check Exception triggered due to STLB entry (2M pages)
overlapping some NTLB entry (regular 8K page).

bw_mem 2m touches a large chunk of vaddr creating NTLB entries. In the
interim khugepaged kicks in, collapsing the contiguous ptes into a
single pmd. pmdp_collapse_flush()->flush_pmd_tlb_range() is called to
flush out NTLB entries for the ptes. This for ARC (by design) can only
shootdown STLB entries (for pmd). The stray NTLB entries cause the
overlap with the subsequent STLB entry for collapsed page. So make
pmdp_collapse_flush() call pte flush interface not pmd flush.

Note that originally all thp flush call sites in generic code called
flush_tlb_range() leaving it to architecture to implement the flush for
pte and/or pmd. Commit 12ebc1581ad11454 changed this by calling a new
opt-in API flush_pmd_tlb_range() which made the semantics more explicit
but failed to distinguish the pte vs pmd flush in generic code, which is
what this patch fixes.

Note that ARC can fixed w/o touching the generic pmdp_collapse_flush()
by defining a ARC version, but that defeats the purpose of generic
version, plus sementically this is the right thing to do.

Fixes STAR 9000961194: LMBench on AXS103 triggering duplicate TLB
exceptions with super pages

Fixes: 12ebc1581ad11454 ("mm,thp: introduce flush_pmd_tlb_range")
Signed-off-by: Vineet Gupta <vgu...@synopsys.com>
Reviewed-by: Aneesh Kumar K.V <aneesh...@linux.vnet.ibm.com>
Acked-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Cc: Andrea Arcangeli <aarc...@redhat.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
mm/pgtable-generic.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -210,7 +210,9 @@ pmd_t pmdp_collapse_flush(struct vm_area
VM_BUG_ON(address & ~HPAGE_PMD_MASK);
VM_BUG_ON(pmd_trans_huge(*pmdp));
pmd = pmdp_huge_get_and_clear(vma->vm_mm, address, pmdp);
- flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+
+ /* collapse entails shooting down ptes not pmd */
+ flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
return pmd;
}
#endif

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:12 PM2/23/16
to
3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: CQ Tang <cq....@intel.com>

commit fda3bec12d0979aae3f02ee645913d66fbc8a26e upstream.

This is a 32-bit register. Apparently harmless on real hardware, but
causing justified warnings in simulation.

Signed-off-by: CQ Tang <cq....@intel.com>
Signed-off-by: David Woodhouse <David.W...@intel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iommu/dmar.c | 2 +-
drivers/iommu/intel_irq_remapping.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -986,7 +986,7 @@ void dmar_disable_qi(struct intel_iommu

raw_spin_lock_irqsave(&iommu->register_lock, flags);

- sts = dmar_readq(iommu->reg + DMAR_GSTS_REG);
+ sts = readl(iommu->reg + DMAR_GSTS_REG);
if (!(sts & DMA_GSTS_QIES))
goto end;

--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -489,7 +489,7 @@ static void iommu_disable_irq_remapping(

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:14 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill....@linux.intel.com>

commit 1ac0b6dec656f3f78d1c3dd216fad84cb4d0a01e upstream.

remap_file_pages(2) emulation can reach file which represents removed
IPC ID as long as a memory segment is mapped. It breaks expectations of
IPC subsystem.

Test case (rewritten to be more human readable, originally autogenerated
by syzkaller[1]):

#define _GNU_SOURCE
#include <stdlib.h>
#include <sys/ipc.h>
#include <sys/mman.h>
#include <sys/shm.h>

#define PAGE_SIZE 4096

int main()
{
int id;
void *p;

id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
p = shmat(id, NULL, 0);
shmctl(id, IPC_RMID, NULL);
remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);

return 0;
}

The patch changes shm_mmap() and code around shm_lock() to propagate
locking error back to caller of shm_mmap().

[1] http://github.com/google/syzkaller

Signed-off-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
Cc: Davidlohr Bueso <da...@stgolabs.net>
Cc: Manfred Spraul <man...@colorfullife.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
ipc/shm.c | 53 +++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 43 insertions(+), 10 deletions(-)

--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -156,11 +156,12 @@ static inline struct shmid_kernel *shm_l
struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);

/*
- * We raced in the idr lookup or with shm_destroy(). Either way, the
- * ID is busted.
+ * Callers of shm_lock() must validate the status of the returned ipc
+ * object pointer (as returned by ipc_lock()), and error out as
+ * appropriate.
*/
- WARN_ON(IS_ERR(ipcp));
-
+ if (IS_ERR(ipcp))
+ return (void *)ipcp;
return container_of(ipcp, struct shmid_kernel, shm_perm);
}

@@ -186,18 +187,33 @@ static inline void shm_rmid(struct ipc_n
}


-/* This is called by fork, once for every shm attach. */
-static void shm_open(struct vm_area_struct *vma)
+static int __shm_open(struct vm_area_struct *vma)
{
struct file *file = vma->vm_file;
struct shm_file_data *sfd = shm_file_data(file);
struct shmid_kernel *shp;

shp = shm_lock(sfd->ns, sfd->id);
+
+ if (IS_ERR(shp))
+ return PTR_ERR(shp);
+
shp->shm_atim = get_seconds();
shp->shm_lprid = task_tgid_vnr(current);
shp->shm_nattch++;
shm_unlock(shp);
+ return 0;
+}
+
+/* This is called by fork, once for every shm attach. */
+static void shm_open(struct vm_area_struct *vma)
+{
+ int err = __shm_open(vma);
+ /*
+ * We raced in the idr lookup or with shm_destroy().
+ * Either way, the ID is busted.
+ */
+ WARN_ON_ONCE(err);
}

/*
@@ -260,6 +276,14 @@ static void shm_close(struct vm_area_str
down_write(&shm_ids(ns).rwsem);
/* remove from the list of attaches of the shm segment */
shp = shm_lock(ns, sfd->id);
+
+ /*
+ * We raced in the idr lookup or with shm_destroy().
+ * Either way, the ID is busted.
+ */
+ if (WARN_ON_ONCE(IS_ERR(shp)))
+ goto done; /* no-op */
+
shp->shm_lprid = task_tgid_vnr(current);
shp->shm_dtim = get_seconds();
shp->shm_nattch--;
@@ -267,6 +291,7 @@ static void shm_close(struct vm_area_str
shm_destroy(ns, shp);
else
shm_unlock(shp);
+done:
up_write(&shm_ids(ns).rwsem);
}

@@ -388,17 +413,25 @@ static int shm_mmap(struct file *file, s
struct shm_file_data *sfd = shm_file_data(file);
int ret;

+ /*
+ * In case of remap_file_pages() emulation, the file can represent
+ * removed IPC ID: propogate shm_lock() error to caller.
+ */
+ ret =__shm_open(vma);
+ if (ret)
+ return ret;
+
ret = sfd->file->f_op->mmap(sfd->file, vma);
- if (ret != 0)
+ if (ret) {
+ shm_close(vma);
return ret;
+ }
sfd->vm_ops = vma->vm_ops;
#ifdef CONFIG_MMU
WARN_ON(!sfd->vm_ops->fault);
#endif
vma->vm_ops = &shm_vm_ops;
- shm_open(vma);
-
- return ret;
+ return 0;
}

static int shm_release(struct inode *ino, struct file *file)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sergey Senozhatsky <sergey.seno...@gmail.com>

commit 72214a24a7677d4c7501eecc9517ed681b5f2db2 upstream.

In Python3+ print is a function so the old syntax is not correct
anymore:

$ ./scripts/bloat-o-meter vmlinux.o vmlinux.o.old
File "./scripts/bloat-o-meter", line 61
print "add/remove: %s/%s grow/shrink: %s/%s up/down: %s/%s (%s)" % \
^
SyntaxError: invalid syntax

Fix by calling print as a function.

Tested on python 2.7.11, 3.5.1

Signed-off-by: Sergey Senozhatsky <sergey.se...@gmail.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
scripts/bloat-o-meter | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/scripts/bloat-o-meter
+++ b/scripts/bloat-o-meter
@@ -58,8 +58,8 @@ for name in common:
delta.sort()
delta.reverse()

-print "add/remove: %s/%s grow/shrink: %s/%s up/down: %s/%s (%s)" % \
- (add, remove, grow, shrink, up, -down, up-down)
-print "%-40s %7s %7s %+7s" % ("function", "old", "new", "delta")
+print("add/remove: %s/%s grow/shrink: %s/%s up/down: %s/%s (%s)" % \
+ (add, remove, grow, shrink, up, -down, up-down))
+print("%-40s %7s %7s %+7s" % ("function", "old", "new", "delta"))
for d, n in delta:
- if d: print "%-40s %7s %7s %+7d" % (n, old.get(n,"-"), new.get(n,"-"), d)
+ if d: print("%-40s %7s %7s %+7d" % (n, old.get(n,"-"), new.get(n,"-"), d))

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:20 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Toshi Kani <toshi...@hpe.com>

commit 9273a8bbf58a15051e53a777389a502420ddc60e upstream.

The pmem driver calls devm_memremap() to map a persistent memory range.
When the pmem driver is unloaded, this memremap'd range is not released
so the kernel will leak a vma.

Fix devm_memremap_release() to handle a given memremap'd address
properly.

Signed-off-by: Toshi Kani <toshi...@hpe.com>
Acked-by: Dan Williams <dan.j.w...@intel.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: Ross Zwisler <ross.z...@linux.intel.com>
Cc: Matthew Wilcox <wi...@linux.intel.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
kernel/memremap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -111,7 +111,7 @@ EXPORT_SYMBOL(memunmap);

static void devm_memremap_release(struct device *dev, void *res)
{
- memunmap(res);
+ memunmap(*(void **)res);
}

static int devm_memremap_match(struct device *dev, void *res, void *match_data)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:20:22 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <ti...@suse.de>

commit 13d5e5d4725c64ec06040d636832e78453f477b7 upstream.

The commit [7f0973e973cd: ALSA: seq: Fix lockdep warnings due to
double mutex locks] split the management of two linked lists (source
and destination) into two individual calls for avoiding the AB/BA
deadlock. However, this may leave the possible double deletion of one
of two lists when the counterpart is being deleted concurrently.
It ends up with a list corruption, as revealed by syzkaller fuzzer.

This patch fixes it by checking the list emptiness and skipping the
deletion and the following process.

BugLink: http://lkml.kernel.org/r/CACT4Y+bay9qsrz6dQu31EcGaH9XwfW7o3oBzSQUG9fMszoh=S...@mail.gmail.com
Fixes: 7f0973e973cd ('ALSA: seq: Fix lockdep warnings due to 'double mutex locks)
Reported-by: Dmitry Vyukov <dvy...@google.com>
Tested-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Takashi Iwai <ti...@suse.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
sound/core/seq/seq_ports.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

--- a/sound/core/seq/seq_ports.c
+++ b/sound/core/seq/seq_ports.c
@@ -535,19 +535,22 @@ static void delete_and_unsubscribe_port(
bool is_src, bool ack)
{
struct snd_seq_port_subs_info *grp;
+ struct list_head *list;
+ bool empty;

grp = is_src ? &port->c_src : &port->c_dest;
+ list = is_src ? &subs->src_list : &subs->dest_list;
down_write(&grp->list_mutex);
write_lock_irq(&grp->list_lock);
- if (is_src)
- list_del(&subs->src_list);
- else
- list_del(&subs->dest_list);
+ empty = list_empty(list);
+ if (!empty)
+ list_del_init(list);
grp->exclusive = 0;
write_unlock_irq(&grp->list_lock);
up_write(&grp->list_mutex);

- unsubscribe_port(client, port, grp, &subs->info, ack);
+ if (!empty)
+ unsubscribe_port(client, port, grp, &subs->info, ack);
}

/* connect two ports */

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:21:11 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.ca...@oracle.com>

commit b1d353ad3d5835b16724653b33c05124e1b5acf1 upstream.

"count" is controlled by the user and it can be negative. Let's prevent
that by making it unsigned. You have to have CAP_SYS_RAWIO to call this
function so the bug is not as serious as it could be.

Fixes: 5369c02d951a ('intel_scu_ipc: Utility driver for intel scu ipc')
Signed-off-by: Dan Carpenter <dan.ca...@oracle.com>
Signed-off-by: Darren Hart <dvh...@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/platform/x86/intel_scu_ipcutil.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/platform/x86/intel_scu_ipcutil.c
+++ b/drivers/platform/x86/intel_scu_ipcutil.c
@@ -49,7 +49,7 @@ struct scu_ipc_data {

static int scu_reg_access(u32 cmd, struct scu_ipc_data *data)
{
- int count = data->count;
+ unsigned int count = data->count;

if (count == 0 || count == 3 || count > 4)
return -EINVAL;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:21:13 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Konstantin Khlebnikov <khleb...@yandex-team.ru>

commit 84889d49335627bc770b32787c1ef9ebad1da232 upstream.

This patch fixes kernel crash at removing directory which contains
whiteouts from lower layers.

Cache of directory content passed as "list" contains entries from all
layers, including whiteouts from lower layers. So, lookup in upper dir
(moved into work at this stage) will return negative entry. Plus this
cache is filled long before and we can race with external removal.

Example:
mkdir -p lower0/dir lower1/dir upper work overlay
touch lower0/dir/a lower0/dir/b
mknod lower1/dir/a c 0 0
mount -t overlay none overlay -o lowerdir=lower1:lower0,upperdir=upper,workdir=work
rm -fr overlay/dir

Signed-off-by: Konstantin Khlebnikov <khleb...@yandex-team.ru>
Signed-off-by: Miklos Szeredi <mik...@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/overlayfs/readdir.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -571,7 +571,8 @@ void ovl_cleanup_whiteouts(struct dentry
(int) PTR_ERR(dentry));
continue;
}
- ovl_cleanup(upper->d_inode, dentry);
+ if (dentry->d_inode)
+ ovl_cleanup(upper->d_inode, dentry);
dput(dentry);
}
mutex_unlock(&upper->d_inode->i_mutex);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Konstantin Khlebnikov <koc...@gmail.com>

commit 732042821cfa106b3c20b9780e4c60fee9d68900 upstream.

Helper radix_tree_iter_retry() resets next_index to the current index.
In following radix_tree_next_slot current chunk size becomes zero. This
isn't checked and it tries to dereference null pointer in slot.

Tagged iterator is fine because retry happens only at slot 0 where tag
bitmask in iter->tags is filled with single bit.

Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup")
Signed-off-by: Konstantin Khlebnikov <koc...@gmail.com>
Cc: Matthew Wilcox <wi...@linux.intel.com>
Cc: Hugh Dickins <hu...@google.com>
Cc: Ohad Ben-Cohen <oh...@wizery.com>
Cc: Jeremiah Mahler <jmma...@gmail.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
include/linux/radix-tree.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -391,7 +391,7 @@ void **radix_tree_iter_retry(struct radi
* @iter: pointer to radix tree iterator
* Returns: current chunk size
*/
-static __always_inline unsigned
+static __always_inline long
radix_tree_chunk_size(struct radix_tree_iter *iter)
{
return iter->next_index - iter->index;
@@ -425,9 +425,9 @@ radix_tree_next_slot(void **slot, struct
return slot + offset + 1;
}
} else {
- unsigned size = radix_tree_chunk_size(iter) - 1;
+ long size = radix_tree_chunk_size(iter);

- while (size--) {
+ while (--size > 0) {
slot++;
iter->index++;
if (likely(*slot))

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mathias Nyman <mathia...@linux.intel.com>

commit 5c82171167adb8e4ac77b91a42cd49fb211a81a0 upstream.

xhci driver frees data for all devices, both usb2 and and usb3 the
first time usb_remove_hcd() is called, including td_list and and xhci_ring
structures.

When usb_remove_hcd() is called a second time for the second xhci bus it
will try to dequeue all pending urbs, and touches td_list which is already
freed for that endpoint.

Reported-by: Joe Lawrence <joe.la...@stratus.com>
Tested-by: Joe Lawrence <joe.la...@stratus.com>
Signed-off-by: Mathias Nyman <mathia...@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/usb/host/xhci.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1549,7 +1549,9 @@ int xhci_urb_dequeue(struct usb_hcd *hcd
xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb,
"HW died, freeing TD.");
urb_priv = urb->hcpriv;
- for (i = urb_priv->td_cnt; i < urb_priv->length; i++) {
+ for (i = urb_priv->td_cnt;
+ i < urb_priv->length && xhci->devs[urb->dev->slot_id];
+ i++) {
td = urb_priv->td[i];
if (!list_empty(&td->td_list))
list_del_init(&td->td_list);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Chinner <da...@fromorbit.com>

commit 3e85286e75224fa3f08bdad20e78c8327742634e upstream.

This reverts commit 24ba16bb3d499c49974669cd8429c3e4138ab102 as it
prevents machines from suspending. This regression occurs when the
xfsaild is idle on entry to suspend, and so there s no activity to
wake it from it's idle sleep and hence see that it is supposed to
freeze. Hence the freezer times out waiting for it and suspend is
cancelled.

There is no obvious fix for this short of freezing the filesystem
properly, so revert this change for now.

Signed-off-by: Dave Chinner <da...@fromorbit.com>
Acked-by: Jiri Kosina <jko...@suse.cz>
Reviewed-by: Brian Foster <bfo...@redhat.com>
Signed-off-by: Dave Chinner <da...@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/xfs/xfs_trans_ail.c | 1 -
1 file changed, 1 deletion(-)

--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -497,7 +497,6 @@ xfsaild(
long tout = 0; /* milliseconds */

current->flags |= PF_MEMALLOC;
- set_freezable();

while (!kthread_should_stop()) {
if (tout && tout <= 20)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gavin Shan <gws...@linux.vnet.ibm.com>

commit 1bc74f1ccd457832dc515fc1febe6655985fdcd2 upstream.

When PCI bus is unplugged during full hotplug for EEH recovery,
the platform PE instance (struct pnv_ioda_pe) isn't released and
it dereferences the stale PCI bus that has been released. It leads
to kernel crash when referring to the stale PCI bus.

This fixes the issue by correcting the PE's primary bus when it's
oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
be called by pcibios_fixup_bus().

Reported-by: Andrew Donnellan <andrew.d...@au1.ibm.com>
Reported-by: Pradipta Ghosh <prad...@in.ibm.com>
Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.d...@au1.ibm.com>
Signed-off-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/powerpc/platforms/powernv/pci-ioda.c | 1 +
arch/powerpc/platforms/powernv/pci.c | 20 ++++++++++++++++++++
arch/powerpc/platforms/powernv/pci.h | 1 +
3 files changed, 22 insertions(+)

--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3034,6 +3034,7 @@ static void pnv_pci_ioda_shutdown(struct

static const struct pci_controller_ops pnv_pci_ioda_controller_ops = {
.dma_dev_setup = pnv_pci_dma_dev_setup,
+ .dma_bus_setup = pnv_pci_dma_bus_setup,
#ifdef CONFIG_PCI_MSI
.setup_msi_irqs = pnv_setup_msi_irqs,
.teardown_msi_irqs = pnv_teardown_msi_irqs,
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -762,6 +762,26 @@ void pnv_pci_dma_dev_setup(struct pci_de
phb->dma_dev_setup(phb, pdev);
}

+void pnv_pci_dma_bus_setup(struct pci_bus *bus)
+{
+ struct pci_controller *hose = bus->sysdata;
+ struct pnv_phb *phb = hose->private_data;
+ struct pnv_ioda_pe *pe;
+
+ list_for_each_entry(pe, &phb->ioda.pe_list, list) {
+ if (!(pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL)))
+ continue;
+
+ if (!pe->pbus)
+ continue;
+
+ if (bus->number == ((pe->rid >> 8) & 0xFF)) {
+ pe->pbus = bus;
+ break;
+ }
+ }
+}
+
void pnv_pci_shutdown(void)
{
struct pci_controller *hose;
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -235,6 +235,7 @@ extern void pnv_pci_reset_secondary_bus(
extern int pnv_eeh_phb_reset(struct pci_controller *hose, int option);

extern void pnv_pci_dma_dev_setup(struct pci_dev *pdev);
+extern void pnv_pci_dma_bus_setup(struct pci_bus *bus);
extern int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type);
extern void pnv_teardown_msi_irqs(struct pci_dev *pdev);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sudip Mukherjee <sudipm.m...@gmail.com>

commit 601f1db653217f205ffa5fb33514b4e1711e56d1 upstream.

The build of m32104ut_defconfig for m32r arch was failing for long long
time with the error:

ERROR: "memory_start" [fs/udf/udf.ko] undefined!
ERROR: "memory_end" [fs/udf/udf.ko] undefined!
ERROR: "memory_end" [drivers/scsi/sg.ko] undefined!
ERROR: "memory_start" [drivers/scsi/sg.ko] undefined!
ERROR: "memory_end" [drivers/i2c/i2c-dev.ko] undefined!
ERROR: "memory_start" [drivers/i2c/i2c-dev.ko] undefined!

As done in other architectures export the symbols to fix the error.

Reported-by: Fengguang Wu <fenggu...@intel.com>
Signed-off-by: Sudip Mukherjee <su...@vectorindia.org>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/m32r/kernel/setup.c | 3 +++
1 file changed, 3 insertions(+)

--- a/arch/m32r/kernel/setup.c
+++ b/arch/m32r/kernel/setup.c
@@ -81,7 +81,10 @@ static struct resource code_resource = {
};

unsigned long memory_start;
+EXPORT_SYMBOL(memory_start);
+
unsigned long memory_end;
+EXPORT_SYMBOL(memory_end);

void __init setup_arch(char **);
int get_cpuinfo(char *);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Luis R. Rodriguez <mcg...@suse.com>

commit 4355efbd80482a961cae849281a8ef866e53d55c upstream.

Commit f2411da746985 ("driver-core: add driver module
asynchronous probe support") added async probe support,
in two forms:

* in-kernel driver specification annotation
* generic async_probe module parameter (modprobe foo async_probe)

To support the generic kernel parameter parse_args() was
extended via commit ecc8617053e0 ("module: add extra
argument for parse_params() callback") however commit
failed to f2411da746985 failed to add the required argument.

This causes a crash then whenever async_probe generic
module parameter is used. This was overlooked when the
form in which in-kernel async probe support was reworked
a bit... Fix this as originally intended.

Cc: Hannes Reinecke <ha...@suse.de>
Cc: Dmitry Torokhov <dmitry....@gmail.com>
Signed-off-by: Luis R. Rodriguez <mcg...@suse.com>
Signed-off-by: Rusty Russell <ru...@rustcorp.com.au> [minimized]
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
kernel/module.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3515,7 +3515,7 @@ static int load_module(struct load_info

/* Module is ready to execute: parsing args may do that. */
after_dashes = parse_args(mod->name, mod->args, mod->kp, mod->num_kp,
- -32768, 32767, NULL,
+ -32768, 32767, mod,
unknown_module_param_cb);
if (IS_ERR(after_dashes)) {
err = PTR_ERR(after_dashes);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Baoquan He <b...@redhat.com>

commit 9b1a12d29109234d2b9718d04d4d404b7da4e794 upstream.

In below commit alias DTE is set when its peripheral is
setting DTE. However there's a code bug here to wrongly
set the alias DTE, correct it in this patch.

commit e25bfb56ea7f046b71414e02f80f620deb5c6362
Author: Joerg Roedel <jro...@suse.de>
Date: Tue Oct 20 17:33:38 2015 +0200

iommu/amd: Set alias DTE in do_attach/do_detach

Signed-off-by: Baoquan He <b...@redhat.com>
Tested-by: Mark Hounschell <ma...@compro.net>
Signed-off-by: Joerg Roedel <jro...@suse.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iommu/amd_iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1905,7 +1905,7 @@ static void do_attach(struct iommu_dev_d
/* Update device table */
set_dte_entry(dev_data->devid, domain, ats);
if (alias != dev_data->devid)
- set_dte_entry(dev_data->devid, domain, ats);
+ set_dte_entry(alias, domain, ats);

device_flush_dte(dev_data);
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gavin Shan <gws...@linux.vnet.ibm.com>

commit 05ba75f848647135f063199dc0e9f40fee769724 upstream.

When PE is created, its primary bus is cached to pe->bus. At later
point, the cached primary bus is returned from eeh_pe_bus_get().
However, we could get stale cached primary bus and run into kernel
crash in one case: full hotplug as part of fenced PHB error recovery
releases all PCI busses under the PHB at unplugging time and recreate
them at plugging time. pe->bus is still dereferencing the PCI bus
that was released.

This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity
of pe->bus. pe->bus is updated when its first child EEH device is
online and the flag is set. Before unplugging in full hotplug for
error recovery, the flag is cleared.

Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE")
Reported-by: Andrew Donnellan <andrew.d...@au1.ibm.com>
Reported-by: Pradipta Ghosh <prad...@in.ibm.com>
Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.d...@au1.ibm.com>
Signed-off-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/powerpc/include/asm/eeh.h | 1 +
arch/powerpc/kernel/eeh_driver.c | 3 +++
arch/powerpc/kernel/eeh_pe.c | 2 +-
arch/powerpc/platforms/powernv/eeh-powernv.c | 5 ++++-
4 files changed, 9 insertions(+), 2 deletions(-)

--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -81,6 +81,7 @@ struct pci_dn;
#define EEH_PE_KEEP (1 << 8) /* Keep PE on hotplug */
#define EEH_PE_CFG_RESTRICTED (1 << 9) /* Block config on error */
#define EEH_PE_REMOVED (1 << 10) /* Removed permanently */
+#define EEH_PE_PRI_BUS (1 << 11) /* Cached primary bus */

struct eeh_pe {
int type; /* PE type: PHB/Bus/Device */
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -564,6 +564,7 @@ static int eeh_reset_device(struct eeh_p
*/
eeh_pe_state_mark(pe, EEH_PE_KEEP);
if (bus) {
+ eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
pci_lock_rescan_remove();
pcibios_remove_pci_devices(bus);
pci_unlock_rescan_remove();
@@ -803,6 +804,7 @@ perm_error:
* the their PCI config any more.
*/
if (frozen_bus) {
+ eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);

pci_lock_rescan_remove();
@@ -886,6 +888,7 @@ static void eeh_handle_special_event(voi
continue;

/* Notify all devices to be down */
+ eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
bus = eeh_pe_bus_get(phb_pe);
eeh_pe_dev_traverse(pe,
eeh_report_failure, NULL);
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -928,7 +928,7 @@ struct pci_bus *eeh_pe_bus_get(struct ee
bus = pe->phb->bus;
} else if (pe->type & EEH_PE_BUS ||
pe->type & EEH_PE_DEVICE) {
- if (pe->bus) {
+ if (pe->state & EEH_PE_PRI_BUS) {
bus = pe->bus;
goto out;
}
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -444,9 +444,12 @@ static void *pnv_eeh_probe(struct pci_dn
* PCI devices of the PE are expected to be removed prior
* to PE reset.
*/
- if (!edev->pe->bus)
+ if (!(edev->pe->state & EEH_PE_PRI_BUS)) {
edev->pe->bus = pci_find_bus(hose->global_number,
pdn->busno);
+ if (edev->pe->bus)
+ edev->pe->state |= EEH_PE_PRI_BUS;
+ }

/*
* Enable EEH explicitly so that we will do EEH check

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:08 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Rusty Russell <ru...@rustcorp.com.au>

commit 2e7bac536106236104e9e339531ff0fcdb7b8147 upstream.

This trivial wrapper adds clarity and makes the following patch
smaller.

Signed-off-by: Rusty Russell <ru...@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
kernel/module.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)

--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3646,6 +3646,11 @@ static inline int is_arm_mapping_symbol(
&& (str[2] == '\0' || str[2] == '.');
}

+static const char *symname(struct module *mod, unsigned int symnum)
+{
+ return mod->strtab + mod->symtab[symnum].st_name;
+}
+
static const char *get_ksymbol(struct module *mod,
unsigned long addr,
unsigned long *size,
@@ -3668,15 +3673,15 @@ static const char *get_ksymbol(struct mo

/* We ignore unnamed symbols: they're uninformative
* and inserted at a whim. */
+ if (*symname(mod, i) == '\0'
+ || is_arm_mapping_symbol(symname(mod, i)))
+ continue;
+
if (mod->symtab[i].st_value <= addr
- && mod->symtab[i].st_value > mod->symtab[best].st_value
- && *(mod->strtab + mod->symtab[i].st_name) != '\0'
- && !is_arm_mapping_symbol(mod->strtab + mod->symtab[i].st_name))
+ && mod->symtab[i].st_value > mod->symtab[best].st_value)
best = i;
if (mod->symtab[i].st_value > addr
- && mod->symtab[i].st_value < nextval
- && *(mod->strtab + mod->symtab[i].st_name) != '\0'
- && !is_arm_mapping_symbol(mod->strtab + mod->symtab[i].st_name))
+ && mod->symtab[i].st_value < nextval)
nextval = mod->symtab[i].st_value;
}

@@ -3687,7 +3692,7 @@ static const char *get_ksymbol(struct mo
*size = nextval - mod->symtab[best].st_value;
if (offset)
*offset = addr - mod->symtab[best].st_value;
- return mod->strtab + mod->symtab[best].st_name;
+ return symname(mod, best);
}

/* For kallsyms to ask for address resolution. NULL means not found. Careful
@@ -3782,8 +3787,7 @@ int module_get_kallsym(unsigned int symn
if (symnum < mod->num_symtab) {
*value = mod->symtab[symnum].st_value;
*type = mod->symtab[symnum].st_info;
- strlcpy(name, mod->strtab + mod->symtab[symnum].st_name,
- KSYM_NAME_LEN);
+ strlcpy(name, symname(mod, symnum), KSYM_NAME_LEN);
strlcpy(module_name, mod->name, MODULE_NAME_LEN);
*exported = is_exported(name, *value, mod);
preempt_enable();
@@ -3800,7 +3804,7 @@ static unsigned long mod_find_symname(st
unsigned int i;

for (i = 0; i < mod->num_symtab; i++)
- if (strcmp(name, mod->strtab+mod->symtab[i].st_name) == 0 &&
+ if (strcmp(name, symname(mod, i)) == 0 &&
mod->symtab[i].st_info != 'U')
return mod->symtab[i].st_value;
return 0;
@@ -3844,7 +3848,7 @@ int module_kallsyms_on_each_symbol(int (
if (mod->state == MODULE_STATE_UNFORMED)
continue;
for (i = 0; i < mod->num_symtab; i++) {
- ret = fn(data, mod->strtab + mod->symtab[i].st_name,
+ ret = fn(data, symname(mod, i),
mod, mod->symtab[i].st_value);
if (ret != 0)
return ret;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Martijn Coenen <ma...@google.com>

commit 6611d8d76132f86faa501de9451a89bf23fb2371 upstream.

A spare array holding mem cgroup threshold events is kept around to make
sure we can always safely deregister an event and have an array to store
the new set of events in.

In the scenario where we're going from 1 to 0 registered events, the
pointer to the primary array containing 1 event is copied to the spare
slot, and then the spare slot is freed because no events are left.
However, it is freed before calling synchronize_rcu(), which means
readers may still be accessing threshold->primary after it is freed.

Fixed by only freeing after synchronize_rcu().

Signed-off-by: Martijn Coenen <ma...@google.com>
Cc: Johannes Weiner <han...@cmpxchg.org>
Acked-by: Michal Hocko <mho...@suse.com>
Cc: Vladimir Davydov <vdav...@virtuozzo.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
mm/memcontrol.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3522,16 +3522,17 @@ static void __mem_cgroup_usage_unregiste
swap_buffers:
/* Swap primary and spare array */
thresholds->spare = thresholds->primary;
- /* If all events are unregistered, free the spare array */
- if (!new) {
- kfree(thresholds->spare);
- thresholds->spare = NULL;
- }

rcu_assign_pointer(thresholds->primary, new);

/* To be sure that nobody uses thresholds */
synchronize_rcu();
+
+ /* If all events are unregistered, free the spare array */
+ if (!new) {
+ kfree(thresholds->spare);
+ thresholds->spare = NULL;
+ }
unlock:
mutex_unlock(&memcg->thresholds_lock);
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Konstantin Khlebnikov <koc...@gmail.com>

commit 12352d3cae2cebe18805a91fab34b534d7444231 upstream.

Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
anon_vma appeared between lock and unlock. We have to check anon_vma
first or call anon_vma_prepare() to be sure that it's here. There are
only few users of these legacy helpers. Let's get rid of them.

This patch fixes anon_vma lock imbalance in validate_mm(). Write lock
isn't required here, read lock is enough.

And reorders expand_downwards/expand_upwards: security_mmap_addr() and
wrapping-around check don't have to be under anon vma lock.

Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9...@mail.gmail.com
Signed-off-by: Konstantin Khlebnikov <koc...@gmail.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
Acked-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Cc: Andrea Arcangeli <aarc...@redhat.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
include/linux/rmap.h | 14 ------------
mm/mmap.c | 55 +++++++++++++++++++++++----------------------------
2 files changed, 25 insertions(+), 44 deletions(-)

--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -108,20 +108,6 @@ static inline void put_anon_vma(struct a
__put_anon_vma(anon_vma);
}

-static inline void vma_lock_anon_vma(struct vm_area_struct *vma)
-{
- struct anon_vma *anon_vma = vma->anon_vma;
- if (anon_vma)
- down_write(&anon_vma->root->rwsem);
-}
-
-static inline void vma_unlock_anon_vma(struct vm_area_struct *vma)
-{
- struct anon_vma *anon_vma = vma->anon_vma;
- if (anon_vma)
- up_write(&anon_vma->root->rwsem);
-}
-
static inline void anon_vma_lock_write(struct anon_vma *anon_vma)
{
down_write(&anon_vma->root->rwsem);
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -441,12 +441,16 @@ static void validate_mm(struct mm_struct
struct vm_area_struct *vma = mm->mmap;

while (vma) {
+ struct anon_vma *anon_vma = vma->anon_vma;
struct anon_vma_chain *avc;

- vma_lock_anon_vma(vma);
- list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
- anon_vma_interval_tree_verify(avc);
- vma_unlock_anon_vma(vma);
+ if (anon_vma) {
+ anon_vma_lock_read(anon_vma);
+ list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
+ anon_vma_interval_tree_verify(avc);
+ anon_vma_unlock_read(anon_vma);
+ }
+
highest_address = vma->vm_end;
vma = vma->vm_next;
i++;
@@ -2147,32 +2151,27 @@ static int acct_stack_growth(struct vm_a
int expand_upwards(struct vm_area_struct *vma, unsigned long address)
{
struct mm_struct *mm = vma->vm_mm;
- int error;
+ int error = 0;

if (!(vma->vm_flags & VM_GROWSUP))
return -EFAULT;

- /*
- * We must make sure the anon_vma is allocated
- * so that the anon_vma locking is not a noop.
- */
+ /* Guard against wrapping around to address 0. */
+ if (address < PAGE_ALIGN(address+4))
+ address = PAGE_ALIGN(address+4);
+ else
+ return -ENOMEM;
+
+ /* We must make sure the anon_vma is allocated. */
if (unlikely(anon_vma_prepare(vma)))
return -ENOMEM;
- vma_lock_anon_vma(vma);

/*
* vma->vm_start/vm_end cannot change under us because the caller
* is required to hold the mmap_sem in read mode. We need the
* anon_vma lock to serialize against concurrent expand_stacks.
- * Also guard against wrapping around to address 0.
*/
- if (address < PAGE_ALIGN(address+4))
- address = PAGE_ALIGN(address+4);
- else {
- vma_unlock_anon_vma(vma);
- return -ENOMEM;
- }
- error = 0;
+ anon_vma_lock_write(vma->anon_vma);

/* Somebody else might have raced and expanded it already */
if (address > vma->vm_end) {
@@ -2190,7 +2189,7 @@ int expand_upwards(struct vm_area_struct
* updates, but we only hold a shared mmap_sem
* lock here, so we need to protect against
* concurrent vma expansions.
- * vma_lock_anon_vma() doesn't help here, as
+ * anon_vma_lock_write() doesn't help here, as
* we don't guarantee that all growable vmas
* in a mm share the same root anon vma.
* So, we reuse mm->page_table_lock to guard
@@ -2214,7 +2213,7 @@ int expand_upwards(struct vm_area_struct
}
}
}
- vma_unlock_anon_vma(vma);
+ anon_vma_unlock_write(vma->anon_vma);
khugepaged_enter_vma_merge(vma, vma->vm_flags);
validate_mm(mm);
return error;
@@ -2230,25 +2229,21 @@ int expand_downwards(struct vm_area_stru
struct mm_struct *mm = vma->vm_mm;
int error;

- /*
- * We must make sure the anon_vma is allocated
- * so that the anon_vma locking is not a noop.
- */
- if (unlikely(anon_vma_prepare(vma)))
- return -ENOMEM;
-
address &= PAGE_MASK;
error = security_mmap_addr(address);
if (error)
return error;

- vma_lock_anon_vma(vma);
+ /* We must make sure the anon_vma is allocated. */
+ if (unlikely(anon_vma_prepare(vma)))
+ return -ENOMEM;

/*
* vma->vm_start/vm_end cannot change under us because the caller
* is required to hold the mmap_sem in read mode. We need the
* anon_vma lock to serialize against concurrent expand_stacks.
*/
+ anon_vma_lock_write(vma->anon_vma);

/* Somebody else might have raced and expanded it already */
if (address < vma->vm_start) {
@@ -2266,7 +2261,7 @@ int expand_downwards(struct vm_area_stru
* updates, but we only hold a shared mmap_sem
* lock here, so we need to protect against
* concurrent vma expansions.
- * vma_lock_anon_vma() doesn't help here, as
+ * anon_vma_lock_write() doesn't help here, as
* we don't guarantee that all growable vmas
* in a mm share the same root anon vma.
* So, we reuse mm->page_table_lock to guard
@@ -2288,7 +2283,7 @@ int expand_downwards(struct vm_area_stru
}
}
}
- vma_unlock_anon_vma(vma);
+ anon_vma_unlock_write(vma->anon_vma);
khugepaged_enter_vma_merge(vma, vma->vm_flags);
validate_mm(mm);
return error;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:12 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gavin Shan <gws...@linux.vnet.ibm.com>

commit 7e56f627768da4e6480986b5145dc3422bc448a5 upstream.

In eeh_pe_loc_get(), the PE location code is retrieved from the
"ibm,loc-code" property of the device node for the bridge of the
PE's primary bus. It's not correct because the property indicates
the parent PE's location code.

This reads the correct PE location code from "ibm,io-base-loc-code"
or "ibm,slot-location-code" property of PE parent bus's device node.

Fixes: 357b2f3dd9b7 ("powerpc/eeh: Dump PE location code")
Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com>
Tested-by: Russell Currey <rus...@russell.cc>
Signed-off-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/powerpc/kernel/eeh_pe.c | 33 +++++++++++++++------------------
1 file changed, 15 insertions(+), 18 deletions(-)

--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -883,32 +883,29 @@ void eeh_pe_restore_bars(struct eeh_pe *
const char *eeh_pe_loc_get(struct eeh_pe *pe)
{
struct pci_bus *bus = eeh_pe_bus_get(pe);
- struct device_node *dn = pci_bus_to_OF_node(bus);
+ struct device_node *dn;
const char *loc = NULL;

- if (!dn)
- goto out;
+ while (bus) {
+ dn = pci_bus_to_OF_node(bus);
+ if (!dn) {
+ bus = bus->parent;
+ continue;
+ }

- /* PHB PE or root PE ? */
- if (pci_is_root_bus(bus)) {
- loc = of_get_property(dn, "ibm,loc-code", NULL);
- if (!loc)
+ if (pci_is_root_bus(bus))
loc = of_get_property(dn, "ibm,io-base-loc-code", NULL);
+ else
+ loc = of_get_property(dn, "ibm,slot-location-code",
+ NULL);
+
if (loc)
- goto out;
+ return loc;

- /* Check the root port */
- dn = dn->child;
- if (!dn)
- goto out;
+ bus = bus->parent;
}

- loc = of_get_property(dn, "ibm,loc-code", NULL);
- if (!loc)
- loc = of_get_property(dn, "ibm,slot-location-code", NULL);
-
-out:
- return loc ? loc : "N/A";
+ return "N/A";
}

/**

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:15 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mathias Nyman <mathia...@linux.intel.com>

commit a6835090716a85f2297668ba593bd00e1051e662 upstream.

This reverts commit e210c422b6fd ("xhci: don't finish a TD if we get a
short transfer event mid TD")

Turns out that most host controllers do not follow the xHCI specs and never
send the second event for the last TRB in the TD if there was a short event
mid-TD.

Returning the URB directly after the first short-transfer event is far
better than never returning the URB. (class drivers usually timeout
after 30sec). For the hosts that do send the second event we will go
back to treating it as misplaced event and print an error message for it.

The origial patch was sent to stable kernels and needs to be reverted from
there as well

Signed-off-by: Mathias Nyman <mathia...@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/usb/host/xhci-ring.c | 10 ----------
1 file changed, 10 deletions(-)

--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2192,10 +2192,6 @@ static int process_bulk_intr_td(struct x
}
/* Fast path - was this the last TRB in the TD for this URB? */
} else if (event_trb == td->last_trb) {
- if (td->urb_length_set && trb_comp_code == COMP_SHORT_TX)
- return finish_td(xhci, td, event_trb, event, ep,
- status, false);
-
if (EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)) != 0) {
td->urb->actual_length =
td->urb->transfer_buffer_length -
@@ -2247,12 +2243,6 @@ static int process_bulk_intr_td(struct x
td->urb->actual_length +=
TRB_LEN(le32_to_cpu(cur_trb->generic.field[2])) -
EVENT_TRB_LEN(le32_to_cpu(event->transfer_len));
-
- if (trb_comp_code == COMP_SHORT_TX) {
- xhci_dbg(xhci, "mid bulk/intr SP, wait for last TRB event\n");
- td->urb_length_set = true;
- return 0;
- }
}

return finish_td(xhci, td, event_trb, event, ep, status, false);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tg...@linutronix.de>

commit 572c39172684c3711e4a03c9a7380067e2b0661c upstream.

As Helge reported for timerfd we have the same issue in posix timers. We
return remaining time larger than the programmed relative time to user space
in case of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra
time added in hrtimer_start_range_ns().

Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Helge Deller <del...@gmx.de>
Cc: John Stultz <john....@linaro.org>
Cc: linux...@lists.linux-m68k.org
Cc: dhow...@redhat.com
Link: http://lkml.kernel.org/r/201601141641...@linutronix.de
Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
kernel/time/posix-timers.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -760,7 +760,7 @@ common_timer_get(struct k_itimer *timr,
(timr->it_sigev_notify & ~SIGEV_THREAD_ID) == SIGEV_NONE))
timr->it_overrun += (unsigned int) hrtimer_forward(timer, now, iv);

- remaining = ktime_sub(hrtimer_get_expires(timer), now);
+ remaining = __hrtimer_expires_remaining_adjusted(timer, now);
/* Return 0 only, when the timer is expired and not pending */
if (remaining.tv64 <= 0) {
/*

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andreas Schwab <sch...@linux-m68k.org>

commit f15838e9cac8f78f0cc506529bb9d3b9fa589c1f upstream.

Since binutils 2.26 BFD is doing suffix merging on STRTAB sections. But
dedotify modifies the symbol names in place, which can also modify
unrelated symbols with a name that matches a suffix of a dotted name. To
remove the leading dot of a symbol name we can just increment the pointer
into the STRTAB section instead.

Backport to all stables to avoid breakage when people update their
binutils - mpe.

Signed-off-by: Andreas Schwab <sch...@linux-m68k.org>
Signed-off-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/powerpc/kernel/module_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -340,7 +340,7 @@ static void dedotify(Elf64_Sym *syms, un
if (name[0] == '.') {
if (strcmp(name+1, "TOC.") == 0)
syms[i].st_shndx = SHN_ABS;
- memmove(name, name+1, strlen(name));
+ syms[i].st_name++;
}
}
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Holzheu <hol...@linux.vnet.ibm.com>

commit 5c2ff95e41c9290d16556cd02e35b25d81be8fe0 upstream.

When working with hugetlbfs ptes (which are actually pmds) is not valid to
directly use pte functions like pte_present() because the hardware bit
layout of pmds and ptes can be different. This is the case on s390.
Therefore we have to convert the hugetlbfs ptes first into a valid pte
encoding with huge_ptep_get().

Currently the /proc/<pid>/numa_maps code uses hugetlbfs ptes without
huge_ptep_get(). On s390 this leads to the following two problems:

1) The pte_present() function returns false (instead of true) for
PROT_NONE hugetlb ptes. Therefore PROT_NONE vmas are missing
completely in the "numa_maps" output.

2) The pte_dirty() function always returns false for all hugetlb ptes.
Therefore these pages are reported as "mapped=xxx" instead of
"dirty=xxx".

Therefore use huge_ptep_get() to correctly convert the hugetlb ptes.

Signed-off-by: Michael Holzheu <hol...@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald....@de.ibm.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/proc/task_mmu.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1473,18 +1473,19 @@ static int gather_pte_stats(pmd_t *pmd,
static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask,
unsigned long addr, unsigned long end, struct mm_walk *walk)
{
+ pte_t huge_pte = huge_ptep_get(pte);
struct numa_maps *md;
struct page *page;

- if (!pte_present(*pte))
+ if (!pte_present(huge_pte))
return 0;

- page = pte_page(*pte);
+ page = pte_page(huge_pte);
if (!page)
return 0;

md = walk->private;
- gather_stats(page, md, pte_dirty(*pte), 1);
+ gather_stats(page, md, pte_dirty(huge_pte), 1);
return 0;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jeremy McNicoll <jmcn...@redhat.com>

commit da972fb13bc5a1baad450c11f9182e4cd0a091f6 upstream.

Fix a simple typo when disabling IOTLB on PCI(e) devices.

Fixes: b16d0cb9e2fc ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS")
Signed-off-by: Jeremy McNicoll <jmcn...@redhat.com>
Reviewed-by: Alex Williamson <alex.wi...@redhat.com>
Signed-off-by: Joerg Roedel <jro...@suse.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iommu/intel-iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1489,7 +1489,7 @@ static void iommu_disable_dev_iotlb(stru
{
struct pci_dev *pdev;

- if (dev_is_pci(info->dev))
+ if (!dev_is_pci(info->dev))
return;

pdev = to_pci_dev(info->dev);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Rich Felker <dal...@libc.org>

commit 114bf37e04d839b555b3dc460b5e6ce156f49cf0 upstream.

Add Yoshinori Sato and Rich Felker as maintainers for arch/sh
(SUPERH).

Signed-off-by: Rich Felker <dal...@libc.org>
Signed-off-by: Yoshinori Sato <ys...@users.sourceforge.jp>
Acked-by: D. Jeff Dionne <je...@uClinux.org>
Acked-by: Rob Landley <r...@landley.net>
Acked-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Acked-by: Simon Horman <horms+...@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+...@glider.be>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
MAINTAINERS | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10289,9 +10289,11 @@ S: Maintained
F: drivers/net/ethernet/dlink/sundance.c

SUPERH
+M: Yoshinori Sato <ys...@users.sourceforge.jp>
+M: Rich Felker <dal...@libc.org>
L: linu...@vger.kernel.org
Q: http://patchwork.kernel.org/project/linux-sh/list/
-S: Orphan
+S: Maintained
F: Documentation/sh/
F: arch/sh/
F: drivers/sh/

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: James Bottomley <JBott...@Odin.com>

commit 564b026fbd0d28e9f70fb3831293d2922bb7855b upstream.

It was noticed that we lose precision in the final calculation for some
inputs. The most egregious example is size=3000 blk_size=1900 in units
of 10 should yield 5.70 MB but in fact yields 3.00 MB (oops).

This is because the current algorithm doesn't correctly account for
all the remainders in the logarithms. Fix this by doing a correct
calculation in the remainders based on napier's algorithm.

Additionally, now we have the correct result, we have to account for
arithmetic rounding because we're printing 3 digits of precision. This
means that if the fourth digit is five or greater, we have to round up,
so add a section to ensure correct rounding. Finally account for all
possible inputs correctly, including zero for block size.

Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
Signed-off-by: James Bottomley <JBott...@Odin.com>
Reported-by: Vitaly Kuznetsov <vkuz...@redhat.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
lib/string_helpers.c | 63 ++++++++++++++++++++++++++++++++++-----------------
1 file changed, 43 insertions(+), 20 deletions(-)

--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -43,50 +43,73 @@ void string_get_size(u64 size, u64 blk_s
[STRING_UNITS_10] = 1000,
[STRING_UNITS_2] = 1024,
};
- int i, j;
- u32 remainder = 0, sf_cap, exp;
+ static const unsigned int rounding[] = { 500, 50, 5 };
+ int i = 0, j;
+ u32 remainder = 0, sf_cap;
char tmp[8];
const char *unit;

tmp[0] = '\0';
- i = 0;
- if (!size)
+
+ if (blk_size == 0)
+ size = 0;
+ if (size == 0)
goto out;

- while (blk_size >= divisor[units]) {
- remainder = do_div(blk_size, divisor[units]);
+ /* This is Napier's algorithm. Reduce the original block size to
+ *
+ * coefficient * divisor[units]^i
+ *
+ * we do the reduction so both coefficients are just under 32 bits so
+ * that multiplying them together won't overflow 64 bits and we keep
+ * as much precision as possible in the numbers.
+ *
+ * Note: it's safe to throw away the remainders here because all the
+ * precision is in the coefficients.
+ */
+ while (blk_size >> 32) {
+ do_div(blk_size, divisor[units]);
i++;
}

- exp = divisor[units] / (u32)blk_size;
- /*
- * size must be strictly greater than exp here to ensure that remainder
- * is greater than divisor[units] coming out of the if below.
- */
- if (size > exp) {
- remainder = do_div(size, divisor[units]);
- remainder *= blk_size;
+ while (size >> 32) {
+ do_div(size, divisor[units]);
i++;
- } else {
- remainder *= size;
}

+ /* now perform the actual multiplication keeping i as the sum of the
+ * two logarithms */
size *= blk_size;
- size += remainder / divisor[units];
- remainder %= divisor[units];

+ /* and logarithmically reduce it until it's just under the divisor */
while (size >= divisor[units]) {
remainder = do_div(size, divisor[units]);
i++;
}

+ /* work out in j how many digits of precision we need from the
+ * remainder */
sf_cap = size;
for (j = 0; sf_cap*10 < 1000; j++)
sf_cap *= 10;

- if (j) {
+ if (units == STRING_UNITS_2) {
+ /* express the remainder as a decimal. It's currently the
+ * numerator of a fraction whose denominator is
+ * divisor[units], which is 1 << 10 for STRING_UNITS_2 */
remainder *= 1000;
- remainder /= divisor[units];
+ remainder >>= 10;
+ }
+
+ /* add a 5 to the digit below what will be printed to ensure
+ * an arithmetical round up and carry it through to size */
+ remainder += rounding[j];
+ if (remainder >= 1000) {
+ remainder -= 1000;
+ size += 1;
+ }
+
+ if (j) {
snprintf(tmp, sizeof(tmp), ".%03u", remainder);
tmp[j+1] = '\0';
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:17 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alan Modra <amo...@gmail.com>

commit c153693d7eb9eeb28478aa2deaaf0b4e7b5ff5e9 upstream.

PowerPC64 uses the symbol .TOC. much as other targets use
_GLOBAL_OFFSET_TABLE_. It identifies the value of the GOT pointer (or in
powerpc parlance, the TOC pointer). Global offset tables are generally
local to an executable or shared library, or in the kernel, module. Thus
it does not make sense for a module to resolve a relocation against
.TOC. to the kernel's .TOC. value. A module has its own .TOC., and
indeed the powerpc64 module relocation processing ignores the kernel
value of .TOC. and instead calculates a module-local value.

This patch removes code involved in exporting the kernel .TOC., tweaks
modpost to ignore an undefined .TOC., and the module loader to twiddle
the section symbol so that .TOC. isn't seen as undefined.

Note that if the kernel was compiled with -msingle-pic-base then ELFv2
would not have function global entry code setting up r2. In that case
the module call stubs would need to be modified to set up r2 using the
kernel .TOC. value, requiring some of this code to be reinstated.

mpe: Furthermore a change in binutils master (not yet released) causes
the current way we handle the TOC to no longer work when building with
MODVERSIONS=y and RELOCATABLE=n. The symptom is that modules can not be
loaded due to there being no version found for TOC.

Signed-off-by: Alan Modra <amo...@gmail.com>
Signed-off-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/powerpc/kernel/misc_64.S | 28 ----------------------------
arch/powerpc/kernel/module_64.c | 12 +++++++++---
scripts/mod/modpost.c | 3 ++-
3 files changed, 11 insertions(+), 32 deletions(-)

--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -701,31 +701,3 @@ _GLOBAL(kexec_sequence)
li r5,0
blr /* image->start(physid, image->start, 0); */
#endif /* CONFIG_KEXEC */
-
-#ifdef CONFIG_MODULES
-#if defined(_CALL_ELF) && _CALL_ELF == 2
-
-#ifdef CONFIG_MODVERSIONS
-.weak __crc_TOC.
-.section "___kcrctab+TOC.","a"
-.globl __kcrctab_TOC.
-__kcrctab_TOC.:
- .llong __crc_TOC.
-#endif
-
-/*
- * Export a fake .TOC. since both modpost and depmod will complain otherwise.
- * Both modpost and depmod strip the leading . so we do the same here.
- */
-.section "__ksymtab_strings","a"
-__kstrtab_TOC.:
- .asciz "TOC."
-
-.section "___ksymtab+TOC.","a"
-/* This symbol name is important: it's used by modpost to find exported syms */
-.globl __ksymtab_TOC.
-__ksymtab_TOC.:
- .llong 0 /* .value */
- .llong __kstrtab_TOC.
-#endif /* ELFv2 */
-#endif /* MODULES */
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -326,7 +326,10 @@ static void dedotify_versions(struct mod
}
}

-/* Undefined symbols which refer to .funcname, hack to funcname (or .TOC.) */
+/*
+ * Undefined symbols which refer to .funcname, hack to funcname. Make .TOC.
+ * seem to be defined (value set later).
+ */
static void dedotify(Elf64_Sym *syms, unsigned int numsyms, char *strtab)
{
unsigned int i;
@@ -334,8 +337,11 @@ static void dedotify(Elf64_Sym *syms, un
for (i = 1; i < numsyms; i++) {
if (syms[i].st_shndx == SHN_UNDEF) {
char *name = strtab + syms[i].st_name;
- if (name[0] == '.')
+ if (name[0] == '.') {
+ if (strcmp(name+1, "TOC.") == 0)
+ syms[i].st_shndx = SHN_ABS;
memmove(name, name+1, strlen(name));
+ }
}
}
}
@@ -351,7 +357,7 @@ static Elf64_Sym *find_dot_toc(Elf64_Shd
numsyms = sechdrs[symindex].sh_size / sizeof(Elf64_Sym);

for (i = 1; i < numsyms; i++) {
- if (syms[i].st_shndx == SHN_UNDEF
+ if (syms[i].st_shndx == SHN_ABS
&& strcmp(strtab + syms[i].st_name, "TOC.") == 0)
return &syms[i];
}
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -594,7 +594,8 @@ static int ignore_undef_symbol(struct el
if (strncmp(symname, "_restgpr0_", sizeof("_restgpr0_") - 1) == 0 ||
strncmp(symname, "_savegpr0_", sizeof("_savegpr0_") - 1) == 0 ||
strncmp(symname, "_restvr_", sizeof("_restvr_") - 1) == 0 ||
- strncmp(symname, "_savevr_", sizeof("_savevr_") - 1) == 0)
+ strncmp(symname, "_savevr_", sizeof("_savevr_") - 1) == 0 ||
+ strcmp(symname, ".TOC.") == 0)
return 1;
/* Do not ignore this symbol */
return 0;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:17 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Matthew Wilcox <wi...@linux.intel.com>

commit 46437f9a554fbe3e110580ca08ab703b59f2f95a upstream.

If the indirect_ptr bit is set on a slot, that indicates we need to redo
the lookup. Introduce a new function radix_tree_iter_retry() which
forces the loop to retry the lookup by setting 'slot' to NULL and
turning the iterator back to point at the problematic entry.

This is a pretty rare problem to hit at the moment; the lookup has to
race with a grow of the radix tree from a height of 0. The consequences
of hitting this race are that gang lookup could return a pointer to a
radix_tree_node instead of a pointer to whatever the user had inserted
in the tree.

Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator")
Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Cc: Hugh Dickins <hu...@google.com>
Cc: Ohad Ben-Cohen <oh...@wizery.com>
Cc: Konstantin Khlebnikov <khleb...@openvz.org>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
include/linux/radix-tree.h | 16 ++++++++++++++++
lib/radix-tree.c | 12 ++++++++++--
2 files changed, 26 insertions(+), 2 deletions(-)

--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -370,6 +370,22 @@ void **radix_tree_next_chunk(struct radi
struct radix_tree_iter *iter, unsigned flags);

/**
+ * radix_tree_iter_retry - retry this chunk of the iteration
+ * @iter: iterator state
+ *
+ * If we iterate over a tree protected only by the RCU lock, a race
+ * against deletion or creation may result in seeing a slot for which
+ * radix_tree_deref_retry() returns true. If so, call this function
+ * and continue the iteration.
+ */
+static inline __must_check
+void **radix_tree_iter_retry(struct radix_tree_iter *iter)
+{
+ iter->next_index = iter->index;
+ return NULL;
+}
+
+/**
* radix_tree_chunk_size - get current chunk size
*
* @iter: pointer to radix tree iterator
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -1019,9 +1019,13 @@ radix_tree_gang_lookup(struct radix_tree
return 0;

radix_tree_for_each_slot(slot, root, &iter, first_index) {
- results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+ results[ret] = rcu_dereference_raw(*slot);
if (!results[ret])
continue;
+ if (radix_tree_is_indirect_ptr(results[ret])) {
+ slot = radix_tree_iter_retry(&iter);
+ continue;
+ }
if (++ret == max_items)
break;
}
@@ -1098,9 +1102,13 @@ radix_tree_gang_lookup_tag(struct radix_
return 0;

radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) {
- results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot));
+ results[ret] = rcu_dereference_raw(*slot);
if (!results[ret])
continue;
+ if (radix_tree_is_indirect_ptr(results[ret])) {
+ slot = radix_tree_iter_retry(&iter);
+ continue;
+ }
if (++ret == max_items)
break;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:17 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mike Kravetz <mike.k...@oracle.com>

commit 9aacdd354d197ad64685941b36d28ea20ab88757 upstream.

Hillf Danton noticed bugs in the hugetlb_vmtruncate_list routine. The
argument end is of type pgoff_t. It was being converted to a vaddr
offset and passed to unmap_hugepage_range. However, end was also being
used as an argument to the vma_interval_tree_foreach controlling loop.
In addition, the conversion of end to vaddr offset was incorrect.

hugetlb_vmtruncate_list is called as part of a file truncate or
fallocate hole punch operation.

When truncating a hugetlbfs file, this bug could prevent some pages from
being unmapped. This is possible if there are multiple vmas mapping the
file, and there is a sufficiently sized hole between the mappings. The
size of the hole between two vmas (A,B) must be such that the starting
virtual address of B is greater than (ending virtual address of A <<
PAGE_SHIFT). In this case, the pages in B would not be unmapped. If
pages are not properly unmapped during truncate, the following BUG is
hit:

kernel BUG at fs/hugetlbfs/inode.c:428!

In the fallocate hole punch case, this bug could prevent pages from
being unmapped as in the truncate case. However, for hole punch the
result is that unmapped pages will not be removed during the operation.
For hole punch, it is also possible that more pages than desired will be
unmapped. This unnecessary unmapping will cause page faults to
reestablish the mappings on subsequent page access.

Fixes: 1bfad99ab (" hugetlbfs: hugetlb_vmtruncate_list() needs to take a range")Reported-by: Hillf Danton <hill...@alibaba-inc.com>
Signed-off-by: Mike Kravetz <mike.k...@oracle.com>
Cc: Hugh Dickins <hu...@google.com>
Cc: Naoya Horiguchi <n-hor...@ah.jp.nec.com>
Cc: Davidlohr Bueso <da...@stgolabs.net>
Cc: Dave Hansen <dave....@linux.intel.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/hugetlbfs/inode.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -463,6 +463,7 @@ hugetlb_vmdelete_list(struct rb_root *ro
*/
vma_interval_tree_foreach(vma, root, start, end ? end : ULONG_MAX) {
unsigned long v_offset;
+ unsigned long v_end;

/*
* Can the expression below overflow on 32-bit arches?
@@ -475,15 +476,17 @@ hugetlb_vmdelete_list(struct rb_root *ro
else
v_offset = 0;

- if (end) {
- end = ((end - start) << PAGE_SHIFT) +
- vma->vm_start + v_offset;
- if (end > vma->vm_end)
- end = vma->vm_end;
- } else
- end = vma->vm_end;
+ if (!end)
+ v_end = vma->vm_end;
+ else {
+ v_end = ((end - vma->vm_pgoff) << PAGE_SHIFT)
+ + vma->vm_start;
+ if (v_end > vma->vm_end)
+ v_end = vma->vm_end;
+ }

- unmap_hugepage_range(vma, vma->vm_start + v_offset, end, NULL);
+ unmap_hugepage_range(vma, vma->vm_start + v_offset, v_end,
+ NULL);
}
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:18 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dmitry Torokhov <dmitry....@gmail.com>

commit d4f1b06d685d11ebdaccf11c0db1cb3c78736862 upstream.

We should set device's capabilities first, and then register it,
otherwise various handlers already present in the kernel will not be
able to connect to the device.

Reported-by: Lauri Kasanen <ca...@gmx.com>
Signed-off-by: Dmitry Torokhov <dmitry....@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/input/mouse/vmmouse.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

--- a/drivers/input/mouse/vmmouse.c
+++ b/drivers/input/mouse/vmmouse.c
@@ -458,8 +458,6 @@ int vmmouse_init(struct psmouse *psmouse
priv->abs_dev = abs_dev;
psmouse->private = priv;

- input_set_capability(rel_dev, EV_REL, REL_WHEEL);
-
/* Set up and register absolute device */
snprintf(priv->phys, sizeof(priv->phys), "%s/input1",
psmouse->ps2dev.serio->phys);
@@ -475,10 +473,6 @@ int vmmouse_init(struct psmouse *psmouse
abs_dev->id.version = psmouse->model;
abs_dev->dev.parent = &psmouse->ps2dev.serio->dev;

- error = input_register_device(priv->abs_dev);
- if (error)
- goto init_fail;
-
/* Set absolute device capabilities */
input_set_capability(abs_dev, EV_KEY, BTN_LEFT);
input_set_capability(abs_dev, EV_KEY, BTN_RIGHT);
@@ -488,6 +482,13 @@ int vmmouse_init(struct psmouse *psmouse
input_set_abs_params(abs_dev, ABS_X, 0, VMMOUSE_MAX_X, 0, 0);
input_set_abs_params(abs_dev, ABS_Y, 0, VMMOUSE_MAX_Y, 0, 0);

+ error = input_register_device(priv->abs_dev);
+ if (error)
+ goto init_fail;
+
+ /* Add wheel capability to the relative device */
+ input_set_capability(rel_dev, EV_REL, REL_WHEEL);
+
psmouse->protocol_handler = vmmouse_process_byte;
psmouse->disconnect = vmmouse_disconnect;
psmouse->reconnect = vmmouse_reconnect;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:19 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <kirill....@linux.intel.com>

commit 7162a1e87b3e380133dadc7909081bb70d0a7041 upstream.

Tetsuo Handa reported underflow of NR_MLOCK on munlock.

Testcase:

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>

#define BASE ((void *)0x400000000000)
#define SIZE (1UL << 21)

int main(int argc, char *argv[])
{
void *addr;

system("grep Mlocked /proc/meminfo");
addr = mmap(BASE, SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_LOCKED | MAP_FIXED,
-1, 0);
if (addr == MAP_FAILED)
printf("mmap() failed\n"), exit(1);
munmap(addr, SIZE);
system("grep Mlocked /proc/meminfo");
return 0;
}

It happens on munlock_vma_page() due to unfortunate choice of nr_pages
data type:

__mod_zone_page_state(zone, NR_MLOCK, -nr_pages);

For unsigned int nr_pages, implicitly casted to long in
__mod_zone_page_state(), it becomes something around UINT_MAX.

munlock_vma_page() usually called for THP as small pages go though
pagevec.

Let's make nr_pages signed int.

Similar fixes in 6cdb18ad98a4 ("mm/vmstat: fix overflow in
mod_zone_page_state()") used `long' type, but `int' here is OK for a
count of the number of sub-pages in a huge page.

Fixes: ff6a6da60b89 ("mm: accelerate munlock() treatment of THP pages")
Signed-off-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Reported-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Tested-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Cc: Michel Lespinasse <wal...@google.com>
Acked-by: Michal Hocko <mho...@suse.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
mm/mlock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -172,7 +172,7 @@ static void __munlock_isolation_failed(s
*/
unsigned int munlock_vma_page(struct page *page)
{
- unsigned int nr_pages;
+ int nr_pages;
struct zone *zone = page_zone(page);

/* For try_to_munlock() and to serialize with page migration */

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:30:23 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Matthew Wilcox <wi...@linux.intel.com>

commit c6400ba7e13a41539342f1b6e1f9e78419cb0148 upstream.

of_hwspin_lock_get_id() is protected by the RCU lock, which means that
insertions can occur simultaneously with the lookup. If the radix tree
transitions from a height of 0, we can see a slot with the indirect_ptr
bit set, which will cause us to at least read random memory, and could
cause other havoc.

Fix this by using the newly introduced radix_tree_iter_retry().

Signed-off-by: Matthew Wilcox <wi...@linux.intel.com>
Cc: Hugh Dickins <hu...@google.com>
Cc: Ohad Ben-Cohen <oh...@wizery.com>
Cc: Konstantin Khlebnikov <khleb...@openvz.org>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/hwspinlock/hwspinlock_core.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/hwspinlock/hwspinlock_core.c
+++ b/drivers/hwspinlock/hwspinlock_core.c
@@ -313,6 +313,10 @@ int of_hwspin_lock_get_id(struct device_
hwlock = radix_tree_deref_slot(slot);
if (unlikely(!hwlock))
continue;
+ if (radix_tree_is_indirect_ptr(hwlock)) {
+ slot = radix_tree_iter_retry(&iter);
+ continue;
+ }

if (hwlock->bank->dev->of_node == args.np) {
ret = 0;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mateusz Guzik <mgu...@redhat.com>

commit ddf1d398e517e660207e2c807f76a90df543a217 upstream.

An unprivileged user can trigger an oops on a kernel with
CONFIG_CHECKPOINT_RESTORE.

proc_pid_cmdline_read takes mmap_sem for reading and obtains args + env
start/end values. These get sanity checked as follows:
BUG_ON(arg_start > arg_end);
BUG_ON(env_start > env_end);

These can be changed by prctl_set_mm. Turns out also takes the semaphore for
reading, effectively rendering it useless. This results in:

kernel BUG at fs/proc/base.c:240!
invalid opcode: 0000 [#1] SMP
Modules linked in: virtio_net
CPU: 0 PID: 925 Comm: a.out Not tainted 4.4.0-rc8-next-20160105dupa+ #71
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff880077a68000 ti: ffff8800784d0000 task.ti: ffff8800784d0000
RIP: proc_pid_cmdline_read+0x520/0x530
RSP: 0018:ffff8800784d3db8 EFLAGS: 00010206
RAX: ffff880077c5b6b0 RBX: ffff8800784d3f18 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 00007f78e8857000 RDI: 0000000000000246
RBP: ffff8800784d3e40 R08: 0000000000000008 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000050
R13: 00007f78e8857800 R14: ffff88006fcef000 R15: ffff880077c5b600
FS: 00007f78e884a740(0000) GS:ffff88007b200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f78e8361770 CR3: 00000000790a5000 CR4: 00000000000006f0
Call Trace:
__vfs_read+0x37/0x100
vfs_read+0x82/0x130
SyS_read+0x58/0xd0
entry_SYSCALL_64_fastpath+0x12/0x76
Code: 4c 8b 7d a8 eb e9 48 8b 9d 78 ff ff ff 4c 8b 7d 90 48 8b 03 48 39 45 a8 0f 87 f0 fe ff ff e9 d1 fe ff ff 4c 8b 7d 90 eb c6 0f 0b <0f> 0b 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
RIP proc_pid_cmdline_read+0x520/0x530
---[ end trace 97882617ae9c6818 ]---

Turns out there are instances where the code just reads aformentioned
values without locking whatsoever - namely environ_read and get_cmdline.

Interestingly these functions look quite resilient against bogus values,
but I don't believe this should be relied upon.

The first patch gets rid of the oops bug by grabbing mmap_sem for
writing.

The second patch is optional and puts locking around aformentioned
consumers for safety. Consumers of other fields don't seem to benefit
from similar treatment and are left untouched.

This patch (of 2):

The code was taking the semaphore for reading, which does not protect
against readers nor concurrent modifications.

The problem could cause a sanity checks to fail in procfs's cmdline
reader, resulting in an OOPS.

Note that some functions perform an unlocked read of various mm fields,
but they seem to be fine despite possible modificaton.

Signed-off-by: Mateusz Guzik <mgu...@redhat.com>
Acked-by: Cyrill Gorcunov <gorc...@openvz.org>
Cc: Alexey Dobriyan <adob...@gmail.com>
Cc: Jarod Wilson <ja...@redhat.com>
Cc: Jan Stancek <jsta...@redhat.com>
Cc: Al Viro <vi...@zeniv.linux.org.uk>
Cc: Anshuman Khandual <anshuma...@gmail.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
kernel/sys.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1853,11 +1853,13 @@ static int prctl_set_mm_map(int opt, con
user_auxv[AT_VECTOR_SIZE - 1] = AT_NULL;
}

- if (prctl_map.exe_fd != (u32)-1)
+ if (prctl_map.exe_fd != (u32)-1) {
error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
- down_read(&mm->mmap_sem);
- if (error)
- goto out;
+ if (error)
+ return error;
+ }
+
+ down_write(&mm->mmap_sem);

/*
* We don't validate if these members are pointing to
@@ -1894,10 +1896,8 @@ static int prctl_set_mm_map(int opt, con
if (prctl_map.auxv_size)
memcpy(mm->saved_auxv, user_auxv, sizeof(user_auxv));

- error = 0;
-out:
- up_read(&mm->mmap_sem);
- return error;
+ up_write(&mm->mmap_sem);
+ return 0;
}
#endif /* CONFIG_CHECKPOINT_RESTORE */

@@ -1963,7 +1963,7 @@ static int prctl_set_mm(int opt, unsigne

error = -EINVAL;

- down_read(&mm->mmap_sem);
+ down_write(&mm->mmap_sem);
vma = find_vma(mm, addr);

prctl_map.start_code = mm->start_code;
@@ -2056,7 +2056,7 @@ static int prctl_set_mm(int opt, unsigne

error = 0;
out:
- up_read(&mm->mmap_sem);
+ up_write(&mm->mmap_sem);
return error;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <trond.m...@primarydata.com>

commit 13331a551ab4df87f7a027d2cab392da96aba1de upstream.

We're seeing hangs in the NFS client code, with loops of the form:

RPC: 30317 xmit incomplete (267368 left of 524448)
RPC: 30317 call_status (status -11)
RPC: 30317 call_transmit (status 0)
RPC: 30317 xprt_prepare_transmit
RPC: 30317 xprt_transmit(524448)
RPC: xs_tcp_send_request(267368) = -11
RPC: 30317 xmit incomplete (267368 left of 524448)
RPC: 30317 call_status (status -11)
RPC: 30317 call_transmit (status 0)
RPC: 30317 xprt_prepare_transmit
RPC: 30317 xprt_transmit(524448)

Turns out commit ceb5d58b2170 ("net: fix sock_wake_async() rcu protection")
moved SOCKWQ_ASYNC_NOSPACE out of sock->flags and into sk->sk_wq->flags,
however it never tried to fix up the code in net/sunrpc.

The new idiom is to use the flags in the RCU protected struct socket_wq.
While we're at it, clear out the now redundant places where we set/clear
SOCKWQ_ASYNC_NOSPACE and SOCK_NOSPACE. In principle, sk_stream_wait_memory()
is supposed to set these for us, so we only need to clear them in the
particular case of our ->write_space() callback.

Fixes: ceb5d58b2170 ("net: fix sock_wake_async() rcu protection")
Cc: Eric Dumazet <edum...@google.com>
Signed-off-by: Trond Myklebust <trond.m...@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
net/sunrpc/xprtsock.c | 49 +++++++++++++++++++++----------------------------
1 file changed, 21 insertions(+), 28 deletions(-)

--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -398,7 +398,6 @@ static int xs_sendpages(struct socket *s
if (unlikely(!sock))
return -ENOTSOCK;

- clear_bit(SOCKWQ_ASYNC_NOSPACE, &sock->flags);
if (base != 0) {
addr = NULL;
addrlen = 0;
@@ -442,7 +441,6 @@ static void xs_nospace_callback(struct r
struct sock_xprt *transport = container_of(task->tk_rqstp->rq_xprt, struct sock_xprt, xprt);

transport->inet->sk_write_pending--;
- clear_bit(SOCKWQ_ASYNC_NOSPACE, &transport->sock->flags);
}

/**
@@ -467,20 +465,11 @@ static int xs_nospace(struct rpc_task *t

/* Don't race with disconnect */
if (xprt_connected(xprt)) {
- if (test_bit(SOCKWQ_ASYNC_NOSPACE, &transport->sock->flags)) {
- /*
- * Notify TCP that we're limited by the application
- * window size
- */
- set_bit(SOCK_NOSPACE, &transport->sock->flags);
- sk->sk_write_pending++;
- /* ...and wait for more buffer space */
- xprt_wait_for_buffer_space(task, xs_nospace_callback);
- }
- } else {
- clear_bit(SOCKWQ_ASYNC_NOSPACE, &transport->sock->flags);
+ /* wait for more buffer space */
+ sk->sk_write_pending++;
+ xprt_wait_for_buffer_space(task, xs_nospace_callback);
+ } else
ret = -ENOTCONN;
- }

spin_unlock_bh(&xprt->transport_lock);

@@ -616,9 +605,6 @@ process_status:
case -EAGAIN:
status = xs_nospace(task);
break;
- default:
- dprintk("RPC: sendmsg returned unrecognized error %d\n",
- -status);
case -ENETUNREACH:
case -ENOBUFS:
case -EPIPE:
@@ -626,7 +612,10 @@ process_status:
case -EPERM:
/* When the server has died, an ICMP port unreachable message
* prompts ECONNREFUSED. */
- clear_bit(SOCKWQ_ASYNC_NOSPACE, &transport->sock->flags);
+ break;
+ default:
+ dprintk("RPC: sendmsg returned unrecognized error %d\n",
+ -status);
}

return status;
@@ -706,16 +695,16 @@ static int xs_tcp_send_request(struct rp
case -EAGAIN:
status = xs_nospace(task);
break;
- default:
- dprintk("RPC: sendmsg returned unrecognized error %d\n",
- -status);
case -ECONNRESET:
case -ECONNREFUSED:
case -ENOTCONN:
case -EADDRINUSE:
case -ENOBUFS:
case -EPIPE:
- clear_bit(SOCKWQ_ASYNC_NOSPACE, &transport->sock->flags);
+ break;
+ default:
+ dprintk("RPC: sendmsg returned unrecognized error %d\n",
+ -status);
}

return status;
@@ -1609,19 +1598,23 @@ static void xs_tcp_state_change(struct s

static void xs_write_space(struct sock *sk)
{
- struct socket *sock;
+ struct socket_wq *wq;
struct rpc_xprt *xprt;

- if (unlikely(!(sock = sk->sk_socket)))
+ if (!sk->sk_socket)
return;
- clear_bit(SOCK_NOSPACE, &sock->flags);
+ clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);

if (unlikely(!(xprt = xprt_from_sock(sk))))
return;
- if (test_and_clear_bit(SOCKWQ_ASYNC_NOSPACE, &sock->flags) == 0)
- return;
+ rcu_read_lock();
+ wq = rcu_dereference(sk->sk_wq);
+ if (!wq || test_and_clear_bit(SOCKWQ_ASYNC_NOSPACE, &wq->flags) == 0)
+ goto out;

xprt_write_space(xprt);
+out:
+ rcu_read_unlock();
}

/**

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tony Lindgren <to...@atomide.com>

commit d9db59103305eb5ec2a86369f32063e9921b6ac5 upstream.

We don't want to be writing to .text so it can be set rodata.
Fix error "Unable to handle kernel paging request at virtual address
c012396c" in wait_dll_lock_timed if CONFIG_DEBUG_RODATA is selected.

As these counters are for debugging only and unused, we can just
remove them.

Cc: Kees Cook <kees...@chromium.org>
Cc: Laura Abbott <lab...@redhat.com>
Cc: Nishanth Menon <n...@ti.com>
Cc: Richard Woodruff <r-woo...@ti.com>
Cc: Russell King <li...@arm.linux.org.uk>
Cc: Tero Kristo <t-kr...@ti.com>
Acked-by: Nicolas Pitre <ni...@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <to...@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/mach-omap2/sleep34xx.S | 22 ----------------------
1 file changed, 22 deletions(-)

--- a/arch/arm/mach-omap2/sleep34xx.S
+++ b/arch/arm/mach-omap2/sleep34xx.S
@@ -289,12 +289,6 @@ wait_sdrc_ready:
bic r5, r5, #0x40
str r5, [r4]

-/*
- * PC-relative stores lead to undefined behaviour in Thumb-2: use a r7 as a
- * base instead.
- * Be careful not to clobber r7 when maintaing this code.
- */
-
is_dll_in_lock_mode:
/* Is dll in lock mode? */
ldr r4, sdrc_dlla_ctrl
@@ -302,11 +296,7 @@ is_dll_in_lock_mode:
tst r5, #0x4
bne exit_nonoff_modes @ Return if locked
/* wait till dll locks */
- adr r7, kick_counter
wait_dll_lock_timed:
- ldr r4, wait_dll_lock_counter
- add r4, r4, #1
- str r4, [r7, #wait_dll_lock_counter - kick_counter]
ldr r4, sdrc_dlla_status
/* Wait 20uS for lock */
mov r6, #8
@@ -330,9 +320,6 @@ kick_dll:
orr r6, r6, #(1<<3) @ enable dll
str r6, [r4]
dsb
- ldr r4, kick_counter
- add r4, r4, #1
- str r4, [r7] @ kick_counter
b wait_dll_lock_timed

exit_nonoff_modes:
@@ -360,15 +347,6 @@ sdrc_dlla_status:
.word SDRC_DLLA_STATUS_V
sdrc_dlla_ctrl:
.word SDRC_DLLA_CTRL_V
- /*
- * When exporting to userspace while the counters are in SRAM,
- * these 2 words need to be at the end to facilitate retrival!
- */
-kick_counter:
- .word 0
-wait_dll_lock_counter:
- .word 0
-
ENTRY(omap3_do_wfi_sz)
.word . - omap3_do_wfi

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrew Gabbasov <andrew_...@mentor.com>

commit bb00c898ad1ce40c4bb422a8207ae562e9aea7ae upstream.

If a name contains at least some characters with Unicode values
exceeding single byte, the CS0 output should have 2 bytes per character.
And if other input characters have single byte Unicode values, then
the single input byte is converted to 2 output bytes, and the length
of output becomes larger than the length of input. And if the input
name is long enough, the output length may exceed the allocated buffer
length.

All this means that conversion from UTF8 or NLS to CS0 requires
checking of output length in order to stop when it exceeds the given
output buffer size.

[JK: Make code return -ENAMETOOLONG instead of silently truncating the
name]

Signed-off-by: Andrew Gabbasov <andrew_...@mentor.com>
Signed-off-by: Jan Kara <ja...@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/udf/unicode.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

--- a/fs/udf/unicode.c
+++ b/fs/udf/unicode.c
@@ -177,17 +177,22 @@ int udf_CS0toUTF8(struct ustr *utf_o, co
static int udf_UTF8toCS0(dstring *ocu, struct ustr *utf, int length)
{
unsigned c, i, max_val, utf_char;
- int utf_cnt, u_len;
+ int utf_cnt, u_len, u_ch;

memset(ocu, 0, sizeof(dstring) * length);
ocu[0] = 8;
max_val = 0xffU;
+ u_ch = 1;

try_again:
u_len = 0U;
utf_char = 0U;
utf_cnt = 0U;
for (i = 0U; i < utf->u_len; i++) {
+ /* Name didn't fit? */
+ if (u_len + 1 + u_ch >= length)
+ return 0;
+
c = (uint8_t)utf->u_name[i];

/* Complete a multi-byte UTF-8 character */
@@ -229,6 +234,7 @@ try_again:
if (max_val == 0xffU) {
max_val = 0xffffU;
ocu[0] = (uint8_t)0x10U;
+ u_ch = 2;
goto try_again;
}
goto error_out;
@@ -299,15 +305,19 @@ static int udf_NLStoCS0(struct nls_table
int len;
unsigned i, max_val;
uint16_t uni_char;
- int u_len;
+ int u_len, u_ch;

memset(ocu, 0, sizeof(dstring) * length);
ocu[0] = 8;
max_val = 0xffU;
+ u_ch = 1;

try_again:
u_len = 0U;
for (i = 0U; i < uni->u_len; i++) {
+ /* Name didn't fit? */
+ if (u_len + 1 + u_ch >= length)
+ return 0;
len = nls->char2uni(&uni->u_name[i], uni->u_len - i, &uni_char);
if (!len)
continue;
@@ -320,6 +330,7 @@ try_again:
if (uni_char > max_val) {
max_val = 0xffffU;
ocu[0] = (uint8_t)0x10U;
+ u_ch = 2;
goto try_again;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Benjamin Tissoires <benjamin....@redhat.com>

commit 6544a1df11c48c8413071aac3316792e4678fbfb upstream.

When using a protocol v2 or v3 hardware, elantech uses the function
elantech_report_semi_mt_data() to report data. This devices are rather
creepy because if num_finger is 3, (x2,y2) is (0,0). Yes, only one valid
touch is reported.

Anyway, userspace (libinput) is now confused by these (0,0) touches,
and detect them as palm, and rejects them.

Commit 3c0213d17a09 ("Input: elantech - fix semi-mt protocol for v3 HW")
was sufficient enough for xf86-input-synaptics and libinput before it has
palm rejection. Now we need to actually tell libinput that this device is
a semi-mt one and it should not rely on the actual values of the 2 touches.

Signed-off-by: Benjamin Tissoires <benjamin....@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry....@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/input/mouse/elantech.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/input/mouse/elantech.c
+++ b/drivers/input/mouse/elantech.c
@@ -1222,7 +1222,7 @@ static int elantech_set_input_params(str
input_set_abs_params(dev, ABS_TOOL_WIDTH, ETP_WMIN_V2,
ETP_WMAX_V2, 0, 0);
}
- input_mt_init_slots(dev, 2, 0);
+ input_mt_init_slots(dev, 2, INPUT_MT_SEMI_MT);
input_set_abs_params(dev, ABS_MT_POSITION_X, x_min, x_max, 0, 0);
input_set_abs_params(dev, ABS_MT_POSITION_Y, y_min, y_max, 0, 0);
break;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>

commit 3caeaa562733c4836e61086ec07666635006a787 upstream.

While recording guest samples in host using perf kvm record, it will
populate unprocessable sample error, though samples will be recorded
properly. While generating report using perf kvm report, no samples will
be processed and same error will populate. We have seen this behaviour
with upstream perf(4.4-rc3) on x86 and ppc64 hardware.

Reason behind this failure is, when it tries to fetch machine from
rb_tree of machines, it fails. As a part of tracing a bug, we figured
out that this code was incorrectly refactored in commit 54245fdc3576
("perf session: Remove wrappers to machines__find").

This patch will change the functionality such that if it can't fetch
machine in first trial, it will create one node of machine and add that to
rb_tree. So next time when it tries to fetch same machine from rb_tree,
it won't fail. Actually it was the case before refactoring of code in
aforementioned commit.

This patch is generated from acme perf/core branch.

Below I've mention an example that demonstrate the behaviour before and
after applying patch.

Before applying patch:
[Note: One needs to run guest before recording data in host]

ravi@ravi-bangoria:~$ ./perf kvm record -a
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
[ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]

ravi@ravi-bangoria:~$ ./perf kvm report --stdio
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 285 of event 'cycles'
# Event count (approx.): 88715406
#
# Overhead Command Shared Object Symbol
# ........ ....... ............. ......
#

# (For a higher level overview, try: perf report --sort comm,dso)
#

After applying patch:

ravi@ravi-bangoria:~$ ./perf kvm record -a
[ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]

ravi@ravi-bangoria:~$ ./perf kvm report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 17 of event 'cycles'
# Event count (approx.): 700746
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
34.19% :5758 [unknown] [g] 0xffffffff818682ab
22.79% :5758 [unknown] [g] 0xffffffff812dc7f8
22.79% :5758 [unknown] [g] 0xffffffff818650d0
14.83% :5758 [unknown] [g] 0xffffffff8161a1b6
2.49% :5758 [unknown] [g] 0xffffffff818692bf
0.48% :5758 [unknown] [g] 0xffffffff81869253
0.05% :5758 [unknown] [g] 0xffffffff81869250

Signed-off-by: Ravi Bangoria <ravi.b...@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen...@linux.vnet.ibm.com>
Fixes: 54245fdc3576 ("perf session: Remove wrappers to machines__find")
Link: http://lkml.kernel.org/r/1449471302-11283-1-git-...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
tools/perf/util/session.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -972,7 +972,7 @@ static struct machine *machines__find_fo

machine = machines__find(machines, pid);
if (!machine)
- machine = machines__find(machines, DEFAULT_GUEST_KERNEL_ID);
+ machine = machines__findnew(machines, DEFAULT_GUEST_KERNEL_ID);
return machine;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Walleij <linus....@linaro.org>

commit e972c37459c813190461dabfeaac228e00aae259 upstream.

Since the dawn of time the ICST code has only supported divide
by one or hang in an eternal loop. Luckily we were always dividing
by one because the reference frequency for the systems using
the ICSTs is 24MHz and the [min,max] values for the PLL input
if [10,320] MHz for ICST307 and [6,200] for ICST525, so the loop
will always terminate immediately without assigning any divisor
for the reference frequency.

But for the code to make sense, let's insert the missing i++

Reported-by: David Binderman <dcb...@hotmail.com>
Signed-off-by: Linus Walleij <linus....@linaro.org>
Signed-off-by: Russell King <rmk+k...@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/common/icst.c | 1 +
1 file changed, 1 insertion(+)

--- a/arch/arm/common/icst.c
+++ b/arch/arm/common/icst.c
@@ -58,6 +58,7 @@ icst_hz_to_vco(const struct icst_params

if (f > p->vco_min && f <= p->vco_max)
break;
+ i++;
} while (i < 8);

if (i >= 8)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marek Szyprowski <m.szyp...@samsung.com>

commit 722ec35f7faefcc34d12616eca7976a848870f9d upstream.

This patch ensures that devices, which got registered before arch_initcall
will be handled correctly by IOMMU-based DMA-mapping code.

Fixes: 13b8629f6511 ("arm64: Add IOMMU dma_ops")
Acked-by: Robin Murphy <robin....@arm.com>
Signed-off-by: Marek Szyprowski <m.szyp...@samsung.com>
Signed-off-by: Will Deacon <will....@arm.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm64/mm/dma-mapping.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -933,6 +933,10 @@ static int __init __iommu_dma_init(void)
ret = register_iommu_dma_ops_notifier(&platform_bus_type);
if (!ret)
ret = register_iommu_dma_ops_notifier(&amba_bustype);
+
+ /* handle devices queued before this arch_initcall */
+ if (!ret)
+ __iommu_attach_notifier(NULL, BUS_NOTIFY_ADD_DEVICE, NULL);
return ret;
}
arch_initcall(__iommu_dma_init);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tony Lindgren <to...@atomide.com>

commit eeaf9646aca89d097861caa24d9818434e48810e upstream.

We don't want to write to .text section. Let's move l2dis_3630
to .data and access it via a pointer.

For calculating the offset, let's optimize out the add and do it
in ldr/str as suggested by Nicolas Pitre <nicola...@linaro.org>.

Cc: Kees Cook <kees...@chromium.org>
Cc: Laura Abbott <lab...@redhat.com>
Cc: Nishanth Menon <n...@ti.com>
Cc: Richard Woodruff <r-woo...@ti.com>
Cc: Russell King <li...@arm.linux.org.uk>
Cc: Tero Kristo <t-kr...@ti.com>
Acked-by: Nicolas Pitre <ni...@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <to...@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/mach-omap2/sleep34xx.S | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

--- a/arch/arm/mach-omap2/sleep34xx.S
+++ b/arch/arm/mach-omap2/sleep34xx.S
@@ -86,8 +86,9 @@ ENTRY(enable_omap3630_toggle_l2_on_resto
stmfd sp!, {lr} @ save registers on stack
/* Setup so that we will disable and enable l2 */
mov r1, #0x1
- adrl r2, l2dis_3630 @ may be too distant for plain adr
- str r1, [r2]
+ adrl r3, l2dis_3630_offset @ may be too distant for plain adr
+ ldr r2, [r3] @ value for offset
+ str r1, [r2, r3] @ write to l2dis_3630
ldmfd sp!, {pc} @ restore regs and return
ENDPROC(enable_omap3630_toggle_l2_on_restore)

@@ -415,7 +416,9 @@ ENTRY(omap3_restore)
cmp r2, #0x0 @ Check if target power state was OFF or RET
bne logic_l1_restore

- ldr r0, l2dis_3630
+ adr r1, l2dis_3630_offset @ address for offset
+ ldr r0, [r1] @ value for offset
+ ldr r0, [r1, r0] @ value at l2dis_3630
cmp r0, #0x1 @ should we disable L2 on 3630?
bne skipl2dis
mrc p15, 0, r0, c1, c0, 1
@@ -486,7 +489,9 @@ l2_inv_gp:
mov r12, #0x2
smc #0 @ Call SMI monitor (smieq)
logic_l1_restore:
- ldr r1, l2dis_3630
+ adr r0, l2dis_3630_offset @ adress for offset
+ ldr r1, [r0] @ value for offset
+ ldr r1, [r0, r1] @ value at l2dis_3630
cmp r1, #0x1 @ Test if L2 re-enable needed on 3630
bne skipl2reen
mrc p15, 0, r1, c1, c0, 1
@@ -515,6 +520,10 @@ control_stat:
.word CONTROL_STAT
control_mem_rta:
.word CONTROL_MEM_RTA_CTRL
+l2dis_3630_offset:
+ .long l2dis_3630 - .
+
+ .data
l2dis_3630:
.word 0

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tony Lindgren <to...@atomide.com>

commit af756bbccff85504ce05c63a50f80b9d7823c500 upstream.

The palmas PMIC has two control lines that need to be muxed properly
for things to work. The sys_nirq pin is used for interrupts, and msecure
pin is used for enabling writes to some PMIC registers.

Without these pins configured properly things can fail in mysterious
ways. For example, we can't update the RTC registers on palmas PMIC
unless the msecure pin is configured. And this is probably the reason
why we had RTC missing from the omap5 dts file.

According to "OMAP5430 ES2.0 Data Manual [Public] VErsion A (Rev. F)"
swps052f.pdf, mux mode 1 is for sys_drm_msecure so in theory there's
should be no need to configure it as a GPIO pin.

However, it seems there are some reliability issues using the msecure
mux mode. And the TI trees configure the msecure pin as GPIO out high
instead.

As the PMIC only cares that the msecure line is high to allow access
to the RTC registers, let's use a GPIO hog as suggested by Nishanth
Menon <n...@ti.com>. Also the use of the internal pull was considered
but supposedly that may not be capable of keeping the line high in
a noisy environment.

If we ever see high security omap5 products in the mainline tree,
those need to skip the msecure pin muxing and ignore setting the GPIO
hog. Chances are the related pin mux registers are locked in that case
and the msecure pin is managed by whatever software may be running in
the ARM TrustZone.

Who knows what the original intention of the msecure pin was. Maybe
it was supposed to prevent the system time to be set back for some
game demo modes to time out? Anyways, it seems that later PMICs like
tps659037 have recycled this pin for "powerhold" and devices like
beagle-x15 do not need changes to the msecure pin configuration.

To avoid further confusion with TWL variant PMICs, beagle-x15 does
not have a back-up battery for RTC palmas. Instead the mcp79410 RTC
is used with rtc-ds1307 driver. There is a "powerhold" jumper j5
holes near the palmas PMIC, and shorting it seems to power up
beagle-x15 automatically. It is unknown if it also has other side
effects to the beagle-x15 power up sequence.

Signed-off-by: Tony Lindgren <to...@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/boot/dts/omap5-board-common.dtsi | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

--- a/arch/arm/boot/dts/omap5-board-common.dtsi
+++ b/arch/arm/boot/dts/omap5-board-common.dtsi
@@ -130,6 +130,16 @@
};
};

+&gpio8 {
+ /* TI trees use GPIO instead of msecure, see also muxing */
+ p234 {
+ gpio-hog;
+ gpios = <10 GPIO_ACTIVE_HIGH>;
+ output-high;
+ line-name = "gpio8_234/msecure";
+ };
+};
+
&omap5_pmx_core {
pinctrl-names = "default";
pinctrl-0 = <
@@ -213,6 +223,13 @@
>;
};

+ /* TI trees use GPIO mode; msecure mode does not work reliably? */
+ palmas_msecure_pins: palmas_msecure_pins {
+ pinctrl-single,pins = <
+ OMAP5_IOPAD(0x180, PIN_OUTPUT | MUX_MODE6) /* gpio8_234 */
+ >;
+ };
+
usbhost_pins: pinmux_usbhost_pins {
pinctrl-single,pins = <
0x84 (PIN_INPUT | MUX_MODE0) /* usbb2_hsic_strobe */
@@ -278,6 +295,12 @@
&usbhost_wkup_pins
>;

+ palmas_sys_nirq_pins: pinmux_palmas_sys_nirq_pins {
+ pinctrl-single,pins = <
+ OMAP5_IOPAD(0x068, PIN_INPUT_PULLUP | MUX_MODE0) /* sys_nirq1 */
+ >;
+ };
+
usbhost_wkup_pins: pinmux_usbhost_wkup_pins {
pinctrl-single,pins = <
0x1A (PIN_OUTPUT | MUX_MODE0) /* fref_clk1_out, USB hub clk */
@@ -345,6 +368,8 @@
interrupt-controller;
#interrupt-cells = <2>;
ti,system-power-controller;
+ pinctrl-names = "default";
+ pinctrl-0 = <&palmas_sys_nirq_pins &palmas_msecure_pins>;

extcon_usb3: palmas_usb {
compatible = "ti,palmas-usb-vid";

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:07 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Walleij <linus....@linaro.org>

commit 5070fb14a0154f075c8b418e5bc58a620ae85a45 upstream.

When trying to set the ICST 307 clock to 25174000 Hz I ran into
this arithmetic error: the icst_hz_to_vco() correctly figure out
DIVIDE=2, RDW=100 and VDW=99 yielding a frequency of
25174000 Hz out of the VCO. (I replicated the icst_hz() function
in a spreadsheet to verify this.)

However, when I called icst_hz() on these VCO settings it would
instead return 4122709 Hz. This causes an error in the common
clock driver for ICST as the common clock framework will call
.round_rate() on the clock which will utilize icst_hz_to_vco()
followed by icst_hz() suggesting the erroneous frequency, and
then the clock gets set to this.

The error did not manifest in the old clock framework since
this high frequency was only used by the CLCD, which calls
clk_set_rate() without first calling clk_round_rate() and since
the old clock framework would not call clk_round_rate() before
setting the frequency, the correct values propagated into
the VCO.

After some experimenting I figured out that it was due to a simple
arithmetic overflow: the divisor for 24Mhz reference frequency
as reference becomes 24000000*2*(99+8)=0x132212400 and the "1"
in bit 32 overflows and is lost.

But introducing an explicit 64-by-32 bit do_div() and casting
the divisor into (u64) we get the right frequency back, and the
right frequency gets set.

Tested on the ARM Versatile.

Cc: linu...@vger.kernel.org
Cc: Pawel Moll <pawel...@arm.com>
Signed-off-by: Linus Walleij <linus....@linaro.org>
Signed-off-by: Russell King <rmk+k...@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/common/icst.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/arch/arm/common/icst.c
+++ b/arch/arm/common/icst.c
@@ -16,7 +16,7 @@
*/
#include <linux/module.h>
#include <linux/kernel.h>
-
+#include <asm/div64.h>
#include <asm/hardware/icst.h>

/*
@@ -29,7 +29,11 @@ EXPORT_SYMBOL(icst525_s2div);

unsigned long icst_hz(const struct icst_params *p, struct icst_vco vco)
{
- return p->ref * 2 * (vco.v + 8) / ((vco.r + 2) * p->s2div[vco.s]);
+ u64 dividend = p->ref * 2 * (u64)(vco.v + 8);
+ u32 divisor = (vco.r + 2) * p->s2div[vco.s];
+
+ do_div(dividend, divisor);
+ return (unsigned long)dividend;
}

EXPORT_SYMBOL(icst_hz);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:09 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jamie Bainbridge <jamie.ba...@gmail.com>

commit ec7147a99e33a9e4abad6fc6e1b40d15df045d53 upstream.

Under some conditions, CIFS can repeatedly call the cifs_dbg() logging
wrapper. If done rapidly enough, the console framebuffer can softlockup
or "rcu_sched self-detected stall". Apply the built-in log ratelimiters
to prevent such hangs.

Signed-off-by: Jamie Bainbridge <jamie.ba...@gmail.com>
Signed-off-by: Steve French <smfr...@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/cifs/cifs_debug.c | 2 +-
fs/cifs/cifs_debug.h | 9 ++++-----
2 files changed, 5 insertions(+), 6 deletions(-)

--- a/fs/cifs/cifs_debug.c
+++ b/fs/cifs/cifs_debug.c
@@ -50,7 +50,7 @@ void cifs_vfs_err(const char *fmt, ...)
vaf.fmt = fmt;
vaf.va = &args;

- pr_err("CIFS VFS: %pV", &vaf);
+ pr_err_ratelimited("CIFS VFS: %pV", &vaf);

va_end(args);
}
--- a/fs/cifs/cifs_debug.h
+++ b/fs/cifs/cifs_debug.h
@@ -51,14 +51,13 @@ __printf(1, 2) void cifs_vfs_err(const c
/* information message: e.g., configuration, major event */
#define cifs_dbg(type, fmt, ...) \
do { \
- if (type == FYI) { \
- if (cifsFYI & CIFS_INFO) { \
- pr_debug("%s: " fmt, __FILE__, ##__VA_ARGS__); \
- } \
+ if (type == FYI && cifsFYI & CIFS_INFO) { \
+ pr_debug_ratelimited("%s: " \
+ fmt, __FILE__, ##__VA_ARGS__); \
} else if (type == VFS) { \
cifs_vfs_err(fmt, ##__VA_ARGS__); \
} else if (type == NOISY && type != 0) { \
- pr_debug(fmt, ##__VA_ARGS__); \
+ pr_debug_ratelimited(fmt, ##__VA_ARGS__); \
} \
} while (0)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrew Gabbasov <andrew_...@mentor.com>

commit ad402b265ecf6fa22d04043b41444cdfcdf4f52d upstream.

udf_CS0toUTF8 function stops the conversion when the output buffer
length reaches UDF_NAME_LEN-2, which is correct maximum name length,
but, when checking, it leaves the space for a single byte only,
while multi-bytes output characters can take more space, causing
buffer overflow.

Similar error exists in udf_CS0toNLS function, that restricts
the output length to UDF_NAME_LEN, while actual maximum allowed
length is UDF_NAME_LEN-2.

In these cases the output can override not only the current buffer
length field, causing corruption of the name buffer itself, but also
following allocation structures, causing kernel crash.

Adjust the output length checks in both functions to prevent buffer
overruns in case of multi-bytes UTF8 or NLS characters.

Signed-off-by: Andrew Gabbasov <andrew_...@mentor.com>
Signed-off-by: Jan Kara <ja...@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/udf/unicode.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

--- a/fs/udf/unicode.c
+++ b/fs/udf/unicode.c
@@ -128,11 +128,15 @@ int udf_CS0toUTF8(struct ustr *utf_o, co
if (c < 0x80U)
utf_o->u_name[utf_o->u_len++] = (uint8_t)c;
else if (c < 0x800U) {
+ if (utf_o->u_len > (UDF_NAME_LEN - 4))
+ break;
utf_o->u_name[utf_o->u_len++] =
(uint8_t)(0xc0 | (c >> 6));
utf_o->u_name[utf_o->u_len++] =
(uint8_t)(0x80 | (c & 0x3f));
} else {
+ if (utf_o->u_len > (UDF_NAME_LEN - 5))
+ break;
utf_o->u_name[utf_o->u_len++] =
(uint8_t)(0xe0 | (c >> 12));
utf_o->u_name[utf_o->u_len++] =
@@ -277,7 +281,7 @@ static int udf_CS0toNLS(struct nls_table
c = (c << 8) | ocu[i++];

len = nls->uni2char(c, &utf_o->u_name[utf_o->u_len],
- UDF_NAME_LEN - utf_o->u_len);
+ UDF_NAME_LEN - 2 - utf_o->u_len);
/* Valid character? */
if (len >= 0)
utf_o->u_len += len;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tony Lindgren <to...@atomide.com>

commit 4da597d16602d14405b71a18d45e1c59f28f0fd2 upstream.

We don't want to write to .text so let's move ppa_zero_params and
ppa_por_params to .data and access them via pointers.

Note that I have not been able to test as we I don't have a HS
omap4 to test with. The code has been changed in similar way as
for omap3 though.

Cc: Kees Cook <kees...@chromium.org>
Cc: Laura Abbott <lab...@redhat.com>
Cc: Nishanth Menon <n...@ti.com>
Cc: Richard Woodruff <r-woo...@ti.com>
Cc: Russell King <li...@arm.linux.org.uk>
Cc: Tero Kristo <t-kr...@ti.com>
Acked-by: Nicolas Pitre <ni...@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <to...@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/mach-omap2/sleep44xx.S | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)

--- a/arch/arm/mach-omap2/sleep44xx.S
+++ b/arch/arm/mach-omap2/sleep44xx.S
@@ -29,12 +29,6 @@
dsb
.endm

-ppa_zero_params:
- .word 0x0
-
-ppa_por_params:
- .word 1, 0
-
#ifdef CONFIG_ARCH_OMAP4

/*
@@ -266,7 +260,9 @@ ENTRY(omap4_cpu_resume)
beq skip_ns_smp_enable
ppa_actrl_retry:
mov r0, #OMAP4_PPA_CPU_ACTRL_SMP_INDEX
- adr r3, ppa_zero_params @ Pointer to parameters
+ adr r1, ppa_zero_params_offset
+ ldr r3, [r1]
+ add r3, r3, r1 @ Pointer to ppa_zero_params
mov r1, #0x0 @ Process ID
mov r2, #0x4 @ Flag
mov r6, #0xff
@@ -303,7 +299,9 @@ skip_ns_smp_enable:
ldr r0, =OMAP4_PPA_L2_POR_INDEX
ldr r1, =OMAP44XX_SAR_RAM_BASE
ldr r4, [r1, #L2X0_PREFETCH_CTRL_OFFSET]
- adr r3, ppa_por_params
+ adr r1, ppa_por_params_offset
+ ldr r3, [r1]
+ add r3, r3, r1 @ Pointer to ppa_por_params
str r4, [r3, #0x04]
mov r1, #0x0 @ Process ID
mov r2, #0x4 @ Flag
@@ -328,6 +326,8 @@ skip_l2en:
#endif

b cpu_resume @ Jump to generic resume
+ppa_por_params_offset:
+ .long ppa_por_params - .
ENDPROC(omap4_cpu_resume)
#endif /* CONFIG_ARCH_OMAP4 */

@@ -380,4 +380,13 @@ ENTRY(omap_do_wfi)
nop

ldmfd sp!, {pc}
+ppa_zero_params_offset:
+ .long ppa_zero_params - .
ENDPROC(omap_do_wfi)
+
+ .data
+ppa_zero_params:
+ .word 0
+
+ppa_por_params:
+ .word 1, 0

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:10 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Petazzoni <thomas.p...@free-electrons.com>

commit 079ae0c121fd23287f4ad2be9e9f8a13f63cae73 upstream.

The Armada 388 GP Device Tree file describes two times a regulator
named 'reg_usb2_1_vbus', with the exact same description. This has
been wrong since Armada 388 GP support was introduced.

Fixes: 928413bd859c0 ("ARM: mvebu: Add Armada 388 General Purpose Development Board support")
Signed-off-by: Thomas Petazzoni <thomas.p...@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory...@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/boot/dts/armada-388-gp.dts | 10 ----------
1 file changed, 10 deletions(-)

--- a/arch/arm/boot/dts/armada-388-gp.dts
+++ b/arch/arm/boot/dts/armada-388-gp.dts
@@ -303,16 +303,6 @@
gpio = <&expander0 4 GPIO_ACTIVE_HIGH>;
};

- reg_usb2_1_vbus: v5-vbus1 {
- compatible = "regulator-fixed";
- regulator-name = "v5.0-vbus1";
- regulator-min-microvolt = <5000000>;
- regulator-max-microvolt = <5000000>;
- enable-active-high;
- regulator-always-on;
- gpio = <&expander0 4 GPIO_ACTIVE_HIGH>;
- };
-
reg_sata0: pwr-sata0 {
compatible = "regulator-fixed";
regulator-name = "pwr_en_sata0";

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:11 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tony Lindgren <to...@atomide.com>

commit 0a0b13275558c32bbf6241464a7244b1ffd5afb3 upstream.

We don't want to write to .text, so let's move l2_inv_api_params
to .data and access it via a pointer.

Cc: Kees Cook <kees...@chromium.org>
Cc: Laura Abbott <lab...@redhat.com>
Cc: Nishanth Menon <n...@ti.com>
Cc: Richard Woodruff <r-woo...@ti.com>
Cc: Russell King <li...@arm.linux.org.uk>
Cc: Tero Kristo <t-kr...@ti.com>
Acked-by: Nicolas Pitre <ni...@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <to...@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/mach-omap2/sleep34xx.S | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

--- a/arch/arm/mach-omap2/sleep34xx.S
+++ b/arch/arm/mach-omap2/sleep34xx.S
@@ -427,12 +427,14 @@ skipl2dis:
and r1, #0x700
cmp r1, #0x300
beq l2_inv_gp
+ adr r0, l2_inv_api_params_offset
+ ldr r3, [r0]
+ add r3, r3, r0 @ r3 points to dummy parameters
mov r0, #40 @ set service ID for PPA
mov r12, r0 @ copy secure Service ID in r12
mov r1, #0 @ set task id for ROM code in r1
mov r2, #4 @ set some flags in r2, r6
mov r6, #0xff
- adr r3, l2_inv_api_params @ r3 points to dummy parameters
dsb @ data write barrier
dmb @ data memory barrier
smc #1 @ call SMI monitor (smi #1)
@@ -466,8 +468,8 @@ skipl2dis:
b logic_l1_restore

.align
-l2_inv_api_params:
- .word 0x1, 0x00
+l2_inv_api_params_offset:
+ .long l2_inv_api_params - .
l2_inv_gp:
/* Execute smi to invalidate L2 cache */
mov r12, #0x1 @ set up to invalidate L2
@@ -516,6 +518,10 @@ control_mem_rta:
l2dis_3630:
.word 0

+ .data
+l2_inv_api_params:
+ .word 0x1, 0x00
+
/*
* Internal functions
*/

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:11 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Naoya Horiguchi <n-hor...@ah.jp.nec.com>

commit d96b339f453997f2f08c52da3f41423be48c978f upstream.

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

Soft offlining page 0x60000 at 0x700000600000
__get_any_page: 0x60000 free buddy page
page:ffffea0001800000 count:0 mapcount:-127 mapping: (null) index:0x1
flags: 0x1fffc0000000000()
page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
------------[ cut here ]------------
kernel BUG at /src/linux-dev/include/linux/mm.h:342!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G O 4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ #74
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
RIP: 0010:[<ffffffff8118998c>] [<ffffffff8118998c>] put_page+0x5c/0x60
RSP: 0018:ffff88007c213e00 EFLAGS: 00010246
Call Trace:
put_hwpoison_page+0x4e/0x80
soft_offline_page+0x501/0x520
SyS_madvise+0x6bc/0x6f0
entry_SYSCALL_64_fastpath+0x12/0x6a
Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
RIP [<ffffffff8118998c>] put_page+0x5c/0x60
RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined. This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list. But it can be also freed to buddy. So the second check need to
care about such case.

Fixes: af8fae7c0886 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <n-hor...@ah.jp.nec.com>
Cc: Sasha Levin <sasha...@oracle.com>
Cc: Aneesh Kumar K.V <aneesh...@linux.vnet.ibm.com>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Jerome Marchand <jmar...@redhat.com>
Cc: Andrea Arcangeli <aarc...@redhat.com>
Cc: Hugh Dickins <hu...@google.com>
Cc: Dave Hansen <dave....@intel.com>
Cc: Mel Gorman <mgo...@suse.de>
Cc: Rik van Riel <ri...@redhat.com>
Cc: Steve Capper <steve....@linaro.org>
Cc: Johannes Weiner <han...@cmpxchg.org>
Cc: Michal Hocko <mho...@suse.cz>
Cc: Christoph Lameter <c...@linux.com>
Cc: David Rientjes <rien...@google.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torv...@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
mm/memory-failure.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1572,7 +1572,7 @@ static int get_any_page(struct page *pag
* Did it turn free?
*/
ret = __get_any_page(page, pfn, 0);
- if (!PageLRU(page)) {
+ if (ret == 1 && !PageLRU(page)) {
/* Drop page reference which is from __get_any_page() */
put_hwpoison_page(page);
pr_info("soft_offline: %#lx: unknown non LRU page type %lx\n",

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:12 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Yong Li <sdli...@gmail.com>

commit 97a249e98a72d6b79fb7350a8dd56b147e9d5bdb upstream.

Without this change, the name entity for mcp4725 is missing in
/sys/bus/iio/devices/iio\:device*/name

With this change, name is reported correctly

Signed-off-by: Yong Li <sdli...@gmail.com>
Signed-off-by: Jonathan Cameron <ji...@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iio/dac/mcp4725.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/iio/dac/mcp4725.c
+++ b/drivers/iio/dac/mcp4725.c
@@ -300,6 +300,7 @@ static int mcp4725_probe(struct i2c_clie
data->client = client;

indio_dev->dev.parent = &client->dev;
+ indio_dev->name = id->name;
indio_dev->info = &mcp4725_info;
indio_dev->channels = &mcp4725_channel;
indio_dev->num_channels = 1;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:12 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nicolas Ferre <nicola...@atmel.com>

commit e873cc022ce5e2c04bbc53b5874494b657e29d3f upstream.

For phy0 KSZ8081, the type of GPIO IRQ should be "level low" instead of
"edge falling".

Signed-off-by: Nicolas Ferre <nicola...@atmel.com>
Fixes: 38153a017896 ("ARM: at91/dt: sama5d4: add dts for sama5d4 xplained board")
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/boot/dts/at91-sama5d4_xplained.dts | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/boot/dts/at91-sama5d4_xplained.dts
+++ b/arch/arm/boot/dts/at91-sama5d4_xplained.dts
@@ -91,7 +91,7 @@

phy0: ethernet-phy@1 {
interrupt-parent = <&pioE>;
- interrupts = <1 IRQ_TYPE_EDGE_FALLING>;
+ interrupts = <1 IRQ_TYPE_LEVEL_LOW>;
reg = <1>;
};
};

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:13 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: H. Nikolaus Schaller <h...@goldelico.com>

commit c08659d431b40ad5beb97d7dde49ad9796cb812c upstream.

tested on OMP5432 EVM

Signed-off-by: H. Nikolaus Schaller <h...@goldelico.com>
Signed-off-by: Tony Lindgren <to...@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/boot/dts/omap5-board-common.dtsi | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/arch/arm/boot/dts/omap5-board-common.dtsi
+++ b/arch/arm/boot/dts/omap5-board-common.dtsi
@@ -383,6 +383,14 @@
#clock-cells = <0>;
};

+ rtc {
+ compatible = "ti,palmas-rtc";
+ interrupt-parent = <&palmas>;
+ interrupts = <8 IRQ_TYPE_NONE>;
+ ti,backup-battery-chargeable;
+ ti,backup-battery-charge-high-current;
+ };
+
palmas_pmic {
compatible = "ti,palmas-pmic";
interrupt-parent = <&palmas>;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:13 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Chinner <dchi...@redhat.com>

commit 85bec5460ad8e05e0a8d70fb0f6750eb719ad092 upstream.

Recently I've been seeing xfs/051 fail on 1k block size filesystems.
Trying to trace the events during the test lead to the problem going
away, indicating that it was a race condition that lead to this
ASSERT failure:

XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 156
.....
[<ffffffff814e1257>] xfs_free_perag+0x87/0xb0
[<ffffffff814e21b9>] xfs_mountfs+0x4d9/0x900
[<ffffffff814e5dff>] xfs_fs_fill_super+0x3bf/0x4d0
[<ffffffff811d8800>] mount_bdev+0x180/0x1b0
[<ffffffff814e3ff5>] xfs_fs_mount+0x15/0x20
[<ffffffff811d90a8>] mount_fs+0x38/0x170
[<ffffffff811f4347>] vfs_kern_mount+0x67/0x120
[<ffffffff811f7018>] do_mount+0x218/0xd60
[<ffffffff811f7e5b>] SyS_mount+0x8b/0xd0

When I finally caught it with tracing enabled, I saw that AG 2 had
an elevated reference count and a buffer was responsible for it. I
tracked down the specific buffer, and found that it was missing the
final reference count release that would put it back on the LRU and
hence be found by xfs_wait_buftarg() calls in the log mount failure
handling.

The last four traces for the buffer before the assert were (trimmed
for relevance)

kworker/0:1-5259 xfs_buf_iodone: hold 2 lock 0 flags ASYNC
kworker/0:1-5259 xfs_buf_ioerror: hold 2 lock 0 error -5
mount-7163 xfs_buf_lock_done: hold 2 lock 0 flags ASYNC
mount-7163 xfs_buf_unlock: hold 2 lock 1 flags ASYNC

This is an async write that is completing, so there's nobody waiting
for it directly. Hence we call xfs_buf_relse() once all the
processing is complete. That does:

static inline void xfs_buf_relse(xfs_buf_t *bp)
{
xfs_buf_unlock(bp);
xfs_buf_rele(bp);
}

Now, it's clear that mount is waiting on the buffer lock, and that
it has been released by xfs_buf_relse() and gained by mount. This is
expected, because at this point the mount process is in
xfs_buf_delwri_submit() waiting for all the IO it submitted to
complete.

The mount process, however, is waiting on the lock for the buffer
because it is in xfs_buf_delwri_submit(). This waits for IO
completion, but it doesn't wait for the buffer reference owned by
the IO to go away. The mount process collects all the completions,
fails the log recovery, and the higher level code then calls
xfs_wait_buftarg() to free all the remaining buffers in the
filesystem.

The issue is that on unlocking the buffer, the scheduler has decided
that the mount process has higher priority than the the kworker
thread that is running the IO completion, and so immediately
switched contexts to the mount process from the semaphore unlock
code, hence preventing the kworker thread from finishing the IO
completion and releasing the IO reference to the buffer.

Hence by the time that xfs_wait_buftarg() is run, the buffer still
has an active reference and so isn't on the LRU list that the
function walks to free the remaining buffers. Hence we miss that
buffer and continue onwards to tear down the mount structures,
at which time we get find a stray reference count on the perag
structure. On a non-debug kernel, this will be ignored and the
structure torn down and freed. Hence when the kworker thread is then
rescheduled and the buffer released and freed, it will access a
freed perag structure.

The problem here is that when the log mount fails, we still need to
quiesce the log to ensure that the IO workqueues have returned to
idle before we run xfs_wait_buftarg(). By synchronising the
workqueues, we ensure that all IO completions are fully processed,
not just to the point where buffers have been unlocked. This ensures
we don't end up in the situation above.

Signed-off-by: Dave Chinner <dchi...@redhat.com>
Reviewed-by: Brian Foster <bfo...@redhat.com>
Signed-off-by: Dave Chinner <da...@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/xfs/xfs_buf.c | 10 ++++++++++
1 file changed, 10 insertions(+)

--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1527,6 +1527,16 @@ xfs_wait_buftarg(
LIST_HEAD(dispose);
int loop = 0;

+ /*
+ * We need to flush the buffer workqueue to ensure that all IO
+ * completion processing is 100% done. Just waiting on buffer locks is
+ * not sufficient for async IO as the reference count held over IO is
+ * not released until after the buffer lock is dropped. Hence we need to
+ * ensure here that all reference counts have been dropped before we
+ * start walking the LRU list.
+ */
+ drain_workqueue(btp->bt_mount->m_buf_workqueue);
+
/* loop until there is nothing left on the lru list. */
while (list_lru_count(&btp->bt_lru)) {
list_lru_walk(&btp->bt_lru, xfs_buftarg_wait_rele,

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:14 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tg...@linutronix.de>

commit 51cbb5242a41700a3f250ecfb48dcfb7e4375ea4 upstream.

As Helge reported for timerfd we have the same issue in itimers. We return
remaining time larger than the programmed relative time to user space in case
of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra time
added in hrtimer_start_range_ns().

Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Helge Deller <del...@gmx.de>
Cc: John Stultz <john....@linaro.org>
Cc: linux...@lists.linux-m68k.org
Cc: dhow...@redhat.com
Link: http://lkml.kernel.org/r/201601141641...@linutronix.de
Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
kernel/time/itimer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/time/itimer.c
+++ b/kernel/time/itimer.c
@@ -26,7 +26,7 @@
*/
static struct timeval itimer_get_remtime(struct hrtimer *timer)
{
- ktime_t rem = hrtimer_get_remaining(timer);
+ ktime_t rem = __hrtimer_get_remaining(timer, true);

/*
* Racy but safe: if the itimer expires after the above

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tg...@linutronix.de>

commit b62526ed11a1fe3861ab98d40b7fdab8981d788a upstream.

Helge reported that a relative timer can return a remaining time larger than
the programmed relative time on parisc and other architectures which have
CONFIG_TIME_LOW_RES set. This happens because we add a jiffie to the resulting
expiry time to prevent short timeouts.

Use the new function hrtimer_expires_remaining_adjusted() to calculate the
remaining time. It takes that extra added time into account for relative
timers.

Reported-and-tested-by: Helge Deller <del...@gmx.de>
Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: John Stultz <john....@linaro.org>
Cc: linux...@lists.linux-m68k.org
Cc: dhow...@redhat.com
Link: http://lkml.kernel.org/r/201601141641...@linutronix.de
Signed-off-by: Thomas Gleixner <tg...@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/timerfd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -153,7 +153,7 @@ static ktime_t timerfd_get_remaining(str
if (isalarm(ctx))
remaining = alarm_expires_remaining(&ctx->t.alarm);
else
- remaining = hrtimer_expires_remaining(&ctx->t.tmr);
+ remaining = hrtimer_expires_remaining_adjusted(&ctx->t.tmr);

return remaining.tv64 < 0 ? ktime_set(0, 0): remaining;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:16 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Greg Kurz <gk...@linux.vnet.ibm.com>

commit b4d7f161feb3015d6306e1d35b565c888ff70c9d upstream.

The get and set operations got exchanged by mistake when moving the
code from book3s.c to powerpc.c.

Fixes: 3840edc8033ad5b86deee309c1c321ca54257452
Signed-off-by: Greg Kurz <gk...@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <pau...@samba.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/powerpc/kvm/powerpc.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -919,21 +919,17 @@ int kvm_vcpu_ioctl_get_one_reg(struct kv
r = -ENXIO;
break;
}
- vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval;
+ val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0];
break;
case KVM_REG_PPC_VSCR:
if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
r = -ENXIO;
break;
}
- vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val);
+ val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]);
break;
case KVM_REG_PPC_VRSAVE:
- if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
- r = -ENXIO;
- break;
- }
- vcpu->arch.vrsave = set_reg_val(reg->id, val);
+ val = get_reg_val(reg->id, vcpu->arch.vrsave);
break;
#endif /* CONFIG_ALTIVEC */
default:
@@ -974,17 +970,21 @@ int kvm_vcpu_ioctl_set_one_reg(struct kv
r = -ENXIO;
break;
}
- val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0];
+ vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval;
break;
case KVM_REG_PPC_VSCR:
if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
r = -ENXIO;
break;
}
- val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]);
+ vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val);
break;
case KVM_REG_PPC_VRSAVE:
- val = get_reg_val(reg->id, vcpu->arch.vrsave);
+ if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
+ r = -ENXIO;
+ break;
+ }
+ vcpu->arch.vrsave = set_reg_val(reg->id, val);
break;
#endif /* CONFIG_ALTIVEC */
default:

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:17 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexandre Belloni <alexandr...@free-electrons.com>

commit f505dba762ae826bb68978a85ee5c8ced7dea8d7 upstream.

No interrupt were received from the phy because PIOE 1 may not be properly
muxed. It prevented proper link detection, especially since commit
321beec5047a ("net: phy: Use interrupts when available in NOLINK state")
disables polling.

Signed-off-by: Alexandre Belloni <alexandr...@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicola...@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/boot/dts/at91-sama5d4_xplained.dts | 6 ++++++
1 file changed, 6 insertions(+)

--- a/arch/arm/boot/dts/at91-sama5d4_xplained.dts
+++ b/arch/arm/boot/dts/at91-sama5d4_xplained.dts
@@ -86,6 +86,8 @@
macb0: ethernet@f8020000 {
phy-mode = "rmii";
status = "okay";
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_macb0_rmii &pinctrl_macb0_phy_irq>;

phy0: ethernet-phy@1 {
interrupt-parent = <&pioE>;
@@ -152,6 +154,10 @@
atmel,pins =
<AT91_PIOE 8 AT91_PERIPH_GPIO AT91_PINCTRL_PULL_UP_DEGLITCH>;
};
+ pinctrl_macb0_phy_irq: macb0_phy_irq_0 {
+ atmel,pins =
+ <AT91_PIOE 1 AT91_PERIPH_GPIO AT91_PINCTRL_PULL_UP_DEGLITCH>;
+ };
};
};
};

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:18 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Walleij <linus....@linaro.org>

commit 418d5516568b3fdbc4e7b53677dd78aed8514565 upstream.

The DTSI file for the Nomadik does not properly specify how the
PL180 levelshifter is connected: the Nomadik actually needs all
the five st,sig-dir-* flags set to properly control all lines out.

Further this board supports full power cycling of the card, and
since this variant has no hardware clock gating, it needs a
ridiculously low frequency setting to keep up with the ever
overflowing FIFO.

The pin configuration set-up is a bit of a mystery, because of
course these pins are a mix of inputs and outputs. However the
reference implementation sets all pins to "output" with
unspecified initial value, so let's do that here as well.

Signed-off-by: Linus Walleij <linus....@linaro.org>
Acked-by: Ulf Hansson <ulf.h...@linaro.org>
Signed-off-by: Olof Johansson <ol...@lixom.net>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/boot/dts/ste-nomadik-stn8815.dtsi | 37 +++++++++++++++--------------
1 file changed, 20 insertions(+), 17 deletions(-)

--- a/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi
+++ b/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi
@@ -127,22 +127,14 @@
};
mmcsd_default_mode: mmcsd_default {
mmcsd_default_cfg1 {
- /* MCCLK */
- pins = "GPIO8_B10";
- ste,output = <0>;
- };
- mmcsd_default_cfg2 {
- /* MCCMDDIR, MCDAT0DIR, MCDAT31DIR, MCDATDIR2 */
- pins = "GPIO10_C11", "GPIO15_A12",
- "GPIO16_C13", "GPIO23_D15";
- ste,output = <1>;
- };
- mmcsd_default_cfg3 {
- /* MCCMD, MCDAT3-0, MCMSFBCLK */
- pins = "GPIO9_A10", "GPIO11_B11",
- "GPIO12_A11", "GPIO13_C12",
- "GPIO14_B12", "GPIO24_C15";
- ste,input = <1>;
+ /*
+ * MCCLK, MCCMDDIR, MCDAT0DIR, MCDAT31DIR, MCDATDIR2
+ * MCCMD, MCDAT3-0, MCMSFBCLK
+ */
+ pins = "GPIO8_B10", "GPIO9_A10", "GPIO10_C11", "GPIO11_B11",
+ "GPIO12_A11", "GPIO13_C12", "GPIO14_B12", "GPIO15_A12",
+ "GPIO16_C13", "GPIO23_D15", "GPIO24_C15";
+ ste,output = <2>;
};
};
};
@@ -802,10 +794,21 @@
clock-names = "mclk", "apb_pclk";
interrupt-parent = <&vica>;
interrupts = <22>;
- max-frequency = <48000000>;
+ max-frequency = <400000>;
bus-width = <4>;
cap-mmc-highspeed;
cap-sd-highspeed;
+ full-pwr-cycle;
+ /*
+ * The STw4811 circuit used with the Nomadik strictly
+ * requires that all of these signal direction pins be
+ * routed and used for its 4-bit levelshifter.
+ */
+ st,sig-dir-dat0;
+ st,sig-dir-dat2;
+ st,sig-dir-dat31;
+ st,sig-dir-cmd;
+ st,sig-pin-fbclk;
pinctrl-names = "default";
pinctrl-0 = <&mmcsd_default_mux>, <&mmcsd_default_mode>;
vmmc-supply = <&vmmc_regulator>;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:18 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Huth <th...@redhat.com>

commit 760a7364f27d974d100118d88190e574626e18a6 upstream.

In the old DABR register, the BT (Breakpoint Translation) bit
is bit number 61. In the new DAWRX register, the WT (Watchpoint
Translation) bit is bit number 59. So to move the DABR-BT bit
into the position of the DAWRX-WT bit, it has to be shifted by
two, not only by one. This fixes hardware watchpoints in gdb of
older guests that only use the H_SET_DABR/X interface instead
of the new H_SET_MODE interface.

Signed-off-by: Thomas Huth <th...@redhat.com>
Reviewed-by: Laurent Vivier <lvi...@redhat.com>
Reviewed-by: David Gibson <da...@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <pau...@samba.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2153,7 +2153,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)

/* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */
2: rlwimi r5, r4, 5, DAWRX_DR | DAWRX_DW
- rlwimi r5, r4, 1, DAWRX_WT
+ rlwimi r5, r4, 2, DAWRX_WT
clrrdi r4, r4, 3
std r4, VCPU_DAWR(r3)
std r5, VCPU_DAWRX(r3)

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:19 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tony Lindgren <to...@atomide.com>

commit a5311d4d13df80bd71a9e47f9ecaf327f478fab1 upstream.

We don't want to write to .text and we can move save_secure_ram_context
into .data as it all gets copied into SRAM anyways.

Cc: Kees Cook <kees...@chromium.org>
Cc: Laura Abbott <lab...@redhat.com>
Cc: Nishanth Menon <n...@ti.com>
Cc: Richard Woodruff <r-woo...@ti.com>
Cc: Russell King <li...@arm.linux.org.uk>
Cc: Sergei Shtylyov <sergei....@cogentembedded.com>
Cc: Tero Kristo <t-kr...@ti.com>
Acked-by: Nicolas Pitre <ni...@linaro.org>
Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
non-executable")
Signed-off-by: Tony Lindgren <to...@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/mach-omap2/sleep34xx.S | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

--- a/arch/arm/mach-omap2/sleep34xx.S
+++ b/arch/arm/mach-omap2/sleep34xx.S
@@ -92,8 +92,12 @@ ENTRY(enable_omap3630_toggle_l2_on_resto
ldmfd sp!, {pc} @ restore regs and return
ENDPROC(enable_omap3630_toggle_l2_on_restore)

- .text
-/* Function to call rom code to save secure ram context */
+/*
+ * Function to call rom code to save secure ram context. This gets
+ * relocated to SRAM, so it can be all in .data section. Otherwise
+ * we need to initialize api_params separately.
+ */
+ .data
.align 3
ENTRY(save_secure_ram_context)
stmfd sp!, {r4 - r11, lr} @ save registers on stack
@@ -127,6 +131,8 @@ ENDPROC(save_secure_ram_context)
ENTRY(save_secure_ram_context_sz)
.word . - save_secure_ram_context

+ .text
+
/*
* ======================
* == Idle entry point ==

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:19 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: James Bottomley <James.B...@HansenPartnership.com>

commit 00cd29b799e3449f0c68b1cc77cd4a5f95b42d17 upstream.

The starting node for a klist iteration is often passed in from
somewhere way above the klist infrastructure, meaning there's no
guarantee the node is still on the list. We've seen this in SCSI where
we use bus_find_device() to iterate through a list of devices. In the
face of heavy hotplug activity, the last device returned by
bus_find_device() can be removed before the next call. This leads to

Dec 3 13:22:02 localhost kernel: WARNING: CPU: 2 PID: 28073 at include/linux/kref.h:47 klist_iter_init_node+0x3d/0x50()
Dec 3 13:22:02 localhost kernel: Modules linked in: scsi_debug x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel joydev iTCO_wdt dcdbas ipmi_devintf acpi_power_meter iTCO_vendor_support ipmi_si imsghandler pcspkr wmi acpi_cpufreq tpm_tis tpm shpchp lpc_ich mfd_core nfsd nfs_acl lockd grace sunrpc tg3 ptp pps_core
Dec 3 13:22:02 localhost kernel: CPU: 2 PID: 28073 Comm: cat Not tainted 4.4.0-rc1+ #2
Dec 3 13:22:02 localhost kernel: Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013
Dec 3 13:22:02 localhost kernel: ffffffff81a20e77 ffff880613acfd18 ffffffff81321eef 0000000000000000
Dec 3 13:22:02 localhost kernel: ffff880613acfd50 ffffffff8107ca52 ffff88061176b198 0000000000000000
Dec 3 13:22:02 localhost kernel: ffffffff814542b0 ffff880610cfb100 ffff88061176b198 ffff880613acfd60
Dec 3 13:22:02 localhost kernel: Call Trace:
Dec 3 13:22:02 localhost kernel: [<ffffffff81321eef>] dump_stack+0x44/0x55
Dec 3 13:22:02 localhost kernel: [<ffffffff8107ca52>] warn_slowpath_common+0x82/0xc0
Dec 3 13:22:02 localhost kernel: [<ffffffff814542b0>] ? proc_scsi_show+0x20/0x20
Dec 3 13:22:02 localhost kernel: [<ffffffff8107cb4a>] warn_slowpath_null+0x1a/0x20
Dec 3 13:22:02 localhost kernel: [<ffffffff8167225d>] klist_iter_init_node+0x3d/0x50
Dec 3 13:22:02 localhost kernel: [<ffffffff81421d41>] bus_find_device+0x51/0xb0
Dec 3 13:22:02 localhost kernel: [<ffffffff814545ad>] scsi_seq_next+0x2d/0x40
[...]

And an eventual crash. It can actually occur in any hotplug system
which has a device finder and a starting device.

We can fix this globally by making sure the starting node for
klist_iter_init_node() is actually a member of the list before using it
(and by starting from the beginning if it isn't).

Reported-by: Ewan D. Milne <emi...@redhat.com>
Tested-by: Ewan D. Milne <emi...@redhat.com>
Signed-off-by: James Bottomley <James.B...@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
lib/klist.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/lib/klist.c
+++ b/lib/klist.c
@@ -282,9 +282,9 @@ void klist_iter_init_node(struct klist *
struct klist_node *n)
{
i->i_klist = k;
- i->i_cur = n;
- if (n)
- kref_get(&n->n_ref);
+ i->i_cur = NULL;
+ if (n && kref_get_unless_zero(&n->n_ref))
+ i->i_cur = n;
}
EXPORT_SYMBOL_GPL(klist_iter_init_node);

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:40:23 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Anton Protopopov <a.s.pro...@gmail.com>

commit 4b550af519854421dfec9f7732cdddeb057134b2 upstream.

The setup_ntlmv2_rsp() function may return positive value ENOMEM instead
of -ENOMEM in case of kmalloc failure.

Signed-off-by: Anton Protopopov <a.s.pro...@gmail.com>
Signed-off-by: Steve French <smfr...@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/cifs/cifsencrypt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/cifs/cifsencrypt.c
+++ b/fs/cifs/cifsencrypt.c
@@ -714,7 +714,7 @@ setup_ntlmv2_rsp(struct cifs_ses *ses, c

ses->auth_key.response = kmalloc(baselen + tilen, GFP_KERNEL);
if (!ses->auth_key.response) {
- rc = ENOMEM;
+ rc = -ENOMEM;
ses->auth_key.len = 0;
goto setup_ntlmv2_rsp_ret;
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:41:11 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Wenyou Yang <wenyo...@atmel.com>

commit aae6b18f5c95b9dc78de66d1e27e8afeee2763b7 upstream.

On SAMA5D4EK board, the Ethernet doesn't work after resuming from the suspend
state.

Signed-off-by: Wenyou Yang <wenyo...@atmel.com>
[nicola...@atmel.com: adapt to newer kernel]
Fixes: 38153a017896 ("ARM: at91/dt: sama5d4: add dts for sama5d4 xplained board")
Signed-off-by: Nicolas Ferre <nicola...@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
arch/arm/boot/dts/at91-sama5d4ek.dts | 11 +++++++++++
1 file changed, 11 insertions(+)

--- a/arch/arm/boot/dts/at91-sama5d4ek.dts
+++ b/arch/arm/boot/dts/at91-sama5d4ek.dts
@@ -160,8 +160,15 @@
};

macb0: ethernet@f8020000 {
+ pinctrl-0 = <&pinctrl_macb0_rmii &pinctrl_macb0_phy_irq>;
phy-mode = "rmii";
status = "okay";
+
+ ethernet-phy@1 {
+ reg = <0x1>;
+ interrupt-parent = <&pioE>;
+ interrupts = <1 IRQ_TYPE_LEVEL_LOW>;
+ };
};

mmc1: mmc@fc000000 {
@@ -193,6 +200,10 @@

pinctrl@fc06a000 {
board {
+ pinctrl_macb0_phy_irq: macb0_phy_irq {
+ atmel,pins =
+ <AT91_PIOE 1 AT91_PERIPH_GPIO AT91_PINCTRL_NONE>;
+ };
pinctrl_mmc0_cd: mmc0_cd {
atmel,pins =
<AT91_PIOE 5 AT91_PERIPH_GPIO AT91_PINCTRL_PULL_UP_DEGLITCH>;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:41:11 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andre Przywara <andre.p...@arm.com>

commit b3aff6ccbb1d25e506b60ccd9c559013903f3464 upstream.

Commit 4b4b4512da2a ("arm/arm64: KVM: Rework the arch timer to use
level-triggered semantics") brought the virtual architected timer
closer to the VGIC. There is one occasion were we don't properly
check for the VGIC actually having been initialized before, but
instead go on to check the active state of some IRQ number.
If userland hasn't instantiated a virtual GIC, we end up with a
kernel NULL pointer dereference:
=========
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ffffffc9745c5000
[00000000] *pgd=00000009f631e003, *pud=00000009f631e003, *pmd=0000000000000000
Internal error: Oops: 96000006 [#2] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 2144 Comm: kvm_simplest-ar Tainted: G D 4.5.0-rc2+ #1300
Hardware name: ARM Juno development board (r1) (DT)
task: ffffffc976da8000 ti: ffffffc976e28000 task.ti: ffffffc976e28000
PC is at vgic_bitmap_get_irq_val+0x78/0x90
LR is at kvm_vgic_map_is_active+0xac/0xc8
pc : [<ffffffc0000b7e28>] lr : [<ffffffc0000b972c>] pstate: 20000145
....
=========

Fix this by bailing out early of kvm_timer_flush_hwstate() if we don't
have a VGIC at all.

Reported-by: Cosmin Gorgovan <cos...@linux-geek.org>
Acked-by: Marc Zyngier <marc.z...@arm.com>
Signed-off-by: Andre Przywara <andre.p...@arm.com>
Signed-off-by: Marc Zyngier <marc.z...@arm.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
virt/kvm/arm/arch_timer.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -143,7 +143,7 @@ static void kvm_timer_update_irq(struct
* Check if there was a change in the timer state (should we raise or lower
* the line level to the GIC).
*/
-static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
+static int kvm_timer_update_state(struct kvm_vcpu *vcpu)
{
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;

@@ -154,10 +154,12 @@ static void kvm_timer_update_state(struc
* until we call this function from kvm_timer_flush_hwstate.
*/
if (!vgic_initialized(vcpu->kvm))
- return;
+ return -ENODEV;

if (kvm_timer_should_fire(vcpu) != timer->irq.level)
kvm_timer_update_irq(vcpu, !timer->irq.level);
+
+ return 0;
}

/*
@@ -218,7 +220,8 @@ void kvm_timer_flush_hwstate(struct kvm_
bool phys_active;
int ret;

- kvm_timer_update_state(vcpu);
+ if (kvm_timer_update_state(vcpu))
+ return;

/*
* If we enter the guest with the virtual input level to the VGIC

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:50:05 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Peter Hurley <pe...@hurleysoftware.com>

commit 308bbc9ab838d0ace0298268c7970ba9513e2c65 upstream.

The omap-serial driver emulates RS485 delays using software timers,
but neglects to clamp the input values from the unprivileged
ioctl(TIOCSRS485). Because the software implementation busy-waits,
malicious userspace could stall the cpu for ~49 days.

Clamp the input values to < 100ms.

Fixes: 4a0ac0f55b18 ("OMAP: add RS485 support")
Signed-off-by: Peter Hurley <pe...@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/tty/serial/omap-serial.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -1343,7 +1343,7 @@ static inline void serial_omap_add_conso

/* Enable or disable the rs485 support */
static int
-serial_omap_config_rs485(struct uart_port *port, struct serial_rs485 *rs485conf)
+serial_omap_config_rs485(struct uart_port *port, struct serial_rs485 *rs485)
{
struct uart_omap_port *up = to_uart_omap_port(port);
unsigned int mode;
@@ -1356,8 +1356,12 @@ serial_omap_config_rs485(struct uart_por
up->ier = 0;
serial_out(up, UART_IER, 0);

+ /* Clamp the delays to [0, 100ms] */
+ rs485->delay_rts_before_send = min(rs485->delay_rts_before_send, 100U);
+ rs485->delay_rts_after_send = min(rs485->delay_rts_after_send, 100U);
+
/* store new config */
- port->rs485 = *rs485conf;
+ port->rs485 = *rs485;

/*
* Just as a precaution, only allow rs485

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:50:05 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt <ros...@goodmis.org>

commit 32abc2ede536aae52978d6c0a8944eb1df14f460 upstream.

When a long value is read on 32 bit machines for 64 bit output, the
parsing needs to change "%lu" into "%llu", as the value is read
natively.

Unfortunately, if "%llu" is already there, the code will add another "l"
to it and fail to parse it properly.

Signed-off-by: Steven Rostedt <ros...@goodmis.org>
Acked-by: Namhyung Kim <namh...@kernel.org>
Link: http://lkml.kernel.org/r/20151116172...@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
tools/lib/traceevent/event-parse.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -4968,13 +4968,12 @@ static void pretty_print(struct trace_se
sizeof(long) != 8) {
char *p;

- ls = 2;
/* make %l into %ll */
- p = strchr(format, 'l');
- if (p)
+ if (ls == 1 && (p = strchr(format, 'l')))
memmove(p+1, p, strlen(p)+1);
else if (strcmp(format, "%p") == 0)
strcpy(format, "0x%llx");
+ ls = 2;
}
switch (ls) {
case -2:

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:50:05 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jonathan Cameron <ji...@kernel.org>

commit 9d0be85d4e2cfa2519ae16efe7ff4a7150c43c0b upstream.

Whilst this part has a hardware buffer, the identifcation that IIO cares
about is the userspace facing end. It this case we push individual elements
from the hardware fifo into the software interface (specifically a kfifo)
rather than providing direct reads through to a hardware buffer
(as we still do in the sca3000 for example).

Technically the original specification as a hardware buffer could be
considered wrong, but it didn't matter until the patch listed below.

Result is that any attempt to enable the buffer will return -EINVAL

Fixes: 225d59adf1c8 ("iio: Specify supported modes for buffers")
Signed-off-by: Jonathan Cameron <ji...@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iio/adc/ti_am335x_adc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/iio/adc/ti_am335x_adc.c
+++ b/drivers/iio/adc/ti_am335x_adc.c
@@ -289,7 +289,7 @@ static int tiadc_iio_buffered_hardware_s
goto error_kfifo_free;

indio_dev->setup_ops = setup_ops;
- indio_dev->modes |= INDIO_BUFFER_HARDWARE;
+ indio_dev->modes |= INDIO_BUFFER_SOFTWARE;

return 0;

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:50:05 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gabriele Mazzotta <gabrie...@gmail.com>

commit fa34e6dd44d7c02c8a8468ce4a52a7506f907bef upstream.

As per the ACPI specification (Revision 5.0) [1], the data coming
from the sensor represent the ambient light illuminance reading
expressed in lux. So use IIO_CHAN_INFO_PROCESSED to signify that
the data are pre-processed.

However, to keep backward ABI compatibility, the IIO_CHAN_INFO_RAW
bit is not removed.

[1] http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf

This issue has also been responsible for at least one userspace bug
report hence marking what is a small semantic fix really for stable.
[2] https://github.com/hadess/iio-sensor-proxy/issues/46

Signed-off-by: Gabriele Mazzotta <gabrie...@gmail.com>
Signed-off-by: Jonathan Cameron <ji...@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iio/light/acpi-als.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/iio/light/acpi-als.c
+++ b/drivers/iio/light/acpi-als.c
@@ -54,7 +54,9 @@ static const struct iio_chan_spec acpi_a
.realbits = 32,
.storagebits = 32,
},
- .info_mask_separate = BIT(IIO_CHAN_INFO_RAW),
+ /* _RAW is here for backward ABI compatibility */
+ .info_mask_separate = BIT(IIO_CHAN_INFO_RAW) |
+ BIT(IIO_CHAN_INFO_PROCESSED),
},
};

@@ -152,7 +154,7 @@ static int acpi_als_read_raw(struct iio_
s32 temp_val;
int ret;

- if (mask != IIO_CHAN_INFO_RAW)
+ if ((mask != IIO_CHAN_INFO_PROCESSED) && (mask != IIO_CHAN_INFO_RAW))
return -EINVAL;

/* we support only illumination (_ALI) so far. */

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:50:05 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrew Elble <awe...@rit.edu>

commit 361cad3c89070aeb37560860ea8bfc092d545adc upstream.

We've seen this in a packet capture - I've intermixed what I
think was going on. The fix here is to grab the so_lock sooner.

1964379 -> #1 open (for write) reply seqid=1
1964393 -> #2 open (for read) reply seqid=2

__nfs4_close(), state->n_wronly--
nfs4_state_set_mode_locked(), changes state->state = [R]
state->flags is [RW]
state->state is [R], state->n_wronly == 0, state->n_rdonly == 1

1964398 -> #3 open (for write) call -> because close is already running
1964399 -> downgrade (to read) call seqid=2 (close of #1)
1964402 -> #3 open (for write) reply seqid=3

__update_open_stateid()
nfs_set_open_stateid_locked(), changes state->flags
state->flags is [RW]
state->state is [R], state->n_wronly == 0, state->n_rdonly == 1
new sequence number is exposed now via nfs4_stateid_copy()

next step would be update_open_stateflags(), pending so_lock

1964403 -> downgrade reply seqid=2, fails with OLD_STATEID (close of #1)

nfs4_close_prepare() gets so_lock and recalcs flags -> send close

1964405 -> downgrade (to read) call seqid=3 (close of #1 retry)

__update_open_stateid() gets so_lock
* update_open_stateflags() updates state->n_wronly.
nfs4_state_set_mode_locked() updates state->state

state->flags is [RW]
state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1

* should have suppressed the preceding nfs4_close_prepare() from
sending open_downgrade

1964406 -> write call
1964408 -> downgrade (to read) reply seqid=4 (close of #1 retry)

nfs_clear_open_stateid_locked()
state->flags is [R]
state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1

1964409 -> write reply (fails, openmode)

Signed-off-by: Andrew Elble <awe...@rit.edu>
Signed-off-by: Trond Myklebust <trond.m...@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
fs/nfs/nfs4proc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1385,6 +1385,7 @@ static void __update_open_stateid(struct
* Protect the call to nfs4_state_set_mode_locked and
* serialise the stateid update
*/
+ spin_lock(&state->owner->so_lock);
write_seqlock(&state->seqlock);
if (deleg_stateid != NULL) {
nfs4_stateid_copy(&state->stateid, deleg_stateid);
@@ -1393,7 +1394,6 @@ static void __update_open_stateid(struct
if (open_stateid != NULL)
nfs_set_open_stateid_locked(state, open_stateid, fmode);
write_sequnlock(&state->seqlock);
- spin_lock(&state->owner->so_lock);
update_open_stateflags(state, fmode);
spin_unlock(&state->owner->so_lock);
}

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:50:05 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Akinobu Mita <akinob...@gmail.com>

commit 431386e783a3a6c8b7707bee32d18c353b8688b2 upstream.

According to the datasheet, the resolusion of temperature sensor is
-5.35 counts/C. Temperature ADC is 472 counts at 25C.
(https://www.sparkfun.com/datasheets/Sensors/Pressure/MPL115A1.pdf
NOTE: This is older revision, but this information is removed from the
latest datasheet from nxp somehow)

Temp [C] = (Tadc - 472) / -5.35 + 25
= (Tadc - 605.750000) * -0.186915888

So the correct offset is -605.750000.

Signed-off-by: Akinobu Mita <akinob...@gmail.com>
Acked-by: Peter Meerwald-Stadler <pme...@pmeerw.net>
Signed-off-by: Jonathan Cameron <ji...@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/iio/pressure/mpl115.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/iio/pressure/mpl115.c
+++ b/drivers/iio/pressure/mpl115.c
@@ -117,7 +117,7 @@ static int mpl115_read_raw(struct iio_de
*val = ret >> 6;
return IIO_VAL_INT;
case IIO_CHAN_INFO_OFFSET:
- *val = 605;
+ *val = -605;
*val2 = 750000;
return IIO_VAL_INT_PLUS_MICRO;
case IIO_CHAN_INFO_SCALE:

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:50:05 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt (Red Hat) <ros...@goodmis.org>

commit f37755490fe9bf76f6ba1d8c6591745d3574a6a6 upstream.

The tracepoint infrastructure uses RCU sched protection to enable and
disable tracepoints safely. There are some instances where tracepoints are
used in infrastructure code (like kfree()) that get called after a CPU is
going offline, and perhaps when it is coming back online but hasn't been
registered yet.

This can probuce the following warning:

[ INFO: suspicious RCU usage. ]
4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
-------------------------------
include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1
no locks held by swapper/8/0.

stack backtrace:
CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34
Call Trace:
[c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
[c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
[c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
[c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
[c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
[c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
[c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
[c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
[c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
[c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
[c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
[c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14

This warning is not a false positive either. RCU is not protecting code that
is being executed while the CPU is offline.

Instead of playing "whack-a-mole(TM)" and adding conditional statements to
the tracepoints we find that are used in this instance, simply add a
cpu_online() test to the tracepoint code where the tracepoint will be
ignored if the CPU is offline.

Use of raw_smp_processor_id() is fine, as there should never be a case where
the tracepoint code goes from running on a CPU that is online and suddenly
gets migrated to a CPU that is offline.

Link: http://lkml.kernel.org/r/1455387773-4245-1-...@linux-powerpc.org

Reported-by: Denis Kirjanov <k...@linux-powerpc.org>
Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
Signed-off-by: Steven Rostedt <ros...@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
include/linux/tracepoint.h | 5 +++++
1 file changed, 5 insertions(+)

--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -14,8 +14,10 @@
* See the file COPYING for more details.
*/

+#include <linux/smp.h>
#include <linux/errno.h>
#include <linux/types.h>
+#include <linux/cpumask.h>
#include <linux/rcupdate.h>
#include <linux/static_key.h>

@@ -146,6 +148,9 @@ extern void syscall_unregfunc(void);
void *it_func; \
void *__data; \
\
+ if (!cpu_online(raw_smp_processor_id())) \
+ return; \
+ \
if (!(cond)) \
return; \
prercu; \

Greg Kroah-Hartman

unread,
Feb 23, 2016, 11:50:06 PM2/23/16
to
4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mike Christie <mchr...@redhat.com>

commit 9055082fb100cc66e20c048251d05159f5f2cfba upstream.

Another iscsi target that cannot handle large IOs, but does not tell us
a limit.

The Synology iSCSI targets report:

Block limits VPD page (SBC):
Write same no zero (WSNZ): 0
Maximum compare and write length: 0 blocks
Optimal transfer length granularity: 0 blocks
Maximum transfer length: 0 blocks
Optimal transfer length: 0 blocks
Maximum prefetch length: 0 blocks
Maximum unmap LBA count: 0
Maximum unmap block descriptor count: 0
Optimal unmap granularity: 0
Unmap granularity alignment valid: 0
Unmap granularity alignment: 0
Maximum write same length: 0x0 blocks

and the size of the command it can handle seems to depend on how much
memory it can allocate at the time. This results in IO errors when
handling large IOs. This patch just has us use the old 1024 default
sectors for this target by adding it to the scsi blacklist. We do not
have good contacs with this vendors, so I have not been able to try and
fix on their side.

I have posted this a long while back, but it was not merged. This
version just fixes it up for merge/patch failures in the original
version.

Reported-by: Ancoron Luciferis <ancoron....@googlemail.com>
Reported-by: Michael Meyers <ste...@tcnnet.com>
Signed-off-by: Mike Christie <mchr...@redhat.com>
Signed-off-by: Martin K. Petersen <martin....@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
drivers/scsi/scsi_devinfo.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/scsi/scsi_devinfo.c
+++ b/drivers/scsi/scsi_devinfo.c
@@ -227,6 +227,7 @@ static struct {
{"Promise", "VTrak E610f", NULL, BLIST_SPARSELUN | BLIST_NO_RSOC},
{"Promise", "", NULL, BLIST_SPARSELUN},
{"QNAP", "iSCSI Storage", NULL, BLIST_MAX_1024},
+ {"SYNOLOGY", "iSCSI Storage", NULL, BLIST_MAX_1024},
{"QUANTUM", "XP34301", "1071", BLIST_NOTQ},
{"REGAL", "CDC-4X", NULL, BLIST_MAX5LUN | BLIST_SINGLELUN},
{"SanDisk", "ImageMate CF-SD1", NULL, BLIST_FORCELUN},
It is loading more messages.
0 new messages