summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-08-02vhost: new device IOTLB APIJason Wang
This patch tries to implement an device IOTLB for vhost. This could be used with userspace(qemu) implementation of DMA remapping to emulate an IOMMU for the guest. The idea is simple, cache the translation in a software device IOTLB (which is implemented as an interval tree) in vhost and use vhost_net file descriptor for reporting IOTLB miss and IOTLB update/invalidation. When vhost meets an IOTLB miss, the fault address, size and access can be read from the file. After userspace finishes the translation, it writes the translated address to the vhost_net file to update the device IOTLB. When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq addresses set by ioctl are treated as iova instead of virtual address and the accessing can only be done through IOTLB instead of direct userspace memory access. Before each round or vq processing, all vq metadata is prefetched in device IOTLB to make sure no translation fault happens during vq processing. In most cases, virtqueues are contiguous even in virtual address space. The IOTLB translation for virtqueue itself may make it a little slower. We might add fast path cache on top of this patch. Signed-off-by: Jason Wang <jasowang@redhat.com> [mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ] [mst: fix build warnings ] Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [ weiyj.lk: missing unlock on error ] Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
2016-08-02vhost: drop vringh dependencyMichael S. Tsirkin
vringh isn't used by vhost net or scsi - it's used by CAIF only at the moment. Drop the dependency. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vhost: convert pre sorted vhost memory array to interval treeJason Wang
Current pre-sorted memory region array has some limitations for future device IOTLB conversion: 1) need extra work for adding and removing a single region, and it's expected to be slow because of sorting or memory re-allocation. 2) need extra work of removing a large range which may intersect several regions with different size. 3) need trick for a replacement policy like LRU To overcome the above shortcomings, this patch convert it to interval tree which can easily address the above issue with almost no extra work. The patch could be used for: - Extend the current API and only let the userspace to send diffs of memory table. - Simplify Device IOTLB implementation. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vhost: introduce vhost memory accessorsJason Wang
This patch introduces vhost memory accessors which were just wrappers for userspace address access helpers. This is a requirement for vhost device iotlb implementation which will add iotlb translations in those accessors. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02VSOCK: Add Makefile and KconfigAsias He
Enable virtio-vsock and vhost-vsock. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02VSOCK: Introduce vhost_vsock.koAsias He
VM sockets vhost transport implementation. This driver runs on the host. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02VSOCK: Introduce virtio_transport.koAsias He
VM sockets virtio transport implementation. This driver runs in the guest. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02VSOCK: Introduce virtio_vsock_common.koAsias He
This module contains the common code and header files for the following virtio_transporto and vhost_vsock kernel modules. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02VSOCK: defer sock removal to transportsStefan Hajnoczi
The virtio transport will implement graceful shutdown and the related SO_LINGER socket option. This requires orphaning the sock but keeping it in the table of connections after .release(). This patch adds the vsock_remove_sock() function and leaves it up to the transport when to remove the sock. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02VSOCK: transport-specific vsock_transport functionsStefan Hajnoczi
struct vsock_transport contains function pointers called by AF_VSOCK core code. The transport may want its own transport-specific function pointers and they can be added after struct vsock_transport. Allow the transport to fetch vsock_transport. It can downcast it to access transport-specific function pointers. The virtio transport will use this. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vhost: drop vringh dependencyMichael S. Tsirkin
vringh isn't used by vhost net or scsi - it's used by CAIF only at the moment. Drop the dependency. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02vop: pull in vhost KconfigMichael S. Tsirkin
VOP selects VHOST_RING. Pull in Kconfig that includes it to make it self-containing. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01virtio: new feature to detect IOMMU device quirkMichael S. Tsirkin
The interaction between virtio and IOMMUs is messy. On most systems with virtio, physical addresses match bus addresses, and it doesn't particularly matter which one we use to program the device. On some systems, including Xen and any system with a physical device that speaks virtio behind a physical IOMMU, we must program the IOMMU for virtio DMA to work at all. On other systems, including SPARC and PPC64, virtio-pci devices are enumerated as though they are behind an IOMMU, but the virtio host ignores the IOMMU, so we must either pretend that the IOMMU isn't there or somehow map everything as the identity. Add a feature bit to detect that quirk: VIRTIO_F_IOMMU_PLATFORM. Any device with this feature bit set to 0 needs a quirk and has to be passed physical addresses (as opposed to bus addresses) even though the device is behind an IOMMU. Note: it has to be a per-device quirk because for example, there could be a mix of passed-through and virtual virtio devices. As another example, some devices could be implemented by an out of process hypervisor backend (in case of qemu vhost, or vhost-user) and so support for an IOMMU needs to be coded up separately. It would be cleanest to handle this in IOMMU core code, but that needs per-device DMA ops. While we are waiting for that to be implemented, use a work-around in virtio core. Note: a "noiommu" feature is a quirk - add a wrapper to make that clear. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01balloon: check the number of available pages in leak balloonKonstantin Neumoin
The balloon has a special mechanism that is subscribed to the oom notification which leads to deflation for a fixed number of pages. The number is always fixed even when the balloon is fully deflated. But leak_balloon did not expect that the pages to deflate will be more than taken, and raise a "BUG" in balloon_page_dequeue when page list will be empty. So, the simplest solution would be to check that the number of releases pages is less or equal to the number taken pages. Cc: stable@vger.kernel.org Signed-off-by: Konstantin Neumoin <kneumoin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01vhost: lockless enqueuingJason Wang
We use spinlock to synchronize the work list now which may cause unnecessary contentions. So this patch switch to use llist to remove this contention. Pktgen tests shows about 5% improvement: Before: ~1300000 pps After: ~1370000 pps Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01vhost: simplify work flushingJason Wang
We used to implement the work flushing through tracking queued seq, done seq, and the number of flushing. This patch simplify this by just implement work flushing through another kind of vhost work with completion. This will be used by lockless enqueuing patch. Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-24Linux 4.7Linus Torvalds
2016-07-24Merge tag 'ceph-for-4.7-rc8' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fix from Ilya Dryomov: "A fix for a long-standing bug in the incremental osdmap handling code that caused misdirected requests, tagged for stable" The tag is signed with a brand new key - Sage is on vacation and I didn't anticipate this" * tag 'ceph-for-4.7-rc8' of git://github.com/ceph/ceph-client: libceph: apply new_state before new_up_client on incrementals
2016-07-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix memory leak in nftables, from Liping Zhang. 2) Need to check result of vlan_insert_tag() in batman-adv otherwise we risk NULL skb derefs, from Sven Eckelmann. 3) Check for dev_alloc_skb() failures in cfg80211, from Gregory Greenman. 4) Handle properly when we have ppp_unregister_channel() happening in parallel with ppp_connect_channel(), from WANG Cong. 5) Fix DCCP deadlock, from Eric Dumazet. 6) Bail out properly in UDP if sk_filter() truncates the packet to be smaller than even the space that the protocol headers need. From Michal Kubecek. 7) Similarly for rose, dccp, and sctp, from Willem de Bruijn. 8) Make TCP challenge ACKs less predictable, from Eric Dumazet. 9) Fix infinite loop in bgmac_dma_tx_add() from Florian Fainelli. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits) packet: propagate sock_cmsg_send() error net/mlx5e: Fix del vxlan port command buffer memset packet: fix second argument of sock_tx_timestamp() net: switchdev: change ageing_time type to clock_t Update maintainer for EHEA driver. net/mlx4_en: Add resilience in low memory systems net/mlx4_en: Move filters cleanup to a proper location sctp: load transport header after sk_filter net/sched/sch_htb: clamp xstats tokens to fit into 32-bit int net: cavium: liquidio: Avoid dma_unmap_single on uninitialized ndata net: nb8800: Fix SKB leak in nb8800_receive() et131x: Fix logical vs bitwise check in et131x_tx_timeout() vlan: use a valid default mtu value for vlan over macsec net: bgmac: Fix infinite loop in bgmac_dma_tx_add() mlxsw: spectrum: Prevent invalid ingress buffer mapping mlxsw: spectrum: Prevent overwrite of DCB capability fields mlxsw: spectrum: Don't emit errors when PFC is disabled mlxsw: spectrum: Indicate support for autonegotiation mlxsw: spectrum: Force link training according to admin state r8152: add MODULE_VERSION ...
2016-07-23Merge branch 'overlayfs-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "This contains a fix for a potential crash/corruption issue and another where the suid/sgid bits weren't cleared on write" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: verify upper dentry in ovl_remove_and_whiteout() ovl: Copy up underlying inode's ->i_mode to overlay inode ovl: handle ATTR_KILL*
2016-07-23Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "Five fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: pps: do not crash when failed to register tools/vm/slabinfo: fix an unintentional printf testing/radix-tree: fix a macro expansion bug radix-tree: fix radix_tree_iter_retry() for tagged iterators. mm: memcontrol: fix cgroup creation failure after many small jobs
2016-07-23Merge tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull intel kabylake drm fixes from Dave Airlie: "As mentioned Intel has gathered all the Kabylake fixes from -next, which we've enabled in 4.7 for the first time, these are pretty much limited in scope to only affects kabylake, which is hw that isn't shipping yet. So I'm mostly okay with it going in now. If we don't land this, it might be a good idea to disable kabylake support in 4.7 before we ship" * tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of git://people.freedesktop.org/~airlied/linux: (28 commits) drm/i915/kbl: Introduce the first official DMC for Kabylake. drm/i915: Introduce Kabypoint PCH for Kabylake H/DT. drm/i915/gen9: implement WaConextSwitchWithConcurrentTLBInvalidate drm/i915/gen9: Add WaFbcHighMemBwCorruptionAvoidance drm/i195/fbc: Add WaFbcNukeOnHostModify drm/i915/gen9: Add WaFbcWakeMemOn drm/i915/gen9: Add WaFbcTurnOffFbcWatermark drm/i915/kbl: Add WaClearSlmSpaceAtContextSwitch drm/i915/gen9: Add WaEnableChickenDCPR drm/i915/kbl: Add WaDisableSbeCacheDispatchPortSharing drm/i915/kbl: Add WaDisableGafsUnitClkGating drm/i915/kbl: Add WaForGAMHang drm/i915: Add WaInsertDummyPushConstP for bxt and kbl drm/i915/kbl: Add WaDisableDynamicCreditSharing drm/i915/kbl: Add WaDisableGamClockGating drm/i915/gen9: Enable must set chicken bits in config0 reg drm/i915/kbl: Add WaDisableLSQCROPERFforOCL drm/i915/kbl: Add WaDisableSDEUnitClockGating drm/i915/kbl: Add WaDisableFenceDestinationToSLM for A0 drm/i915/kbl: Add WaEnableGapsTsvCreditFix ...
2016-07-23Merge tag 'drm-fixes-for-v4.7-rc8-intel' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Two i915 regression fixes. Intel have submitted some Kabylake fixes I'll send separately, since this is the first kernel with kabylake support and they don't go much outside that area I think they should be fine" * tag 'drm-fixes-for-v4.7-rc8-intel' of git://people.freedesktop.org/~airlied/linux: drm/i915: add missing condition for committing planes on crtc drm/i915: Treat eDP as always connected, again
2016-07-23Merge tag 'm68k-for-v4.8-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k upddates from Geert Uytterhoeven: - assorted spelling fixes - defconfig updates * tag 'm68k-for-v4.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k/defconfig: Update defconfigs for v4.7-rc2 m68k: Assorted spelling fixes
2016-07-23Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "A handful of fixes before final release: Marvell Armada: - One to fix a typo in the devicetree specifying memory ranges for the crypto engine - Two to deal with marking PCI and device-memory as strongly ordered to avoid hardware deadlocks, in particular when enabling above crypto driver. - Compile fix for PM Allwinner: - DT clock fixes to deal with u-boot-enabled framebuffer (simplefb). - Make R8 (C.H.I.P. SoC) inherit system compatibility from A13 to make clocks register proper. Tegra: - Fix SD card voltage setting on the Tegra3 Beaver dev board Misc: - Two maintainers updates for STM32 and STi platforms" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: tegra: beaver: Allow SD card voltage to be changed MAINTAINERS: update STi maintainer list MAINTAINERS: update STM32 maintainers list ARM: mvebu: compile pm code conditionally ARM: dts: sun7i: Fix pll3x2 and pll7x2 not having a parent clock ARM: dts: sunxi: Add pll3 to simplefb nodes clocks lists ARM: dts: armada-38x: fix MBUS_ID for crypto SRAM on Armada 385 Linksys ARM: mvebu: map PCI I/O regions strongly ordered ARM: mvebu: fix HW I/O coherency related deadlocks ARM: sunxi/dt: make the CHIP inherit from allwinner,sun5i-a13
2016-07-23Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a sporadic build failure in the qat driver as well as a memory corruption bug in rsa-pkcs1pad" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct crypto: qat - make qat_asym_algs.o depend on asn1 headers
2016-07-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull key handling fixes from James Morris: "Quoting David Howells: Here are three miscellaneous fixes: (1) Fix a panic in some debugging code in PKCS#7. This can only happen by explicitly inserting a #define DEBUG into the code. (2) Fix the calculation of the digest length in the PE file parser. This causes a failure where there should be a success. (3) Fix the case where an X.509 cert can be added as an asymmetric key to a trusted keyring with no trust restriction if no AKID is supplied. Bugs (1) and (2) aren't particularly problematic, but (3) allows a security check to be bypassed. Happily, this is a recent regression and never made it into a released kernel" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: KEYS: Fix for erroneous trust of incorrectly signed X.509 certs pefile: Fix the failure of calculation for digest PKCS#7: Fix panic when referring to the empty AKID when DEBUG defined
2016-07-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "A few more fixes for the input subsystem: - restore naming for tsc2005 touchscreens as some userspace match on it - fix out of bound access in legacy keyboard driver - fixup in RMI4 driver Everything is tagged for stable as well" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: tsc200x - report proper input_dev name tty/vt/keyboard: fix OOB access in do_compute_shiftstate() Input: synaptics-rmi4 - fix maximum size check for F12 control register 8
2016-07-23Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fix from Dan Williams: "This contains a regression fix for a problem that was introduced in v4.7-rc6. In 4.7-rc1 we introduced auto-probing for the ACPI DSM (device- specific-method) format that the platform firmware implements for nvdimm devices. We initially fixed a regression in probing the QEMU DSM implementation by making acpi_check_dsm() tolerant of the way QEMU reports the "0 DSMs supported" condition. However, that broke HPE platforms since that tolerance caused the driver to mistakenly match the 1-zero-byte response those platforms give to "unknown" commands. Instead, we simply make the driver tolerant of not finding any supported DSMs. This has been tested to work with both QEMU and HPE platforms. This commit has appeared in a -next release with no reported issues" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nfit: make DIMM DSMs optional
2016-07-23Merge tag 'gpio-v4.7-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fix from Linus Walleij: "Compile problem fix for Tegra, Sorry to send this in the last minute but Ingo says this build failure is very prominent so I'm not going to wait for v4.7 before sending it. It is a case of COMPILE_TEST causing more problems than it solves and I'm already swearing about me shooting myself in the foot with that gun :(" * tag 'gpio-v4.7-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: tegra: don't auto-enable for COMPILE_TEST
2016-07-23Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Michael Turquette: "Fix a bug in the at91 clk driver, two compile time warnings in sunxi clk drivers, and one bug in a sunxi clk driver introduced in the 4.7 merge window" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: at91: fix clk_programmable_set_parent() clk: sunxi: remove unused variable clk: sunxi: display: Add per-clock flags clk: sunxi: tcon-ch1: Do not return a negative error in get_parent
2016-07-23Merge branch 'for-4.7-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fix from Tejun Heo: "Another fallout from max_sectors bump a couple years ago. The lite-on optical drive times out on large requests" * 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: LITE-ON CX1-JB256-HP needs lower max_sectors
2016-07-23Merge tag 'mmc-v4.7-rc7' of git://git.linaro.org/people/ulf.hansson/mmcLinus Torvalds
Pull MMC fixes from Ulf Hansson: "Here are a few late mmc fixes intended for v4.7 final. MMC core: - Fix eMMC packed command header endianness - Fix free of uninitialized buffer for mmc ioctl MMC host: - pxamci: Fix potential oops in ->probe()" * tag 'mmc-v4.7-rc7' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: pxamci: fix potential oops mmc: block: fix packed command header endianness mmc: block: fix free of uninitialized 'idata->buf'
2016-07-23Merge tag 'sound-4.7-fix2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "No surprise, just a few small fixes: a couple of changes are seen in the core part, and both of them are rather for unusual error paths. The rest are the regular HD-audio fixes and one USB-audio regression fix" * tag 'sound-4.7-fix2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Fix quirks code is not called ALSA: hda: add AMD Stoney PCI ID with proper driver caps ALSA: hda - fix use-after-free after module unload ALSA: pcm: Free chmap at PCM free callback, too ALSA: ctl: Stop notification after disconnection ALSA: hda/realtek - add new pin definition in alc225 pin quirk table
2016-07-23Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull NVMe fix from Jens Axboe: "Late addition here, it's basically a revert of a patch that was added in this merge window, but has proven to cause problems. This is swapping out the RCU based namespace protection with a good old mutex instead" * 'for-linus' of git://git.kernel.dk/linux-block: nvme: Remove RCU namespace protection
2016-07-23pps: do not crash when failed to registerJiri Slaby
With this command sequence: modprobe plip modprobe pps_parport rmmod pps_parport the partport_pps modules causes this crash: BUG: unable to handle kernel NULL pointer dereference at (null) IP: parport_detach+0x1d/0x60 [pps_parport] Oops: 0000 [#1] SMP ... Call Trace: parport_unregister_driver+0x65/0xc0 [parport] SyS_delete_module+0x187/0x210 The sequence that builds up to this is: 1) plip is loaded and takes the parport device for exclusive use: plip0: Parallel port at 0x378, using IRQ 7. 2) pps_parport then fails to grab the device: pps_parport: parallel port PPS client parport0: cannot grant exclusive access for device pps_parport pps_parport: couldn't register with parport0 3) rmmod of pps_parport is then killed because it tries to access pardev->name, but pardev (taken from port->cad) is NULL. So add a check for NULL in the test there too. Link: http://lkml.kernel.org/r/20160714115245.12651-1-jslaby@suse.cz Signed-off-by: Jiri Slaby <jslaby@suse.cz> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23tools/vm/slabinfo: fix an unintentional printfDan Carpenter
The curly braces are missing here so we print stuff unintentionally. Fixes: 9da4714a2d44 ('slub: slabinfo update for cmpxchg handling') Link: http://lkml.kernel.org/r/20160715211243.GE19522@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Laura Abbott <labbott@fedoraproject.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23testing/radix-tree: fix a macro expansion bugDan Carpenter
There are no parentheses around this macro and it causes a problem when we do: index = rand() % THRASH_SIZE; Link: http://lkml.kernel.org/r/20160715210953.GC19522@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23radix-tree: fix radix_tree_iter_retry() for tagged iterators.Andrey Ryabinin
radix_tree_iter_retry() resets slot to NULL, but it doesn't reset tags. Then NULL slot and non-zero iter.tags passed to radix_tree_next_slot() leading to crash: RIP: radix_tree_next_slot include/linux/radix-tree.h:473 find_get_pages_tag+0x334/0x930 mm/filemap.c:1452 .... Call Trace: pagevec_lookup_tag+0x3a/0x80 mm/swap.c:960 mpage_prepare_extent_to_map+0x321/0xa90 fs/ext4/inode.c:2516 ext4_writepages+0x10be/0x2b20 fs/ext4/inode.c:2736 do_writepages+0x97/0x100 mm/page-writeback.c:2364 __filemap_fdatawrite_range+0x248/0x2e0 mm/filemap.c:300 filemap_write_and_wait_range+0x121/0x1b0 mm/filemap.c:490 ext4_sync_file+0x34d/0xdb0 fs/ext4/fsync.c:115 vfs_fsync_range+0x10a/0x250 fs/sync.c:195 vfs_fsync fs/sync.c:209 do_fsync+0x42/0x70 fs/sync.c:219 SYSC_fdatasync fs/sync.c:232 SyS_fdatasync+0x19/0x20 fs/sync.c:230 entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207 We must reset iterator's tags to bail out from radix_tree_next_slot() and go to the slow-path in radix_tree_next_chunk(). Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup") Link: http://lkml.kernel.org/r/1468495196-10604-1-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23mm: memcontrol: fix cgroup creation failure after many small jobsJohannes Weiner
The memory controller has quite a bit of state that usually outlives the cgroup and pins its CSS until said state disappears. At the same time it imposes a 16-bit limit on the CSS ID space to economically store IDs in the wild. Consequently, when we use cgroups to contain frequent but small and short-lived jobs that leave behind some page cache, we quickly run into the 64k limitations of outstanding CSSs. Creating a new cgroup fails with -ENOSPC while there are only a few, or even no user-visible cgroups in existence. Although pinning CSSs past cgroup removal is common, there are only two instances that actually need an ID after a cgroup is deleted: cache shadow entries and swapout records. Cache shadow entries reference the ID weakly and can deal with the CSS having disappeared when it's looked up later. They pose no hurdle. Swap-out records do need to pin the css to hierarchically attribute swapins after the cgroup has been deleted; though the only pages that remain swapped out after offlining are tmpfs/shmem pages. And those references are under the user's control, so they are manageable. This patch introduces a private 16-bit memcg ID and switches swap and cache shadow entries over to using that. This ID can then be recycled after offlining when the CSS remains pinned only by objects that don't specifically need it. This script demonstrates the problem by faulting one cache page in a new cgroup and deleting it again: set -e mkdir -p pages for x in `seq 128000`; do [ $((x % 1000)) -eq 0 ] && echo $x mkdir /cgroup/foo echo $$ >/cgroup/foo/cgroup.procs echo trex >pages/$x echo $$ >/cgroup/cgroup.procs rmdir /cgroup/foo done When run on an unpatched kernel, we eventually run out of possible IDs even though there are no visible cgroups: [root@ham ~]# ./cssidstress.sh [...] 65000 mkdir: cannot create directory '/cgroup/foo': No space left on device After this patch, the IDs get released upon cgroup destruction and the cache and css objects get released once memory reclaim kicks in. [hannes@cmpxchg.org: init the IDR] Link: http://lkml.kernel.org/r/20160621154601.GA22431@cmpxchg.org Fixes: b2052564e66d ("mm: memcontrol: continue cache reclaim from offlined groups") Link: http://lkml.kernel.org/r/20160617162516.GD19084@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: John Garcia <john.garcia@mesosphere.io> Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Nikolay Borisov <kernel@kyup.com> Cc: <stable@vger.kernel.org> [3.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-22gpio: tegra: don't auto-enable for COMPILE_TESTArnd Bergmann
I stumbled over a build error with COMPILE_TEST and CONFIG_OF disabled: drivers/gpio/gpio-tegra.c: In function 'tegra_gpio_probe': drivers/gpio/gpio-tegra.c:603:9: error: 'struct gpio_chip' has no member named 'of_node' The problem is that the newly added GPIO_TEGRA Kconfig symbol does not have a dependency on CONFIG_OF. However, there is another problem here as the driver gets enabled unconditionally whenever COMPILE_TEST is set. This fixes both problems, by making the symbol user-visible when COMPILE_TEST is set and default-enabled for ARCH_TEGRA=y. As a side-effect, it is now possible to compile-test a Tegra kernel with GPIO support disabled, which is harmless. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 4dd4dd1d2120 ("gpio: tegra: Allow compile test") Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-07-22libceph: apply new_state before new_up_client on incrementalsIlya Dryomov
Currently, osd_weight and osd_state fields are updated in the encoding order. This is wrong, because an incremental map may look like e.g. new_up_client: { osd=6, addr=... } # set osd_state and addr new_state: { osd=6, xorstate=EXISTS } # clear osd_state Suppose osd6's current osd_state is EXISTS (i.e. osd6 is down). After applying new_up_client, osd_state is changed to EXISTS | UP. Carrying on with the new_state update, we flip EXISTS and leave osd6 in a weird "!EXISTS but UP" state. A non-existent OSD is considered down by the mapping code 2087 for (i = 0; i < pg->pg_temp.len; i++) { 2088 if (ceph_osd_is_down(osdmap, pg->pg_temp.osds[i])) { 2089 if (ceph_can_shift_osds(pi)) 2090 continue; 2091 2092 temp->osds[temp->size++] = CRUSH_ITEM_NONE; and so requests get directed to the second OSD in the set instead of the first, resulting in OSD-side errors like: [WRN] : client.4239 192.168.122.21:0/2444980242 misdirected client.4239.1:2827 pg 2.5df899f2 to osd.4 not [1,4,6] in e680/680 and hung rbds on the client: [ 493.566367] rbd: rbd0: write 400000 at 11cc00000 (0) [ 493.566805] rbd: rbd0: result -6 xferred 400000 [ 493.567011] blk_update_request: I/O error, dev rbd0, sector 9330688 The fix is to decouple application from the decoding and: - apply new_weight first - apply new_state before new_up_client - twiddle osd_state flags if marking in - clear out some of the state if osd is destroyed Fixes: http://tracker.ceph.com/issues/14901 Cc: stable@vger.kernel.org # 3.15+: 6dd74e44dc1d: libceph: set 'exists' flag for newly up osd Cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2016-07-22crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request structHerbert Xu
To allow for child request context the struct akcipher_request child_req needs to be at the end of the structure. Cc: stable@vger.kernel.org Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-22ovl: verify upper dentry in ovl_remove_and_whiteout()Maxim Patlasov
The upper dentry may become stale before we call ovl_lock_rename_workdir. For example, someone could (mistakenly or maliciously) manually unlink(2) it directly from upperdir. To ensure it is not stale, let's lookup it after ovl_lock_rename_workdir and and check if it matches the upper dentry. Essentially, it is the same problem and similar solution as in commit 11f3710417d0 ("ovl: verify upper dentry before unlink and rename"). Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-07-22packet: propagate sock_cmsg_send() errorSoheil Hassas Yeganeh
sock_cmsg_send() can return different error codes and not only -EINVAL, and we should properly propagate them. Fixes: c14ac9451c34 ("sock: enable timestamping using control messages") Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-21crypto: qat - make qat_asym_algs.o depend on asn1 headersJan Stancek
Parallel build can sporadically fail because asn1 headers may not be built yet by the time qat_asym_algs.o is compiled: drivers/crypto/qat/qat_common/qat_asym_algs.c:55:32: fatal error: qat_rsapubkey-asn1.h: No such file or directory #include "qat_rsapubkey-asn1.h" Cc: stable@vger.kernel.org Signed-off-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-20Input: tsc200x - report proper input_dev nameMichael Welling
Passes input_id struct to the common probe function for the tsc200x drivers instead of just the bustype. This allows for the use of the product variable to set the input_dev->name variable according to the type of touchscreen used. Note that when we introduced support for TSC2004 we started calling everything TSC200X, so let's keep this quirk. Signed-off-by: Michael Welling <mwelling@ieee.org> Cc: stable@vger.kernel.org Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-07-20tty/vt/keyboard: fix OOB access in do_compute_shiftstate()Dmitry Torokhov
The size of individual keymap in drivers/tty/vt/keyboard.c is NR_KEYS, which is currently 256, whereas number of keys/buttons in input device (and therefor in key_down) is much larger - KEY_CNT - 768, and that can cause out-of-bound access when we do sym = U(key_maps[0][k]); with large 'k'. To fix it we should not attempt iterating beyond smaller of NR_KEYS and KEY_CNT. Also while at it let's switch to for_each_set_bit() instead of open-coding it. Reported-by: Sasha Levin <sasha.levin@oracle.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-07-20net/mlx5e: Fix del vxlan port command buffer memsetSaeed Mahameed
memset the command buffers rather than the pointers to them. Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-19packet: fix second argument of sock_tx_timestamp()Yoshihiro Shimoda
This patch fixes an issue that a syscall (e.g. sendto syscall) cannot work correctly. Since the sendto syscall doesn't have msg_control buffer, the sock_tx_timestamp() in packet_snd() cannot work correctly because the socks.tsflags is set to 0. So, this patch sets the socks.tsflags to sk->sk_tsflags as default. Fixes: c14ac9451c34 ("sock: enable timestamping using control messages") Reported-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com> Reported-by: Keita Kobayashi <keita.kobayashi.ym@renesas.com> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>