summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-06-13mm/mlock.c: change count_mm_mlocked_page_nr return typeswkhack
On a 64-bit machine the value of "vma->vm_end - vma->vm_start" may be negative when using 32 bit ints and the "count >> PAGE_SHIFT"'s result will be wrong. So change the local variable and return value to unsigned long to fix the problem. Link: http://lkml.kernel.org/r/20190513023701.83056-1-swkhack@gmail.com Fixes: 0cf2f6f6dc60 ("mm: mlock: check against vma for actual mlock() size") Signed-off-by: swkhack <swkhack@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm: mmu_gather: remove __tlb_reset_range() for force flushYang Shi
A few new fields were added to mmu_gather to make TLB flush smarter for huge page by telling what level of page table is changed. __tlb_reset_range() is used to reset all these page table state to unchanged, which is called by TLB flush for parallel mapping changes for the same range under non-exclusive lock (i.e. read mmap_sem). Before commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap"), the syscalls (e.g. MADV_DONTNEED, MADV_FREE) which may update PTEs in parallel don't remove page tables. But, the forementioned commit may do munmap() under read mmap_sem and free page tables. This may result in program hang on aarch64 reported by Jan Stancek. The problem could be reproduced by his test program with slightly modified below. ---8<--- static int map_size = 4096; static int num_iter = 500; static long threads_total; static void *distant_area; void *map_write_unmap(void *ptr) { int *fd = ptr; unsigned char *map_address; int i, j = 0; for (i = 0; i < num_iter; i++) { map_address = mmap(distant_area, (size_t) map_size, PROT_WRITE | PROT_READ, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (map_address == MAP_FAILED) { perror("mmap"); exit(1); } for (j = 0; j < map_size; j++) map_address[j] = 'b'; if (munmap(map_address, map_size) == -1) { perror("munmap"); exit(1); } } return NULL; } void *dummy(void *ptr) { return NULL; } int main(void) { pthread_t thid[2]; /* hint for mmap in map_write_unmap() */ distant_area = mmap(0, DISTANT_MMAP_SIZE, PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); munmap(distant_area, (size_t)DISTANT_MMAP_SIZE); distant_area += DISTANT_MMAP_SIZE / 2; while (1) { pthread_create(&thid[0], NULL, map_write_unmap, NULL); pthread_create(&thid[1], NULL, dummy, NULL); pthread_join(thid[0], NULL); pthread_join(thid[1], NULL); } } ---8<--- The program may bring in parallel execution like below: t1 t2 munmap(map_address) downgrade_write(&mm->mmap_sem); unmap_region() tlb_gather_mmu() inc_tlb_flush_pending(tlb->mm); free_pgtables() tlb->freed_tables = 1 tlb->cleared_pmds = 1 pthread_exit() madvise(thread_stack, 8M, MADV_DONTNEED) zap_page_range() tlb_gather_mmu() inc_tlb_flush_pending(tlb->mm); tlb_finish_mmu() if (mm_tlb_flush_nested(tlb->mm)) __tlb_reset_range() __tlb_reset_range() would reset freed_tables and cleared_* bits, but this may cause inconsistency for munmap() which do free page tables. Then it may result in some architectures, e.g. aarch64, may not flush TLB completely as expected to have stale TLB entries remained. Use fullmm flush since it yields much better performance on aarch64 and non-fullmm doesn't yields significant difference on x86. The original proposed fix came from Jan Stancek who mainly debugged this issue, I just wrapped up everything together. Jan's testing results: v5.2-rc2-24-gbec7550cca10 -------------------------- mean stddev real 37.382 2.780 user 1.420 0.078 sys 54.658 1.855 v5.2-rc2-24-gbec7550cca10 + "mm: mmu_gather: remove __tlb_reset_range() for force flush" ---------------------------------------------------------------------------------------_ mean stddev real 37.119 2.105 user 1.548 0.087 sys 55.698 1.357 [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1558322252-113575-1-git-send-email-yang.shi@linux.alibaba.com Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Jan Stancek <jstancek@redhat.com> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Suggested-by: Will Deacon <will.deacon@arm.com> Tested-by: Will Deacon <will.deacon@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> [4.20+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13fs/ocfs2: fix race in ocfs2_dentry_attach_lock()Wengang Wang
ocfs2_dentry_attach_lock() can be executed in parallel threads against the same dentry. Make that race safe. The race is like this: thread A thread B (A1) enter ocfs2_dentry_attach_lock, seeing dentry->d_fsdata is NULL, and no alias found by ocfs2_find_local_alias, so kmalloc a new ocfs2_dentry_lock structure to local variable "dl", dl1 ..... (B1) enter ocfs2_dentry_attach_lock, seeing dentry->d_fsdata is NULL, and no alias found by ocfs2_find_local_alias so kmalloc a new ocfs2_dentry_lock structure to local variable "dl", dl2. ...... (A2) set dentry->d_fsdata with dl1, call ocfs2_dentry_lock() and increase dl1->dl_lockres.l_ro_holders to 1 on success. ...... (B2) set dentry->d_fsdata with dl2 call ocfs2_dentry_lock() and increase dl2->dl_lockres.l_ro_holders to 1 on success. ...... (A3) call ocfs2_dentry_unlock() and decrease dl2->dl_lockres.l_ro_holders to 0 on success. .... (B3) call ocfs2_dentry_unlock(), decreasing dl2->dl_lockres.l_ro_holders, but see it's zero now, panic Link: http://lkml.kernel.org/r/20190529174636.22364-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reported-by: Daniel Sobe <daniel.sobe@nxp.com> Tested-by: Daniel Sobe <daniel.sobe@nxp.com> Reviewed-by: Changwei Ge <gechangwei@live.cn> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/vmscan.c: fix recent_rotated historyKirill Tkhai
Johannes pointed out that after commit 886cf1901db9 ("mm: move recent_rotated pages calculation to shrink_inactive_list()") we lost all zone_reclaim_stat::recent_rotated history. This fixes it. Link: http://lkml.kernel.org/r/155905972210.26456.11178359431724024112.stgit@localhost.localdomain Fixes: 886cf1901db9 ("mm: move recent_rotated pages calculation to shrink_inactive_list()") Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reported-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/mlock.c: mlockall error for flag MCL_ONFAULTPotyra, Stefan
If mlockall() is called with only MCL_ONFAULT as flag, it removes any previously applied lockings and does nothing else. This behavior is counter-intuitive and doesn't match the Linux man page. For mlockall(): EINVAL Unknown flags were specified or MCL_ONFAULT was specified without either MCL_FUTURE or MCL_CURRENT. Consequently, return the error EINVAL, if only MCL_ONFAULT is passed. That way, applications will at least detect that they are calling mlockall() incorrectly. Link: http://lkml.kernel.org/r/20190527075333.GA6339@er01809n.ebgroup.elektrobit.com Fixes: b0f205c2a308 ("mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage") Signed-off-by: Stefan Potyra <Stefan.Potyra@elektrobit.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILEManuel Traut
At least for ARM64 kernels compiled with the crosstoolchain from Debian/stretch or with the toolchain from kernel.org the line number is not decoded correctly by 'decode_stacktrace.sh': $ echo "[ 136.513051] f1+0x0/0xc [kcrash]" | \ CROSS_COMPILE=/opt/gcc-8.1.0-nolibc/aarch64-linux/bin/aarch64-linux- \ ./scripts/decode_stacktrace.sh /scratch/linux-arm64/vmlinux \ /scratch/linux-arm64 \ /nfs/debian/lib/modules/4.20.0-devel [ 136.513051] f1 (/linux/drivers/staging/kcrash/kcrash.c:68) kcrash If addr2line from the toolchain is used the decoded line number is correct: [ 136.513051] f1 (/linux/drivers/staging/kcrash/kcrash.c:57) kcrash Link: http://lkml.kernel.org/r/20190527083425.3763-1-manut@linutronix.de Signed-off-by: Manuel Traut <manut@linutronix.de> Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/list_lru.c: fix memory leak in __memcg_init_list_lru_nodeShakeel Butt
Syzbot reported following memory leak: ffffffffda RBX: 0000000000000003 RCX: 0000000000441f79 BUG: memory leak unreferenced object 0xffff888114f26040 (size 32): comm "syz-executor626", pid 7056, jiffies 4294948701 (age 39.410s) hex dump (first 32 bytes): 40 60 f2 14 81 88 ff ff 40 60 f2 14 81 88 ff ff @`......@`...... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: slab_post_alloc_hook mm/slab.h:439 [inline] slab_alloc mm/slab.c:3326 [inline] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 kmalloc include/linux/slab.h:547 [inline] __memcg_init_list_lru_node+0x58/0xf0 mm/list_lru.c:352 memcg_init_list_lru_node mm/list_lru.c:375 [inline] memcg_init_list_lru mm/list_lru.c:459 [inline] __list_lru_init+0x193/0x2a0 mm/list_lru.c:626 alloc_super+0x2e0/0x310 fs/super.c:269 sget_userns+0x94/0x2a0 fs/super.c:609 sget+0x8d/0xb0 fs/super.c:660 mount_nodev+0x31/0xb0 fs/super.c:1387 fuse_mount+0x2d/0x40 fs/fuse/inode.c:1236 legacy_get_tree+0x27/0x80 fs/fs_context.c:661 vfs_get_tree+0x2e/0x120 fs/super.c:1476 do_new_mount fs/namespace.c:2790 [inline] do_mount+0x932/0xc50 fs/namespace.c:3110 ksys_mount+0xab/0x120 fs/namespace.c:3319 __do_sys_mount fs/namespace.c:3333 [inline] __se_sys_mount fs/namespace.c:3330 [inline] __x64_sys_mount+0x26/0x30 fs/namespace.c:3330 do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This is a simple off by one bug on the error path. Link: http://lkml.kernel.org/r/20190528043202.99980-1-shakeelb@google.com Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists") Reported-by: syzbot+f90a420dfe2b1b03cb2c@syzkaller.appspotmail.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm: memcontrol: don't batch updates of local VM stats and eventsJohannes Weiner
The kernel test robot noticed a 26% will-it-scale pagefault regression from commit 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty"). This appears to be caused by bouncing the additional cachelines from the new hierarchical statistics counters. We can fix this by getting rid of the batched local counters instead. Originally, there were *only* group-local counters, and they were fully maintained per cpu. A reader of a stats file high up in the cgroup tree would have to walk the entire subtree and collect each level's per-cpu counters to get the recursive view. This was prohibitively expensive, and so we switched to per-cpu batched updates of the local counters during a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting"), reducing the complexity from nr_subgroups * nr_cpus to nr_subgroups. With growing machines and cgroup trees, the tree walk itself became too expensive for monitoring top-level groups, and this is when the culprit patch added hierarchy counters on each cgroup level. When the per-cpu batch size would be reached, both the local and the hierarchy counters would get batch-updated from the per-cpu delta simultaneously. This makes local and hierarchical counter reads blazingly fast, but it unfortunately makes the write-side too cache line intense. Since local counter reads were never a problem - we only centralized them to accelerate the hierarchy walk - and use of the local counters are becoming rarer due to replacement with hierarchical views (ongoing rework in the page reclaim and workingset code), we can make those local counters unbatched per-cpu counters again. The scheme will then be as such: when a memcg statistic changes, the writer will: - update the local counter (per-cpu) - update the batch counter (per-cpu). If the batch is full: - spill the batch into the group's atomic_t - spill the batch into all ancestors' atomic_ts - empty out the batch counter (per-cpu) when a local memcg counter is read, the reader will: - collect the local counter from all cpus when a hiearchy memcg counter is read, the reader will: - read the atomic_t We might be able to simplify this further and make the recursive counters unbatched per-cpu counters as well (batch upward propagation, but leave per-cpu collection to the readers), but that will require a more in-depth analysis and testing of all the callsites. Deal with the immediate regression for now. Link: http://lkml.kernel.org/r/20190521151647.GB2870@cmpxchg.org Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: kernel test robot <rong.a.chen@intel.com> Tested-by: kernel test robot <rong.a.chen@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13Merge branch 'bpf-ppc-div-fix'Daniel Borkmann
Naveen N. Rao says: ==================== The first patch updates DIV64 overflow tests to properly detect error conditions. The second patch fixes powerpc64 JIT to generate the proper unsigned division instruction for BPF_ALU64. ==================== Acked-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-13powerpc/bpf: use unsigned division instruction for 64-bit operationsNaveen N. Rao
BPF_ALU64 div/mod operations are currently using signed division, unlike BPF_ALU32 operations. Fix the same. DIV64 and MOD64 overflow tests pass with this fix. Fixes: 156d0e290e969c ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-13bpf: fix div64 overflow tests to properly detect errorsNaveen N. Rao
If the result of the division is LLONG_MIN, current tests do not detect the error since the return value is truncated to a 32-bit value and ends up being 0. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-13bpf: sync BPF_FIB_LOOKUP flag changes with BPF uapiMartynas Pumputis
Sync the changes to the flags made in "bpf: simplify definition of BPF_FIB_LOOKUP related flags" with the BPF UAPI headers. Doing in a separate commit to ease syncing of github/libbpf. Signed-off-by: Martynas Pumputis <m@lambda.lt> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-13bpf: simplify definition of BPF_FIB_LOOKUP related flagsMartynas Pumputis
Previously, the BPF_FIB_LOOKUP_{DIRECT,OUTPUT} flags in the BPF UAPI were defined with the help of BIT macro. This had the following issues: - In order to use any of the flags, a user was required to depend on <linux/bits.h>. - No other flag in bpf.h uses the macro, so it seems that an unwritten convention is to use (1 << (nr)) to define BPF-related flags. Fixes: 87f5fc7e48dd ("bpf: Provide helper to do forwarding lookups in kernel FIB table") Signed-off-by: Martynas Pumputis <m@lambda.lt> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-13x86/fpu: Don't use current->mm to check for a kthreadChristoph Hellwig
current->mm can be non-NULL if a kthread calls use_mm(). Check for PF_KTHREAD instead to decide when to store user mode FP state. Fixes: 2722146eb784 ("x86/fpu: Remove fpu->initialized") Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Aubrey Li <aubrey.li@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Nicolai Stange <nstange@suse.de> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190604175411.GA27477@lst.de
2019-06-13Merge tag 'timers-v5.2-rc1' of ↵Thomas Gleixner
https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent Pull timer fixes from Daniel Lezcano: - Fix missing notrace leading to deadlock on arch_arm_timer (Julien Thierry) - Fix compilation warning on timer-ti-dm (Philippe Mazenauer)
2019-06-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - regression fixes (reverts) for module loading changes that turned out to be incompatible with some userspace, from Benjamin Tissoires - regression fix for special Logitech unifiying receiver 0xc52f, from Hans de Goede - a few device ID additions to logitech driver, from Hans de Goede - fix for Bluetooth support on 2nd-gen Wacom Intuos Pro, from Jason Gerecke * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: logitech-dj: Fix 064d:c52f receiver support Revert "HID: core: Call request_module before doing device_add" Revert "HID: core: Do not call request_module() in async context" Revert "HID: Increase maximum report size allowed by hid_field_extract()" HID: a4tech: fix horizontal scrolling HID: hyperv: Add a module description line HID: logitech-hidpp: Add support for the S510 remote control HID: multitouch: handle faulty Elo touch device HID: wacom: Sync INTUOSP2_BT touch state after each frame if necessary HID: wacom: Correct button numbering 2nd-gen Intuos Pro over Bluetooth HID: wacom: Send BTN_TOUCH in response to INTUOSP2_BT eraser contact HID: wacom: Don't report anything prior to the tool entering range HID: wacom: Don't set tool type until we're in range HID: rmi: Use SET_REPORT request on control endpoint for Acer Switch 3 and 5 HID: logitech-hidpp: add support for the MX5500 keyboard HID: logitech-dj: add support for the Logitech MX5500's Bluetooth Mini-Receiver HID: i2c-hid: add iBall Aer3 to descriptor override
2019-06-13Merge tag 'asoc-fix-v5.2-rc4' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v5.2 There's an awful lot of fixes here, almost all for the newly introduced SoF DSP drivers (including a few things it turned up in shared code). This is a large and complex piece of code so it's not surprising that there have been quite a few issues here, fortunately things seem to have mostly calmed down now. Otherwise there's just a smattering of small fixes.
2019-06-13Merge tag 'drm-intel-fixes-2019-06-13' of ↵Daniel Vetter
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.2-rc5: - Fix DMC firmware input validation to avoid buffer overflow - Fix perf register access whitelist for userspace - Fix DSI panel on GPD MicroPC - Fix per-pixel alpha with CCS - Fix HDMI audio for SDVO Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87y325x22w.fsf@intel.com
2019-06-13block/ps3vram: Use %llu to format sector_t after LBDAF removalGeert Uytterhoeven
The removal of CONFIG_LBDAF changed the type of sector_t from "unsigned long" to "u64" aka "unsigned long long" on 64-bit platforms, leading to a compiler warning regression: drivers/block/ps3vram.c: In function ‘ps3vram_probe’: drivers/block/ps3vram.c:770:23: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘sector_t {aka long long unsigned int}’ [-Wformat=] Fix this by using "%llu" instead. Fixes: 72deb455b5ec619f ("block: remove CONFIG_LBDAF") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13libata: Extend quirks for the ST1000LM024 drives with NOLPM quirkHans de Goede
We've received a bugreport that using LPM with ST1000LM024 drives leads to system lockups. So it seems that these models are buggy in more then 1 way. Add NOLPM quirk to the existing quirks entry for BROKEN_FPDMA_AA. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1571330 Cc: stable@vger.kernel.org Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13bcache: only set BCACHE_DEV_WB_RUNNING when cached device attachedColy Li
When people set a writeback percent via sysfs file, /sys/block/bcache<N>/bcache/writeback_percent current code directly sets BCACHE_DEV_WB_RUNNING to dc->disk.flags and schedules kworker dc->writeback_rate_update. If there is no cache set attached to, the writeback kernel thread is not running indeed, running dc->writeback_rate_update does not make sense and may cause NULL pointer deference when reference cache set pointer inside update_writeback_rate(). This patch checks whether the cache set point (dc->disk.c) is NULL in sysfs interface handler, and only set BCACHE_DEV_WB_RUNNING and schedule dc->writeback_rate_update when dc->disk.c is not NULL (it means the cache device is attached to a cache set). This problem might be introduced from initial bcache commit, but commit 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly") changes part of the original code piece, so I add 'Fixes: 3fd47bfe55b0' to indicate from which commit this patch can be applied. Fixes: 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly") Reported-by: Bjørn Forsman <bjorn.forsman@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Bjørn Forsman <bjorn.forsman@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13bcache: fix stack corruption by PRECEDING_KEY()Coly Li
Recently people report bcache code compiled with gcc9 is broken, one of the buggy behavior I observe is that two adjacent 4KB I/Os should merge into one but they don't. Finally it turns out to be a stack corruption caused by macro PRECEDING_KEY(). See how PRECEDING_KEY() is defined in bset.h, 437 #define PRECEDING_KEY(_k) \ 438 ({ \ 439 struct bkey *_ret = NULL; \ 440 \ 441 if (KEY_INODE(_k) || KEY_OFFSET(_k)) { \ 442 _ret = &KEY(KEY_INODE(_k), KEY_OFFSET(_k), 0); \ 443 \ 444 if (!_ret->low) \ 445 _ret->high--; \ 446 _ret->low--; \ 447 } \ 448 \ 449 _ret; \ 450 }) At line 442, _ret points to address of a on-stack variable combined by KEY(), the life range of this on-stack variable is in line 442-446, once _ret is returned to bch_btree_insert_key(), the returned address points to an invalid stack address and this address is overwritten in the following called bch_btree_iter_init(). Then argument 'search' of bch_btree_iter_init() points to some address inside stackframe of bch_btree_iter_init(), exact address depends on how the compiler allocates stack space. Now the stack is corrupted. Fixes: 0eacac22034c ("bcache: PRECEDING_KEY()") Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Rolf Fokkens <rolf@rolffokkens.nl> Reviewed-by: Pierre JUHEN <pierre.juhen@orange.fr> Tested-by: Shenghui Wang <shhuiw@foxmail.com> Tested-by: Pierre JUHEN <pierre.juhen@orange.fr> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Nix <nix@esperi.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13arm64/sve: Fix missing SVE/FPSIMD endianness conversionsDave Martin
The in-memory representation of SVE and FPSIMD registers is different: the FPSIMD V-registers are stored as single 128-bit host-endian values, whereas SVE registers are stored in an endianness-invariant byte order. This means that the two representations differ when running on a big-endian host. But we blindly copy data from one representation to another when converting between the two, resulting in the register contents being unintentionally byteswapped in certain situations. Currently this can be triggered by the first SVE instruction after a syscall, for example (though the potential trigger points may vary in future). So, fix the conversion functions fpsimd_to_sve(), sve_to_fpsimd() and sve_sync_from_fpsimd_zeropad() to swab where appropriate. There is no common swahl128() or swab128() that we could use here. Maybe it would be worth making this generic, but for now add a simple local hack. Since the byte order differences are exposed in ABI, also clarify the documentation. Cc: Alex Bennée <alex.bennee@linaro.org> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Alan Hayward <alan.hayward@arm.com> Cc: Julien Grall <julien.grall@arm.com> Fixes: bc0ee4760364 ("arm64/sve: Core task context handling") Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support") Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support") Signed-off-by: Dave Martin <Dave.Martin@arm.com> [will: Fix typos in comments and docs spotted by Julien] Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-13blk-mq: remove WARN_ON(!q->elevator) from blk_mq_sched_free_requestsMing Lei
blk_mq_sched_free_requests() may be called in failure path in which q->elevator may not be setup yet, so remove WARN_ON(!q->elevator) from blk_mq_sched_free_requests for avoiding the false positive. This function is actually safe to call in case of !q->elevator because hctx->sched_tags is checked. Cc: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Yi Zhang <yi.zhang@redhat.com> Fixes: c3e2219216c9 ("block: free sched's request pool in blk_cleanup_queue") Reported-by: syzbot+b9d0d56867048c7bcfde@syzkaller.appspotmail.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13blkio-controller.txt: Remove references to CFQAndreas Herrmann
CFQ is gone. No need anymore to document its "proportional weight time based division of disk policy". Signed-off-by: Andreas Herrmann <aherrmann@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13block/switching-sched.txt: Update to blk-mq schedulersAndreas Herrmann
Remove references to CFQ and legacy block layer which are gone. Update example with what's available under blk-mq. Signed-off-by: Andreas Herrmann <aherrmann@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13null_blk: remove duplicate check for report zoneChaitanya Kulkarni
This patch removes the check in the null_blk_zoned for report zone command, where it checks for the dev-,>zoned before executing the report zone. The null_zone_report() function is a block_device operation callback which is initialized in the null_blk_main.c and gets called as a part of blkdev for report zone IOCTL (BLKREPORTZONE). blkdev_ioctl() blkdev_report_zones_ioctl() blkdev_report_zones() blk_report_zones() disk->fops->report_zones() nullb_zone_report(); The null_zone_report() will never get executed on the non-zoned block device, in the non zoned block device blk_queue_is_zoned() will always be false which is first check the blkdev_report_zones_ioctl() before actual low level driver report zone callback is executed. Here is the detailed scenario:- 1. modprobe null_blk null_init null_alloc_dev dev->zoned = 0 null_add_dev dev->zoned == 0 so we don't set the q->limits.zoned = BLK_ZONED_HR 2. blkzone report /dev/nullb0 blkdev_ioctl() blkdev_report_zones_ioctl() blk_queue_is_zoned() blk_queue_is_zoned q->limits.zoned == 0 return false if (!blk_queue_is_zoned(q)) <--- true return -ENOTTY; Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13blk-mq: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. When all of these checks are cleaned up, lots of the functions used in the blk-mq-debugfs code can now return void, as no need to check the return value of them either. Overall, this ends up cleaning up the code and making it smaller, always a nice win. Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13io_uring: fix memory leak of UNIX domain socket inodeEric Biggers
Opening and closing an io_uring instance leaks a UNIX domain socket inode. This is because the ->file of the io_uring instance's internal UNIX domain socket is set to point to the io_uring file, but then sock_release() sees the non-NULL ->file and assumes the inode reference is held by the file so doesn't call iput(). That's not the case here, since the reference is still meant to be held by the socket; the actual inode of the io_uring file is different. Fix this leak by NULL-ing out ->file before releasing the socket. Reported-by: syzbot+111cb28d9f583693aefa@syzkaller.appspotmail.com Fixes: 2b188cc1bb85 ("Add io_uring IO interface") Cc: <stable@vger.kernel.org> # v5.1+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13block: force select mq-deadline for zoned block devicesDamien Le Moal
In most use cases of zoned block devices (aka SMR disks), the mq-deadline scheduler is mandatory as it implements sequential write command processing guarantees with zone write locking. So make sure that this scheduler is always enabled if CONFIG_BLK_DEV_ZONED is selected. Tested-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-12Merge tag 'selinux-pr-20190612' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fixes from Paul Moore: "Three patches for v5.2. One fixes a problem where we weren't correctly logging raw SELinux labels, the other two fix problems where we weren't properly checking calls to kmemdup()" * tag 'selinux-pr-20190612' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix a missing-check bug in selinux_sb_eat_lsm_opts() selinux: fix a missing-check bug in selinux_add_mnt_opt( ) selinux: log raw contexts as untrusted strings
2019-06-12drm/amdgpu: return 0 by default in amdgpu_pm_load_smu_firmwareAlex Deucher
Fixes SI cards running on amdgpu. Fixes: 1929059893022 ("drm/amd/amdgpu: add RLC firmware to support raven1 refresh") Bug: https://bugs.freedesktop.org/show_bug.cgi?id=110883 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-12drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported()Dan Carpenter
The "block" variable can be set by the user through debugfs, so it can be quite large which leads to shift wrapping here. This means we report a "block" as supported when it's not, and that leads to array overflows later on. This bug is not really a security issue in real life, because debugfs is generally root only. Fixes: 36ea1bd2d084 ("drm/amdgpu: add debugfs ctrl node") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-12Merge branch 'net-mvpp2-prs-Fixes-for-VID-filtering'David S. Miller
Maxime Chevallier says: ==================== net: mvpp2: prs: Fixes for VID filtering This series fixes some issues with VID filtering offload, mainly due to the wrong ranges being used in the TCAM header parser. The first patch fixes a bug where removing a VLAN from a port's whitelist would also remove it from other port's, if they are on the same PPv2 instance. The second patch makes so that we don't invalidate the wrong TCAM entries when clearing the whole whitelist. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12net: mvpp2: prs: Use the correct helpers when removing all VID filtersMaxime Chevallier
When removing all VID filters, the mvpp2_prs_vid_entry_remove would be called with the TCAM id incorrectly used as a VID, causing the wrong TCAM entries to be invalidated. Fix this by directly invalidating entries in the VID range. Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering") Suggested-by: Yuri Chipchev <yuric@marvell.com> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12net: mvpp2: prs: Fix parser range for VID filteringMaxime Chevallier
VID filtering is implemented in the Header Parser, with one range of 11 vids being assigned for each no-loopback port. Make sure we use the per-port range when looking for existing entries in the Parser. Since we used a global range instead of a per-port one, this causes VIDs to be removed from the whitelist from all ports of the same PPv2 instance. Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering") Suggested-by: Yuri Chipchev <yuric@marvell.com> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12Merge branch 'mlxsw-Various-fixes'David S. Miller
Ido Schimmel says: ==================== mlxsw: Various fixes This patchset contains various fixes for mlxsw. Patch #1 fixes an hash polarization problem when a nexthop device is a LAG device. This is caused by the fact that the same seed is used for the LAG and ECMP hash functions. Patch #2 fixes an issue in which the driver fails to refresh a nexthop neighbour after it becomes dead. This prevents the nexthop from ever being written to the adjacency table and used to forward traffic. Patch Patch #4 fixes a wrong extraction of TOS value in flower offload code. Patch #5 is a test case. Patch #6 works around a buffer issue in Spectrum-2 by reducing the default sizes of the shared buffer pools. Patch #7 prevents prio-tagged packets from entering the switch when PVID is removed from the bridge port. Please consider patches #2, #4 and #6 for 5.1.y ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum: Disallow prio-tagged packets when PVID is removedIdo Schimmel
When PVID is removed from a bridge port, the Linux bridge drops both untagged and prio-tagged packets. Align mlxsw with this behavior. Fixes: 148f472da5db ("mlxsw: reg: Add the Switch Port Acceptable Frame Types register") Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum_buffers: Reduce pool size on Spectrum-2Petr Machata
Due to an issue on Spectrum-2, in front-panel ports split four ways, 2 out of 32 port buffers cannot be used. To work around this, the next FW release will mark them as unused, and will report correspondingly lower total shared buffer size. mlxsw will pick up the new value through a query to cap_total_buffer_size resource. However the initial size for shared buffer pool 0 is hard-coded and therefore needs to be updated. Thus reduce the pool size by 2.7 MiB (which corresponds to 2/32 of the total size of 42 MiB), and round down to the whole number of cells. Fixes: fe099bf682ab ("mlxsw: spectrum_buffers: Add Spectrum-2 shared buffer configuration") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12selftests: tc_flower: Add TOS matching testJiri Pirko
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum_flower: Fix TOS matchingJiri Pirko
The TOS value was not extracted correctly. Fix it. Fixes: 87996f91f739 ("mlxsw: spectrum_flower: Add support for ip tos") Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12selftests: mlxsw: Test nexthop offload indicationIdo Schimmel
Test that IPv4 and IPv6 nexthops are correctly marked with offload indication in response to neighbour events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum_router: Refresh nexthop neighbour when it becomes deadIdo Schimmel
The driver tries to periodically refresh neighbours that are used to reach nexthops. This is done by periodically calling neigh_event_send(). However, if the neighbour becomes dead, there is nothing we can do to return it to a connected state and the above function call is basically a NOP. This results in the nexthop never being written to the device's adjacency table and therefore never used to forward packets. Fix this by dropping our reference from the dead neighbour and associating the nexthop with a new neigbhour which we will try to refresh. Fixes: a7ff87acd995 ("mlxsw: spectrum_router: Implement next-hop routing") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alex Veber <alexve@mellanox.com> Tested-by: Alex Veber <alexve@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12mlxsw: spectrum: Use different seeds for ECMP and LAG hashIdo Schimmel
The same hash function and seed are used for both ECMP and LAG hash. Therefore, when a LAG device is used as a nexthop device as part of an ECMP group, hash polarization can occur and all the traffic will be hashed to a single LAG slave. Fix this by using a different seed for the LAG hash. Fixes: fa73989f2697 ("mlxsw: spectrum: Use a stable ECMP/LAG seed") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alex Veber <alexve@mellanox.com> Tested-by: Alex Veber <alexve@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12net: tls, correctly account for copied bytes with multiple sk_msgsJohn Fastabend
tls_sw_do_sendpage needs to return the total number of bytes sent regardless of how many sk_msgs are allocated. Unfortunately, copied (the value we return up the stack) is zero'd before each new sk_msg is allocated so we only return the copied size of the last sk_msg used. The caller (splice, etc.) of sendpage will then believe only part of its data was sent and send the missing chunks again. However, because the data actually was sent the receiver will get multiple copies of the same data. To reproduce this do multiple sendfile calls with a length close to the max record size. This will in turn call splice/sendpage, sendpage may use multiple sk_msg in this case and then returns the incorrect number of bytes. This will cause splice to resend creating duplicate data on the receiver. Andre created a C program that can easily generate this case so we will push a similar selftest for this to bpf-next shortly. The fix is to _not_ zero the copied field so that the total sent bytes is returned. Reported-by: Steinar H. Gunderson <steinar+kernel@gunderson.no> Reported-by: Andre Tomt <andre@tomt.net> Tested-by: Andre Tomt <andre@tomt.net> Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12vrf: Increment Icmp6InMsgs on the original netdevStephen Suryaputra
Get the ingress interface and increment ICMP counters based on that instead of skb->dev when the the dev is a VRF device. This is a follow up on the following message: https://www.spinics.net/lists/netdev/msg560268.html v2: Avoid changing skb->dev since it has unintended effect for local delivery (David Ahern). Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12cpuset: restore sanity to cpuset_cpus_allowed_fallback()Joel Savitz
In the case that a process is constrained by taskset(1) (i.e. sched_setaffinity(2)) to a subset of available cpus, and all of those are subsequently offlined, the scheduler will set tsk->cpus_allowed to the current value of task_cs(tsk)->effective_cpus. This is done via a call to do_set_cpus_allowed() in the context of cpuset_cpus_allowed_fallback() made by the scheduler when this case is detected. This is the only call made to cpuset_cpus_allowed_fallback() in the latest mainline kernel. However, this is not sane behavior. I will demonstrate this on a system running the latest upstream kernel with the following initial configuration: # grep -i cpu /proc/$$/status Cpus_allowed: ffffffff,fffffff Cpus_allowed_list: 0-63 (Where cpus 32-63 are provided via smt.) If we limit our current shell process to cpu2 only and then offline it and reonline it: # taskset -p 4 $$ pid 2272's current affinity mask: ffffffffffffffff pid 2272's new affinity mask: 4 # echo off > /sys/devices/system/cpu/cpu2/online # dmesg | tail -3 [ 2195.866089] process 2272 (bash) no longer affine to cpu2 [ 2195.872700] IRQ 114: no longer affine to CPU2 [ 2195.879128] smpboot: CPU 2 is now offline # echo on > /sys/devices/system/cpu/cpu2/online # dmesg | tail -1 [ 2617.043572] smpboot: Booting Node 0 Processor 2 APIC 0x4 We see that our current process now has an affinity mask containing every cpu available on the system _except_ the one we originally constrained it to: # grep -i cpu /proc/$$/status Cpus_allowed: ffffffff,fffffffb Cpus_allowed_list: 0-1,3-63 This is not sane behavior, as the scheduler can now not only place the process on previously forbidden cpus, it can't even schedule it on the cpu it was originally constrained to! Other cases result in even more exotic affinity masks. Take for instance a process with an affinity mask containing only cpus provided by smt at the moment that smt is toggled, in a configuration such as the following: # taskset -p f000000000 $$ # grep -i cpu /proc/$$/status Cpus_allowed: 000000f0,00000000 Cpus_allowed_list: 36-39 A double toggle of smt results in the following behavior: # echo off > /sys/devices/system/cpu/smt/control # echo on > /sys/devices/system/cpu/smt/control # grep -i cpus /proc/$$/status Cpus_allowed: ffffff00,ffffffff Cpus_allowed_list: 0-31,40-63 This is even less sane than the previous case, as the new affinity mask excludes all smt-provided cpus with ids less than those that were previously in the affinity mask, as well as those that were actually in the mask. With this patch applied, both of these cases end in the following state: # grep -i cpu /proc/$$/status Cpus_allowed: ffffffff,ffffffff Cpus_allowed_list: 0-63 The original policy is discarded. Though not ideal, it is the simplest way to restore sanity to this fallback case without reinventing the cpuset wheel that rolls down the kernel just fine in cgroup v2. A user who wishes for the previous affinity mask to be restored in this fallback case can use that mechanism instead. This patch modifies scheduler behavior by instead resetting the mask to task_cs(tsk)->cpus_allowed by default, and cpu_possible mask in legacy mode. I tested the cases above on both modes. Note that the scheduler uses this fallback mechanism if and only if _every_ other valid avenue has been traveled, and it is the last resort before calling BUG(). Suggested-by: Waiman Long <longman@redhat.com> Suggested-by: Phil Auld <pauld@redhat.com> Signed-off-by: Joel Savitz <jsavitz@redhat.com> Acked-by: Phil Auld <pauld@redhat.com> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-06-12net: ethtool: Allow matching on vlan DEI bitMaxime Chevallier
Using ethtool, users can specify a classification action matching on the full vlan tag, which includes the DEI bit (also previously called CFI). However, when converting the ethool_flow_spec to a flow_rule, we use dissector keys to represent the matching patterns. Since the vlan dissector key doesn't include the DEI bit, this information was silently discarded when translating the ethtool flow spec in to a flow_rule. This commit adds the DEI bit into the vlan dissector key, and allows propagating the information to the driver when parsing the ethtool flow spec. Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator") Reported-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12linux-next: DOC: RDS: Fix a typo in rds.txtMasanari Iida
This patch fixes a spelling typo in rds.txt Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12x86/kgdb: Return 0 from kgdb_arch_set_breakpoint()Matt Mullins
err must be nonzero in order to reach text_poke(), which caused kgdb to fail to set breakpoints: (gdb) break __x64_sys_sync Breakpoint 1 at 0xffffffff81288910: file ../fs/sync.c, line 124. (gdb) c Continuing. Warning: Cannot insert breakpoint 1. Cannot access memory at address 0xffffffff81288910 Command aborted. Fixes: 86a22057127d ("x86/kgdb: Avoid redundant comparison of patched code") Signed-off-by: Matt Mullins <mmullins@fb.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nadav Amit <namit@vmware.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190531194755.6320-1-mmullins@fb.com