summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2018-03-08x86/spectre_v2: Don't check microcode versions when running under hypervisorsKonrad Rzeszutek Wilk
As: 1) It's known that hypervisors lie about the environment anyhow (host mismatch) 2) Even if the hypervisor (Xen, KVM, VMWare, etc) provided a valid "correct" value, it all gets to be very murky when migration happens (do you provide the "new" microcode of the machine?). And in reality the cloud vendors are the ones that should make sure that the microcode that is running is correct and we should just sing lalalala and trust them. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: kvm <kvm@vger.kernel.org> Cc: Krčmář <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@alien8.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180226213019.GE9497@char.us.oracle.com
2018-03-08x86/vsyscall/64: Drop "native" vsyscallsAndy Lutomirski
Since Linux v3.2, vsyscalls have been deprecated and slow. From v3.2 on, Linux had three vsyscall modes: "native", "emulate", and "none". "emulate" is the default. All known user programs work correctly in emulate mode, but vsyscalls turn into page faults and are emulated. This is very slow. In "native" mode, the vsyscall page is easily usable as an exploit gadget, but vsyscalls are a bit faster -- they turn into normal syscalls. (This is in contrast to vDSO functions, which can be much faster than syscalls.) In "none" mode, there are no vsyscalls. For all practical purposes, "native" was really just a chicken bit in case something went wrong with the emulation. It's been over six years, and nothing has gone wrong. Delete it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/519fee5268faea09ae550776ce969fa6e88668b0.1520449896.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-03-08 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix various BPF helpers which adjust the skb and its GSO information with regards to SCTP GSO. The latter is a special case where gso_size is of value GSO_BY_FRAGS, so mangling that will end up corrupting the skb, thus bail out when seeing SCTP GSO packets, from Daniel(s). 2) Fix a compilation error in bpftool where BPF_FS_MAGIC is not defined due to too old kernel headers in the system, from Jiri. 3) Increase the number of x64 JIT passes in order to allow larger images to converge instead of punting them to interpreter or having them rejected when the interpreter is not built into the kernel, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07bpf, x64: increase number of passesDaniel Borkmann
In Cilium some of the main programs we run today are hitting 9 passes on x64's JIT compiler, and we've had cases already where we surpassed the limit where the JIT then punts the program to the interpreter instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON or insertion failures due to the prog array owner being JITed but the program to insert not (both must have the same JITed/non-JITed property). One concrete case the program image shrunk from 12,767 bytes down to 10,288 bytes where the image converged after 16 steps. I've measured that this took 340us in the JIT until it converges on my i7-6600U. Thus, increase the original limit we had from day one where the JIT covered cBPF only back then before we run into the case (as similar with the complexity limit) where we trip over this and hit program rejections. Also add a cond_resched() into the compilation loop, the JIT process runs without any locks and may sleep anyway. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-03-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Nine bug fixes for s390: - Three fixes for the expoline code, one of them is strictly speaking a cleanup but as it relates to code added with 4.16 I would like to include the patch. - Three timer related fixes in the common I/O layer - A fix for the handling of internal DASD request which could cause panics. - One correction in regard to the accounting of pud page tables vs. compat tasks. - The register scrubbing in entry.S caused spurious crashes, this is fixed now as well" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/entry.S: fix spurious zeroing of r0 s390: Fix runtime warning about negative pgtables_bytes s390: do not bypass BPENTER for interrupt system calls s390/cio: clear timer when terminating driver I/O s390/cio: fix return code after missing interrupt s390/cio: fix ccw_device_start_timeout API s390/clean-up: use CFI_* macros in entry.S s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*) s390/dasd: fix handling of internal requests
2018-03-07x86/entry/64/compat: Save one instruction in entry_INT80_compat()Dominik Brodowski
As %rdi is never user except in the following push, there is no need to restore %rdi to the original value. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/entry: Do not special-case clone(2) in compat entryDominik Brodowski
With the CPU renaming registers on its own, and all the overhead of the syscall entry/exit, it is doubtful whether the compiled output of mov %r8, %rax mov %rcx, %r8 mov %rax, %rcx jmpq sys_clone is measurably slower than the hand-crafted version of xchg %r8, %rcx So get rid of this special case. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscallsDominik Brodowski
While at it, convert declarations of type "unsigned" to "unsigned int". Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/syscalls: Use proper syscall definition for sys_ioperm()Dominik Brodowski
Using SYSCALL_DEFINEx() is recommended, so use it also here. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/entry: Remove stale syscall prototypeDominik Brodowski
sys32_vm86_warning() is long gone. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/syscalls/32: Simplify $entry == $compat entriesDominik Brodowski
If the compat entry point is equivalent to the native entry point, it does not need to be specified explicitly. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-06Merge branch 'siginfo-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull sigingo fix from Eric Biederman: "The kbuild test robot found that I accidentally moved si_pkey when I was cleaning up siginfo_t. A short followed by an int with the int having 8 byte alignment. Sheesh siginfo_t is a weird structure. I have now corrected it and added build time checks that with a little luck will catch any similar future mistakes. The build time checks were sufficient for me to verify the bug and to verify my fix. So they are at least useful this once." * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: signal/x86: Include the field offsets in the build time checks signal: Correct the offset of si_pkey in struct siginfo
2018-03-06Merge tag 'kvm-s390-master-4.16-3' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: Fixes - Fix random memory corruption when running as guest2 (e.g. KVM in LPAR) and starting guest3 (nested KVM) with many CPUs (e.g. a nested guest with 200 vcpu) - io interrupt delivery counter was not exported
2018-03-06Merge tag 'kvm-ppc-fixes-4.16-1' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc Fixes for PPC KVM: - Fix guest time accounting in the host - Fix large-page backing for radix guests on POWER9 - Fix HPT guests on POWER9 backed by 2M or 1G pages - Compile fixes for some configs and gcc versions
2018-03-06KVM: s390: fix memory overwrites when not using SCA entriesDavid Hildenbrand
Even if we don't have extended SCA support, we can have more than 64 CPUs if we don't enable any HW features that might use the SCA entries. Now, this works just fine, but we missed a return, which is why we would actually store the SCA entries. If we have more than 64 CPUs, this means writing outside of the basic SCA - bad. Let's fix this. This allows > 64 CPUs when running nested (under vSIE) without random crashes. Fixes: a6940674c384 ("KVM: s390: allow 255 VCPUs when sca entries aren't used") Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180306132758.21034-1-david@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-06powerpc/pseries: Fix vector5 in ibm architecture vector tableBharata B Rao
With ibm,dynamic-memory-v2 and ibm,drc-info coming around the same time, byte22 in vector5 of ibm architecture vector table got set twice separately. The end result is that guest kernel isn't advertising support for ibm,dynamic-memory-v2. Fix this by removing the duplicate assignment of byte22. Fixes: 02ef6dd8109b ("powerpc: Enable support for ibm,drc-info devtree property") Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-06s390/entry.S: fix spurious zeroing of r0Christian Borntraeger
when a system call is interrupted we might call the critical section cleanup handler that re-does some of the operations. When we are between .Lsysc_vtime and .Lsysc_do_svc we might also redo the saving of the problem state registers r0-r7: .Lcleanup_system_call: [...] 0: # update accounting time stamp mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER # set up saved register r11 lg %r15,__LC_KERNEL_STACK la %r9,STACK_FRAME_OVERHEAD(%r15) stg %r9,24(%r11) # r11 pt_regs pointer # fill pt_regs mvc __PT_R8(64,%r9),__LC_SAVE_AREA_SYNC ---> stmg %r0,%r7,__PT_R0(%r9) The problem is now, that we might have already zeroed out r0. The fix is to move the zeroing of r0 after sysc_do_svc. Reported-by: Farhan Ali <alifm@linux.vnet.ibm.com> Fixes: 7041d28115e91 ("s390: scrub registers on kernel entry and KVM exit") Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-03-06signal/x86: Include the field offsets in the build time checksEric W. Biederman
Due to an oversight when refactoring siginfo_t si_pkey has been in the wrong position since 4.16-rc1. Add an explicit check of the offset of every user space field in siginfo_t and compat_siginfo_t to make a mistake like this hard to make in the future. I have run this code on 4.15 and 4.16-rc1 with the position of si_pkey fixed and all of the fields show up in the same location. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-03-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
All of the conflicts were cases of overlapping changes. In net/core/devlink.c, we have to make care that the resouce size_params have become a struct member rather than a pointer to such an object. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05Merge tag 'please-pull-ia64_misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull ia64 cleanups from Tony Luck: - More atomic cleanup from willy - Fix a python script to work with version 3 - Some other small cleanups * tag 'please-pull-ia64_misc' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: ia64/err-inject: fix spelling mistake: "capapbilities" -> "capabilities" ia64/err-inject: Use get_user_pages_fast() ia64: doc: tweak whitespace for 'console=' parameter ia64: Convert remaining atomic operations ia64: convert unwcheck.py to python3
2018-03-05MIPS: BMIPS: Do not mask IPIs during suspendJustin Chen
Commit a3e6c1eff548 ("MIPS: IRQ: Fix disable_irq on CPU IRQs") fixes an issue where disable_irq did not actually disable the irq. The bug caused our IPIs to not be disabled, which actually is the correct behavior. With the addition of commit a3e6c1eff548 ("MIPS: IRQ: Fix disable_irq on CPU IRQs"), the IPIs were getting disabled going into suspend, thus schedule_ipi() was not being called. This caused deadlocks where schedulable task were not being scheduled and other cpus were waiting for them to do something. Add the IRQF_NO_SUSPEND flag so an irq_disable will not be called on the IPIs during suspend. Signed-off-by: Justin Chen <justinpopo6@gmail.com> Fixes: a3e6c1eff548 ("MIPS: IRQ: Fix disabled_irq on CPU IRQs") Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17385/ [jhogan@kernel.org: checkpatch: wrap long lines and fix commit refs] Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-05ia64/err-inject: fix spelling mistake: "capapbilities" -> "capabilities"Colin Ian King
Trivial fix to spelling mistake in debug message text. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2018-03-05ia64/err-inject: Use get_user_pages_fast()Davidlohr Bueso
At the point of sysfs callback, the call to gup is done without mmap_sem (or any lock for that matter). This is racy. As such, use the get_user_pages_fast() alternative and safely avoid taking the lock, if possible. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
2018-03-05ia64: Convert remaining atomic operationsMatthew Wilcox
While we've only seen inlining problems with atomic_sub_return(), the other atomic operations could have the same problem. Convert all remaining operations to use the same solution as atomic_sub_return(). Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2018-03-05ia64: convert unwcheck.py to python3Corentin Labbe
Since my system use python3 as default, arch/ia64/scripts/unwcheck.py no longer run. This patch convert it to the python3 syntax. I have ran it with python2/python3 while printing values of start/end/rlen_sum which could be impacted by this change and I see no difference. Fixes: 94a47083522e ("scripts: change scripts to use system python instead of env") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2018-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Use an appropriate TSQ pacing shift in mac80211, from Toke Høiland-Jørgensen. 2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk in ip6_route_me_harder, from Eric Dumazet. 3) Fix several shutdown races and similar other problems in l2tp, from James Chapman. 4) Handle missing XDP flush properly in tuntap, for real this time. From Jason Wang. 5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel Borkmann. 6) Fix phy_resume() locking, from Andrew Lunn. 7) IFLA_MTU values are ignored on newlink for some tunnel types, fix from Xin Long. 8) Revert F-RTO middle box workarounds, they only handle one dimension of the problem. From Yuchung Cheng. 9) Fix socket refcounting in RDS, from Ka-Cheong Poon. 10) Don't allow ppp unit registration to an unregistered channel, from Guillaume Nault. 11) Various hv_netvsc fixes from Stephen Hemminger. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (98 commits) hv_netvsc: propagate rx filters to VF hv_netvsc: filter multicast/broadcast hv_netvsc: defer queue selection to VF hv_netvsc: use napi_schedule_irqoff hv_netvsc: fix race in napi poll when rescheduling hv_netvsc: cancel subchannel setup before halting device hv_netvsc: fix error unwind handling if vmbus_open fails hv_netvsc: only wake transmit queue if link is up hv_netvsc: avoid retry on send during shutdown virtio-net: re enable XDP_REDIRECT for mergeable buffer ppp: prevent unregistered channels from connecting to PPP units tc-testing: skbmod: fix match value of ethertype mlxsw: spectrum_switchdev: Check success of FDB add operation net: make skb_gso_*_seglen functions private net: xfrm: use skb_gso_validate_network_len() to check gso sizes net: sched: tbf: handle GSO_BY_FRAGS case in enqueue net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len rds: Incorrect reference counting in TCP socket creation net: ethtool: don't ignore return from driver get_fecparam method vrf: check forwarding on the original netdevice when generating ICMP dest unreachable ...
2018-03-05MIPS: Loongson64: Select ARCH_MIGHT_HAVE_PC_SERIOHuacai Chen
Commit 7a407aa5e0d3 ("MIPS: Push ARCH_MIGHT_HAVE_PC_SERIO down to platform level") moves the global MIPS ARCH_MIGHT_HAVE_PC_SERIO select down to various platforms, but doesn't add it to Loongson64 platforms which need it, so add the selects to these platforms too. Fixes: 7a407aa5e0d3 ("MIPS: Push ARCH_MIGHT_HAVE_PC_SERIO down to platform level") Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18704/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-05MIPS: Loongson64: Select ARCH_MIGHT_HAVE_PC_PARPORTHuacai Chen
Commit a211a0820d3c ("MIPS: Push ARCH_MIGHT_HAVE_PC_PARPORT down to platform level") moves the global MIPS ARCH_MIGHT_HAVE_PC_PARPORT select down to various platforms, but doesn't add it to Loongson64 platforms which need it, so add the selects to these platforms too. Fixes: a211a0820d3c ("MIPS: Push ARCH_MIGHT_HAVE_PC_PARPORT down to platform level") Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18703/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-04Merge branch 'x86/urgent' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of fixes for x86: - Add missing instruction suffixes to assembly code so it can be compiled by newer GAS versions without warnings. - Switch refcount WARN exceptions to UD2 as we did in general - Make the reboot on Intel Edison platforms work - A small documentation update so text and sample command match" * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation, x86, resctrl: Make text and sample command match x86/platform/intel-mid: Handle Intel Edison reboot correctly x86/asm: Add instruction suffixes to bitops x86/entry/64: Add instruction suffix x86/refcounts: Switch to UD2 for exceptions
2018-03-04Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti fixes from Thomas Gleixner: "Three fixes related to melted spectrum: - Sync the cpu_entry_area page table to initial_page_table on 32 bit. Otherwise suspend/resume fails because resume uses initial_page_table and triggers a triple fault when accessing the cpu entry area. - Zero the SPEC_CTL MRS on XEN before suspend to address a shortcoming in the hypervisor. - Fix another switch table detection issue in objtool" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table objtool: Fix another switch table detection issue x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
2018-03-04perf/x86/intel/uncore: Fix Skylake UPI event formatKan Liang
There is no event extension (bit 21) for SKX UPI, so use 'event' instead of 'event_ext'. Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore support") Link: http://lkml.kernel.org/r/1520004150-4855-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-03Merge tag 'kbuild-fixes-v4.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - suppress sparse warnings about unknown attributes - fix typos and stale comments - fix build error of arch/sh - fix wrong use of ld-option vs cc-ldoption - remove redundant GCC_PLUGINS_CFLAGS assignment - fix another memory leak of Kconfig - fix line number in error messages of Kconfig - do not write confusing CONFIG_DEFCONFIG_LIST out to .config - add xstrdup() to Kconfig to handle memory shortage errors - show also a Debian package name if ncurses is missing * tag 'kbuild-fixes-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: MAINTAINERS: take over Kconfig maintainership kconfig: fix line number in recursive inclusion error message Coccinelle: memdup: Fix typo in warning messages kconfig: Update ncurses package names for menuconfig kbuild/kallsyms: trivial typo fix kbuild: test --build-id linker flag by ld-option instead of cc-ldoption kbuild: drop superfluous GCC_PLUGINS_CFLAGS assignment kconfig: Don't leak choice names during parsing sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCE kconfig: set SYMBOL_AUTO to the symbol marked with defconfig_list kconfig: add xstrdup() helper kbuild: disable sparse warnings about unknown attributes Makefile: Fix lying comment re. silentoldconfig
2018-03-03KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GENLaurent Vivier
Since commit 8b24e69fc47e ("KVM: PPC: Book3S HV: Close race with testing for signals on guest entry"), if CONFIG_VIRT_CPU_ACCOUNTING_GEN is set, the guest time is not accounted to guest time and user time, but instead to system time. This is because guest_enter()/guest_exit() are called while interrupts are disabled and the tick counter cannot be updated between them. To fix that, move guest_exit() after local_irq_enable(), and as guest_enter() is called with IRQ disabled, call guest_enter_irqoff() instead. Fixes: 8b24e69fc47e ("KVM: PPC: Book3S HV: Close race with testing for signals on guest entry") Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-03-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "x86: - fix NULL dereference when using userspace lapic - optimize spectre v1 mitigations by allowing guests to use LFENCE - make microcode revision configurable to prevent guests from unnecessarily blacklisting spectre v2 mitigation feature" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: fix vcpu initialization with userspace lapic KVM: X86: Allow userspace to define the microcode version KVM: X86: Introduce kvm_get_msr_feature() KVM: SVM: Add MSR-based feature support for serializing LFENCE KVM: x86: Add a framework for supporting MSR-based features
2018-03-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-03-03 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Extend bpftool to build up CFG information of eBPF programs and add an option to dump this in DOT format such that this can later be used with DOT graphic tools (xdot, graphviz, etc) to visualize it. Part of the analysis performed is sub-program detection and basic-block partitioning, from Jiong. 2) Multiple enhancements for bpftool's batch mode, more specifically the parser now understands comments (#), continuation lines (\), and arguments enclosed between quotes. Also, allow to read from stdin via '-' as input file, all from Quentin. 3) Improve BPF kselftests by i) unifying the rlimit handling into a helper that is then used by all tests, and ii) add support for testing tail calls to test_verifier plus add tests covering all corner cases. The latter is especially useful for testing JITs, from Daniel. 4) Remove x64 JIT's bpf_flush_icache() since flush_icache_range() is a noop on x64, from Daniel. 5) Fix one more occasion in BPF samples where we do not detach the BPF program from the cgroup after completion, from Prashant. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02Merge branch 'parisc-4.16-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - a patch to change the ordering of cache and TLB flushes to hopefully fix the random segfaults we very rarely face (by Dave Anglin). - a patch to hide the virtual kernel memory layout due to security reasons. - two small patches to make the kernel run more smoothly under qemu. * 'parisc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Reduce irq overhead when run in qemu parisc: Use cr16 interval timers unconditionally on qemu parisc: Check if secondary CPUs want own PDC calls parisc: Hide virtual kernel memory layout parisc: Fix ordering of cache and TLB flushes
2018-03-02Merge tag 'for-linus-4.16a-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Five minor fixes for Xen-specific drivers" * tag 'for-linus-4.16a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: pvcalls-front: 64-bit align flags x86/xen: add tty0 and hvc0 as preferred consoles for dom0 xen-netfront: Fix hang on device removal xen/pirq: fix error path cleanup when binding MSIs xen/pvcalls: fix null pointer dereference on map->sock
2018-03-02s390: Fix runtime warning about negative pgtables_bytesGuenter Roeck
When running s390 images with 'compat' processes, the following BUG is seen repeatedly. BUG: non-zero pgtables_bytes on freeing mm: -16384 Bisect points to commit b4e98d9ac775 ("mm: account pud page tables"). Analysis shows that init_new_context() is called with mm->context.asce_limit set to _REGION3_SIZE. In this situation, pgtables_bytes remains set to 0 and is not increased. The message is displayed when the affected process dies and mm_dec_nr_puds() is called. Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Fixes: b4e98d9ac775 ("mm: account pud page tables") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-03-02parisc: Reduce irq overhead when run in qemuHelge Deller
When run under QEMU, calling mfctl(16) creates some overhead because the qemu timer has to be scaled and moved into the register. This patch reduces the number of calls to mfctl(16) by moving the calls out of the loops. Additionally, increase the minimal time interval to 8000 cycles instead of 500 to compensate possible QEMU delays when delivering interrupts. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.14+
2018-03-02parisc: Use cr16 interval timers unconditionally on qemuHelge Deller
When running on qemu we know that the (emulated) cr16 cpu-internal clocks are syncronized. So let's use them unconditionally on qemu. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.14+
2018-03-02parisc: Check if secondary CPUs want own PDC callsHelge Deller
The architecture specification says (for 64-bit systems): PDC is a per processor resource, and operating system software must be prepared to manage separate pointers to PDCE_PROC for each processor. The address of PDCE_PROC for the monarch processor is stored in the Page Zero location MEM_PDC. The address of PDCE_PROC for each non-monarch processor is passed in gr26 when PDCE_RESET invokes OS_RENDEZ. Currently we still use one PDC for all CPUs, but in case we face a machine which is following the specification let's warn about it. Signed-off-by: Helge Deller <deller@gmx.de>
2018-03-02parisc: Hide virtual kernel memory layoutHelge Deller
For security reasons do not expose the virtual kernel memory layout to userspace. Signed-off-by: Helge Deller <deller@gmx.de> Suggested-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org # 4.15 Reviewed-by: Kees Cook <keescook@chromium.org>
2018-03-02parisc: Fix ordering of cache and TLB flushesJohn David Anglin
The change to flush_kernel_vmap_range() wasn't sufficient to avoid the SMP stalls.  The problem is some drivers call these routines with interrupts disabled.  Interrupts need to be enabled for flush_tlb_all() and flush_cache_all() to work.  This version adds checks to ensure interrupts are not disabled before calling routines that need IPI interrupts.  When interrupts are disabled, we now drop into slower code. The attached change fixes the ordering of cache and TLB flushes in several cases.  When we flush the cache using the existing PTE/TLB entries, we need to flush the TLB after doing the cache flush.  We don't need to do this when we flush the entire instruction and data caches as these flushes don't use the existing TLB entries.  The same is true for tmpalias region flushes. The flush_kernel_vmap_range() and invalidate_kernel_vmap_range() routines have been updated. Secondly, we added a new purge_kernel_dcache_range_asm() routine to pacache.S and use it in invalidate_kernel_vmap_range().  Nominally, purges are faster than flushes as the cache lines don't have to be written back to memory. Hopefully, this is sufficient to resolve the remaining problems due to cache speculation.  So far, testing indicates that this is the case.  I did work up a patch using tmpalias flushes, but there is a performance hit because we need the physical address for each page, and we also need to sequence access to the tmpalias flush code.  This increases the probability of stalls. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: Helge Deller <deller@gmx.de>
2018-03-02KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backingPaul Mackerras
The current code for initializing the VRMA (virtual real memory area) for HPT guests requires the page size of the backing memory to be one of 4kB, 64kB or 16MB. With a radix host we have the possibility that the backing memory page size can be 2MB or 1GB. In these cases, if the guest switches to HPT mode, KVM will not initialize the VRMA and the guest will fail to run. In fact it is not necessary that the VRMA page size is the same as the backing memory page size; any VRMA page size less than or equal to the backing memory page size is acceptable. Therefore we now choose the largest page size out of the set {4k, 64k, 16M} which is not larger than the backing memory page size. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-03-02KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handlerPaul Mackerras
This fixes several bugs in the radix page fault handler relating to the way large pages in the memory backing the guest were handled. First, the check for large pages only checked for explicit huge pages and missed transparent huge pages. Then the check that the addresses (host virtual vs. guest physical) had appropriate alignment was wrong, meaning that the code never put a large page in the partition scoped radix tree; it was always demoted to a small page. Fixing this exposed bugs in kvmppc_create_pte(). We were never invalidating a 2MB PTE, which meant that if a page was initially faulted in without write permission and the guest then attempted to store to it, we would never update the PTE to have write permission. If we find a valid 2MB PTE in the PMD, we need to clear it and do a TLB invalidation before installing either the new 2MB PTE or a pointer to a page table page. This also corrects an assumption that get_user_pages_fast would set the _PAGE_DIRTY bit if we are writing, which is not true. Instead we mark the page dirty explicitly with set_page_dirty_lock(). This also means we don't need the dirty bit set on the host PTE when providing write access on a read fault. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-03-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-02-28 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Add schedule points and reduce the number of loop iterations the test_bpf kernel module is performing in order to not hog the CPU for too long, from Eric. 2) Fix an out of bounds access in tail calls in the ppc64 BPF JIT compiler, from Daniel. 3) Fix a crash on arm64 on unaligned BPF xadd operations that could be triggered via interpreter and JIT, from Daniel. Please not that once you merge net into net-next at some point, there is a minor merge conflict in test_verifier.c since test cases had been added at the end in both trees. Resolution is trivial: keep all the test cases from both trees. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01net/mac89x0: Convert to platform_driverFinn Thain
Apparently these Dayna cards don't have a pseudoslot declaration ROM which means they can't be probed like NuBus cards. Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCEMasahiro Yamada
If CONFIG_USE_BUILTIN_DTB is enabled, but CONFIG_BUILTIN_DTB_SOURCE is empty (for example, allmodconfig), it fails to build, like this: make[2]: *** No rule to make target 'arch/sh/boot/dts/.dtb.o', needed by 'arch/sh/boot/dts/built-in.o'. Stop. Surround obj-y with ifneq ... endif. I replaced $(CONFIG_USE_BUILTIN_DTB) with 'y' since this is always the case from the following code from arch/sh/Makefile: core-$(CONFIG_USE_BUILTIN_DTB) += arch/sh/boot/dts/ Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-01Merge tag 'arc-4.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - MCIP aka ARconnect fixes for SMP builds [Euginey] - preventive fix for SLC (L2 cache) flushing [Euginey] - Kconfig default fix [Ulf Magnusson] - trailing semicolon fixes [Luis de Bethencourt] - other assorted minor fixes * tag 'arc-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: setup cpu possible mask according to possible-cpus dts property ARC: mcip: update MCIP debug mask when the new cpu came online ARC: mcip: halt GFRC counter when ARC cores halt ARCv2: boot log: fix HS48 release number arc: dts: use 'atmel' as manufacturer for at24 in axs10x_mb ARC: Fix malformed ARC_EMUL_UNALIGNED default ARC: boot log: Fix trailing semicolon ARC: dw2 unwind: Fix trailing semicolon ARC: Enable fatal signals on boot for dev platforms ARCv2: Don't pretend we may set L-bit in STATUS32 with kflag instruction ARCv2: cache: fix slc_entire_op: flush only instead of flush-n-inv
2018-03-01KVM: x86: fix vcpu initialization with userspace lapicRadim Krčmář
Moving the code around broke this rare configuration. Use this opportunity to finally call lapic reset from vcpu reset. Reported-by: syzbot+fb7a33a4b6c35007a72b@syzkaller.appspotmail.com Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Fixes: 0b2e9904c159 ("KVM: x86: move LAPIC initialization after VMCS creation") Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>