summaryrefslogtreecommitdiff
path: root/arch/powerpc/kvm
AgeCommit message (Collapse)Author
2020-02-04Merge tag 'powerpc-5.6-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "A pretty small batch for us, and apologies for it being a bit late, I wanted to sneak Christophe's user_access_begin() series in. Summary: - Implement user_access_begin() and friends for our platforms that support controlling kernel access to userspace. - Enable CONFIG_VMAP_STACK on 32-bit Book3S and 8xx. - Some tweaks to our pseries IOMMU code to allow SVMs ("secure" virtual machines) to use the IOMMU. - Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 32-bit VDSO, and some other improvements. - A series to use the PCI hotplug framework to control opencapi card's so that they can be reset and re-read after flashing a new FPGA image. As well as other minor fixes and improvements as usual. Thanks to: Alastair D'Silva, Alexandre Ghiti, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Bai Yingjie, Chen Zhou, Christophe Leroy, Frederic Barrat, Greg Kurz, Jason A. Donenfeld, Joel Stanley, Jordan Niethe, Julia Lawall, Krzysztof Kozlowski, Laurent Dufour, Laurentiu Tudor, Linus Walleij, Michael Bringmann, Nathan Chancellor, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Peter Ujfalusi, Pingfan Liu, Ram Pai, Randy Dunlap, Russell Currey, Sam Bobroff, Sebastian Andrzej Siewior, Shawn Anastasio, Stephen Rothwell, Steve Best, Sukadev Bhattiprolu, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain" * tag 'powerpc-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (131 commits) powerpc: configs: Cleanup old Kconfig options powerpc/configs/skiroot: Enable some more hardening options powerpc/configs/skiroot: Disable xmon default & enable reboot on panic powerpc/configs/skiroot: Enable security features powerpc/configs/skiroot: Update for symbol movement only powerpc/configs/skiroot: Drop default n CONFIG_CRYPTO_ECHAINIV powerpc/configs/skiroot: Drop HID_LOGITECH powerpc/configs: Drop NET_VENDOR_HP which moved to staging powerpc/configs: NET_CADENCE became NET_VENDOR_CADENCE powerpc/configs: Drop CONFIG_QLGE which moved to staging powerpc: Do not consider weak unresolved symbol relocations as bad powerpc/32s: Fix kasan_early_hash_table() for CONFIG_VMAP_STACK powerpc: indent to improve Kconfig readability powerpc: Provide initial documentation for PAPR hcalls powerpc: Implement user_access_save() and user_access_restore() powerpc: Implement user_access_begin and friends powerpc/32s: Prepare prevent_user_access() for user_access_end() powerpc/32s: Drop NULL addr verification powerpc/kuap: Fix set direction in allow/prevent_user_access() powerpc/32s: Fix bad_kuap_fault() ...
2020-01-30Merge tag 'kvm-ppc-next-5.6-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD Second KVM PPC update for 5.6 * Fix compile warning on 32-bit machines * Fix locking error in secure VM support
2020-01-29KVM: PPC: Book3S PR: Fix -Werror=return-type build failureDavid Michael
Fixes: 3a167beac07c ("kvm: powerpc: Add kvmppc_ops callback") Signed-off-by: David Michael <fedora.dm0@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2020-01-29KVM: PPC: Book3S HV: Release lock on page-out failure pathBharata B Rao
When migrate_vma_setup() fails in kvmppc_svm_page_out(), release kvm->arch.uvmem_lock before returning. Fixes: ca9f4942670 ("KVM: PPC: Book3S HV: Support for running secure guests") Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2020-01-27KVM: Use vcpu-specific gva->hva translation when querying host page sizeSean Christopherson
Use kvm_vcpu_gfn_to_hva() when retrieving the host page size so that the correct set of memslots is used when handling x86 page faults in SMM. Fixes: 54bf36aac520 ("KVM: x86: use vcpu-specific functions to read/write/translate GFNs") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Drop kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit()Sean Christopherson
Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all arch specific implementations are nops. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: PPC: Move all vcpu init code into kvm_arch_vcpu_create()Sean Christopherson
Fold init() into create() now that the two are called back-to-back by common KVM code (kvm_vcpu_init() calls kvm_arch_vcpu_init() as its last action, and kvm_vm_ioctl_create_vcpu() calls kvm_arch_vcpu_create() immediately thereafter). Rinse and repeat for kvm_arch_vcpu_uninit() and kvm_arch_vcpu_destroy(). This paves the way for removing kvm_arch_vcpu_{un}init() entirely. Note, calling kvmppc_mmu_destroy() if kvmppc_core_vcpu_create() fails may or may not be necessary. Move it along with the more obvious call to kvmppc_subarch_vcpu_uninit() so as not to inadvertantly introduce a functional change and/or bug. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Drop kvm_arch_vcpu_setup()Sean Christopherson
Remove kvm_arch_vcpu_setup() now that all arch specific implementations are nops. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: PPC: BookE: Setup vcpu during kvmppc_core_vcpu_create()Sean Christopherson
Fold setup() into create() now that the two are called back-to-back by common KVM code. This paves the way for removing kvm_arch_vcpu_setup(). Note, BookE directly implements kvm_arch_vcpu_setup() and PPC's common kvm_arch_vcpu_create() is responsible for its own cleanup, thus the only cleanup required when directly invoking kvmppc_core_vcpu_setup() is to call .vcpu_free(), which is the BookE specific portion of PPC's kvm_arch_vcpu_destroy() by way of kvmppc_core_vcpu_free(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27KVM: Move vcpu alloc and init invocation to common codeSean Christopherson
Now that all architectures tightly couple vcpu allocation/free with the mandatory calls to kvm_{un}init_vcpu(), move the sequences verbatim to common KVM code. Move both allocation and initialization in a single patch to eliminate thrash in arch specific code. The bisection benefits of moving the two pieces in separate patches is marginal at best, whereas the odds of introducing a transient arch specific bug are non-zero. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-26powerpc/mm: Remove kvm radix prefetch workaround for Power9 DD2.2Jordan Niethe
Commit a25bd72badfa ("powerpc/mm/radix: Workaround prefetch issue with KVM") introduced a number of workarounds as coming out of a guest with the mmu enabled would make the cpu would start running in hypervisor state with the PID value from the guest. The cpu will then start prefetching for the hypervisor with that PID value. In Power9 DD2.2 the cpu behaviour was modified to fix this. When accessing Quadrant 0 in hypervisor mode with LPID != 0 prefetching will not be performed. This means that we can get rid of the workarounds for Power9 DD2.2 and later revisions. Add a new cpu feature CPU_FTR_P9_RADIX_PREFETCH_BUG to indicate if the workarounds are needed. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191206031722.25781-1-jniethe5@gmail.com
2020-01-26powerpc: use probe_user_read() and probe_user_write()Christophe Leroy
Instead of opencoding, use probe_user_read() to failessly read a user location and probe_user_write() for writing to user. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e041f5eedb23f09ab553be8a91c3de2087147320.1579800517.git.christophe.leroy@c-s.fr
2020-01-24KVM: Introduce kvm_vcpu_destroy()Sean Christopherson
Add kvm_vcpu_destroy() and wire up all architectures to call the common function instead of their arch specific implementation. The common destruction function will be used by future patches to move allocation and initialization of vCPUs to common KVM code, i.e. to free resources that are allocated by arch agnostic code. No functional change intended. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: Add kvm_arch_vcpu_precreate() to handle pre-allocation issuesSean Christopherson
Add a pre-allocation arch hook to handle checks that are currently done by arch specific code prior to allocating the vCPU object. This paves the way for moving the allocation to common KVM code. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: PPC: Drop kvm_arch_vcpu_free()Sean Christopherson
Remove the superfluous kvm_arch_vcpu_free() as it is no longer called from commmon KVM code. Note, kvm_arch_vcpu_destroy() *is* called from common code, i.e. choosing which function to whack is not completely arbitrary. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: PPC: Move kvm_vcpu_init() invocation to common codeSean Christopherson
Move the kvm_cpu_{un}init() calls to common PPC code as an intermediate step towards removing kvm_cpu_{un}init() altogether. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: PPC: e500mc: Move reset of oldpir below call to kvm_vcpu_init()Sean Christopherson
Move the initialization of oldpir so that the call to kvm_vcpu_init() is at the top of kvmppc_core_vcpu_create_e500mc(). oldpir is only use when loading/putting a vCPU, which currently cannot be done until after kvm_arch_vcpu_create() completes. Reording the call to kvm_vcpu_init() paves the way for moving the invocation to common PPC code. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: PPC: Book3S PR: Allocate book3s and shadow vcpu after common initSean Christopherson
Call kvm_vcpu_init() in kvmppc_core_vcpu_create_pr() prior to allocating the book3s and shadow_vcpu objects in preparation of moving said call to common PPC code. Although kvm_vcpu_init() has an arch callback, the callback is empty for Book3S PR, i.e. barring unseen black magic, moving the allocation has no real functional impact. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: PPC: Allocate vcpu struct in common PPC codeSean Christopherson
Move allocation of all flavors of PPC vCPUs to common PPC code. All variants either allocate 'struct kvm_vcpu' directly, or require that the embedded 'struct kvm_vcpu' member be located at offset 0, i.e. guarantee that the allocation can be directly interpreted as a 'struct kvm_vcpu' object. Remove the message from the build-time assertion regarding placement of the struct, as compatibility with the arch usercopy region is no longer the sole dependent on 'struct kvm_vcpu' being at offset zero. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: PPC: e500mc: Add build-time assert that vcpu is at offset 0Sean Christopherson
In preparation for moving vcpu allocation to common PPC code, add an explicit, albeit redundant, build-time assert to ensure the vcpu member is located at offset 0. The assert is redundant in the sense that kvmppc_core_vcpu_create_e500() contains a functionally identical assert. The motiviation for adding the extra assert is to provide visual confirmation of the correctness of moving vcpu allocation to common code. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: PPC: Book3S PR: Free shared page if mmu initialization failsSean Christopherson
Explicitly free the shared page if kvmppc_mmu_init() fails during kvmppc_core_vcpu_create(), as the page is freed only in kvmppc_core_vcpu_free(), which is not reached via kvm_vcpu_uninit(). Fixes: 96bc451a15329 ("KVM: PPC: Introduce shared page") Cc: stable@vger.kernel.org Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24KVM: PPC: Book3S HV: Uninit vCPU if vcore creation failsSean Christopherson
Call kvm_vcpu_uninit() if vcore creation fails to avoid leaking any resources allocated by kvm_vcpu_init(), i.e. the vcpu->run page. Fixes: 371fefd6f2dc4 ("KVM: PPC: Allow book3s_hv guests to use SMT processor modes") Cc: stable@vger.kernel.org Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-23KVM: PPC: Book3S HV: XIVE: Fix typo in commentGreg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156219139988.578018.1046848908285019838.stgit@bahia.lan
2020-01-17KVM: PPC: Book3S HV: Implement H_SVM_INIT_ABORT hcallSukadev Bhattiprolu
Implement the H_SVM_INIT_ABORT hcall which the Ultravisor can use to abort an SVM after it has issued the H_SVM_INIT_START and before the H_SVM_INIT_DONE hcalls. This hcall could be used when Ultravisor encounters security violations or other errors when starting an SVM. Note that this hcall is different from UV_SVM_TERMINATE ucall which is used by HV to terminate/cleanup an VM that has becore secure. The H_SVM_INIT_ABORT basically undoes operations that were done since the H_SVM_INIT_START hcall - i.e page-out all the VM pages back to normal memory, and terminate the SVM. (If we do not bring the pages back to normal memory, the text/data of the VM would be stuck in secure memory and since the SVM did not go secure, its MSR_S bit will be clear and the VM wont be able to access its pages even to do a clean exit). Based on patches and discussion with Paul Mackerras, Ram Pai and Bharata Rao. Signed-off-by: Ram Pai <linuxram@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2020-01-17KVM: PPC: Add skip_page_out parameter to uvmem functionsSukadev Bhattiprolu
Add 'skip_page_out' parameter to kvmppc_uvmem_drop_pages() so the callers can specify whetheter or not to skip paging out pages. This will be needed in a follow-on patch that implements H_SVM_INIT_ABORT hcall. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2020-01-17KVM: PPC: Book3E: Replace current->mm by kvm->mmLeonardo Bras
Given that in kvm_create_vm() there is: kvm->mm = current->mm; And that on every kvm_*_ioctl we have: if (kvm->mm != current->mm) return -EIO; I see no reason to keep using current->mm instead of kvm->mm. By doing so, we would reduce the use of 'global' variables on code, relying more in the contents of kvm struct. Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2020-01-17KVM: PPC: Book3S: Replace current->mm by kvm->mmLeonardo Bras
Given that in kvm_create_vm() there is: kvm->mm = current->mm; And that on every kvm_*_ioctl we have: if (kvm->mm != current->mm) return -EIO; I see no reason to keep using current->mm instead of kvm->mm. By doing so, we would reduce the use of 'global' variables on code, relying more in the contents of kvm struct. Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2020-01-17KVM: PPC: Remove set but not used variable 'ra', 'rs', 'rt'zhengbin
Fixes gcc '-Wunused-but-set-variable' warning: arch/powerpc/kvm/emulate_loadstore.c: In function kvmppc_emulate_loadstore: arch/powerpc/kvm/emulate_loadstore.c:87:6: warning: variable ra set but not used [-Wunused-but-set-variable] arch/powerpc/kvm/emulate_loadstore.c: In function kvmppc_emulate_loadstore: arch/powerpc/kvm/emulate_loadstore.c:87:10: warning: variable rs set but not used [-Wunused-but-set-variable] arch/powerpc/kvm/emulate_loadstore.c: In function kvmppc_emulate_loadstore: arch/powerpc/kvm/emulate_loadstore.c:87:14: warning: variable rt set but not used [-Wunused-but-set-variable] They are not used since commit 2b33cb585f94 ("KVM: PPC: Reimplement LOAD_FP/STORE_FP instruction mmio emulation with analyse_instr() input") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-12-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "PPC: - Fix a bug where we try to do an ultracall on a system without an ultravisor KVM: - Fix uninitialised sysreg accessor - Fix handling of demand-paged device mappings - Stop spamming the console on IMPDEF sysregs - Relax mappings of writable memslots - Assorted cleanups MIPS: - Now orphan, James Hogan is stepping down x86: - MAINTAINERS change, so long Radim and thanks for all the fish - supported CPUID fixes for AMD machines without SPEC_CTRL" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: MAINTAINERS: remove Radim from KVM maintainers MAINTAINERS: Orphan KVM for MIPS kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor KVM: arm/arm64: Properly handle faulting of device mappings KVM: arm64: Ensure 'params' is initialised when looking up sys register KVM: arm/arm64: Remove excessive permission check in kvm_arch_prepare_memory_region KVM: arm64: Don't log IMP DEF sysreg traps KVM: arm64: Sanely ratelimit sysreg messages KVM: arm/arm64: vgic: Use wrapper function to lock/unlock all vcpus in kvm_vgic_create() KVM: arm/arm64: vgic: Fix potential double free dist->spis in __kvm_vgic_destroy() KVM: arm/arm64: Get rid of unused arg in cpu_init_hyp_mode()
2019-12-18KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisorPaul Mackerras
Commit 22945688acd4 ("KVM: PPC: Book3S HV: Support reset of secure guest") added a call to uv_svm_terminate, which is an ultravisor call, without any check that the guest is a secure guest or even that the system has an ultravisor. On a system without an ultravisor, the ultracall will degenerate to a hypercall, but since we are not in KVM guest context, the hypercall will get treated as a system call, which could have random effects depending on what happens to be in r0, and could also corrupt the current task's kernel stack. Hence this adds a test for the guest being a secure guest before doing uv_svm_terminate(). Fixes: 22945688acd4 ("KVM: PPC: Book3S HV: Support reset of secure guest") Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-12-17KVM: PPC: Book3S HV: Fix regression on big endian hostsMarcus Comstedt
VCPU_CR is the offset of arch.regs.ccr in kvm_vcpu. arch/powerpc/include/asm/kvm_host.h defines arch.regs as a struct pt_regs, and arch/powerpc/include/asm/ptrace.h defines the ccr field of pt_regs as "unsigned long ccr". Since unsigned long is 64 bits, a 64-bit load needs to be used to load it, unless an endianness specific correction offset is added to access the desired subpart. In this case there is no reason to _not_ use a 64 bit load though. Fixes: 6c85b7bc637b ("powerpc/kvm: Use UV_RETURN ucall to return to ultravisor") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Marcus Comstedt <marcus@mc.pp.se> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191215094900.46740-1-marcus@mc.pp.se
2019-12-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull more KVM updates from Paolo Bonzini: - PPC secure guest support - small x86 cleanup - fix for an x86-specific out-of-bounds write on a ioctl (not guest triggerable, data not attacker-controlled) * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: vmx: Stop wasting a page for guest_msrs KVM: x86: fix out-of-bounds write in KVM_GET_EMULATED_CPUID (CVE-2019-19332) Documentation: kvm: Fix mention to number of ioctls classes powerpc: Ultravisor: Add PPC_UV config option KVM: PPC: Book3S HV: Support reset of secure guest KVM: PPC: Book3S HV: Handle memory plug/unplug to secure VM KVM: PPC: Book3S HV: Radix changes for secure guest KVM: PPC: Book3S HV: Shared pages support for secure guests KVM: PPC: Book3S HV: Support for running secure guests mm: ksm: Export ksm_madvise() KVM x86: Move kvm cpuid support out of svm
2019-11-28KVM: PPC: Book3S HV: Support reset of secure guestBharata B Rao
Add support for reset of secure guest via a new ioctl KVM_PPC_SVM_OFF. This ioctl will be issued by QEMU during reset and includes the the following steps: - Release all device pages of the secure guest. - Ask UV to terminate the guest via UV_SVM_TERMINATE ucall - Unpin the VPA pages so that they can be migrated back to secure side when guest becomes secure again. This is required because pinned pages can't be migrated. - Reinit the partition scoped page tables After these steps, guest is ready to issue UV_ESM call once again to switch to secure mode. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [Implementation of uv_svm_terminate() and its call from guest shutdown path] Signed-off-by: Ram Pai <linuxram@us.ibm.com> [Unpinning of VPA pages] Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Handle memory plug/unplug to secure VMBharata B Rao
Register the new memslot with UV during plug and unregister the memslot during unplug. In addition, release all the device pages during unplug. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Radix changes for secure guestBharata B Rao
- After the guest becomes secure, when we handle a page fault of a page belonging to SVM in HV, send that page to UV via UV_PAGE_IN. - Whenever a page is unmapped on the HV side, inform UV via UV_PAGE_INVAL. - Ensure all those routines that walk the secondary page tables of the guest don't do so in case of secure VM. For secure guest, the active secondary page tables are in secure memory and the secondary page tables in HV are freed when guest becomes secure. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Shared pages support for secure guestsBharata B Rao
A secure guest will share some of its pages with hypervisor (Eg. virtio bounce buffers etc). Support sharing of pages between hypervisor and ultravisor. Shared page is reachable via both HV and UV side page tables. Once a secure page is converted to shared page, the device page that represents the secure page is unmapped from the HV side page tables. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-28KVM: PPC: Book3S HV: Support for running secure guestsBharata B Rao
A pseries guest can be run as secure guest on Ultravisor-enabled POWER platforms. On such platforms, this driver will be used to manage the movement of guest pages between the normal memory managed by hypervisor (HV) and secure memory managed by Ultravisor (UV). HV is informed about the guest's transition to secure mode via hcalls: H_SVM_INIT_START: Initiate securing a VM H_SVM_INIT_DONE: Conclude securing a VM As part of H_SVM_INIT_START, register all existing memslots with the UV. H_SVM_INIT_DONE call by UV informs HV that transition of the guest to secure mode is complete. These two states (transition to secure mode STARTED and transition to secure mode COMPLETED) are recorded in kvm->arch.secure_guest. Setting these states will cause the assembly code that enters the guest to call the UV_RETURN ucall instead of trying to enter the guest directly. Migration of pages betwen normal and secure memory of secure guest is implemented in H_SVM_PAGE_IN and H_SVM_PAGE_OUT hcalls. H_SVM_PAGE_IN: Move the content of a normal page to secure page H_SVM_PAGE_OUT: Move the content of a secure page to normal page Private ZONE_DEVICE memory equal to the amount of secure memory available in the platform for running secure guests is created. Whenever a page belonging to the guest becomes secure, a page from this private device memory is used to represent and track that secure page on the HV side. The movement of pages between normal and secure memory is done via migrate_vma_pages() using UV_PAGE_IN and UV_PAGE_OUT ucalls. In order to prevent the device private pages (that correspond to pages of secure guest) from participating in KSM merging, H_SVM_PAGE_IN calls ksm_madvise() under read version of mmap_sem. However ksm_madvise() needs to be under write lock. Hence we call kvmppc_svm_page_in with mmap_sem held for writing, and it then downgrades to a read lock after calling ksm_madvise. [paulus@ozlabs.org - roll in patch "KVM: PPC: Book3S HV: Take write mmap_sem when calling ksm_madvise"] Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-27Merge tag 'powerpc-spectre-rsb' of powerpc-CVE-2019-18660.bundleLinus Torvalds
Pull powerpc Spectre-RSB fixes from Michael Ellerman: "We failed to activate the mitigation for Spectre-RSB (Return Stack Buffer, aka. ret2spec) on context switch, on CPUs prior to Power9 DD2.3. That allows a process to poison the RSB (called Link Stack on Power CPUs) and possibly misdirect speculative execution of another process. If the victim process can be induced to execute a leak gadget then it may be possible to extract information from the victim via a side channel. The fix is to correctly activate the link stack flush mitigation on all CPUs that have any mitigation of Spectre v2 in userspace enabled. There's a second commit which adds a link stack flush in the KVM guest exit path. A leak via that path has not been demonstrated, but we believe it's at least theoretically possible. This is the fix for CVE-2019-18660" * tag 'powerpc-spectre-rsb' of /home/torvalds/Downloads/powerpc-CVE-2019-18660.bundle: KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel powerpc/book3s64: Fix link stack flush on context switch
2019-11-25Merge tag 'kvm-ppc-next-5.5-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD Second KVM PPC update for 5.5 - Two fixes from Greg Kurz to fix memory leak bugs in the XIVE code.
2019-11-21KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error pathGreg Kurz
We need to check the host page size is big enough to accomodate the EQ. Let's do this before taking a reference on the EQ page to avoid a potential leak if the check fails. Cc: stable@vger.kernel.org # v5.2 Fixes: 13ce3297c576 ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration") Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-21KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new oneGreg Kurz
The EQ page is allocated by the guest and then passed to the hypervisor with the H_INT_SET_QUEUE_CONFIG hcall. A reference is taken on the page before handing it over to the HW. This reference is dropped either when the guest issues the H_INT_RESET hcall or when the KVM device is released. But, the guest can legitimately call H_INT_SET_QUEUE_CONFIG several times, either to reset the EQ (vCPU hot unplug) or to set a new EQ (guest reboot). In both cases the existing EQ page reference is leaked because we simply overwrite it in the XIVE queue structure without calling put_page(). This is especially visible when the guest memory is backed with huge pages: start a VM up to the guest userspace, either reboot it or unplug a vCPU, quit QEMU. The leak is observed by comparing the value of HugePages_Free in /proc/meminfo before and after the VM is run. Ideally we'd want the XIVE code to handle the EQ page de-allocation at the platform level. This isn't the case right now because the various XIVE drivers have different allocation needs. It could maybe worth introducing hooks for this purpose instead of exposing XIVE internals to the drivers, but this is certainly a huge work to be done later. In the meantime, for easier backport, fix both vCPU unplug and guest reboot leaks by introducing a wrapper around xive_native_configure_queue() that does the necessary cleanup. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v5.2 Fixes: 13ce3297c576 ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org> Tested-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-14KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernelMichael Ellerman
On some systems that are vulnerable to Spectre v2, it is up to software to flush the link stack (return address stack), in order to protect against Spectre-RSB. When exiting from a guest we do some house keeping and then potentially exit to C code which is several stack frames deep in the host kernel. We will then execute a series of returns without preceeding calls, opening up the possiblity that the guest could have poisoned the link stack, and direct speculative execution of the host to a gadget of some sort. To prevent this we add a flush of the link stack on exit from a guest. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-01Merge tag 'kvm-ppc-next-5.5-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD KVM PPC update for 5.5 * Add capability to tell userspace whether we can single-step the guest. * Improve the allocation of XIVE virtual processor IDs, to reduce the risk of running out of IDs when running many VMs on POWER9. * Rewrite interrupt synthesis code to deliver interrupts in virtual mode when appropriate. * Minor cleanups and improvements.
2019-10-22KVM: Add separate helper for putting borrowed reference to kvmSean Christopherson
Add a new helper, kvm_put_kvm_no_destroy(), to handle putting a borrowed reference[*] to the VM when installing a new file descriptor fails. KVM expects the refcount to remain valid in this case, as the in-progress ioctl() has an explicit reference to the VM. The primary motiviation for the helper is to document that the 'kvm' pointer is still valid after putting the borrowed reference, e.g. to document that doing mutex(&kvm->lock) immediately after putting a ref to kvm isn't broken. [*] When exposing a new object to userspace via a file descriptor, e.g. a new vcpu, KVM grabs a reference to itself (the VM) prior to making the object visible to userspace to avoid prematurely freeing the VM in the scenario where userspace immediately closes file descriptor. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22KVM: PPC: Book3S HV: Reject mflags=2 (LPCR[AIL]=2) ADDR_TRANS_MODE modeNicholas Piggin
AIL=2 mode has no known users, so is not well tested or supported. Disallow guests from selecting this mode because it may become deprecated in future versions of the architecture. This policy decision is not left to QEMU because KVM support is required for AIL=2 (when injecting interrupts). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-10-22KVM: PPC: Book3S HV: Implement LPCR[AIL]=3 mode for injected interruptsNicholas Piggin
kvmppc_inject_interrupt does not implement LPCR[AIL]!=0 modes, which can result in the guest receiving interrupts as if LPCR[AIL]=0 contrary to the ISA. In practice, Linux guests cope with this deviation, but it should be fixed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-10-22KVM: PPC: Book3S HV: Reuse kvmppc_inject_interrupt for async guest deliveryNicholas Piggin
This consolidates the HV interrupt delivery logic into one place. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-10-22KVM: PPC: Book3S: Replace reset_msr mmu op with inject_interrupt arch opNicholas Piggin
reset_msr sets the MSR for interrupt injection, but it's cleaner and more flexible to provide a single op to set both MSR and PC for the interrupt. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-10-22KVM: PPC: Book3S: Define and use SRR1_MSR_BITSNicholas Piggin
Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-10-22KVM: PPC: Book3S HV: XIVE: Allow userspace to set the # of VPsGreg Kurz
Add a new attribute to both XIVE and XICS-on-XIVE KVM devices so that userspace can tell how many interrupt servers it needs. If a VM needs less than the current default of KVM_MAX_VCPUS (2048), we can allocate less VPs in OPAL. Combined with a core stride (VSMT) that matches the number of guest threads per core, this may substantially increases the number of VMs that can run concurrently with an in-kernel XIVE device. Since the legacy XIVE KVM device is exposed to userspace through the XICS KVM API, a new attribute group is added to it for this purpose. While here, fix the syntax of the existing KVM_DEV_XICS_GRP_SOURCES in the XICS documentation. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>