summaryrefslogtreecommitdiff
path: root/arch/s390/kvm
AgeCommit message (Collapse)Author
2017-07-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "PPC: - Better machine check handling for HV KVM - Ability to support guests with threads=2, 4 or 8 on POWER9 - Fix for a race that could cause delayed recognition of signals - Fix for a bug where POWER9 guests could sleep with interrupts pending. ARM: - VCPU request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups s390: - initial machine check forwarding - migration support for the CMMA page hinting information - cleanups and fixes x86: - nested VMX bugfixes and improvements - more reliable NMI window detection on AMD - APIC timer optimizations Generic: - VCPU request overhaul + documentation of common code patterns - kvm_stat improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits) Update my email address kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12 kvm: x86: mmu: allow A/D bits to be disabled in an mmu x86: kvm: mmu: make spte mmio mask more explicit x86: kvm: mmu: dead code thanks to access tracking KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving code KVM: PPC: Book3S HV: Close race with testing for signals on guest entry KVM: PPC: Book3S HV: Simplify dynamic micro-threading code KVM: x86: remove ignored type attribute KVM: LAPIC: Fix lapic timer injection delay KVM: lapic: reorganize restart_apic_timer KVM: lapic: reorganize start_hv_timer kvm: nVMX: Check memory operand to INVVPID KVM: s390: Inject machine check into the nested guest KVM: s390: Inject machine check into the guest tools/kvm_stat: add new interactive command 'b' tools/kvm_stat: add new command line switch '-i' tools/kvm_stat: fix error on interactive command 'g' KVM: SVM: suppress unnecessary NMI singlestep on GIF=0 and nested exit ...
2017-07-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "The bulk of the s390 patches for 4.13. Some new things but mostly bug fixes and cleanups. Noteworthy changes: - The SCM block driver is converted to blk-mq - Switch s390 to 5 level page tables. The virtual address space for a user space process can now have up to 16EB-4KB. - Introduce a ELF phdr flag for qemu to avoid the global vm.alloc_pgste which forces all processes to large page tables - A couple of PCI improvements to improve error recovery - Included is the merge of the base support for proper machine checks for KVM" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits) s390/dasd: Fix faulty ENODEV for RO sysfs attribute s390/pci: recognize name clashes with uids s390/pci: provide more debug information s390/pci: fix handling of PEC 306 s390/pci: improve pci hotplug s390/pci: introduce clp_get_state s390/pci: improve error handling during fmb (de)registration s390/pci: improve unreg_ioat error handling s390/pci: improve error handling during interrupt deregistration s390/pci: don't cleanup in arch_setup_msi_irqs KVM: s390: Backup the guest's machine check info s390/nmi: s390: New low level handling for machine check happening in guest s390/fpu: export save_fpu_regs for all configs s390/kvm: avoid global config of vm.alloc_pgste=1 s390: rename struct psw_bits members s390: rename psw_bits enums s390/mm: use correct address space when enabling DAT s390/cio: introduce io_subchannel_type s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL s390/dumpstack: remove raw stack dump ...
2017-06-30Merge tag 'kvmarm-for-4.13' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.13 - vcpu request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups Conflicts: arch/s390/include/asm/kvm_host.h
2017-06-28Merge tag 'nmiforkvm' of ↵Martin Schwidefsky
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into features Pull kvm patches from Christian Borntraeger: "s390,kvm: provide plumbing for machines checks when running guests" This provides the basic plumbing for handling machine checks when running guests
2017-06-28KVM: s390: Inject machine check into the nested guestQingFeng Hao
With vsie feature enabled, kvm can support nested guests (guest-3). So inject machine check to the guest-2 if it happens when the nested guest is running. And guest-2 will detect the machine check belongs to guest-3 and reinject it into guest-3. The host (guest-1) tries to inject the machine check to the picked destination vcpu if it's a floating machine check. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-28KVM: s390: Inject machine check into the guestQingFeng Hao
If the exit flag of SIE indicates that a machine check has happened during guest's running and needs to be injected, inject it to the guest accordingly. But some machine checks, e.g. Channel Report Pending (CRW), refer to host conditions only (the guest's channel devices are not managed by the kernel directly) and are therefore not injected into the guest. External Damage (ED) is also not reinjected into the guest because ETR conditions are gone in Linux and STP conditions are not enabled in the guest, and ED contains only these 8 ETR and STP conditions. In general, instruction-processing damage, system recovery, storage error, service-processor damage and channel subsystem damage will be reinjected into the guest, and the remain (System damage, timing-facility damage, warning, ED and CRW) will be handled on the host. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-28Merge tag 'nmiforkvm' of ↵Christian Borntraeger
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kernelorgnext s390,kvm: provide plumbing for machines checks when running guests This provides the basic plumbing for handling machine checks when running guests
2017-06-27KVM: s390: Backup the guest's machine check infoQingFeng Hao
When a machine check happens in the guest, related mcck info (mcic, external damage code, ...) is stored in the vcpu's lowcore on the host. Then the machine check handler's low-level part is executed, followed by the high-level part. If the high-level part's execution is interrupted by a new machine check happening on the same vcpu on the host, the mcck info in the lowcore is overwritten with the new machine check's data. If the high-level part's execution is scheduled to a different cpu, the mcck info in the lowcore is uncertain. Therefore, for both cases, the further reinjection to the guest will use the wrong data. Let's backup the mcck info in the lowcore to the sie page for further reinjection, so that the right data will be used. Add new member into struct sie_page to store related machine check's info of mcic, failing storage address and external damage code. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-22KVM: s390: gaccess: fix real-space designation asce handling for gmap shadowsHeiko Carstens
For real-space designation asces the asce origin part is only a token. The asce token origin must not be used to generate an effective address for storage references. This however is erroneously done within kvm_s390_shadow_tables(). Furthermore within the same function the wrong parts of virtual addresses are used to generate a corresponding real address (e.g. the region second index is used as region first index). Both of the above can result in incorrect address translations. Only for real space designations with a token origin of zero and addresses below one megabyte the translation was correct. Furthermore replace a "!asce.r" statement with a "!*fake" statement to make it more obvious that a specific condition has nothing to do with the architecture, but with the fake handling of real space designations. Fixes: 3218f7094b6b ("s390/mm: support real-space for gmap shadows") Cc: David Hildenbrand <david@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-22KVM: s390: avoid packed attributeMartin Schwidefsky
For naturally aligned and sized data structures avoid superfluous packed and aligned attributes. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-22KVM: S390: add new group for flicYi Min Zhao
In some cases, userspace needs to get or set all ais states for example migration. So we introduce a new group KVM_DEV_FLIC_AISM_ALL to provide interfaces to get or set the adapter-interruption-suppression mode for all ISCs. The corresponding documentation is updated. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-22KVM: s390: implement instruction execution protection for emulatedChristian Borntraeger
ifetch While currently only used to fetch the original instruction on failure for getting the instruction length code, we should make the page table walking code future proof. Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-22KVM: s390: ioctls to get and set guest storage attributesClaudio Imbrenda
* Add the struct used in the ioctls to get and set CMMA attributes. * Add the two functions needed to get and set the CMMA attributes for guest pages. * Add the two ioctls that use the aforementioned functions. Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-22KVM: s390: CMMA tracking, ESSA emulation, migration modeClaudio Imbrenda
* Add a migration state bitmap to keep track of which pages have dirty CMMA information. * Disable CMMA by default, so we can track if it's used or not. Enable it on first use like we do for storage keys (unless we are doing a migration). * Creates a VM attribute to enter and leave migration mode. * In migration mode, CMMA is disabled in the SIE block, so ESSA is always interpreted and emulated in software. * Free the migration state on VM destroy. Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-12s390: rename struct psw_bits membersHeiko Carstens
Rename a couple of the struct psw_bits members so it is more obvious for what they are good. Initially I thought using the single character names from the PoP would be sufficient and obvious, but admittedly that is not true. The current implementation is not easy to use, if one has to look into the source file to figure out which member represents the 'per' bit (which is the 'r' member). Therefore rename the members to sane names that are identical to the uapi psw mask defines: r -> per i -> io e -> ext t -> dat m -> mcheck w -> wait p -> pstate Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-06-12s390: rename psw_bits enumsHeiko Carstens
The address space enums that must be used when modifying the address space part of a psw with the psw_bits() macro can easily be confused with the psw defines that are used to mask and compare directly the mask part of a psw. We have e.g. PSW_AS_PRIMARY vs PSW_ASC_PRIMARY. To avoid confusion rename the PSW_AS_* enums to PSW_BITS_AS_*. In addition also rename the PSW_AMODE_* enums, so they also follow the same naming scheme: PSW_BITS_AMODE_*. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-06-04KVM: add kvm_request_pendingRadim Krčmář
A first step in vcpu->requests encapsulation. Additionally, we now use READ_ONCE() when accessing vcpu->requests, which ensures we always load vcpu->requests when it's accessed. This is important as other threads can change it any time. Also, READ_ONCE() documents that vcpu->requests is used with other threads, likely requiring memory barriers, which it does. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> [ Documented the new use of READ_ONCE() and converted another check in arch/mips/kvm/vz.c ] Signed-off-by: Andrew Jones <drjones@redhat.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-05-31KVM: s390: fix ais handling vs cpu modelChristian Borntraeger
If ais is disabled via cpumodel, we must act accordingly, even if KVM_CAP_S390_AIS was enabled. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
2017-05-08Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more updates from Andrew Morton: - the rest of MM - various misc things - procfs updates - lib/ updates - checkpatch updates - kdump/kexec updates - add kvmalloc helpers, use them - time helper updates for Y2038 issues. We're almost ready to remove current_fs_time() but that awaits a btrfs merge. - add tracepoints to DAX * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits) drivers/staging/ccree/ssi_hash.c: fix build with gcc-4.4.4 selftests/vm: add a test for virtual address range mapping dax: add tracepoint to dax_insert_mapping() dax: add tracepoint to dax_writeback_one() dax: add tracepoints to dax_writeback_mapping_range() dax: add tracepoints to dax_load_hole() dax: add tracepoints to dax_pfn_mkwrite() dax: add tracepoints to dax_iomap_pte_fault() mtd: nand: nandsim: convert to memalloc_noreclaim_*() treewide: convert PF_MEMALLOC manipulations to new helpers mm: introduce memalloc_noreclaim_{save,restore} mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC mm/huge_memory.c: deposit a pgtable for DAX PMD faults when required mm/huge_memory.c: use zap_deposited_table() more time: delete CURRENT_TIME_SEC and CURRENT_TIME gfs2: replace CURRENT_TIME with current_time apparmorfs: replace CURRENT_TIME with current_time() lustre: replace CURRENT_TIME macro fs: ubifs: replace CURRENT_TIME_SEC with current_time fs: ufs: use ktime_get_real_ts64() for birthtime ...
2017-05-08treewide: use kv[mz]alloc* rather than opencoded variantsMichal Hocko
There are many code paths opencoding kvmalloc. Let's use the helper instead. The main difference to kvmalloc is that those users are usually not considering all the aspects of the memory allocator. E.g. allocation requests <= 32kB (with 4kB pages) are basically never failing and invoke OOM killer to satisfy the allocation. This sounds too disruptive for something that has a reasonable fallback - the vmalloc. On the other hand those requests might fallback to vmalloc even when the memory allocator would succeed after several more reclaim/compaction attempts previously. There is no guarantee something like that happens though. This patch converts many of those places to kv[mz]alloc* helpers because they are more conservative. Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390 Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim Acked-by: David Sterba <dsterba@suse.com> # btrfs Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4 Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5 Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Tony Luck <tony.luck@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Santosh Raspatur <santosh@chelsio.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Yishai Hadas <yishaih@mellanox.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: "Yan, Zheng" <zyan@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - HYP mode stub supports kexec/kdump on 32-bit - improved PMU support - virtual interrupt controller performance improvements - support for userspace virtual interrupt controller (slower, but necessary for KVM on the weird Broadcom SoCs used by the Raspberry Pi 3) MIPS: - basic support for hardware virtualization (ImgTec P5600/P6600/I6400 and Cavium Octeon III) PPC: - in-kernel acceleration for VFIO s390: - support for guests without storage keys - adapter interruption suppression x86: - usual range of nVMX improvements, notably nested EPT support for accessed and dirty bits - emulation of CPL3 CPUID faulting generic: - first part of VCPU thread request API - kvm_stat improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) kvm: nVMX: Don't validate disabled secondary controls KVM: put back #ifndef CONFIG_S390 around kvm_vcpu_kick Revert "KVM: Support vCPU-based gfn->hva cache" tools/kvm: fix top level makefile KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING KVM: Documentation: remove VM mmap documentation kvm: nVMX: Remove superfluous VMX instruction fault checks KVM: x86: fix emulation of RSM and IRET instructions KVM: mark requests that need synchronization KVM: return if kvm_vcpu_wake_up() did wake up the VCPU KVM: add explicit barrier to kvm_vcpu_kick KVM: perform a wake_up in kvm_make_all_cpus_request KVM: mark requests that do not need a wakeup KVM: remove #ifndef CONFIG_S390 around kvm_vcpu_wake_up KVM: x86: always use kvm_make_request instead of set_bit KVM: add kvm_{test,clear}_request to replace {test,clear}_bit s390: kvm: Cpu model support for msa6, msa7 and msa8 KVM: x86: remove irq disablement around KVM_SET_CLOCK/KVM_GET_CLOCK kvm: better MWAIT emulation for guests KVM: x86: virtualize cpuid faulting ...
2017-05-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - three merges for KVM/s390 with changes for vfio-ccw and cpacf. The patches are included in the KVM tree as well, let git sort it out. - add the new 'trng' random number generator - provide the secure key verification API for the pkey interface - introduce the z13 cpu counters to perf - add a new system call to set up the guarded storage facility - simplify TASK_SIZE and arch_get_unmapped_area - export the raw STSI data related to CPU topology to user space - ... and the usual churn of bug-fixes and cleanups. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (74 commits) s390/crypt: use the correct module alias for paes_s390. s390/cpacf: Introduce kma instruction s390/cpacf: query instructions use unique parameters for compatibility with KMA s390/trng: Introduce s390 TRNG device driver. s390/crypto: Provide s390 specific arch random functionality. s390/crypto: Add new subfunctions to the cpacf PRNO function. s390/crypto: Renaming PPNO to PRNO. s390/pageattr: avoid unnecessary page table splitting s390/mm: simplify arch_get_unmapped_area[_topdown] s390/mm: make TASK_SIZE independent from the number of page table levels s390/gs: add regset for the guarded storage broadcast control block s390/kvm: Add use_cmma field to mm_context_t s390/kvm: Add PGSTE manipulation functions vfio: ccw: improve error handling for vfio_ccw_mdev_remove vfio: ccw: remove unnecessary NULL checks of a pointer s390/spinlock: remove compare and delay instruction s390/spinlock: use atomic primitives for spinlocks s390/cpumf: simplify detection of guest samples s390/pci: remove forward declaration s390/pci: increase the PCI_NR_FUNCTIONS default ...
2017-04-27KVM: add kvm_{test,clear}_request to replace {test,clear}_bitRadim Krčmář
Users were expected to use kvm_check_request() for testing and clearing, but request have expanded their use since then and some users want to only test or do a faster clear. Make sure that requests are not directly accessed with bit operations. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-26s390: kvm: Cpu model support for msa6, msa7 and msa8Jason J. Herne
msa6 and msa7 require no changes. msa8 adds kma instruction and feature area. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-26s390/crypto: Renaming PPNO to PRNO.Harald Freudenberger
The PPNO (Perform Pseudorandom Number Operation) instruction has been renamed to PRNO (Perform Random Number Operation). To avoid confusion and conflicts with future extensions with this instruction (like e.g. provide a true random number generator) this patch renames all occurences in cpacf.h and adjusts the only exploiter code which is the prng device driver and one line in the s390 kvm feature check. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-25s390/mm: make TASK_SIZE independent from the number of page table levelsMartin Schwidefsky
The TASK_SIZE for a process should be maximum possible size of the address space, 2GB for a 31-bit process and 8PB for a 64-bit process. The number of page table levels required for a given memory layout is a consequence of the mapped memory areas and their location. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-21KVM: s390: Support keyless subset guest modeFarhan Ali
If the KSS facility is available on the machine, we also make it available for our KVM guests. The KSS facility bypasses storage key management as long as the guest does not issue a related instruction. When that happens, the control is returned to the host, which has to turn off KSS for a guest vcpu before retrying the instruction. Signed-off-by: Corey S. McQuay <csmcquay@linux.vnet.ibm.com> Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-12KVM: s390: fix stale machine check data for guarded storageChristian Borntraeger
When delivering a machine check the CPU state is "loaded", which means that some registers are already in the host registers. Before writing the register content into the machine check save area, we must make sure that we save the content of the registers into the data structures that are used for delivering the machine check. We already do the right thing for access, vector/floating point registers, let's do the same for guarded storage. Fixes: 4e0b1ab72b8a ("KVM: s390: gs support for kvm guests") Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-12KVM: s390: Fix sdnxo setting for nested guestsChristian Borntraeger
If the guest does not use the host register management, but it uses the sdnx area, we must fill in a proper sdnxo value (address of sdnx and the sdnxc). Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-07KVM: s390: introduce AIS capabilityYi Min Zhao
Introduce a cap to enable AIS facility bit, and add documentation for this capability. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-06KVM: s390: introduce adapter interrupt inject functionYi Min Zhao
Inject adapter interrupts on a specified adapter which allows to retrieve the adapter flags, e.g. if the adapter is subject to AIS facility or not. And add documentation for this interface. For adapters subject to AIS, handle the airq injection suppression for a given ISC according to the interruption mode: - before injection, if NO-Interruptions Mode, just return 0 and suppress, otherwise, allow the injection. - after injection, if SINGLE-Interruption Mode, change it to NO-Interruptions Mode to suppress the following interrupts. Besides, add tracepoint for suppressed airq and AIS mode transitions. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-06KVM: s390: introduce ais mode modify functionFei Li
Provide an interface for userspace to modify AIS (adapter-interruption-suppression) mode state, and add documentation for the interface. Allowed target modes are ALL-Interruptions mode and SINGLE-Interruption mode. We introduce the 'simm' and 'nimm' fields in kvm_s390_float_interrupt to store interruption modes for each ISC. Each bit in 'simm' and 'nimm' targets to one ISC, and collaboratively indicate three modes: ALL-Interruptions, SINGLE-Interruption and NO-Interruptions. This interface can initiate most transitions between the states; transition from SINGLE-Interruption to NO-Interruptions via adapter interrupt injection will be introduced in a following patch. The meaningful combinations are as follows: interruption mode | simm bit | nimm bit ------------------|----------|---------- ALL | 0 | 0 SINGLE | 1 | 0 NO | 1 | 1 Besides, add tracepoint to track AIS mode transitions. Co-Authored-By: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-06KVM: s390: interface for suppressible I/O adaptersFei Li
In order to properly implement adapter-interruption suppression, we need a way for userspace to specify which adapters are subject to suppression. Let's convert the existing (and unused) 'pad' field into a 'flags' field and define a flag value for suppressible adapters. Besides, add documentation for the interface. Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-03KVM: s390: remove change-recording override supportHeiko Carstens
Change-recording override (CO) was never implemented in any machine. According to the architecture it is unpredictable if a translation-specification exception will be recognized if the bit is set and EDAT1 does not apply. Therefore the easiest solution is to simply ignore the bit. This also fixes commit cd1836f583d7 ("KVM: s390: instruction-execution-protection support"). A guest may enable instruction-execution-protection (IEP) but not EDAT1. In such a case the guest_translate() function (arch/s390/kvm/gaccess.c) will report a specification exception on pages that have the IEP bit set while it should not. It might make sense to add full IEP support to guest_translate() and the GACC_IFETCH case. However, as far as I can tell the GACC_IFETCH case is currently only used after an instruction was executed in order to fetch the failing instruction. So there is no additional problem *currently*. Fixes: cd1836f583d7 ("KVM: s390: instruction-execution-protection support") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-03-22KVM: s390: gs support for kvm guestsFan Zhang
This patch adds guarded storage support for KVM guest. We need to setup the necessary control blocks, the kvm_run structure for the new registers, the necessary wrappers for VSIE, as well as the machine check save areas. GS is enabled lazily and the register saving and reloading is done in KVM code. As this feature adds new content for migration, we provide a new capability for enablement (KVM_CAP_S390_GS). Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-03-22Merge remote-tracking branch 's390/guarded-storage' into kvms390/nextChristian Borntraeger
2017-03-22s390: add a system call for guarded storageMartin Schwidefsky
This adds a new system call to enable the use of guarded storage for user space processes. The system call takes two arguments, a command and pointer to a guarded storage control block: s390_guarded_storage(int command, struct gs_cb *gs_cb); The second argument is relevant only for the GS_SET_BC_CB command. The commands in detail: 0 - GS_ENABLE Enable the guarded storage facility for the current task. The initial content of the guarded storage control block will be all zeros. After the enablement the user space code can use load-guarded-storage-controls instruction (LGSC) to load an arbitrary control block. While a task is enabled the kernel will save and restore the current content of the guarded storage registers on context switch. 1 - GS_DISABLE Disables the use of the guarded storage facility for the current task. The kernel will cease to save and restore the content of the guarded storage registers, the task specific content of these registers is lost. 2 - GS_SET_BC_CB Set a broadcast guarded storage control block. This is called per thread and stores a specific guarded storage control block in the task struct of the current task. This control block will be used for the broadcast event GS_BROADCAST. 3 - GS_CLEAR_BC_CB Clears the broadcast guarded storage control block. The guarded- storage control block is removed from the task struct that was established by GS_SET_BC_CB. 4 - GS_BROADCAST Sends a broadcast to all thread siblings of the current task. Every sibling that has established a broadcast guarded storage control block will load this control block and will be enabled for guarded storage. The broadcast guarded storage control block is used up, a second broadcast without a refresh of the stored control block with GS_SET_BC_CB will not have any effect. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-03-21KVM: s390: Use defines for intercept codeFarhan Ali
Let's use #define values for better readability. Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-03-16KVM: s390: use defines for execution controlsDavid Hildenbrand
Let's replace the bitmasks by defines. Reconstructed from code, comments and commit messages. Tried to keep the defines short and map them to feature names. In case they don't completely map to features, keep them in the stye of ICTL defines. This effectively drops all "U" from the existing numbers. I think this should be fine (as similarly done for e.g. ICTL defines). I am not 100% sure about the ECA_MVPGI and ECA_PROTEXCI bits as they are always used in pairs. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170313104828.13362-1-david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [some renames, add one missing place]
2017-03-16KVM: s390: Handle sthyi also for instruction interceptChristian Borntraeger
Right now we handle the STHYI only via the operation exception intercept (illegal instruction). If hardware ever decides to provide an instruction intercept for STHYI, we should handle that as well. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-03-16KVM: s390: log runtime instrumentation enablementChristian Borntraeger
We handle runtime instrumentation enablement either lazy or via sync_regs on migration. Make sure to add a debug log entry for that per CPU on the first occurrence. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-03-02sched/headers: Prepare to remove the <linux/mm_types.h> dependency from ↵Ingo Molnar
<linux/sched.h> Update code that relied on sched.h including various MM types for them. This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare to move signal wakeup & sigpending methods from ↵Ingo Molnar
<linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "4.11 is going to be a relatively large release for KVM, with a little over 200 commits and noteworthy changes for most architectures. ARM: - GICv3 save/restore - cache flushing fixes - working MSI injection for GICv3 ITS - physical timer emulation MIPS: - various improvements under the hood - support for SMP guests - a large rewrite of MMU emulation. KVM MIPS can now use MMU notifiers to support copy-on-write, KSM, idle page tracking, swapping, ballooning and everything else. KVM_CAP_READONLY_MEM is also supported, so that writes to some memory regions can be treated as MMIO. The new MMU also paves the way for hardware virtualization support. PPC: - support for POWER9 using the radix-tree MMU for host and guest - resizable hashed page table - bugfixes. s390: - expose more features to the guest - more SIMD extensions - instruction execution protection - ESOP2 x86: - improved hashing in the MMU - faster PageLRU tracking for Intel CPUs without EPT A/D bits - some refactoring of nested VMX entry/exit code, preparing for live migration support of nested hypervisors - expose yet another AVX512 CPUID bit - host-to-guest PTP support - refactoring of interrupt injection, with some optimizations thrown in and some duct tape removed. - remove lazy FPU handling - optimizations of user-mode exits - optimizations of vcpu_is_preempted() for KVM guests generic: - alternative signaling mechanism that doesn't pound on tsk->sighand->siglock" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (195 commits) x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64 x86/paravirt: Change vcp_is_preempted() arg type to long KVM: VMX: use correct vmcs_read/write for guest segment selector/base x86/kvm/vmx: Defer TR reload after VM exit x86/asm/64: Drop __cacheline_aligned from struct x86_hw_tss x86/kvm/vmx: Simplify segment_base() x86/kvm/vmx: Get rid of segment_base() on 64-bit kernels x86/kvm/vmx: Don't fetch the TSS base from the GDT x86/asm: Define the kernel TSS limit in a macro kvm: fix page struct leak in handle_vmon KVM: PPC: Book3S HV: Disable HPT resizing on POWER9 for now KVM: Return an error code only as a constant in kvm_get_dirty_log() KVM: Return an error code only as a constant in kvm_get_dirty_log_protect() KVM: Return directly after a failed copy_from_user() in kvm_vm_compat_ioctl() KVM: x86: remove code for lazy FPU handling KVM: race-free exit from KVM_RUN without POSIX signals KVM: PPC: Book3S HV: Turn "KVM guest htab" message into a debug message KVM: PPC: Book3S PR: Ratelimit copy data failure error messages KVM: Support vCPU-based gfn->hva cache KVM: use separate generations for each address space ...
2017-02-22Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - New entropy generation for the pseudo random number generator. - Early boot printk output via sclp to help debug crashes on boot. This needs to be enabled with a kernel parameter. - Add proper no-execute support with a bit in the page table entry. - Bug fixes and cleanups. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (65 commits) s390/syscall: fix single stepped system calls s390/zcrypt: make ap_bus explicitly non-modular s390/zcrypt: Removed unneeded debug feature directory creation. s390: add missing "do {} while (0)" loop constructs to multiline macros s390/mm: add cond_resched call to kernel page table dumper s390: get rid of MACHINE_HAS_PFMF and MACHINE_HAS_HPAGE s390/mm: make memory_block_size_bytes available for !MEMORY_HOTPLUG s390: replace ACCESS_ONCE with READ_ONCE s390: Audit and remove any remaining unnecessary uses of module.h s390: mm: Audit and remove any unnecessary uses of module.h s390: kernel: Audit and remove any unnecessary uses of module.h s390/kdump: Use "LINUX" ELF note name instead of "CORE" s390: add no-execute support s390: report new vector facilities s390: use correct input data address for setup_randomness s390/sclp: get rid of common response code handling s390/sclp: don't add new lines to each printed string s390/sclp: make early sclp code readable s390/sclp: disable early sclp code as soon as the base sclp driver is active s390/sclp: move early printk code to drivers ...
2017-02-17KVM: race-free exit from KVM_RUN without POSIX signalsPaolo Bonzini
The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick" a VCPU out of KVM_RUN through a POSIX signal. A signal is attached to a dummy signal handler; by blocking the signal outside KVM_RUN and unblocking it inside, this possible race is closed: VCPU thread service thread -------------------------------------------------------------- check flag set flag raise signal (signal handler does nothing) KVM_RUN However, one issue with KVM_SET_SIGNAL_MASK is that it has to take tsk->sighand->siglock on every KVM_RUN. This lock is often on a remote NUMA node, because it is on the node of a thread's creator. Taking this lock can be very expensive if there are many userspace exits (as is the case for SMP Windows VMs without Hyper-V reference time counter). As an alternative, we can put the flag directly in kvm_run so that KVM can see it: VCPU thread service thread -------------------------------------------------------------- raise signal signal handler set run->immediate_exit KVM_RUN check run->immediate_exit Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-17s390: Audit and remove any remaining unnecessary uses of module.hPaul Gortmaker
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each change instance for the presence of either and replace as needed. An instance where module_param was used without moduleparam.h was also fixed, as well as implicit use of ptrace.h and string.h headers. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-06KVM: s390: detect some program check loopsChristian Borntraeger
Sometimes (e.g. early boot) a guest is broken in such ways that it loops 100% delivering operation exceptions (illegal operation) but the pgm new PSW is not set properly. This will result in code being read from address zero, which usually contains another illegal op. Let's detect this case and return to userspace. Instead of only detecting this for address zero apply a heuristic that will work for any program check new psw. We do not want guest problem state to be able to trigger a guest panic, e.g. by faulting on an address that is the same as the program check new PSW, so we check for the problem state bit being off. With proper handling in userspace we a: get rid of CPU consumption of such broken guests b: keep the program old PSW. This allows to find out the original illegal operation - making debugging such early boot issues much easier than with single stepping Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-02-06KVM: s390: Disable dirty log retrieval for UCONTROL guestsJanosch Frank
User controlled KVM guests do not support the dirty log, as they have no single gmap that we can check for changes. As they have no single gmap, kvm->arch.gmap is NULL and all further referencing to it for dirty checking will result in a NULL dereference. Let's return -EINVAL if a caller tries to sync dirty logs for a UCONTROL guest. Fixes: 15f36eb ("KVM: s390: Add proper dirty bitmap support to S390 kvm.") Cc: <stable@vger.kernel.org> # 3.16+ Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-01-30KVM: s390: Add debug logging to basic cpu model interfaceChristian Borntraeger
Let's log something for changes in facilities, cpuid and ibc now that we have a cpu model in QEMU. All of these calls are pretty seldom, so we will not spill the log, the they will help to understand pontential guest issues, for example if some instructions are fenced off. As the s390 debug feature has a limited amount of parameters and strings must not go away we limit the facility printing to 3 double words, instead of building that list dynamically. This should be enough for several years. If we ever exceed 3 double words then the logging will be incomplete but no functional impact will happen. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>