summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2016-06-23MIPS: KVM: Combine entry trace events into classJames Hogan
Combine the kvm_enter, kvm_reenter and kvm_out trace events into a single kvm_transition event class to reduce duplication and bloat. Suggested-by: Steven Rostedt <rostedt@goodmis.org> Fixes: 93258604ab6d ("MIPS: KVM: Add guest mode switch trace events") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-23kvm: x86: use getboottime64Arnd Bergmann
KVM reads the current boottime value as a struct timespec in order to calculate the guest wallclock time, resulting in an overflow in 2038 on 32-bit systems. The data then gets passed as an unsigned 32-bit number to the guest, and that in turn overflows in 2106. We cannot do much about the second overflow, which affects both 32-bit and 64-bit hosts, but we can ensure that they both behave the same way and don't overflow until 2106, by using getboottime64() to read a timespec64 value. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-23KVM: VMX: enable guest access to LMCE related MSRsAshok Raj
On Intel platforms, this patch adds LMCE to KVM MCE supported capabilities and handles guest access to LMCE related MSRs. Signed-off-by: Ashok Raj <ashok.raj@intel.com> [Haozhong: macro KVM_MCE_CAP_SUPPORTED => variable kvm_mce_cap_supported Only enable LMCE on Intel platform Check MSR_IA32_FEATURE_CONTROL when handling guest access to MSR_IA32_MCG_EXT_CTL] Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-23KVM: VMX: validate individual bits of guest MSR_IA32_FEATURE_CONTROLHaozhong Zhang
KVM currently does not check the value written to guest MSR_IA32_FEATURE_CONTROL, though bits corresponding to disabled features may be set. This patch makes KVM to validate individual bits written to guest MSR_IA32_FEATURE_CONTROL according to enabled features. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-23KVM: VMX: move msr_ia32_feature_control to vcpu_vmxHaozhong Zhang
msr_ia32_feature_control will be used for LMCE and not depend only on nested anymore, so move it from struct nested_vmx to struct vcpu_vmx. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-21KVM: s390: vsie: add module parameter "nested"David Hildenbrand
Let's be careful first and allow nested virtualization only if enabled by the system administrator. In addition, user space still has to explicitly enable it via SCLP features for it to work. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: add indication for future featuresDavid Hildenbrand
We have certain SIE features that we cannot support for now. Let's add these features, so user space can directly prepare to enable them, so we don't have to update yet another component. In addition, add a comment block, telling why it is for now not possible to forward/enable these features. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: correctly set and handle guest TODDavid Hildenbrand
Guest 2 sets up the epoch of guest 3 from his point of view. Therefore, we have to add the guest 2 epoch to the guest 3 epoch. We also have to take care of guest 2 epoch changes on STP syncs. This will work just fine by also updating the guest 3 epoch when a vsie_block has been set for a VCPU. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: speed up VCPU external callsDavid Hildenbrand
Whenever a SIGP external call is injected via the SIGP external call interpretation facility, the VCPU is not kicked. When a VCPU is currently in the VSIE, the external call might not be processed immediately. Therefore we have to provoke partial execution exceptions, which leads to a kick of the VCPU and therefore also kick out of VSIE. This is done by simulating the WAIT state. This bit has no other side effects. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: don't use CPUSTAT_WAIT to detect if a VCPU is idleDavid Hildenbrand
As we want to make use of CPUSTAT_WAIT also when a VCPU is not idle but to force interception of external calls, let's check in the bitmap instead. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: speed up VCPU irq delivery when handling vsieDavid Hildenbrand
Whenever we want to wake up a VCPU (e.g. when injecting an IRQ), we have to kick it out of vsie, so the request will be handled faster. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: try to refault after a reported fault to g2David Hildenbrand
We can avoid one unneeded SIE entry after we reported a fault to g2. Theoretically, g2 resolves the fault and we can create the shadow mapping directly, instead of failing again when entering the SIE. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support IBS interpretationDavid Hildenbrand
We can easily enable ibs for guest 2, so he can use it for guest 3. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support conditional-external-interceptionDavid Hildenbrand
We can easily enable cei for guest 2, so he can use it for guest 3. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support intervention-bypassDavid Hildenbrand
We can easily enable intervention bypass for guest 2, so it can use it for guest 3. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support guest-storage-limit-suppressionDavid Hildenbrand
We can easily forward guest-storage-limit-suppression if available. One thing to care about is keeping the prefix properly mapped when gsls in toggled on/off or the mso changes in between. Therefore we better remap the prefix on any mso changes just like we already do with the prefix. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support guest-PER-enhancementDavid Hildenbrand
We can easily forward the guest-PER-enhancement facility to guest 2 if available. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support shared IPTE-interlock facilityDavid Hildenbrand
As we forward the whole SCA provided by guest 2, we can directly forward SIIF if available. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support 64-bit-SCAODavid Hildenbrand
Let's provide the 64-bit-SCAO facility to guest 2, so he can set up a SCA for guest 3 that has a 64 bit address. Please note that we already require the 64 bit SCAO for our vsie implementation, in order to forward the SCA directly (by pinning the page). Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support run-time-instrumentationDavid Hildenbrand
As soon as guest 2 is allowed to use run-time-instrumentation (indicated via via STFLE), it can also enable it for guest 3. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support vectory facility (SIMD)David Hildenbrand
As soon as guest 2 is allowed to use the vector facility (indicated via STFLE), it can also enable it for guest 3. We have to take care of the sattellite block that might be used when not relying on lazy vector copying (not the case for KVM). Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support transactional executionDavid Hildenbrand
As soon as guest 2 is allowed to use transactional execution (indicated via STFLE), he can also enable it for guest 3. Active transactional execution requires also the second prefix page to be mapped. If that page cannot be mapped, a validity icpt has to be presented to the guest. We have to take care of tx being toggled on/off, otherwise we might get wrong prefix validity icpt. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support aes dea wrapping keysDavid Hildenbrand
As soon as message-security-assist extension 3 is enabled for guest 2, we have to allow key wrapping for guest 3. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support STFLE interpretationDavid Hildenbrand
Issuing STFLE is extremely rare. Instead of copying 2k on every VSIE call, let's do this lazily, when a guest 3 tries to execute STFLE. We can setup the block and retry. Unfortunately, we can't directly forward that facility list, as we only have a 31 bit address for the facility list designation. So let's use a DMA allocation for our vsie_page instead for now. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support host-protection-interruptionDavid Hildenbrand
Introduced with ESOP, therefore available for the guest if it is allowed to use ESOP. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support edat1 / edat2David Hildenbrand
If guest 2 is allowed to use edat 1 / edat 2, it can also set it up for guest 3, so let's properly check and forward the edat cpuflags. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: support setting the ibcDavid Hildenbrand
As soon as we forward an ibc to guest 2 (indicated via kvm->arch.model.ibc), he can also use it for guest 3. Let's properly round the ibc up/down, so we avoid any potential validity icpts from the underlying SIE, if it doesn't simply round the values. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: optimize gmap prefix mappingDavid Hildenbrand
In order to not always map the prefix, we have to take care of certain aspects that implicitly unmap the prefix: - Changes to the prefix address - Changes to MSO, because the HVA of the prefix is changed - Changes of the gmap shadow (e.g. unshadowed, asce or edat changes) By properly handling these cases, we can stop remapping the prefix when there is no reason to do so. This also allows us now to not acquire any gmap shadow locks when rerunning the vsie and still having a valid gmap shadow. Please note, to detect changing gmap shadows, we have to keep the reference of the gmap shadow. The address of a gmap shadow does otherwise not reliably indicate if the gmap shadow has changed (the memory chunk could get reused). Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21KVM: s390: vsie: initial support for nested virtualizationDavid Hildenbrand
This patch adds basic support for nested virtualization on s390x, called VSIE (virtual SIE) and allows it to be used by the guest if the necessary facilities are supported by the hardware and enabled for the guest. In order to make this work, we have to shadow the sie control block provided by guest 2. In order to gain some performance, we have to reuse the same shadow blocks as good as possible. For now, we allow as many shadow blocks as we have VCPUs (that way, every VCPU can run the VSIE concurrently). We have to watch out for the prefix getting unmapped out of our shadow gmap and properly get the VCPU out of VSIE in that case, to fault the prefix pages back in. We use the PROG_REQUEST bit for that purpose. This patch is based on an initial prototype by Tobias Elpelt. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390: introduce page_to_virt() and pfn_to_virt()David Hildenbrand
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20KVM: s390: backup the currently enabled gmap when scheduled outDavid Hildenbrand
Nested virtualization will have to enable own gmaps. Current code would enable the wrong gmap whenever scheduled out and back in, therefore resulting in the wrong gmap being enabled. This patch reenables the last enabled gmap, therefore avoiding having to touch vcpu->arch.gmap when enabling a different gmap. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20KVM: s390: fast path for shadow gmaps in gmap notifierDavid Hildenbrand
The default kvm gmap notifier doesn't have to handle shadow gmaps. So let's just directly exit in case we get notified about one. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: don't fault everything in read-write in gmap_pte_op_fixup()David Hildenbrand
Let's not fault in everything in read-write but limit it to read-only where possible. When restricting access rights, we already have the required protection level in our hands. When reading from guest 2 storage (gmap_read_table), it is obviously PROT_READ. When shadowing a pte, the required protection level is given via the guest 2 provided pte. Based on an initial patch by Martin Schwidefsky. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: allow to check if a gmap shadow is validDavid Hildenbrand
It will be very helpful to have a mechanism to check without any locks if a given gmap shadow is still valid and matches the given properties. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: remember the int code for the last gmap faultDavid Hildenbrand
For nested virtualization, we want to know if we are handling a protection exception, because these can directly be forwarded to the guest without additional checks. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: limit number of real-space gmap shadowsDavid Hildenbrand
We have no known user of real-space designation and only support it to be architecture compliant. Gmap shadows with real-space designation are never unshadowed automatically, as there is nothing to protect for the top level table. So let's simply limit the number of such shadows to one by removing existing ones on creation of another one. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: support real-space for gmap shadowsDavid Hildenbrand
We can easily support real-space designation just like EDAT1 and EDAT2. So guest2 can provide for guest3 an asce with the real-space control being set. We simply have to allocate the biggest page table possible and fake all levels. There is no protection to consider. If we exceed guest memory, vsie code will inject an addressing exception (via program intercept). In the future, we could limit the fake table level to the gmap page table. As the top level page table can never go away, such gmap shadows will never get unshadowed, we'll have to come up with another way to limit the number of kept gmap shadows. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: push rte protection down to shadow pteDavid Hildenbrand
Just like we already do with ste protection, let's take rte protection into account. This way, the host pte doesn't have to be mapped writable. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: support EDAT2 for gmap shadowsDavid Hildenbrand
If the guest is enabled for EDAT2, we can easily create shadows for guest2 -> guest3 provided tables that make use of EDAT2. If guest2 references a 2GB page, this memory looks consecutive for guest2, but it does not have to be so for us. Therefore we have to create fake segment and page tables. This works just like EDAT1 support, so page tables are removed when the parent table (r3t table entry) is changed. We don't hve to care about: - ACCF-Validity Control in RTTE - Access-Control Bits in RTTE - Fetch-Protection Bit in RTTE - Common-Region Bit in RTTE Just like for EDAT1, all bits might be dropped and there is no guaranteed that they are active. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: support EDAT1 for gmap shadowsDavid Hildenbrand
If the guest is enabled for EDAT1, we can easily create shadows for guest2 -> guest3 provided tables that make use of EDAT1. If guest2 references a 1MB page, this memory looks consecutive for guest2, but it might not be so for us. Therefore we have to create fake page tables. We can easily add that to our existing infrastructure. The invalidation mechanism will make sure that fake page tables are removed when the parent table (sgt table entry) is changed. As EDAT1 also introduced protection on all page table levels, we have to also shadow these correctly. We don't have to care about: - ACCF-Validity Control in STE - Access-Control Bits in STE - Fetch-Protection Bit in STE - Common-Segment Bit in STE As all bits might be dropped and there is no guaranteed that they are active ("unpredictable whether the CPU uses these bits", "may be used"). Without using EDAT1 in the shadow ourselfes (STE-format control == 0), simply shadowing these bits would not be enough. They would be ignored. Please note that we are using the "fake" flag to make this look consistent with further changes (EDAT2, real-space designation support) and don't let the shadow functions handle fc=1 stes. In the future, with huge pages in the host, gmap_shadow_pgt() could simply try to map a huge host page if "fake" is set to one and indicate via return value that no lower fake tables / shadow ptes are required. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: prepare for EDAT1/EDAT2 support in gmap shadowDavid Hildenbrand
In preparation for EDAT1/EDAT2 support for gmap shadows, we have to store the requested edat level in the gmap shadow. The edat level used during shadow translation is a property of the gmap shadow. Depending on that level, the gmap shadow will look differently for the same guest tables. We have to store it internally in order to support it later. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: push ste protection down to shadow pteDavid Hildenbrand
If a guest ste is read-only, it doesn't make sense to force the ptes in as writable in the host. If the source page is read-only in the host, it won't have to be made writable. Please note that if the source page is not available, it will still be faulted in writable. This can be changed internally later on. If ste protection is removed, underlying shadow tables are also removed, therefore this change does not affect the guest. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: take ipte_lock during shadow faultsDavid Hildenbrand
Let's take the ipte_lock while working on guest 2 provided page table, just like the other gaccess functions. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: protection exceptions are corrrectly shadowedDavid Hildenbrand
As gmap shadows contains correct protection permissions, protection exceptons can directly be forwarded to guest 3. If we would encounter a protection exception while faulting, the next guest 3 run will automatically handle that for us. Keep the dat_protection logic in place, as it will be helpful later. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: take the mmap_sem in kvm_s390_shadow_fault()David Hildenbrand
Instead of doing it in the caller, let's just take the mmap_sem in kvm_s390_shadow_fault(). By taking it as read, we allow parallel faulting on shadow page tables, gmap shadow code is prepared for that. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: fix races on gmap_shadow creationDavid Hildenbrand
Before any thread is allowed to use a gmap_shadow, it has to be fully initialized. However, for invalidation to work properly, we have to register the new gmap_shadow before we protect the parent gmap table. Because locking is tricky, and we have to avoid duplicate gmaps, let's introduce an initialized field, that signalizes other threads if that gmap_shadow can already be used or if they have to retry. Let's properly return errors using ERR_PTR() instead of simply returning NULL, so a caller can properly react on the error. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: avoid races on region/segment/page table shadowingDavid Hildenbrand
We have to unlock sg->guest_table_lock in order to call gmap_protect_rmap(). If we sleep just before that call, another VCPU might pick up that shadowed page table (while it is not protected yet) and use it. In order to avoid these races, we have to introduce a third state - "origin set but still invalid" for an entry. This way, we can avoid another thread already using the entry before the table is fully protected. As soon as everything is set up, we can clear the invalid bit - if we had no race with the unshadowing code. Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: shadow pages with real guest requested protectionDavid Hildenbrand
We really want to avoid manually handling protection for nested virtualization. By shadowing pages with the protection the guest asked us for, the SIE can handle most protection-related actions for us (e.g. special handling for MVPG) and we can directly forward protection exceptions to the guest. PTEs will now always be shadowed with the correct _PAGE_PROTECT flag. Unshadowing will take care of any guest changes to the parent PTE and any host changes to the host PTE. If the host PTE doesn't have the fitting access rights or is not available, we have to fix it up. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: flush tlb of shadows in all situationsDavid Hildenbrand
For now, the tlb of shadow gmap is only flushed when the parent is removed, not when it is removed upfront. Therefore other shadow gmaps can reuse the tables without the tlb getting flushed. Fix this by simply flushing the tlb 1. Before the shadow tables are removed (analogouos to other unshadow functions) 2. When the gmap is freed and therefore the top level pages are freed. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: add kvm shadow fault functionMartin Schwidefsky
This patch introduces function kvm_s390_shadow_fault() used to resolve a fault on a shadow gmap. This function will do validity checking and build up the shadow page table hierarchy in order to fault in the requested page into the shadow page table structure. If an exception occurs while shadowing, guest 2 has to be notified about it using either an exception or a program interrupt intercept. If concurrent unshadowing occurres, this function will simply return with -EAGAIN and the caller has to retry. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>