diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2016-03-16 09:55:35 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2016-03-16 09:55:35 -0700 |
commit | 10dc3747661bea9215417b659449bb7b8ed3df2c (patch) | |
tree | d943974b4941203a7db2fabe4896852cf0f16bc4 /Documentation | |
parent | 047486d8e7c2a7e8d75b068b69cb67b47364f5d4 (diff) | |
parent | f958ee745f70b60d0e41927cab2c073104bc70c2 (diff) |
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"One of the largest releases for KVM... Hardly any generic
changes, but lots of architecture-specific updates.
ARM:
- VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
- PMU support for guests
- 32bit world switch rewritten in C
- various optimizations to the vgic save/restore code.
PPC:
- enabled KVM-VFIO integration ("VFIO device")
- optimizations to speed up IPIs between vcpus
- in-kernel handling of IOMMU hypercalls
- support for dynamic DMA windows (DDW).
s390:
- provide the floating point registers via sync regs;
- separated instruction vs. data accesses
- dirty log improvements for huge guests
- bugfixes and documentation improvements.
x86:
- Hyper-V VMBus hypercall userspace exit
- alternative implementation of lowest-priority interrupts using
vector hashing (for better VT-d posted interrupt support)
- fixed guest debugging with nested virtualizations
- improved interrupt tracking in the in-kernel IOAPIC
- generic infrastructure for tracking writes to guest
memory - currently its only use is to speedup the legacy shadow
paging (pre-EPT) case, but in the future it will be used for
virtual GPUs as well
- much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits)
KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch
KVM: x86: disable MPX if host did not enable MPX XSAVE features
arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
arm64: KVM: vgic-v3: Reset LRs at boot time
arm64: KVM: vgic-v3: Do not save an LR known to be empty
arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
arm64: KVM: vgic-v3: Avoid accessing ICH registers
KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit
KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit
KVM: arm/arm64: vgic-v2: Reset LRs at boot time
KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty
KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function
KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required
KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers
KVM: s390: allocate only one DMA page per VM
KVM: s390: enable STFLE interpretation only if enabled for the guest
KVM: s390: wake up when the VCPU cpu timer expires
KVM: s390: step the VCPU timer while in enabled wait
KVM: s390: protect VCPU cpu timer with a seqcount
KVM: s390: step VCPU cpu timer during kvm_run ioctl
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/virtual/kvm/api.txt | 99 | ||||
-rw-r--r-- | Documentation/virtual/kvm/devices/s390_flic.txt | 2 | ||||
-rw-r--r-- | Documentation/virtual/kvm/devices/vcpu.txt | 33 | ||||
-rw-r--r-- | Documentation/virtual/kvm/devices/vm.txt | 52 | ||||
-rw-r--r-- | Documentation/virtual/kvm/mmu.txt | 6 |
5 files changed, 185 insertions, 7 deletions
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 07e4cdf02407..4d0542c5206b 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2507,8 +2507,9 @@ struct kvm_create_device { 4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR -Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device -Type: device ioctl, vm ioctl +Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device, + KVM_CAP_VCPU_ATTRIBUTES for vcpu device +Type: device ioctl, vm ioctl, vcpu ioctl Parameters: struct kvm_device_attr Returns: 0 on success, -1 on error Errors: @@ -2533,8 +2534,9 @@ struct kvm_device_attr { 4.81 KVM_HAS_DEVICE_ATTR -Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device -Type: device ioctl, vm ioctl +Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device, + KVM_CAP_VCPU_ATTRIBUTES for vcpu device +Type: device ioctl, vm ioctl, vcpu ioctl Parameters: struct kvm_device_attr Returns: 0 on success, -1 on error Errors: @@ -2577,6 +2579,8 @@ Possible features: Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only). - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU. Depends on KVM_CAP_ARM_PSCI_0_2. + - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU. + Depends on KVM_CAP_ARM_PMU_V3. 4.83 KVM_ARM_PREFERRED_TARGET @@ -3035,6 +3039,87 @@ Returns: 0 on success, -1 on error Queues an SMI on the thread's vcpu. +4.97 KVM_CAP_PPC_MULTITCE + +Capability: KVM_CAP_PPC_MULTITCE +Architectures: ppc +Type: vm + +This capability means the kernel is capable of handling hypercalls +H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user +space. This significantly accelerates DMA operations for PPC KVM guests. +User space should expect that its handlers for these hypercalls +are not going to be called if user space previously registered LIOBN +in KVM (via KVM_CREATE_SPAPR_TCE or similar calls). + +In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest, +user space might have to advertise it for the guest. For example, +IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is +present in the "ibm,hypertas-functions" device-tree property. + +The hypercalls mentioned above may or may not be processed successfully +in the kernel based fast path. If they can not be handled by the kernel, +they will get passed on to user space. So user space still has to have +an implementation for these despite the in kernel acceleration. + +This capability is always enabled. + +4.98 KVM_CREATE_SPAPR_TCE_64 + +Capability: KVM_CAP_SPAPR_TCE_64 +Architectures: powerpc +Type: vm ioctl +Parameters: struct kvm_create_spapr_tce_64 (in) +Returns: file descriptor for manipulating the created TCE table + +This is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit +windows, described in 4.62 KVM_CREATE_SPAPR_TCE + +This capability uses extended struct in ioctl interface: + +/* for KVM_CAP_SPAPR_TCE_64 */ +struct kvm_create_spapr_tce_64 { + __u64 liobn; + __u32 page_shift; + __u32 flags; + __u64 offset; /* in pages */ + __u64 size; /* in pages */ +}; + +The aim of extension is to support an additional bigger DMA window with +a variable page size. +KVM_CREATE_SPAPR_TCE_64 receives a 64bit window size, an IOMMU page shift and +a bus offset of the corresponding DMA window, @size and @offset are numbers +of IOMMU pages. + +@flags are not used at the moment. + +The rest of functionality is identical to KVM_CREATE_SPAPR_TCE. + +4.98 KVM_REINJECT_CONTROL + +Capability: KVM_CAP_REINJECT_CONTROL +Architectures: x86 +Type: vm ioctl +Parameters: struct kvm_reinject_control (in) +Returns: 0 on success, + -EFAULT if struct kvm_reinject_control cannot be read, + -ENXIO if KVM_CREATE_PIT or KVM_CREATE_PIT2 didn't succeed earlier. + +i8254 (PIT) has two modes, reinject and !reinject. The default is reinject, +where KVM queues elapsed i8254 ticks and monitors completion of interrupt from +vector(s) that i8254 injects. Reinject mode dequeues a tick and injects its +interrupt whenever there isn't a pending interrupt from i8254. +!reinject mode injects an interrupt as soon as a tick arrives. + +struct kvm_reinject_control { + __u8 pit_reinject; + __u8 reserved[31]; +}; + +pit_reinject = 0 (!reinject mode) is recommended, unless running an old +operating system that uses the PIT for timing (e.g. Linux 2.4.x). + 5. The kvm_run structure ------------------------ @@ -3339,6 +3424,7 @@ EOI was received. struct kvm_hyperv_exit { #define KVM_EXIT_HYPERV_SYNIC 1 +#define KVM_EXIT_HYPERV_HCALL 2 __u32 type; union { struct { @@ -3347,6 +3433,11 @@ EOI was received. __u64 evt_page; __u64 msg_page; } synic; + struct { + __u64 input; + __u64 result; + __u64 params[2]; + } hcall; } u; }; /* KVM_EXIT_HYPERV */ diff --git a/Documentation/virtual/kvm/devices/s390_flic.txt b/Documentation/virtual/kvm/devices/s390_flic.txt index d1ad9d5cae46..e3e314cb83e8 100644 --- a/Documentation/virtual/kvm/devices/s390_flic.txt +++ b/Documentation/virtual/kvm/devices/s390_flic.txt @@ -88,6 +88,8 @@ struct kvm_s390_io_adapter_req { perform a gmap translation for the guest address provided in addr, pin a userspace page for the translated address and add it to the list of mappings + Note: A new mapping will be created unconditionally; therefore, + the calling code should avoid making duplicate mappings. KVM_S390_IO_ADAPTER_UNMAP release a userspace page for the translated address specified in addr diff --git a/Documentation/virtual/kvm/devices/vcpu.txt b/Documentation/virtual/kvm/devices/vcpu.txt new file mode 100644 index 000000000000..c04165868faf --- /dev/null +++ b/Documentation/virtual/kvm/devices/vcpu.txt @@ -0,0 +1,33 @@ +Generic vcpu interface +==================================== + +The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR, +KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct +kvm_device_attr as other devices, but targets VCPU-wide settings and controls. + +The groups and attributes per virtual cpu, if any, are architecture specific. + +1. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL +Architectures: ARM64 + +1.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ +Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a + pointer to an int +Returns: -EBUSY: The PMU overflow interrupt is already set + -ENXIO: The overflow interrupt not set when attempting to get it + -ENODEV: PMUv3 not supported + -EINVAL: Invalid PMU overflow interrupt number supplied + +A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt +number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt +type must be same for each vcpu. As a PPI, the interrupt number is the same for +all vcpus, while as an SPI it must be a separate number per vcpu. + +1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT +Parameters: no additional parameter in kvm_device_attr.addr +Returns: -ENODEV: PMUv3 not supported + -ENXIO: PMUv3 not properly configured as required prior to calling this + attribute + -EBUSY: PMUv3 already initialized + +Request the initialization of the PMUv3. diff --git a/Documentation/virtual/kvm/devices/vm.txt b/Documentation/virtual/kvm/devices/vm.txt index f083a168eb35..a9ea8774a45f 100644 --- a/Documentation/virtual/kvm/devices/vm.txt +++ b/Documentation/virtual/kvm/devices/vm.txt @@ -84,3 +84,55 @@ Returns: -EBUSY in case 1 or more vcpus are already activated (only in write -EFAULT if the given address is not accessible from kernel space -ENOMEM if not enough memory is available to process the ioctl 0 in case of success + +3. GROUP: KVM_S390_VM_TOD +Architectures: s390 + +3.1. ATTRIBUTE: KVM_S390_VM_TOD_HIGH + +Allows user space to set/get the TOD clock extension (u8). + +Parameters: address of a buffer in user space to store the data (u8) to +Returns: -EFAULT if the given address is not accessible from kernel space + -EINVAL if setting the TOD clock extension to != 0 is not supported + +3.2. ATTRIBUTE: KVM_S390_VM_TOD_LOW + +Allows user space to set/get bits 0-63 of the TOD clock register as defined in +the POP (u64). + +Parameters: address of a buffer in user space to store the data (u64) to +Returns: -EFAULT if the given address is not accessible from kernel space + +4. GROUP: KVM_S390_VM_CRYPTO +Architectures: s390 + +4.1. ATTRIBUTE: KVM_S390_VM_CRYPTO_ENABLE_AES_KW (w/o) + +Allows user space to enable aes key wrapping, including generating a new +wrapping key. + +Parameters: none +Returns: 0 + +4.2. ATTRIBUTE: KVM_S390_VM_CRYPTO_ENABLE_DEA_KW (w/o) + +Allows user space to enable dea key wrapping, including generating a new +wrapping key. + +Parameters: none +Returns: 0 + +4.3. ATTRIBUTE: KVM_S390_VM_CRYPTO_DISABLE_AES_KW (w/o) + +Allows user space to disable aes key wrapping, clearing the wrapping key. + +Parameters: none +Returns: 0 + +4.4. ATTRIBUTE: KVM_S390_VM_CRYPTO_DISABLE_DEA_KW (w/o) + +Allows user space to disable dea key wrapping, clearing the wrapping key. + +Parameters: none +Returns: 0 diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt index c81731096a43..481b6a9c25d5 100644 --- a/Documentation/virtual/kvm/mmu.txt +++ b/Documentation/virtual/kvm/mmu.txt @@ -392,11 +392,11 @@ To instantiate a large spte, four constraints must be satisfied: write-protected pages - the guest page must be wholly contained by a single memory slot -To check the last two conditions, the mmu maintains a ->write_count set of +To check the last two conditions, the mmu maintains a ->disallow_lpage set of arrays for each memory slot and large page size. Every write protected page -causes its write_count to be incremented, thus preventing instantiation of +causes its disallow_lpage to be incremented, thus preventing instantiation of a large spte. The frames at the end of an unaligned memory slot have -artificially inflated ->write_counts so they can never be instantiated. +artificially inflated ->disallow_lpages so they can never be instantiated. Zapping all pages (page generation count) ========================================= |