aboutsummaryrefslogtreecommitdiff
path: root/target/i386/kvm
AgeCommit message (Collapse)Author
2022-10-01target/i386/kvm: fix kvmclock_current_nsec: Assertion `time.tsc_timestamp <= ↵Ray Zhang
migration_tsc' failed New KVM_CLOCK flags were added in the kernel.(c68dc1b577eabd5605c6c7c08f3e07ae18d30d5d) ``` + #define KVM_CLOCK_VALID_FLAGS \ + (KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC) case KVM_CAP_ADJUST_CLOCK: - r = KVM_CLOCK_TSC_STABLE; + r = KVM_CLOCK_VALID_FLAGS; ``` kvm_has_adjust_clock_stable needs to handle additional flags, so that s->clock_is_reliable can be true and kvmclock_current_nsec doesn't need to be called. Signed-off-by: Ray Zhang <zhanglei002@gmail.com> Message-Id: <20220922100523.2362205-1-zhanglei002@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-01i386: do kvm_put_msr_feature_control() first thing when vCPU is resetVitaly Kuznetsov
kvm_put_sregs2() fails to reset 'locked' CR4/CR0 bits upon vCPU reset when it is in VMX root operation. Do kvm_put_msr_feature_control() before kvm_put_sregs2() to (possibly) kick vCPU out of VMX root operation. It also seems logical to do kvm_put_msr_feature_control() before kvm_put_nested_state() and not after it, especially when 'real' nested state is set. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-01i386: reset KVM nested state upon CPU resetVitaly Kuznetsov
Make sure env->nested_state is cleaned up when a vCPU is reset, it may be stale after an incoming migration, kvm_arch_put_registers() may end up failing or putting vCPU in a weird state. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25i386: Hyper-V Direct TLB flush hypercallVitaly Kuznetsov
Hyper-V TLFS allows for L0 and L1 hypervisors to collaborate on L2's TLB flush hypercalls handling. With the correct setup, L2's TLB flush hypercalls can be handled by L0 directly, without the need to exit to L1. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25i386: Hyper-V Support extended GVA ranges for TLB flush hypercallsVitaly Kuznetsov
KVM kind of supported "extended GVA ranges" (up to 4095 additional GFNs per hypercall) since the implementation of Hyper-V PV TLB flush feature (Linux-4.18) as regardless of the request, full TLB flush was always performed. "Extended GVA ranges for TLB flush hypercalls" feature bit wasn't exposed then. Now, as KVM gains support for fine-grained TLB flush handling, exposing this feature starts making sense. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25i386: Hyper-V XMM fast hypercall input featureVitaly Kuznetsov
Hyper-V specification allows to pass parameters for certain hypercalls using XMM registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows for faster hypercalls processing as KVM can avoid reading guest's memory. KVM supports the feature since v5.14. Rename HV_HYPERCALL_{PARAMS_XMM_AVAILABLE -> XMM_INPUT_AVAILABLE} to comply with KVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25i386: Hyper-V Enlightened MSR bitmap featureVitaly Kuznetsov
The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURESVitaly Kuznetsov
Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently as it was only used to encode the supported eVMCS version range. In fact, there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-23target/i386: Remove LBREn bit check when access Arch LBR MSRsYang Weijiang
Live migration can happen when Arch LBR LBREn bit is cleared, e.g., when migration happens after guest entered SMM mode. In this case, we still need to migrate Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220517155024.33270-1-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-16Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Richard Henderson
into staging virtio,pc,pci: fixes,cleanups,features most of CXL support fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi # MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv # SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77 # 1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR # PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1 # f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA= # =xdSk # -----END PGP SIGNATURE----- # gpg: Signature made Mon 16 May 2022 01:48:50 PM PDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (86 commits) vhost-user-scsi: avoid unlink(NULL) with fd passing virtio-net: don't handle mq request in userspace handler for vhost-vdpa vhost-vdpa: change name and polarity for vhost_vdpa_one_time_request() vhost-vdpa: backend feature should set only once vhost-net: fix improper cleanup in vhost_net_start vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa virtio-net: setup vhost_dev and notifiers for cvq only when feature is negotiated hw/i386/amd_iommu: Fix IOMMU event log encoding errors hw/i386: Make pic a property of common x86 base machine type hw/i386: Make pit a property of common x86 base machine type include/hw/pci/pcie_host: Correct PCIE_MMCFG_SIZE_MAX include/hw/pci/pcie_host: Correct PCIE_MMCFG_BUS_MASK docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG vhost-user: more master/slave things virtio: add vhost support for virtio devices virtio: drop name parameter for virtio_init() virtio/vhost-user: dynamically assign VhostUserHostNotifiers hw/virtio/vhost-user: don't suppress F_CONFIG when supported include/hw: start documenting the vhost API ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-05-16target/i386: Fix sanity check on max APIC ID / X2APIC enablementDavid Woodhouse
The check on x86ms->apic_id_limit in pc_machine_done() had two problems. Firstly, we need KVM to support the X2APIC API in order to allow IRQ delivery to APICs >= 255. So we need to call/check kvm_enable_x2apic(), which was done elsewhere in *some* cases but not all. Secondly, microvm needs the same check. So move it from pc_machine_done() to x86_cpus_init() where it will work for both. The check in kvm_cpu_instance_init() is now redundant and can be dropped. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20220314142544.150555-1-dwmw2@infradead.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-14target/i386: Add MSR access interface for Arch LBRYang Weijiang
In the first generation of Arch LBR, the max support Arch LBR depth is 32, both host and guest use the value to set depth MSR. This can simplify the implementation of patch given the side-effect of mismatch of host/guest depth MSR: XRSTORS will reset all recording MSRs to 0s if the saved depth mismatches MSR_ARCH_LBR_DEPTH. In most of the cases Arch LBR is not in active status, so check the control bit before save/restore the big chunck of Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-7-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-14target/i386: Add kvm_get_one_msr helperYang Weijiang
When try to get one msr from KVM, I found there's no such kind of existing interface while kvm_put_one_msr() is there. So here comes the patch. It'll remove redundant preparation code before finally call KVM_GET_MSRS IOCTL. No functional change intended. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-4-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06hw: hyperv: Initial commit for Synthetic Debugging deviceJon Doron
Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-5-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06hyperv: Add support to process syndbg commandsJon Doron
SynDbg commands can come from two different flows: 1. Hypercalls, in this mode the data being sent is fully encapsulated network packets. 2. SynDbg specific MSRs, in this mode only the data that needs to be transfered is passed. Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-4-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06hyperv: Add definitions for syndbgJon Doron
Add all required definitions for hyperv synthetic debugger interface. Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-3-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06Remove qemu-common.h include from most unitsMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-23KVM: x86: workaround invalid CPUID[0xD,9] info on some AMD processorsPaolo Bonzini
Some AMD processors expose the PKRU extended save state even if they do not have the related PKU feature in CPUID. Worse, when they do they report a size of 64, whereas the expected size of the PKRU extended save state is 8, therefore the esa->size == eax assertion does not hold. The state is already ignored by KVM_GET_SUPPORTED_CPUID because it was not enabled in the host XCR0. However, QEMU kvm_cpu_xsave_init() runs before QEMU invokes arch_prctl() to enable dynamically-enabled save states such as XTILEDATA, and KVM_GET_SUPPORTED_CPUID hides save states that have yet to be enabled. Therefore, kvm_cpu_xsave_init() needs to consult the host CPUID instead of KVM_GET_SUPPORTED_CPUID, and dies with an assertion failure. When setting up the ExtSaveArea array to match the host, ignore features that KVM does not report as supported. This will cause QEMU to skip the incorrect CPUID leaf instead of tripping the assertion. Closes: https://gitlab.com/qemu-project/qemu/-/issues/916 Reported-by: Daniel P. Berrangé <berrange@redhat.com> Analyzed-by: Yang Zhong <yang.zhong@intel.com> Reported-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-23i386: Set MCG_STATUS_RIPV bit for mce SRAR errorluofei
In the physical machine environment, when a SRAR error occurs, the IA32_MCG_STATUS RIPV bit is set, but qemu does not set this bit. When qemu injects an SRAR error into virtual machine, the virtual machine kernel just call do_machine_check() to kill the current task, but not call memory_failure() to isolate the faulty page, which will cause the faulty page to be allocated and used repeatedly. If used by the virtual machine kernel, it will cause the virtual machine to crash Signed-off-by: luofei <luofei@unicloud.com> Message-Id: <20220120084634.131450-1-luofei@unicloud.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-23target/i386/kvm: Free xsave_buf when destroying vCPUPhilippe Mathieu-Daudé
Fix vCPU hot-unplug related leak reported by Valgrind: ==132362== 4,096 bytes in 1 blocks are definitely lost in loss record 8,440 of 8,549 ==132362== at 0x4C3B15F: memalign (vg_replace_malloc.c:1265) ==132362== by 0x4C3B288: posix_memalign (vg_replace_malloc.c:1429) ==132362== by 0xB41195: qemu_try_memalign (memalign.c:53) ==132362== by 0xB41204: qemu_memalign (memalign.c:73) ==132362== by 0x7131CB: kvm_init_xsave (kvm.c:1601) ==132362== by 0x7148ED: kvm_arch_init_vcpu (kvm.c:2031) ==132362== by 0x91D224: kvm_init_vcpu (kvm-all.c:516) ==132362== by 0x9242C9: kvm_vcpu_thread_fn (kvm-accel-ops.c:40) ==132362== by 0xB2EB26: qemu_thread_start (qemu-thread-posix.c:556) ==132362== by 0x7EB2159: start_thread (in /usr/lib64/libpthread-2.28.so) ==132362== by 0x9D45DD2: clone (in /usr/lib64/libc-2.28.so) Reported-by: Mark Kanda <mark.kanda@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Mark Kanda <mark.kanda@oracle.com> Message-Id: <20220322120522.26200-1-philippe.mathieu.daude@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-20target/i386: kvm: do not access uninitialized variable on older kernelsPaolo Bonzini
KVM support for AMX includes a new system attribute, KVM_X86_XCOMP_GUEST_SUPP. Commit 19db68ca68 ("x86: Grant AMX permission for guest", 2022-03-15) however did not fully consider the behavior on older kernels. First, it warns too aggressively. Second, it invokes the KVM_GET_DEVICE_ATTR ioctl unconditionally and then uses the "bitmask" variable, which remains uninitialized if the ioctl fails. Third, kvm_ioctl returns -errno rather than -1 on errors. While at it, explain why the ioctl is needed and KVM_GET_SUPPORTED_CPUID is not enough. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: Support XFD and AMX xsave data migrationZeng Guang
XFD(eXtended Feature Disable) allows to enable a feature on xsave state while preventing specific user threads from using the feature. Support save and restore XFD MSRs if CPUID.D.1.EAX[4] enumerate to be valid. Likewise migrate the MSRs and related xsave state necessarily. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-8-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: add support for KVM_CAP_XSAVE2 and AMX state migrationJing Liu
When dynamic xfeatures (e.g. AMX) are used by the guest, the xsave area would be larger than 4KB. KVM_GET_XSAVE2 and KVM_SET_XSAVE under KVM_CAP_XSAVE2 works with a xsave buffer larger than 4KB. Always use the new ioctls under KVM_CAP_XSAVE2 when KVM supports it. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-7-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: Add AMX CPUIDs enumerationJing Liu
Add AMX primary feature bits XFD and AMX_TILE to enumerate the CPU's AMX capability. Meanwhile, add AMX TILE and TMUL CPUID leaf and subleaves which exist when AMX TILE is present to provide the maximum capability of TILE and TMUL. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-6-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: Grant AMX permission for guestYang Zhong
Kernel allocates 4K xstate buffer by default. For XSAVE features which require large state component (e.g. AMX), Linux kernel dynamically expands the xstate buffer only after the process has acquired the necessary permissions. Those are called dynamically- enabled XSAVE features (or dynamic xfeatures). There are separate permissions for native tasks and guests. Qemu should request the guest permissions for dynamic xfeatures which will be exposed to the guest. This only needs to be done once before the first vcpu is created. KVM implemented one new ARCH_GET_XCOMP_SUPP system attribute API to get host side supported_xcr0 and Qemu can decide if it can request dynamically enabled XSAVE features permission. https://lore.kernel.org/all/20220126152210.3044876-1-pbonzini@redhat.com/ Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Message-Id: <20220217060434.52460-4-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: Fix the 64-byte boundary enumeration for extended stateJing Liu
The extended state subleaves (EAX=0Dh, ECX=n, n>1).ECX[1] indicate whether the extended state component locates on the next 64-byte boundary following the preceding state component when the compacted format of an XSAVE area is used. Right now, they are all zero because no supported component needed the bit to be set, but the upcoming AMX feature will use it. Fix the subleaves value according to KVM's supported cpuid. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-2-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15kvm/msi: do explicit commit when adding msi routesLongpeng(Mike)
We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-07osdep: Move memalign-related functions to their own headerPeter Maydell
Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
2022-01-12KVM: x86: ignore interrupt_bitmap field of KVM_GET/SET_SREGSPaolo Bonzini
This is unnecessary, because the interrupt would be retrieved and queued anyway by KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS respectively, and it makes the flow more similar to the one for KVM_GET/SET_SREGS2. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-12KVM: use KVM_{GET|SET}_SREGS2 when supported.Maxim Levitsky
This allows to make PDPTRs part of the migration stream and thus not reload them after migration which is against X86 spec. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-17target/i386/kvm: Replace use of __u32 typePhilippe Mathieu-Daudé
QEMU coding style mandates to not use Linux kernel internal types for scalars types. Replace __u32 by uint32_t. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211116193955.2793171-1-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-11-02KVM: SVM: add migration support for nested TSC scalingMaxim Levitsky
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13target/i386/sev: Declare system-specific functions in 'sev.h'Philippe Mathieu-Daudé
"sysemu/sev.h" is only used from x86-specific files. Let's move it to include/hw/i386, and merge it with target/i386/sev.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-16-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13target/i386/sev: Rename sev_i386.h -> sev.hPhilippe Mathieu-Daudé
SEV is a x86 specific feature, and the "sev_i386.h" header is already in target/i386/. Rename it as "sev.h" to simplify. Patch created mechanically using: $ git mv target/i386/sev_i386.h target/i386/sev.h $ sed -i s/sev_i386.h/sev.h/ $(git grep -l sev_i386.h) Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211007161716.453984-15-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13target/i386/kvm: Restrict SEV stubs to x86 architecturePhilippe Mathieu-Daudé
SEV is x86-specific, no need to add its stub to other architectures. Move the stub file to target/i386/kvm/. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-5-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13target/i386/kvm: Introduce i386_softmmu_kvm Meson source setPhilippe Mathieu-Daudé
Introduce the i386_softmmu_kvm Meson source set to be able to add features dependent on CONFIG_KVM. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-4-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Make Hyper-V version id configurableVitaly Kuznetsov
Currently, we hardcode Hyper-V version id (CPUID 0x40000002) to WS2008R2 and it is known that certain tools in Windows check this. It seems useful to provide some flexibility by making it possible to change this info at will. CPUID information is defined in TLFS as: EAX: Build Number EBX Bits 31-16: Major Version Bits 15-0: Minor Version ECX Service Pack EDX Bits 31-24: Service Branch Bits 23-0: Service Number Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-8-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenmentVitaly Kuznetsov
The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled. Normally, Hyper-V SynIC disables these hardware features and suggests the guest to use paravirtualized AutoEOI feature. Linux-4.15 gains support for conditional APICv/AVIC disablement, the feature stays on until the guest tries to use AutoEOI feature with SynIC. With 'HV_DEPRECATING_AEOI_RECOMMENDED' bit exposed, modern enough Windows/ Hyper-V versions should follow the recommendation and not use the (unwanted) feature. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-7-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to hyperv_fill_cpuids()Vitaly Kuznetsov
In preparation to enabling Hyper-V + APICv/AVIC move HV_APIC_ACCESS_RECOMMENDED setting out of kvm_hyperv_properties[]: the 'real' feature bit for the vAPIC features is HV_APIC_ACCESS_AVAILABLE, HV_APIC_ACCESS_RECOMMENDED is a recommendation to use the feature which we may not always want to give. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Support KVM_CAP_HYPERV_ENFORCE_CPUIDVitaly Kuznetsov
By default, KVM allows the guest to use all currently supported Hyper-V enlightenments when Hyper-V CPUID interface was exposed, regardless of if some features were not announced in guest visible CPUIDs. hv-enforce-cpuid feature alters this behavior and only allows the guest to use exposed Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUIDVitaly Kuznetsov
By default, KVM allows the guest to use all currently supported PV features even when they were not announced in guest visible CPUIDs. Introduce a new "kvm-pv-enforce-cpuid" flag to limit the supported feature set to the exposed features. The feature is supported by Linux >= 5.10 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30memory: Name all the memory listenersPeter Xu
Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30i386: Propagate SGX CPUID sub-leafs to KVMSean Christopherson
The SGX sub-leafs are enumerated at CPUID 0x12. Indices 0 and 1 are always present when SGX is supported, and enumerate SGX features and capabilities. Indices >=2 are directly correlated with the platform's EPC sections. Because the number of EPC sections is dynamic and user defined, the number of SGX sub-leafs is "NULL" terminated. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-15-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30i386: kvm: Add support for exposing PROVISIONKEY to guestSean Christopherson
If the guest want to fully use SGX, the guest needs to be able to access provisioning key. Add a new KVM_CAP_SGX_ATTRIBUTE to KVM to support provisioning key to KVM guests. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-14-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30i386: Add feature control MSR dependency when SGX is enabledSean Christopherson
SGX adds multiple flags to FEATURE_CONTROL to enable SGX and Flexible Launch Control. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-12-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30i386: Add get/set/migrate support for SGX_LEPUBKEYHASH MSRsSean Christopherson
On real hardware, on systems that supports SGX Launch Control, those MSRs are initialized to digest of Intel's signing key; on systems that don't support SGX Launch Control, those MSRs are not available but hardware always uses digest of Intel's signing key in EINIT. KVM advertises SGX LC via CPUID if and only if the MSRs are writable. Unconditionally initialize those MSRs to digest of Intel's signing key when CPU is realized and reset to reflect the fact. This avoids potential bug in case kvm_arch_put_registers() is called before kvm_arch_get_registers() is called, in which case guest's virtual SGX_LEPUBKEYHASH MSRs will be set to 0, although KVM initializes those to digest of Intel's signing key by default, since KVM allows those MSRs to be updated by Qemu to support live migration. Save/restore the SGX Launch Enclave Public Key Hash MSRs if SGX Launch Control (LC) is exposed to the guest. Likewise, migrate the MSRs if they are writable by the guest. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-11-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-26migration: Unify failure check for migrate_add_blocker()Markus Armbruster
Most callers check the return value. Some check whether it set an error. Functionally equivalent, but the former tends to be easier on the eyes, so do that everywhere. Prior art: commit c6ecec43b2 "qemu-option: Check return value instead of @err where convenient". Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-26i386: Never free migration blocker objects instead of sometimesMarkus Armbruster
invtsc_mig_blocker has static storage duration. When a CPU with certain features is initialized, and invtsc_mig_blocker is still null, we add a migration blocker and store it in invtsc_mig_blocker. The object is freed when migrate_add_blocker() fails, leaving invtsc_mig_blocker dangling. It is not freed on later failures. Same for hv_passthrough_mig_blocker and hv_no_nonarch_cs_mig_blocker. All failures are actually fatal, so whether we free or not doesn't really matter, except as bad examples to be copied / imitated. Clean this up in a minimal way: never free these blocker objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-7-armbru@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-29i386: assert 'cs->kvm_state' is not nullVitaly Kuznetsov
Coverity reports potential NULL pointer dereference in get_supported_hv_cpuid_legacy() when 'cs->kvm_state' is NULL. While 'cs->kvm_state' can indeed be NULL in hv_cpuid_get_host(), kvm_hyperv_expand_features() makes sure that it only happens when KVM_CAP_SYS_HYPERV_CPUID is supported and KVM_CAP_SYS_HYPERV_CPUID implies KVM_CAP_HYPERV_CPUID so get_supported_hv_cpuid_legacy() is never really called. Add asserts to strengthen the protection against broken KVM behavior. Coverity: CID 1458243 Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210716115852.418293-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-23i386: do not call cpudef-only models functions for max, host, baseClaudio Fontana
Some cpu properties have to be set only for cpu models in builtin_x86_defs, registered with x86_register_cpu_model_type, and not for cpu models "base", "max", and the subclass "host". These properties are the ones set by function x86_cpu_apply_props, (also including kvm_default_props, tcg_default_props), and the "vendor" property for the KVM and HVF accelerators. After recent refactoring of cpu, which also affected these properties, they were instead set unconditionally for all x86 cpus. This has been detected as a bug with Nested on AMD with cpu "host", as svm was not turned on by default, due to the wrongful setting of kvm_default_props via x86_cpu_apply_props, which set svm to "off". Rectify the bug introduced in commit "i386: split cpu accelerators" and document the functions that are builtin_x86_defs-only. Signed-off-by: Claudio Fontana <cfontana@suse.de> Tested-by: Alexander Bulekov <alxndr@bu.edu> Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c,"...) Resolves: https://gitlab.com/qemu-project/qemu/-/issues/477 Message-Id: <20210723112921.12637-1-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>