aboutsummaryrefslogtreecommitdiff
path: root/target-i386/kvm.c
AgeCommit message (Collapse)Author
2016-10-17target-i386/kvm: cache the return value of kvm_enable_x2apic()Radim Krčmář
Assume that KVM would have returned the same on subsequent runs. Abstract the memoizaiton pattern into macros and call it memorize as adding the r makes it less obscure. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17intel_iommu: reject broken EIMRadim Krčmář
Cluster x2APIC cannot work without KVM's x2apic API when the maximal APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we forbid other APICs and also the old KVM case with less than 9, to simplify the code. There is no point in enabling EIM in forbidden APICs, so we keep it enabled only for the KVM APIC; unconditionally, because making the option depend on KVM version would be a maintanance burden. Old QEMUs would enable eim whenever intremap was on, which would trick guests into thinking that they can enable cluster x2APIC even if any interrupt destination would get clamped to 8 bits. Depending on your configuration, QEMU could notice that the destination LAPIC is not present and report it with a very non-obvious: KVM: injection failed, MSI lost (Operation not permitted) Or the guest could say something about unexpected interrupts, because clamping leads to aliasing so interrupts were being delivered to incorrect VCPUs. KVM_X2APIC_API is the feature that allows us to enable EIM for KVM. QEMU 2.7 allowed EIM whenever interrupt remapping was enabled. In order to keep backward compatibility, we again allow guests to misbehave in non-obvious ways, and make it the default for old machine types. A user can enable the buggy mode it with "x-buggy-eim=on". Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-28Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* thread-safe tb_flush (Fred, Alex, Sergey, me, Richard, Emilio,... :-) * license clarification for compiler.h (Felipe) * glib cflags improvement (Marc-André) * checkpatch silencing (Paolo) * SMRAM migration fix (Paolo) * Replay improvements (Pavel) * IOMMU notifier improvements (Peter) * IOAPIC now defaults to version 0x20 (Peter) # gpg: Signature made Tue 27 Sep 2016 10:57:40 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (28 commits) replay: allow replay stopping and restarting replay: vmstate for replay module replay: move internal data to the structure cpus-common: lock-free fast path for cpu_exec_start/end tcg: Make tb_flush() thread safe cpus-common: Introduce async_safe_run_on_cpu() cpus-common: simplify locking for start_exclusive/end_exclusive cpus-common: remove redundant call to exclusive_idle() cpus-common: always defer async_run_on_cpu work items docs: include formal model for TCG exclusive sections cpus-common: move exclusive work infrastructure from linux-user cpus-common: fix uninitialized variable use in run_on_cpu cpus-common: move CPU work item management to common code cpus-common: move CPU list management to common code linux-user: Add qemu_cpu_is_self() and qemu_cpu_kick() linux-user: Use QemuMutex and QemuCond cpus: Rename flush_queued_work() cpus: Move common code out of {async_, }run_on_cpu() cpus: pass CPUState to run_on_cpu helpers build-sys: put glib_cflags in QEMU_CFLAGS ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-27target-i386: Remove has_msr_* global vars for KVM featuresEduardo Habkost
The global variables are not necessary because we can check KVM feature flags in X86CPU directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Remove has_msr_hv_tsc global variableEduardo Habkost
The global variable is not necessary because we can check cpu->hyperv_time directly. We just need to ensure cpu->hyperv_time will be cleared if the feature is not really being exposed to the guest due to missing KVM_CAP_HYPERV_TIME capability. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Remove has_msr_hv_apic global variableEduardo Habkost
The global variable is not necessary because we can check cpu->hyperv_vapic directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Remove has_msr_mtrr global variableEduardo Habkost
The global variable is not necessary because we can check the CPU feature flags directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27cpus: pass CPUState to run_on_cpu helpersAlex Bennée
CPUState is a fairly common pointer to pass to these helpers. This means if you need other arguments for the async_run_on_cpu case you end up having to do a g_malloc to stuff additional data into the routine. For the current users this isn't a massive deal but for MTTCG this gets cumbersome when the only other parameter is often an address. This adds the typedef run_on_cpu_func for helper functions which has an explicit CPUState * passed as the first parameter. All the users of run_on_cpu and async_run_on_cpu have had their helpers updated to use CPUState where available. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> [Sergey Fedorov: - eliminate more CPUState in user data; - remove unnecessary user data passing; - fix target-s390x/kvm.c and target-s390x/misc_helper.c] Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> (s390 parts) Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1470158864-17651-3-git-send-email-alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-22kvm: fix events.flags (KVM_VCPUEVENT_VALID_SMM) overwritten by 0Herongguang (Stephen)
Fix events.flags (KVM_VCPUEVENT_VALID_SMM) overwritten by 0. Signed-off-by: He Rongguang <herongguang.he@huawei.com> Message-Id: <57E38EAC.3020108@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-22kvm: apic: set APIC base as part of kvm_apic_putDr. David Alan Gilbert
The parsing of KVM_SET_LAPIC's input depends on the current value of the APIC base MSR---which indeed is stored in APICCommonState---but for historical reasons APIC base is set through KVM_SET_SREGS together with cr8 (which is really just the APIC TPR) and the actual "special CPU registers". APIC base must now be set before the actual LAPIC registers, so do that in kvm_apic_put. It will be set again to the same value with KVM_SET_SREGS, but that's not a big issue. This only happens since Linux 4.8, which checks for x2apic mode in KVM_SET_LAPIC. However it's really a QEMU bug; until the recent commit 78d6a05 ("x86/lapic: Load LAPIC state at post_load", 2016-09-13) QEMU was indeed setting APIC base (via KVM_SET_SREGS) before the other LAPIC registers. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-22target-i386: introduce kvm_put_one_msrPaolo Bonzini
Avoid further code duplication in the next patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13x86/lapic: Load LAPIC state at post_loadDr. David Alan Gilbert
Load the LAPIC state during post_load (rather than when the CPU starts). This allows an interrupt to be delivered from the ioapic to the lapic prior to cpu loading, in particular the RTC that starts ticking as soon as we load it's state. Fixes a case where Windows hangs after migration due to RTC interrupts disappearing. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-16target-i386: kvm: Report kvm_pv_unhalt as unsupported w/o kernel_irqchipEduardo Habkost
The kvm_pv_unhalt feature doesn't work if kernel_irqchip is disabled, so we need to report it as unsupported. Tested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-08-08error: Strip trailing '\n' from error string arguments (again)Markus Armbruster
Commit 9af9e0f, 6daf194d, be62a2eb and 312fd5f got rid of a bunch, but they keep coming back. checkpatch.pl tries to flag them since commit 5d596c2, but it's not very good at it. Offenders tracked down with Coccinelle script scripts/coccinelle/err-bad-newline.cocci, an updated version of the script from commit 312fd5f. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1470224274-31522-2-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-07-21Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pc, pci, virtio: new features, cleanups, fixes - interrupt remapping for intel iommus - a bunch of virtio cleanups - fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 21 Jul 2016 18:49:30 BST # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (57 commits) intel_iommu: avoid unnamed fields virtio: Update migration docs virtio-gpu: Wrap in vmstate virtio-gpu: Use migrate_add_blocker for virgl migration blocking virtio-input: Wrap in vmstate 9pfs: Wrap in vmstate virtio-serial: Wrap in vmstate virtio-net: Wrap in vmstate virtio-balloon: Wrap in vmstate virtio-rng: Wrap in vmstate virtio-blk: Wrap in vmstate virtio-scsi: Wrap in vmstate virtio: Migration helper function and macro virtio-serial: Remove old migration version support virtio-net: Remove old migration version support virtio-scsi: Replace HandleOutput typedef Revert "mirror: Workaround for unexpected iohandler events during completion" virtio-scsi: Call virtio_add_queue_aio virtio-blk: Call virtio_add_queue_aio virtio: Introduce virtio_add_queue_aio ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-21kvm-irqchip: do explicit commit when update irqPeter Xu
In the past, we are doing gsi route commit for each irqchip route update. This is not efficient if we are updating lots of routes in the same time. This patch removes the committing phase in kvm_irqchip_update_msi_route(). Instead, we do explicit commit after all routes updated. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21kvm-irqchip: x86: add msi route notify fnPeter Xu
One more IEC notifier is added to let msi routes know about the IEC changes. When interrupt invalidation happens, all registered msi routes will be updated for all PCI devices. Since both vfio and vhost are possible gsi route consumers, this patch will go one step further to keep them safe in split irqchip mode and when irqfd is enabled. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> [move trace-events lines into target-i386/trace-events] Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21kvm-irqchip: i386: add hook for add/remove virqPeter Xu
Adding two hooks to be notified when adding/removing msi routes. There are two kinds of MSI routes: - in kvm_irqchip_add_irq_route(): before assigning IRQFD. Used by vhost, vfio, etc. - in kvm_irqchip_send_msi(): when sending direct MSI message, if direct MSI not allowed, we will first create one MSI route entry in the kernel, then trigger it. This patch only hooks the first one (irqfd case). We do not need to take care for the 2nd one, since it's only used by QEMU userspace (kvm-apic) and the messages will always do in-time translation when triggered. While we need to note them down for the 1st one, so that we can notify the kernel when cache invalidation happens. Also, we do not hook IOAPIC msi routes (we have explicit notifier for IOAPIC to keep its cache updated). We only need to care about irqfd users. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21kvm-irqchip: simplify kvm_irqchip_add_msi_routePeter Xu
Changing the original MSIMessage parameter in kvm_irqchip_add_msi_route into the vector number. Vector index provides more information than the MSIMessage, we can retrieve the MSIMessage using the vector easily. This will avoid fetching MSIMessage every time before adding MSI routes. Meanwhile, the vector info will be used in the coming patches to further enable gsi route update notifications. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21intel_iommu: add support for split irqchipPeter Xu
In split irqchip mode, IOAPIC is working in user space, only update kernel irq routes when entry changed. When IR is enabled, we directly update the kernel with translated messages. It works just like a kernel cache for the remapping entries. Since KVM irqfd is using kernel gsi routes to deliver interrupts, as long as we can support split irqchip, we will support irqfd as well. Also, since kernel gsi routes will cache translated interrupts, irqfd delivery will not suffer from any performance impact due to IR. And, since we supported irqfd, vhost devices will be able to work seamlessly with IR now. Logically this should contain both vhost-net and vhost-user case. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [move trace-events lines into target-i386/trace-events] Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-20target-i386: Fill high bits of mtrr maskDr. David Alan Gilbert
Fill the bits between 51..number-of-physical-address-bits in the MTRR_PHYSMASKn variable range mtrr masks so that they're consistent in the migration stream irrespective of the physical address space of the source VM in a migration. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-20target-i386: Mask mtrr mask based on CPU physical address limitsDr. David Alan Gilbert
The CPU GPs if we try and set a bit in a variable MTRR mask above the limit of physical address bits on the host. We hit this when loading a migration from a host with a larger physical address limit than our destination (e.g. a Xeon->i7 of same generation) but previously used to get away with it until 48e1a45 started checking that msr writes actually worked. It seems in our case the GP probably comes from KVM emulating that GP. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-07target-i386: kvm: Add basic Intel LMCE supportAshok Raj
This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they are injected to only one VCPU rather than broadcast to all VCPUs. As KVM reports LMCE support on Intel platforms, this features is only available on Intel platforms. LMCE is disabled by default and can be enabled/disabled by cpu option 'lmce=on/off'. Signed-off-by: Ashok Raj <ashok.raj@intel.com> [Haozhong: Enable LMCE only on Intel platforms Disable LMCE by default and add a cpu option 'lmce' Handle the error if LMCE is enabled w/o host support Remove MCG_LMCE_P from MCE_CAP_DEF Add migration support for LMCE Minor code style changes] Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-07target-i386: Report hyperv feature words through qomEvgeny Yakovlev
This change adds hyperv feature words report through qom rpc. When VM is configured with hyperv features enabled libvirt will check that required feature words are set in cpuid leaf 40000003 through qom request. Currently qemu does not report hyperv feature words which prevents windows guests from starting with libvirt. To avoid conflicting with current hyperv properties all added feature words cannot be set directly with -cpu +feature yet. Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Richard Henderson <rth@twiddle.net> CC: Eduardo Habkost <ehabkost@redhat.com> CC: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-07target-i386: Show host and VM TSC frequencies on mismatchEduardo Habkost
Improve the TSC frequency mismatch warning to show the host and VM TSC frequencies. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-20coccinelle: Remove unnecessary variables for function return valueEduardo Habkost
Use Coccinelle script to replace 'ret = E; return ret' with 'return E'. The script will do the substitution only when the function return type and variable type are the same. Manual fixups: * audio/audio.c: coding style of "read (...)" and "write (...)" * block/qcow2-cluster.c: wrap line to make it shorter * block/qcow2-refcount.c: change indentation of wrapped line * target-tricore/op_helper.c: fix coding style of "remainder|quotient" * target-mips/dsp_helper.c: reverted changes because I don't want to argue about checkpatch.pl * ui/qemu-pixman.c: fix line indentation * block/rbd.c: restore blank line between declarations and statements Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1465855078-19435-4-git-send-email-ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Unused Coccinelle rule name dropped along with a redundant comment; whitespace touched up in block/qcow2-cluster.c; stale commit message paragraph deleted] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-16target-i386: kvm: cache KVM_GET_SUPPORTED_CPUID dataChao Peng
KVM_GET_SUPPORTED_CPUID ioctl is called frequently when initializing CPU. Depends on CPU features and CPU count, the number of calls can be extremely high which slows down QEMU booting significantly. In our testing, we saw 5922 calls with switches: -cpu SandyBridge -smp 6,sockets=6,cores=1,threads=1 This ioctl takes more than 100ms, which is almost half of the total QEMU startup time. While for most cases the data returned from two different invocations are not changed, that means, we can cache the data to avoid trapping into kernel for the second time. To make sure the cache safe one assumption is desirable: the ioctl is stateless. This is not true for CPUID leaves in general (such as CPUID leaf 0xD, whose value depends on guest XCR0 and IA32_XSS) but it is true of KVM_GET_SUPPORTED_CPUID, which runs before there is a value for XCR0 and IA32_XSS. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <1465784487-23482-1-git-send-email-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16os-posix: include sys/mman.hPaolo Bonzini
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check is bogus without a previous inclusion of sys/mman.h. Include it in sysemu/os-posix.h and remove it from everywhere else. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29memory: split memory_region_from_host from qemu_ram_addr_from_hostPaolo Bonzini
Move the old qemu_ram_addr_from_host to memory_region_from_host and make it return an offset within the region. For qemu_ram_addr_from_host return the ram_addr_t directly, similar to what it was before commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host", 2013-07-04). Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23target-i386: kvm: Eliminate kvm_msr_entry_set()Eduardo Habkost
Inline the function inside kvm_msr_entry_add(). Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23target-i386: kvm: Simplify MSR setting functionsEduardo Habkost
Simplify kvm_put_tscdeadline_msr() and kvm_put_msr_feature_control() using kvm_msr_buf and the kvm_msr_entry_add() helper. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23target-i386: kvm: Simplify MSR array constructionEduardo Habkost
Add a helper function that appends new entries to the MSR buffer and checks for the buffer size limit. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23target-i386: kvm: Increase MSR_BUF_SIZEEduardo Habkost
We are dangerously close to the array limits in kvm_put_msrs() and kvm_get_msrs(): with the default mcg_cap configuration, we can set up to 148 MSRs in kvm_put_msrs(), and if we allow mcg_cap to be changed, we can write up to 236 MSRs. Use 4096 bytes for the buffer, that can hold 255 kvm_msr_entry structs. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23target-i386: kvm: Allocate kvm_msrs struct once per VCPUEduardo Habkost
Instead of using 2400 bytes in the stack for 150 MSR entries in kvm_get_msrs() and kvm_put_msrs(), allocate a buffer once for each VCPU. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23target-i386: kvm: Use X86XSaveArea struct for xsave save/loadEduardo Habkost
Instead of using offset macros and bit operations in a uint32_t array, use the X86XSaveArea struct to perform the loading/saving operations in kvm_put_xsave() and kvm_get_xsave(). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23target-i386: Define structs for layout of xsave areaEduardo Habkost
Add structs that define the layout of the xsave areas used by Intel processors. Add some QEMU_BUILD_BUG_ON lines to ensure the structs match the XSAVE_* macros in target-i386/kvm.c and the offsets and sizes at target-i386/cpu.c:ext_save_areas. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-19qemu-common: push cpu.h inclusion out of qemu-common.hPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-05target-i386: assert that KVM_GET/SET_MSRS can set all requested MSRsPaolo Bonzini
This would have caught the bug in the previous patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-05target-i386: do not pass MSR_TSC_AUX to KVM ioctls if CPUID bit is not setPaolo Bonzini
KVM does not let you read or write this MSR if the corresponding CPUID bit is not set. This in turn causes MSRs that come after MSR_TSC_AUX to be ignored by KVM_SET_MSRS. One visible symptom is that s3.flat from kvm-unit-tests fails with CPUs that do not have RDTSCP, because the SMBASE is not reset to 0x30000 after reset. Fixes: c9b8f6b6210847b4381c5b2ee172b1c7eb9985d6 Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22include/qemu/osdep.h: Don't include qapi/error.hMarkus Armbruster
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-14hyperv: cpu hotplug fix with HyperV enabledDenis V. Lunev
With Hyper-V enabled CPU hotplug stops working. The CPU appears in device manager on Windows but does not appear in peformance monitor and control panel. The root of the problem is the following. Windows checks HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLE bit in CPUID. The presence of this bit is enough to cure the situation. The bit should be set when CPU hotplug is allowed for HyperV VM. The check that hot_add_cpu callback is defined is enough from the protocol point of view. Though this callback is defined almost always thus there is no need to export that knowledge in the other way. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Richard Henderson <rth@twiddle.net> CC: Eduardo Habkost <ehabkost@redhat.com> CC: "Andreas Färber" <afaerber@suse.de> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-02-13target-i386: Enable control registers for MPXRichard Henderson
Enable and disable at CPL changes, MSR changes, and XRSTOR changes. Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13target-i386: Add XSAVE extensionRichard Henderson
This includes XSAVE, XRSTOR, XGETBV, XSETBV, which are all related, as well as the associate cpuid bits. Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-01-29x86: Clean up includesPeter Maydell
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1453832250-766-11-git-send-email-peter.maydell@linaro.org
2016-01-21target-i386: Add PKU and and OSPKE supportHuaitong Han
Add PKU and OSPKE CPUID features, including xsave state and migration support. Signed-off-by: Huaitong Han <huaitong.han@intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> [ehabkost: squashed 3 patches together, edited patch description] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-01-21target-i386: Add support to migrate vcpu's TSC rateHaozhong Zhang
This patch enables migrating vcpu's TSC rate. If KVM on the destination machine supports TSC scaling, guest programs will observe a consistent TSC rate across the migration. If TSC scaling is not supported on the destination machine, the migration will not be aborted and QEMU on the destination will not set vcpu's TSC rate to the migrated value. If vcpu's TSC rate specified by CPU option 'tsc-freq' on the destination machine is inconsistent with the migrated TSC rate, the migration will be aborted. For backwards compatibility, the migration of vcpu's TSC rate is disabled on pc-*-2.5 and older machine types. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> [ehabkost: Rewrote comment at kvm_arch_put_registers()] [ehabkost: Moved compat code to pc-2.5] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-01-21target-i386: Reorganize TSC rate setting codeHaozhong Zhang
Following changes are made to the TSC rate setting code in kvm_arch_init_vcpu(): * The code is moved to a new function kvm_arch_set_tsc_khz(). * If kvm_arch_set_tsc_khz() fails, i.e. following two conditions are both satisfied: * KVM does not support the TSC scaling or it fails to set vcpu's TSC rate by KVM_SET_TSC_KHZ, * the TSC rate to be set is different than the value currently used by KVM, then kvm_arch_init_vcpu() will fail. Prevously, * the lack of TSC scaling never failed kvm_arch_init_vcpu(), * the failure of KVM_SET_TSC_KHZ failed kvm_arch_init_vcpu() unconditionally, even though the TSC rate to be set is identical to the value currently used by KVM. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-01-21target-i386: Fallback vcpu's TSC rate to value returned by KVMHaozhong Zhang
If no user-specified TSC rate is present, we will try to set env->tsc_khz to the value returned by KVM_GET_TSC_KHZ. This patch does not change the current functionality of QEMU and just prepares for later patches to enable migrating vcpu's TSC rate. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-01-21target-i386: Rename XMM_[BWLSDQ] helpers to ZMM_*Eduardo Habkost
They are helpers for the ZMMReg fields, so name them accordingly. This is just a global search+replace, no other changes are being introduced. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-12-17target-i386: kvm: clear unusable segments' flags in migrationMichael Chapman
This commit fixes migration of a QEMU/KVM guest from kernel >= v3.9 to kernel <= v3.7 (e.g. from RHEL 7 to RHEL 6). Without this commit a guest migrated across these kernel versions fails to resume on the target host as its segment descriptors are invalid. Two separate kernel commits combined together to result in this bug: commit f0495f9b9992f80f82b14306946444b287193390 Author: Avi Kivity <avi@redhat.com> Date: Thu Jun 7 17:06:10 2012 +0300 KVM: VMX: Relax check on unusable segment Some userspace (e.g. QEMU 1.1) munge the d and g bits of segment descriptors, causing us not to recognize them as unusable segments with emulate_invalid_guest_state=1. Relax the check by testing for segment not present (a non-present segment cannot be usable). Signed-off-by: Avi Kivity <avi@redhat.com> commit 25391454e73e3156202264eb3c473825afe4bc94 Author: Gleb Natapov <gleb@redhat.com> Date: Mon Jan 21 15:36:46 2013 +0200 KVM: VMX: don't clobber segment AR of unusable segments. Usability is returned in unusable field, so not need to clobber entire AR. Callers have to know how to deal with unusable segments already since if emulate_invalid_guest_state=true AR is not zeroed. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> The first commit changed the KVM_SET_SREGS ioctl so that it did no treat segment flags == 0 as an unusable segment, instead only looking at the "present" flag. The second commit changed KVM_GET_SREGS so that it did not clear the flags of an unusable segment. Since QEMU does not itself maintain the "unusable" flag across a migration, the end result is that unusable segments read from a kernel with these commits and loaded into a kernel without these commits are not properly recognised as being unusable. This commit updates both get_seg and set_seg so that the problem is avoided even when migrating to or migrating from a QEMU without this commit. In get_seg, we clear the segment flags if the segment is marked unusable. In set_seg, we mark the segment unusable if the segment's "present" flag is not set. Signed-off-by: Michael Chapman <mike@very.puzzling.org> Message-Id: <1449464047-17467-1-git-send-email-mike@very.puzzling.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>