aboutsummaryrefslogtreecommitdiff
path: root/target-i386/kvm.c
AgeCommit message (Collapse)Author
2015-07-07i386: Introduce ARAT CPU featureJan Kiszka
ARAT signals that the APIC timer does not stop in power saving states. As our APICs are emulated, it's fine to expose this feature to guests, at least when asking for KVM host features or with CPU types that include the flag. The exact model number that introduced the feature is not known, but reports can be found that it's at least available since Sandy Bridge. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-07-06pc: add SMM propertyPaolo Bonzini
The property can take values on, off or auto. The default is "off" for KVM and pre-2.4 machines, otherwise "auto" (which makes it available on TCG or on new-enough kernels). Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06target-i386: register a separate KVM address space including SMRAM regionsPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06target-i386: add support for SMBASE MSR and SMIsPaolo Bonzini
Apart from the MSR, the smi field of struct kvm_vcpu_events has to be translated into the corresponding CPUX86State fields. Also, memory transaction flags depend on SMM state, so pull it from struct kvm_run on every exit from KVM to userspace. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-01kvm: First step to push iothread lock out of inner run loopJan Kiszka
This opens the path to get rid of the iothread lock on vmexits in KVM mode. On x86, the in-kernel irqchips has to be used because we otherwise need to synchronize APIC and other per-cpu state accesses that could be changed concurrently. Regarding pre/post-run callbacks, s390x and ARM should be fine without specific locking as the callbacks are empty. MIPS and POWER require locking for the pre-run callback. For the handle_exit callback, it is non-empty in x86, POWER and s390. Some POWER cases could do without the locking, but it is left in place for now. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1434646046-27150-7-git-send-email-pbonzini@redhat.com>
2015-06-22Include qapi/qmp/qerror.h exactly where neededMarkus Armbruster
In particular, don't include it into headers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-05target-i386: introduce cpu_get_mem_attrsPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-02kvm: introduce kvm_arch_msi_data_to_gsiEric Auger
On ARM the MSI data corresponds to the shared peripheral interrupt (SPI) ID. This latter equals to the SPI index + 32. to retrieve the SPI index, matching the gsi, an architecture specific function is introduced. Signed-off-by: Eric Auger <eric.auger@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-04-30kvm: add support for memory transaction attributesPaolo Bonzini
Let kvm_arch_post_run convert fields in the kvm_run struct to MemTxAttrs. These are then passed to address_space_rw. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-12Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
misc fixes and cleanups A bunch of fixes all over the place, some of the bugs fixed are actually regressions. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Wed Mar 11 17:48:30 2015 GMT using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (25 commits) virtio-scsi: remove empty wrapper for cmd virtio-scsi: clean out duplicate cdb field virtio-scsi: fix cdb/sense size uapi/virtio_scsi: allow overriding CDB/SENSE size virtio-scsi: drop duplicate CDB/SENSE SIZE exec: don't include hw/boards for linux-user acpi: specify format for build_append_namestring MAINTAINERS: drop aliguori@amazon.com tpm: Move memory subregion function into realize function virtio-pci: Convert to realize() pci: Convert pci_nic_init() to Error to avoid qdev_init() machine: query mem-merge machine property machine: query dump-guest-core machine property hw/boards: make it safe to include for linux-user machine: query phandle-start machine property machine: query kvm-shadow-mem machine property kvm: add machine state to kvm_arch_init machine: query kernel-irqchip property machine: allowed/required kernel-irqchip support machine: replace qemu opts with iommu property ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-03-11machine: query kvm-shadow-mem machine propertyMarcel Apfelbaum
Commit e79d5a6 ("machine: remove qemu_machine_opts global list") removed the global option descriptions and moved them to MachineState's QOM properties. Query kvm-shadow-mem by accessing machine properties through designated wrappers. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-11kvm: add machine state to kvm_arch_initMarcel Apfelbaum
Needed to query machine's properties. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-09target-i386: Move CPUX86State::cpuid_apic_id to X86CPU::apic_idEduardo Habkost
The field doesn't need to be inside CPUX86State, and it is not specific for the CPUID instruction, so move and rename it. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-03-03Revert "Merge remote-tracking branch ↵Peter Maydell
'remotes/ehabkost/tags/x86-pull-request' into staging" This reverts commit b8a173b25c887a606681fc35a46702c164d5b2d0, reversing changes made to 5de090464f1ec5360c4f30faa01d8a9f8826cd58. (I applied this pull request when I should not have done so, and am now immediately reverting it.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-25target-i386: Move CPUX86State.cpuid_apic_id to X86CPU.apic_idEduardo Habkost
The field doesn't need to be inside CPUState, and it is not specific for the CPUID instruction, so move and rename it. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-01-26target-i386: make xmm_regs 512-bit widePaolo Bonzini
Right now, the AVX512 registers are split in many different fields: xmm_regs for the low 128 bits of the first 16 registers, ymmh_regs for the next 128 bits of the same first 16 registers, zmmh_regs for the next 256 bits of the same first 16 registers, and finally hi16_zmm_regs for the full 512 bits of the second 16 bit registers. This makes it simple to move data in and out of the xsave region, but would be a nightmare for a hypothetical TCG implementation and leads to a proliferation of [XYZ]MM_[BWLSQD] macros. Instead, this patch marshals data manually from the xsave region to a single 32x512-bit array, simplifying the macro jungle and clarifying which bits are in which vmstate subsection. The migration format is unaffected. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-14Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
Mostly bugfixes and cleanups from qemu-devel. Yet another small patch from the record/replay series, and a few SCSI and i386 patches as well. # gpg: Signature made Wed 14 Jan 2015 09:39:14 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: cpus: consistently use QEMU_CLOCK_VIRTUAL_RT for icount_warp_rt timer qemu-timer: rename timer_init to timer_init_tl scsi: fix cancellation when I/O was completed but DMA was not. rules.mak: Fix module build hw/scsi/lsi53c895a: add support for additional diag / debug registers qemu-common.h: optimise muldiv64 if int128 is available target-i386: do not memcpy in and out of xmm_regs target-i386: fix movntsd on big-endian hosts vl.c: fix regression when reading memory size from config file vl: Don't silently change topology when all -smp options were set vl: fix max_cpus check vl: Avoid unnecessary 'if' nesting 9pfs: changed to use event_notifier instead of qemu_pipe vl.c: fix regression when reading machine type from config file char: restore stdio echo on resume from suspend. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-01-14target-i386: do not memcpy in and out of xmm_regsPaolo Bonzini
After the next patch, we will move the high parts of AVX and AVX512 registers in the same array as the SSE registers. This will make it impossible to memcpy an array of 128-bit values in and out of xmm_regs in one swoop. Use a for loop instead. Similarly, always use XMM_Q in translate.c. This avoids introducing bugs such as the one fixed in the previous patch. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-12kvm: extend kvm_irqchip_add_msi_route to work on s390Frank Blaschka
on s390 MSI-X irqs are presented as thin or adapter interrupts for this we have to reorganize the routing entry to contain valid information for the adapter interrupt code on s390. To minimize impact on existing code we introduce an architecture function to fixup the routing entry. Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-15x86: Drop some superfluous casts from void *Markus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15x86: Use g_new() & friends where that makes obvious senseMarkus Armbruster
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15target-i386: get/set/migrate XSAVES stateWanpeng Li
Add xsaves related definition, it also adds corresponding part to kvm_get/put, and vmstate. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15valgrind/i386: avoid false positives on KVM_SET_VCPU_EVENTS ioctlChristian Borntraeger
struct kvm_vcpu_events contains reserved fields. Let's use a designated initializer to avoid false positives in valgrind. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15valgrind/i386: avoid false positives on KVM_GET_MSRS ioctlChristian Borntraeger
struct kvm_msrs contains a pad field. Let's use a designated initializer on the info part to avoid false positives from valgrind/memcheck. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15valgrind/i386: avoid false positives on KVM_SET_MSRS ioctlChristian Borntraeger
struct kvm_msrs contains padding bytes. Let's use a designated initializer on the info part to avoid false positives from valgrind/memcheck. Do the same for generic MSRS, the TSC and feature control. We also need to zero out the reserved fields in the entries. We do this in kvm_msr_entry_set as suggested by Paolo. This avoids a big memset that a designated initializer on the full structure would do. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15valgrind/i386: avoid false positives on KVM_SET_XCRS ioctlChristian Borntraeger
struct kvm_xcrs contains padding bytes. Let's use a designated initializer to avoid false positives from valgrind/memcheck. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checksEric Auger
Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension. Remove direct settings in architecture specific files. Add a new kvm_resamplefds_allowed variable, initialized by checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding kvm_resamplefds_enabled() function. A special notice for s390 where KVM_CAP_IRQFD was not immediatly advirtised when irqfd capability was introduced in the kernel. KVM_CAP_IRQ_ROUTING was advertised instead. This was fixed in "KVM: s390: announce irqfd capability", ebc3226202d5956a5963185222982d435378b899 whereas irqfd support was brought in 84223598778ba08041f4297fda485df83414d57e, "KVM: s390: irq routing for adapter interrupts". Both commits first appear in 3.15 so there should not be any kernel version impacted by this QEMU modification. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-24target-i386: add Intel AVX-512 supportChao Peng
Add AVX512 feature bits, register definition and corresponding xsave/vmstate support. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25x86: kvm: Add MTRR support for kvm_get|put_msrs()Alex Williamson
The MTRR state in KVM currently runs completely independent of the QEMU state in CPUX86State.mtrr_*. This means that on migration, the target loses MTRR state from the source. Generally that's ok though because KVM ignores it and maps everything as write-back anyway. The exception to this rule is when we have an assigned device and an IOMMU that doesn't promote NoSnoop transactions from that device to be cache coherent. In that case KVM trusts the guest mapping of memory as configured in the MTRR. This patch updates kvm_get|put_msrs() so that we retrieve the actual vCPU MTRR settings and therefore keep CPUX86State synchronized for migration. kvm_put_msrs() is also used on vCPU reset and therefore allows future modificaitons of MTRR state at reset to be realized. Note that the entries array used by both functions was already slightly undersized for holding every possible MSR, so this patch increases it beyond the 28 new entries necessary for MTRR state. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-25target-i386: block migration and savevm if invariant tsc is exposedMarcelo Tosatti
Invariant TSC documentation mentions that "invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states". This is not the case if migration to a host with different TSC frequency is allowed, or if savevm is performed. So block migration/savevm. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> [AF+mtosatti: Updated error message] Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-06-04kvm: Fix eax for cpuid leaf 0x40000000Jidong Xiao
Since Linux kernel 3.5, KVM has documented eax for leaf 0x40000000 to be KVM_CPUID_FEATURES: https://github.com/torvalds/linux/commit/57c22e5f35aa4b9b2fe11f73f3e62bbf9ef36190 But qemu still tries to set it to 0. It would be better to make qemu and kvm consistent. This patch just fixes this issue. Signed-off-by: Jidong Xiao <jidong.xiao@gmail.com> [Include kvm_base in the value. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-03kvm: Enable -cpu option to hide KVMAlex Williamson
The latest Nvidia driver (337.88) specifically checks for KVM as the hypervisor and reports Code 43 for the driver in a Windows guest when found. Removing or changing the KVM signature is sufficient for the driver to load and work. This patch adds an option to easily allow the KVM hypervisor signature to be hidden using '-cpu kvm=off'. We continue to expose KVM via the cpuid value by default. The state of this option does not supercede or replace -enable-kvm or the accel=kvm machine option. This only changes the visibility of KVM to the guest and paravirtual features specifically tied to the KVM cpuid. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-21target-i386: get CPL from SS.DPLPaolo Bonzini
CS.RPL is not equal to the CPL in the few instructions between setting CR0.PE and reloading CS. We get this right in the common case, because writes to CR0 do not modify the CPL, but it would not be enough if an SMI comes exactly during that brief period. Were this to happen, the RSM instruction would erroneously set CPL to the low two bits of the real-mode selector; and if they are not 00, the next instruction fetch cannot access the code segment and causes a triple fault. However, SS.DPL *is* always equal to the CPL. In real processors (AMD only) there is a weird case of SYSRET setting SS.DPL=SS.RPL from the STAR register while forcing CPL=3, but we do not emulate that. Tested-by: Kevin O'Connor <kevin@koconnor.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-13kvm: forward INIT signals coming from the chipsetPaolo Bonzini
Reviewed-by: Gleb Natapov <gnatapov@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-13kvm: reset state from the CPU's reset methodPaolo Bonzini
Now that we have a CPU object with a reset method, it is better to keep the KVM reset close to the CPU reset. Using qemu_register_reset as we do now keeps them far apart. With this patch, PPC no longer calls the kvm_arch_ function, so it can get removed there. Other arches call it from their CPU reset handler, and the function gets an ARMCPU/X86CPU/S390CPU. Note that ARM- and s390-specific functions are called kvm_arm_* and kvm_s390_*, while x86-specific functions are called kvm_arch_*. That follows the convention used by the different architectures. Changing that is the topic of a separate patch. Reviewed-by: Gleb Natapov <gnatapov@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-13target-i386: Remove unused data from local arrayStefan Weil
Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-27target-i386: Add missing 'static' and 'const' attributesStefan Weil
This fixes warnings from the static code analysis (smatch). Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-03-13cpu: Move watchpoint fields from CPU_COMMON to CPUStateAndreas Färber
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-02-03kvm: add support for hyper-v timersVadim Rozenfeld
http://msdn.microsoft.com/en-us/library/windows/hardware/ff541625%28v=vs.85%29.aspx This code is generic for activating reference time counter or virtual reference time stamp counter Signed-off-by: Vadim Rozenfeld <vrozenfe@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-03kvm: make hyperv vapic assist page migratableVadim Rozenfeld
Signed-off-by: Vadim Rozenfeld <vrozenfe@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-03kvm: make hyperv hypercall and guest os id MSRs migratable.Vadim Rozenfeld
Signed-off-by: Vadim Rozenfeld <vrozenfe@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-03kvm: make availability of Hyper-V enlightenments dependent on KVM_CAP_HYPERVPaolo Bonzini
The MS docs specify HV_X64_MSR_HYPERCALL as a mandatory interface, thus we must provide the MSRs even if the user only specified features that, like relaxed timing, in principle don't require them. And the MSRs are only there if the hypervisor has KVM_CAP_HYPERV. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-03KVM: fix coexistence of KVM and Hyper-V leavesPaolo Bonzini
kvm_arch_init_vcpu's initialization of the KVM leaves at 0x40000100 is broken, because KVM_CPUID_FEATURES is left at 0x40000001. Move it to 0x40000101 if Hyper-V is enabled. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-24Merge remote-tracking branch 'qemu-kvm/uq/master' into stagingAnthony Liguori
* qemu-kvm/uq/master: kvm: always update the MPX model specific register KVM: fix addr type for KVM_IOEVENTFD KVM: Retry KVM_CREATE_VM on EINTR mempath prefault: fix off-by-one error kvm: x86: Separately write feature control MSR on reset roms: Flush icache when writing roms to guest memory target-i386: clear guest TSC on reset target-i386: do not special case TSC writeback target-i386: Intel MPX Conflicts: exec.c aliguori: fix trivial merge conflict in exec.c Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-20kvm: always update the MPX model specific registerPaolo Bonzini
The original patch from Liu Jinsong restricted them to reset or full state updates, but that's unnecessary (and wrong) since the BNDCFGS MSR has no side effects. Cc: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-23target-i386: Move apic_state field from CPUX86State to X86CPUChen Fan
This motion is preparing for refactoring vCPU APIC subsequently. Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-12-18kvm: x86: Separately write feature control MSR on resetJan Kiszka
If the guest is running in nested mode on system reset, clearing the feature MSR signals the kernel to leave this mode. Recent kernels processes this properly, but leave the VCPU state undefined behind. It is the job of userspace to bring it to a proper shape. Therefore, write this specific MSR first so that no state transfer gets lost. This allows to cleanly reset a guest with VMX in use. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-12target-i386: clear guest TSC on resetFernando Luis Vázquez Cao
VCPU TSC is not cleared by a warm reset (*), which leaves some types of Linux guests (non-pvops guests and those with the kernel parameter no-kvmclock set) vulnerable to the overflow in cyc2ns_offset fixed by upstream commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow in cyc2ns_offset"). To put it in a nutshell, if such a Linux guest without the patch above applied has been up more than 208 days and attempts a warm reset chances are that the newly booted kernel will panic or hang. (*) Intel Xeon E5 processors show the same broken behavior due to the errata "TSC is Not Affected by Warm Reset" (Intel® Xeon® Processor E5 Family Specification Update - August 2013): "The TSC (Time Stamp Counter MSR 10H) should be cleared on reset. Due to this erratum the TSC is not affected by warm reset." Cc: Will Auld <will.auld@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fernando Luis Vázquez Cao <fernando_b1@lab.ntt.co.jp>
2013-12-12target-i386: do not special case TSC writebackFernando Luis Vázquez Cao
Newer kernels are capable of synchronizing TSC values of multiple VCPUs on writeback, but we were excluding the power up case, which is not needed anymore. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fernando Luis Vázquez Cao <fernando_b1@lab.ntt.co.jp>
2013-12-12target-i386: Intel MPXLiu Jinsong
Add some MPX related definiation, and hardcode sizes and offsets of xsave features 3 and 4. It also add corresponding part to kvm_get/put_xsave, and vmstate. Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>