Age | Commit message (Collapse) | Author |
|
Previous implementation in hvf_inject_interrupts() would always inject
VMCS_INTR_T_SWINTR even when VMCS_INTR_T_HWINTR was required. Now
correctly determine when VMCS_INTR_T_HWINTR is appropriate versus
VMCS_INTR_T_SWINTR.
Make sure to clear ins_len and has_error_code when ins_len isn't
valid and error_code isn't set.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <bf8d945ea1b423786d7802bbcf769517d1fd01f8.1575330463.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
More accurately match SDM when setting CR0 and PDPTE registers.
Clear PDPTE registers when resetting vcpus.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <464adb39c8699fb8331d8ad6016fc3e2eff53dbc.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In real x86 processors, the REX prefix must come after legacy prefixes.
REX before legacy is ignored. Update the HVF emulation code to properly
handle this. Fix some spelling errors in constants. Fix some decoder
table initialization issues found by Coverity.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <eff30ded8307471936bec5d84c3b6efbc95e3211.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The existing code in QEMU's HVF support to attempt to synchronize TSC
across multiple cores is not sufficient. TSC value on other cores
can go backwards. Until implementation is fixed, remove calls to
hv_vm_sync_tsc(). Pass through TSC to guest OS.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <44c4afd2301b8bf99682b229b0796d84edd6d66f.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
If an area is non-RAM and non-ROMD, then remove mappings so accesses
will trap and can be emulated. Change hvf_find_overlap_slot() to take
a size instead of an end address: it wouldn't return a slot because
callers would pass the same address for start and end. Don't always
map area as read/write/execute, respect area flags.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <1d8476c8f86959273fbdf23c86f8b4b611f5e2e1.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
They are present in client (Core) Skylake but pasted wrong into the server
SKUs.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We have been trying to avoid adding new aliases for CPU model
versions, but in the case of changes in defaults introduced by
the TAA mitigation patches, the aliases might help avoid user
confusion when applying host software updates.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
One of the mitigation methods for TAA[1] is to disable TSX
support on the host system. Linux added a mechanism to disable
TSX globally through the kernel command line, and many Linux
distributions now default to tsx=off. This makes existing CPU
models that have HLE and RTM enabled not usable anymore.
Add new versions of all CPU models that have the HLE and RTM
features enabled, that can be used when TSX is disabled in the
host system.
References:
[1] TAA, TSX asynchronous Abort:
https://software.intel.com/security-software-guidance/insights/deep-dive-intel-transactional-synchronization-extensions-intel-tsx-asynchronous-abort
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the
Trusty Side-channel Extension). By virtualizing the MSR, KVM guests
can disable TSX and avoid paying the price of mitigating TSX-based
attacks on microarchitectural side channels.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This allows using "-cpu Haswell,+vmx", which we did not really want to
support in QEMU but was produced by Libvirt when using the "host-model"
CPU model. Without this patch, no VMX feature is _actually_ supported
(only the basic instruction set extensions are) and KVM fails to load
in the guest.
This was produced from the output of scripts/kvm/vmxcap using the following
very ugly Python script:
bits = {
'INS/OUTS instruction information': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_INS_OUTS'],
'IA32_VMX_TRUE_*_CTLS support': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_TRUE_CTLS'],
'External interrupt exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_EXT_INTR_MASK'],
'NMI exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_NMI_EXITING'],
'Virtual NMIs': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VIRTUAL_NMIS'],
'Activate VMX-preemption timer': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VMX_PREEMPTION_TIMER'],
'Process posted interrupts': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_POSTED_INTR'],
'Interrupt window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_INTR_PENDING'],
'Use TSC offsetting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_TSC_OFFSETING'],
'HLT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_HLT_EXITING'],
'INVLPG exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_INVLPG_EXITING'],
'MWAIT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MWAIT_EXITING'],
'RDPMC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDPMC_EXITING'],
'RDTSC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDTSC_EXITING'],
'CR3-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_LOAD_EXITING'],
'CR3-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_STORE_EXITING'],
'CR8-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_LOAD_EXITING'],
'CR8-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_STORE_EXITING'],
'Use TPR shadow': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_TPR_SHADOW'],
'NMI-window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_NMI_PENDING'],
'MOV-DR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MOV_DR_EXITING'],
'Unconditional I/O exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_UNCOND_IO_EXITING'],
'Use I/O bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_IO_BITMAPS'],
'Monitor trap flag': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_TRAP_FLAG'],
'Use MSR bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_MSR_BITMAPS'],
'MONITOR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_EXITING'],
'PAUSE exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_PAUSE_EXITING'],
'Activate secondary control': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS'],
'Virtualize APIC accesses': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES'],
'Enable EPT': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_EPT'],
'Descriptor-table exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_DESC'],
'Enable RDTSCP': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDTSCP'],
'Virtualize x2APIC mode': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE'],
'Enable VPID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VPID'],
'WBINVD exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_WBINVD_EXITING'],
'Unrestricted guest': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST'],
'APIC register emulation': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT'],
'Virtual interrupt delivery': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY'],
'PAUSE-loop exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_PAUSE_LOOP_EXITING'],
'RDRAND exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDRAND_EXITING'],
'Enable INVPCID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_INVPCID'],
'Enable VM functions': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VMFUNC'],
'VMCS shadowing': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_SHADOW_VMCS'],
'RDSEED exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDSEED_EXITING'],
'Enable PML': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_PML'],
'Enable XSAVES/XRSTORS': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_XSAVES'],
'Save debug controls': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_DEBUG_CONTROLS'],
'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL'],
'Acknowledge interrupt on exit': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_ACK_INTR_ON_EXIT'],
'Save IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_PAT'],
'Load IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PAT'],
'Save IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_EFER'],
'Load IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_EFER'],
'Save VMX-preemption timer value': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER'],
'Clear IA32_BNDCFGS': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_CLEAR_BNDCFGS'],
'Load debug controls': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS'],
'IA-32e mode guest': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_IA32E_MODE'],
'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL'],
'Load IA32_PAT': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PAT'],
'Load IA32_EFER': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_EFER'],
'Load IA32_BNDCFGS': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_BNDCFGS'],
'Store EFER.LMA into IA-32e mode guest control': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_STORE_LMA'],
'HLT activity state': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ACTIVITY_HLT'],
'VMWRITE to VM-exit information fields': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_VMWRITE_VMEXIT'],
'Inject event with insn length=0': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ZERO_LEN_INJECT'],
'Execute-only EPT translations': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_EXECONLY'],
'Page-walk length 4': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_PAGE_WALK_LENGTH_4'],
'Paging-structure memory type WB': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_WB'],
'2MB EPT pages': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_2MB | MSR_VMX_EPT_1GB'],
'INVEPT supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT'],
'EPT accessed and dirty flags': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_AD_BITS'],
'Single-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT'],
'All-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_ALL_CONTEXT'],
'INVVPID supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID'],
'Individual-address INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_ADDR'],
'Single-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT'],
'All-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_ALL_CONTEXT'],
'Single-context-retaining-globals INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS'],
'EPTP Switching': ['FEAT_VMX_VMFUNC', 'MSR_VMX_VMFUNC_EPT_SWITCHING']
}
import sys
import textwrap
out = {}
for l in sys.stdin.readlines():
l = l.rstrip()
if l.endswith('!!'):
l = l[:-2].rstrip()
if l.startswith(' ') and (l.endswith('default') or l.endswith('yes')):
l = l[4:]
for key, value in bits.items():
if l.startswith(key):
ctl, bit = value
if ctl in out:
out[ctl] = out[ctl] + ' | '
else:
out[ctl] = ' [%s] = ' % ctl
out[ctl] = out[ctl] + bit
for x in sorted(out.keys()):
print("\n ".join(textwrap.wrap(out[x] + ",")))
Note that the script has a bug in that some keys apply to both VM entry
and VM exit controls ("load IA32_PERF_GLOBAL_CTRL", "load IA32_EFER",
"load IA32_PAT". Those have to be fixed by hand.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Attempting to migrate a VM using the microvm machine class results in the source
QEMU aborting with the following message/backtrace:
target/i386/machine.c:955:tsc_khz_needed: Object 0x555556608fa0 is not an
instance of type generic-pc-machine
abort()
object_class_dynamic_cast_assert()
vmstate_save_state_v()
vmstate_save_state()
vmstate_save()
qemu_savevm_state_complete_precopy()
migration_thread()
migration_thread()
migration_thread()
qemu_thread_start()
start_thread()
clone()
The access to the machine class returned by MACHINE_GET_CLASS() in
tsc_khz_needed() is crashing as it is trying to dereference a different
type of machine class object (TYPE_PC_MACHINE) to that of this microVM.
This can be resolved by extending the changes in the following commit
f0bb276bf8d5 ("hw/i386: split PCMachineState deriving X86MachineState from it")
and moving the save_tsc_khz field in PCMachineClass to X86MachineClass.
Fixes: f0bb276bf8d5 ("hw/i386: split PCMachineState deriving X86MachineState from it")
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <1574075605-25215-1-git-send-email-liam.merwick@oracle.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
TSX Async Abort (TAA) is a side channel attack on internal buffers in
some Intel processors similar to Microachitectural Data Sampling (MDS).
Some future Intel processors will use the ARCH_CAP_TAA_NO bit in the
IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable to
TAA. Make this bit available to guests.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This is required to disable ITLB multihit mitigations in nested
hypervisors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
Change the handling of port F0h writes and FPU exceptions to implement IGNNE.
The implementation mixes a bit what the chipset and processor do in real
hardware, but the effect is the same as what happens with actual FERR#
and IGNNE# pins: writing to port F0h asserts IGNNE# in addition to lowering
FP_IRQ; while clearing the SE bit in the FPU status word deasserts IGNNE#.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In the next patch, this will provide a hook to detect clearing of
FSW.ES.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Move it out of pc.c since it is strictly tied to TCG. This is
almost exclusively code movement, the next patch will implement
IGNNE.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This merge fixes a semantic conflict with the trivial tree.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Denverton is the Atom Processor of Intel Harrisonville platform.
For more information:
https://ark.intel.com/content/www/us/en/ark/products/\
codename/63508/denverton.html
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190718073405.28301-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
UMWAIT and TPAUSE instructions use 32bits IA32_UMWAIT_CONTROL at MSR
index E1H to determines the maximum time in TSC-quanta that the processor
can reside in either C0.1 or C0.2.
This patch is to Add support for save/load IA32_UMWAIT_CONTROL MSR in
guest.
Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-3-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions.
This patch adds support for user wait instructions in KVM. Availability
of the user wait instructions is indicated by the presence of the CPUID
feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may
be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to
set the maximum time.
The patch enable the umonitor, umwait and tpause features in KVM.
Because umwait and tpause can put a (psysical) CPU into a power saving
state, by default we dont't expose it to kvm and enable it only when
guest CPUID has it. And use QEMU command-line "-overcommit cpu-pm=on"
(enable_cpu_pm is enabled), a VM can use UMONITOR, UMWAIT and TPAUSE
instructions. If the instruction causes a delay, the amount of time
delayed is called here the physical delay. The physical delay is first
computed by determining the virtual delay (the time to delay relative to
the VM’s timestamp counter). Otherwise, UMONITOR, UMWAIT and TPAUSE cause
an invalid-opcode exception(#UD).
The release document ref below link:
https://software.intel.com/sites/default/files/\
managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-2-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Hyper-V TLFS specifies this enlightenment as:
"NoNonArchitecturalCoreSharing - Indicates that a virtual processor will never
share a physical core with another virtual processor, except for virtual
processors that are reported as sibling SMT threads. This can be used as an
optimization to avoid the performance overhead of STIBP".
However, STIBP is not the only implication. It was found that Hyper-V on
KVM doesn't pass MD_CLEAR bit to its guests if it doesn't see
NoNonArchitecturalCoreSharing bit.
KVM reports NoNonArchitecturalCoreSharing in KVM_GET_SUPPORTED_HV_CPUID to
indicate that SMT on the host is impossible (not supported of forcefully
disabled).
Implement NoNonArchitecturalCoreSharing support in QEMU as tristate:
'off' - the feature is disabled (default)
'on' - the feature is enabled. This is only safe if vCPUS are properly
pinned and correct topology is exposed. As CPU pinning is done outside
of QEMU the enablement decision will be made on a higher level.
'auto' - copy KVM setting. As during live migration SMT settings on the
source and destination host may differ this requires us to add a migration
blocker.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20191018163908.10246-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Patch logs MCE AO, AR messages injected to guest or taken by QEMU itself.
We print the QEMU address for guest MCEs, helps on hypervisors that have
another source of MCE logging like mce log, and when they go missing.
For example we found these QEMU logs:
September 26th 2019, 17:36:02.309 Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected qemu-system-x86_64 amsN ams3nodeNNNN
September 27th 2019, 06:25:03.234 Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected qemu-system-x86_64 amsN ams3nodeNNNN
The first log had a corresponding mce log entry, the second didnt (logging
thresholds) we can infer from second entry same PA and mce type.
Signed-off-by: Mario Smarduch <msmarduch@digitalocean.com>
Message-Id: <20191009164459.8209-3-msmarduch@digitalocean.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add new version of Snowridge CPU model that removes MPX feature.
MPX support is being phased out by Intel. GCC has dropped it, Linux kernel
and KVM are also going to do that in the future.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191012024748.127135-1-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
KVM has a 80-entry limit at KVM_SET_CPUID2. With the
introduction of CPUID[0x1F], it is now possible to hit this limit
with unusual CPU configurations, e.g.:
$ ./x86_64-softmmu/qemu-system-x86_64 \
-smp 1,dies=2,maxcpus=2 \
-cpu EPYC,check=off,enforce=off \
-machine accel=kvm
qemu-system-x86_64: kvm_init_vcpu failed: Argument list too long
This happens because QEMU adds a lot of all-zeroes CPUID entries
for unused CPUID leaves. In the example above, we end up
creating 48 all-zeroes CPUID entries.
KVM already returns all-zeroes when emulating the CPUID
instruction if an entry is missing, so the all-zeroes entries are
redundant. Skip those entries. This reduces the CPUID table
size by half while keeping CPUID output unchanged.
Reported-by: Yumei Huang <yuhuang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1741508
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190822225210.32541-1-ehabkost@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
without kvm commit 412a3c41, CPUID(EAX=0xd,ECX=0).EBX always equal to 0 even
through guest update xcr0, this will crash legacy guest(e.g., CentOS 6).
Below is the call trace on the guest.
[ 0.000000] kernel BUG at mm/bootmem.c:469!
[ 0.000000] invalid opcode: 0000 [#1] SMP
[ 0.000000] last sysfs file:
[ 0.000000] CPU 0
[ 0.000000] Modules linked in:
[ 0.000000]
[ 0.000000] Pid: 0, comm: swapper Tainted: G --------------- H 2.6.32-279#2 Red Hat KVM
[ 0.000000] RIP: 0010:[<ffffffff81c4edc4>] [<ffffffff81c4edc4>] alloc_bootmem_core+0x7b/0x29e
[ 0.000000] RSP: 0018:ffffffff81a01cd8 EFLAGS: 00010046
[ 0.000000] RAX: ffffffff81cb1748 RBX: ffffffff81cb1720 RCX: 0000000001000000
[ 0.000000] RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffffffff81cb1720
[ 0.000000] RBP: ffffffff81a01d38 R08: 0000000000000000 R09: 0000000000001000
[ 0.000000] R10: 02008921da802087 R11: 00000000ffff8800 R12: 0000000000000000
[ 0.000000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000001000000
[ 0.000000] FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
[ 0.000000] CS: 0010 DS: 0018 ES: 0018 CR0: 0000000080050033
[ 0.000000] CR2: 0000000000000000 CR3: 0000000001a85000 CR4: 00000000001406b0
[ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 0.000000] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020)
[ 0.000000] Stack:
[ 0.000000] 0000000000000002 81a01dd881eaf060 000000007e5fe227 0000000000001001
[ 0.000000] <d> 0000000000000040 0000000000000001 0000006cffffffff 0000000001000000
[ 0.000000] <d> ffffffff81cb1720 0000000000000000 0000000000000000 0000000000000000
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff81c4f074>] ___alloc_bootmem_nopanic+0x8d/0xca
[ 0.000000] [<ffffffff81c4f0cf>] ___alloc_bootmem+0x11/0x39
[ 0.000000] [<ffffffff81c4f172>] __alloc_bootmem+0xb/0xd
[ 0.000000] [<ffffffff814d42d9>] xsave_cntxt_init+0x249/0x2c0
[ 0.000000] [<ffffffff814e0689>] init_thread_xstate+0x17/0x25
[ 0.000000] [<ffffffff814e0710>] fpu_init+0x79/0xaa
[ 0.000000] [<ffffffff814e27e3>] cpu_init+0x301/0x344
[ 0.000000] [<ffffffff81276395>] ? sort+0x155/0x230
[ 0.000000] [<ffffffff81c30cf2>] trap_init+0x24e/0x25f
[ 0.000000] [<ffffffff81c2bd73>] start_kernel+0x21c/0x430
[ 0.000000] [<ffffffff81c2b33a>] x86_64_start_reservations+0x125/0x129
[ 0.000000] [<ffffffff81c2b438>] x86_64_start_kernel+0xfa/0x109
[ 0.000000] Code: 03 48 89 f1 49 c1 e8 0c 48 0f af d0 48 c7 c6 00 a6 61 81 48 c7 c7 00 e5 79 81 31 c0 4c 89 74 24 08 e8 f2 d7 89 ff 4d 85 e4 75 04 <0f> 0b eb fe 48 8b 45 c0 48 83 e8 01 48 85 45
c0 74 04 0f 0b eb
Signed-off-by: Bingsong Si <owen.si@ucloud.cn>
Message-Id: <20190822042901.16858-1-owen.si@ucloud.cn>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
Drop the duplicated definition of cpuid AVX512_VBMI macro and rename
it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as
CPUID_7_0_ECX_AVX512_VBMI2.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-3-tao3.xu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
Add some comments, clean up comments over 80 chars per line. And there
is an extra line in comment of CPUID_8000_0008_EBX_WBNOINVD, remove
the extra enter and spaces.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-2-tao3.xu@intel.com>
[ehabkost: rebase to latest git master]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
When I run QEMU with KVM under Valgrind, I currently get this warning:
Syscall param ioctl(generic) points to uninitialised byte(s)
at 0x95BA45B: ioctl (in /usr/lib64/libc-2.28.so)
by 0x429DC3: kvm_ioctl (kvm-all.c:2365)
by 0x51B249: kvm_arch_get_supported_msr_feature (kvm.c:469)
by 0x4C2A49: x86_cpu_get_supported_feature_word (cpu.c:3765)
by 0x4C4116: x86_cpu_expand_features (cpu.c:5065)
by 0x4C7F8D: x86_cpu_realizefn (cpu.c:5242)
by 0x5961F3: device_set_realized (qdev.c:835)
by 0x7038F6: property_set_bool (object.c:2080)
by 0x707EFE: object_property_set_qobject (qom-qobject.c:26)
by 0x705814: object_property_set_bool (object.c:1338)
by 0x498435: pc_new_cpu (pc.c:1549)
by 0x49C67D: pc_cpus_init (pc.c:1681)
Address 0x1ffeffee74 is on thread 1's stack
in frame #2, created by kvm_arch_get_supported_msr_feature (kvm.c:445)
It's harmless, but a little bit annoying, so silence it by properly
initializing the whole structure with zeroes.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Some secondary controls are automatically enabled/disabled based on the CPUID
values that are set for the guest. However, they are still available at a
global level and therefore should be present when KVM_GET_MSRS is sent to
/dev/kvm.
Unfortunately KVM forgot to include those, so fix that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add code to convert the VMX feature words back into MSR values,
allowing the user to enable/disable VMX features as they wish. The same
infrastructure enables support for limiting VMX features in named
CPU models.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
These will be used to compile the list of VMX features for named
CPU models, and/or by the code that sets up the VMX MSRs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP
and IA32_VMX_BASIC MSRs. (The VMX control MSRs are 64-bit wide but
actually have only 32 bits of information).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Sometimes a CPU feature does not make sense unless another is
present. In the case of VMX features, KVM does not even allow
setting the VMX controls to some invalid combinations.
Therefore, this patch adds a generic mechanism that looks for bits
that the user explicitly cleared, and uses them to remove other bits
from the expanded CPU definition. If these dependent bits were also
explicitly *set* by the user, this will be a warning for "-cpu check"
and an error for "-cpu enforce". If not, then the dependent bits are
cleared silently, for convenience.
With VMX features, this will be used so that for example
"-cpu host,-rdrand" will also hide support for RDRAND exiting.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
mark_unavailable_features
The next patch will add a different reason for filtering features, unrelated
to host feature support. Extract a new function that takes care of disabling
the features and optionally reporting them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
There is a problem, that you don't have access to the data using cpu_memory_rw_debug() function when in SMM. You can't remotely debug SMM mode program because of that for example.
Likely attrs version of get_phys_page_debug should be used to get correct asidx at the end to handle access properly.
Here the patch to fix it.
Signed-off-by: Dmitry Poletaev <poletaev@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform
and could be passed to the guest.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The WHPX build is broken since commit 12e9493df92 which removed the
"hw/boards.h" where MachineState is declared:
$ ./configure \
--enable-hax --enable-whpx
$ make x86_64-softmmu/all
[...]
CC x86_64-softmmu/target/i386/whpx-all.o
target/i386/whpx-all.c: In function 'whpx_accel_init':
target/i386/whpx-all.c:1378:25: error: dereferencing pointer to
incomplete type 'MachineState' {aka 'struct MachineState'}
whpx->mem_quota = ms->ram_size;
^~
make[1]: *** [rules.mak:69: target/i386/whpx-all.o] Error 1
CC x86_64-softmmu/trace/generated-helpers.o
make[1]: Target 'all' not remade because of errors.
make: *** [Makefile:471: x86_64-softmmu/all] Error 2
Restore this header, partially reverting commit 12e9493df92.
Fixes: 12e9493df92
Reported-by: Ilias Maratos <i.maratos@gmail.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190920113329.16787-2-philmd@redhat.com>
|
|
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190818225414.22590-3-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Allow guest reads CORE cstate when exposing host CPU power management capabilities
to the guest. PKG cstate is restricted to avoid a guest to get the whole package
information in multi-tenant scenario.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1563154124-18579-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.
Target dependant attributes are conditionalized upon NEED_CPU_H.
Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
into staging
Monitor patches for 2019-08-21
# gpg: Signature made Wed 21 Aug 2019 16:35:07 BST
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-monitor-2019-08-21:
monitor/qmp: Update comment for commit 4eaca8de268
qdev: Collect HMP handlers command handlers in qdev-monitor.c
qapi: Move query-target from misc.json to machine.json
hw/core: Move cpu.c, cpu.h from qom/ to hw/core/
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h
in comments replaced]
|
|
Intel CooperLake cpu adds AVX512_BF16 instruction, defining as
CPUID.(EAX=7,ECX=1):EAX[bit 05].
The patch adds a property for setting the subleaf of CPUID leaf 7 in
case that people would like to specify it.
The release spec link as follows,
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Prior patch resets can_do_io flag at the TB entry. Therefore there is no
need in resetting this flag at the end of the block.
This patch removes redundant gen_io_end calls.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404429499.18669.13404064982854123855.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
|
|
conversions
The x86 architecture requires that all conversions from floating
point to integer which raise the 'invalid' exception (infinities of
both signs, NaN, and all values which don't fit in the destination
integer) return what the x86 spec calls the "indefinite integer
value", which is 0x8000_0000 for 32-bits or 0x8000_0000_0000_0000 for
64-bits. The softfloat functions return the more usual behaviour of
positive overflows returning the maximum value that fits in the
destination integer format and negative overflows returning the
minimum value that fits.
Wrap the softfloat functions in x86-specific versions which
detect the 'invalid' condition and return the indefinite integer.
Note that we don't use these wrappers for the 3DNow! pf2id and pf2iw
instructions, which do return the minimum value that fits in
an int32 if the input float is a large negative number.
Fixes: https://bugs.launchpad.net/qemu/+bug/1815423
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20190805180332.10185-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Not the whole structure is initialized before passing it to the KVM.
Reduce the number of Valgrind reports.
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <1564502498-805893-4-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Function 'kvm_get_supported_msrs' is only called once
now, get rid of the static variable 'kvm_supported_msrs'.
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190725151639.21693-1-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Adds PV_SCHED_YIELD CPUID feature bit.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1562745771-8414-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add support for halt poll control MSR: save/restore, migration
and new feature name.
The purpose of this MSR is to allow the guest to disable
host halt poll.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20190603230408.GA7938@amt.cnet>
[Do not enable by default, as pointed out by Mark Kanda. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|