Age | Commit message (Collapse) | Author |
|
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <086c197db928384b8697edfa64755e2cb46c8100.1575685843.git.dirty@apple.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Legacy PCI device assignment has been already removed in commit ab37bfc7d641
("pci-assign: Remove"), but some codes remain unused.
CC: qemu-trivial@nongnu.org
Signed-off-by: Eiichi Tsukata <devel@etsukata.com>
Message-Id: <20191209072932.313056-1-devel@etsukata.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This refactors the load library of WHV libraries to make it more
modular. It makes a helper routine that can be called on demand.
This allows future expansion of load library/functions to support
functionality that is dependent on some feature being available.
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Message-Id: <MW2PR2101MB1116578040BE1F0C1B662318C0760@MW2PR2101MB1116.namprd21.prod.outlook.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
These are needed by microvm too, so move them outside of PC-specific files.
With this patch, microvm.c need not include pc.h anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add it to microvm as well, it is a generic property of the x86
architecture.
Suggested-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Remove the need to include i386/pc.h to get to the i8259 functions.
This is enough to remove the inclusion of hw/i386/pc.h from all non-x86
files.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The KVMState struct is opaque, so provide accessors for the fields
that will be moved from current_machine to the accelerator. For now
they just forward to the machine object, but this will change.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Similar to CPU and machine classes, "-accel" class names are mangled,
so we have to first get a class via accel_find and then instantiate it.
Provide a new function to instantiate a class without going through
object_class_get_name, and use it for CPUs and machines already.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Get rid of 12 explicit g_free() calls.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20191025025632.5928-1-ehabkost@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
Cooper Lake is intel's successor to Cascade Lake, the new
CPU model inherits features from Cascadelake-Server, while
add one platform associated new feature: AVX512_BF16. Meanwhile,
add STIBP for speculative execution.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-4-git-send-email-cathy.zhang@intel.com>
Reviewed-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
stibp feature is already added through the following commit.
https://github.com/qemu/qemu/commit/0e8916582991b9fd0b94850a8444b8b80d0a0955
Add a macro for it to allow CPU models to report it when host supports.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-3-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
Define MSR_ARCH_CAP_MDS_NO in the IA32_ARCH_CAPABILITIES MSR to allow
CPU models to report the feature when host supports it.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-2-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
If kvm does not support VMX feature by nested=0, the kvm_vmx_basic
can't get the right value from MSR_IA32_VMX_BASIC register, which
make qemu coredump when qemu do KVM_SET_MSRS.
The coredump info:
error: failed to set MSR 0x480 to 0x0
kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20191206071111.12128-1-yang.zhong@intel.com>
Reported-by: Catherine Ho <catherine.hecx@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Previous implementation in hvf_inject_interrupts() would always inject
VMCS_INTR_T_SWINTR even when VMCS_INTR_T_HWINTR was required. Now
correctly determine when VMCS_INTR_T_HWINTR is appropriate versus
VMCS_INTR_T_SWINTR.
Make sure to clear ins_len and has_error_code when ins_len isn't
valid and error_code isn't set.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <bf8d945ea1b423786d7802bbcf769517d1fd01f8.1575330463.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
More accurately match SDM when setting CR0 and PDPTE registers.
Clear PDPTE registers when resetting vcpus.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <464adb39c8699fb8331d8ad6016fc3e2eff53dbc.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In real x86 processors, the REX prefix must come after legacy prefixes.
REX before legacy is ignored. Update the HVF emulation code to properly
handle this. Fix some spelling errors in constants. Fix some decoder
table initialization issues found by Coverity.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <eff30ded8307471936bec5d84c3b6efbc95e3211.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The existing code in QEMU's HVF support to attempt to synchronize TSC
across multiple cores is not sufficient. TSC value on other cores
can go backwards. Until implementation is fixed, remove calls to
hv_vm_sync_tsc(). Pass through TSC to guest OS.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <44c4afd2301b8bf99682b229b0796d84edd6d66f.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
If an area is non-RAM and non-ROMD, then remove mappings so accesses
will trap and can be emulated. Change hvf_find_overlap_slot() to take
a size instead of an end address: it wouldn't return a slot because
callers would pass the same address for start and end. Don't always
map area as read/write/execute, respect area flags.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <1d8476c8f86959273fbdf23c86f8b4b611f5e2e1.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
They are present in client (Core) Skylake but pasted wrong into the server
SKUs.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We have been trying to avoid adding new aliases for CPU model
versions, but in the case of changes in defaults introduced by
the TAA mitigation patches, the aliases might help avoid user
confusion when applying host software updates.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
One of the mitigation methods for TAA[1] is to disable TSX
support on the host system. Linux added a mechanism to disable
TSX globally through the kernel command line, and many Linux
distributions now default to tsx=off. This makes existing CPU
models that have HLE and RTM enabled not usable anymore.
Add new versions of all CPU models that have the HLE and RTM
features enabled, that can be used when TSX is disabled in the
host system.
References:
[1] TAA, TSX asynchronous Abort:
https://software.intel.com/security-software-guidance/insights/deep-dive-intel-transactional-synchronization-extensions-intel-tsx-asynchronous-abort
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the
Trusty Side-channel Extension). By virtualizing the MSR, KVM guests
can disable TSX and avoid paying the price of mitigating TSX-based
attacks on microarchitectural side channels.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This allows using "-cpu Haswell,+vmx", which we did not really want to
support in QEMU but was produced by Libvirt when using the "host-model"
CPU model. Without this patch, no VMX feature is _actually_ supported
(only the basic instruction set extensions are) and KVM fails to load
in the guest.
This was produced from the output of scripts/kvm/vmxcap using the following
very ugly Python script:
bits = {
'INS/OUTS instruction information': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_INS_OUTS'],
'IA32_VMX_TRUE_*_CTLS support': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_TRUE_CTLS'],
'External interrupt exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_EXT_INTR_MASK'],
'NMI exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_NMI_EXITING'],
'Virtual NMIs': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VIRTUAL_NMIS'],
'Activate VMX-preemption timer': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VMX_PREEMPTION_TIMER'],
'Process posted interrupts': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_POSTED_INTR'],
'Interrupt window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_INTR_PENDING'],
'Use TSC offsetting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_TSC_OFFSETING'],
'HLT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_HLT_EXITING'],
'INVLPG exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_INVLPG_EXITING'],
'MWAIT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MWAIT_EXITING'],
'RDPMC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDPMC_EXITING'],
'RDTSC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDTSC_EXITING'],
'CR3-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_LOAD_EXITING'],
'CR3-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_STORE_EXITING'],
'CR8-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_LOAD_EXITING'],
'CR8-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_STORE_EXITING'],
'Use TPR shadow': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_TPR_SHADOW'],
'NMI-window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_NMI_PENDING'],
'MOV-DR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MOV_DR_EXITING'],
'Unconditional I/O exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_UNCOND_IO_EXITING'],
'Use I/O bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_IO_BITMAPS'],
'Monitor trap flag': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_TRAP_FLAG'],
'Use MSR bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_MSR_BITMAPS'],
'MONITOR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_EXITING'],
'PAUSE exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_PAUSE_EXITING'],
'Activate secondary control': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS'],
'Virtualize APIC accesses': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES'],
'Enable EPT': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_EPT'],
'Descriptor-table exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_DESC'],
'Enable RDTSCP': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDTSCP'],
'Virtualize x2APIC mode': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE'],
'Enable VPID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VPID'],
'WBINVD exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_WBINVD_EXITING'],
'Unrestricted guest': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST'],
'APIC register emulation': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT'],
'Virtual interrupt delivery': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY'],
'PAUSE-loop exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_PAUSE_LOOP_EXITING'],
'RDRAND exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDRAND_EXITING'],
'Enable INVPCID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_INVPCID'],
'Enable VM functions': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VMFUNC'],
'VMCS shadowing': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_SHADOW_VMCS'],
'RDSEED exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDSEED_EXITING'],
'Enable PML': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_PML'],
'Enable XSAVES/XRSTORS': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_XSAVES'],
'Save debug controls': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_DEBUG_CONTROLS'],
'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL'],
'Acknowledge interrupt on exit': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_ACK_INTR_ON_EXIT'],
'Save IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_PAT'],
'Load IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PAT'],
'Save IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_EFER'],
'Load IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_EFER'],
'Save VMX-preemption timer value': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER'],
'Clear IA32_BNDCFGS': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_CLEAR_BNDCFGS'],
'Load debug controls': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS'],
'IA-32e mode guest': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_IA32E_MODE'],
'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL'],
'Load IA32_PAT': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PAT'],
'Load IA32_EFER': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_EFER'],
'Load IA32_BNDCFGS': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_BNDCFGS'],
'Store EFER.LMA into IA-32e mode guest control': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_STORE_LMA'],
'HLT activity state': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ACTIVITY_HLT'],
'VMWRITE to VM-exit information fields': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_VMWRITE_VMEXIT'],
'Inject event with insn length=0': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ZERO_LEN_INJECT'],
'Execute-only EPT translations': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_EXECONLY'],
'Page-walk length 4': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_PAGE_WALK_LENGTH_4'],
'Paging-structure memory type WB': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_WB'],
'2MB EPT pages': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_2MB | MSR_VMX_EPT_1GB'],
'INVEPT supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT'],
'EPT accessed and dirty flags': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_AD_BITS'],
'Single-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT'],
'All-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_ALL_CONTEXT'],
'INVVPID supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID'],
'Individual-address INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_ADDR'],
'Single-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT'],
'All-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_ALL_CONTEXT'],
'Single-context-retaining-globals INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS'],
'EPTP Switching': ['FEAT_VMX_VMFUNC', 'MSR_VMX_VMFUNC_EPT_SWITCHING']
}
import sys
import textwrap
out = {}
for l in sys.stdin.readlines():
l = l.rstrip()
if l.endswith('!!'):
l = l[:-2].rstrip()
if l.startswith(' ') and (l.endswith('default') or l.endswith('yes')):
l = l[4:]
for key, value in bits.items():
if l.startswith(key):
ctl, bit = value
if ctl in out:
out[ctl] = out[ctl] + ' | '
else:
out[ctl] = ' [%s] = ' % ctl
out[ctl] = out[ctl] + bit
for x in sorted(out.keys()):
print("\n ".join(textwrap.wrap(out[x] + ",")))
Note that the script has a bug in that some keys apply to both VM entry
and VM exit controls ("load IA32_PERF_GLOBAL_CTRL", "load IA32_EFER",
"load IA32_PAT". Those have to be fixed by hand.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Attempting to migrate a VM using the microvm machine class results in the source
QEMU aborting with the following message/backtrace:
target/i386/machine.c:955:tsc_khz_needed: Object 0x555556608fa0 is not an
instance of type generic-pc-machine
abort()
object_class_dynamic_cast_assert()
vmstate_save_state_v()
vmstate_save_state()
vmstate_save()
qemu_savevm_state_complete_precopy()
migration_thread()
migration_thread()
migration_thread()
qemu_thread_start()
start_thread()
clone()
The access to the machine class returned by MACHINE_GET_CLASS() in
tsc_khz_needed() is crashing as it is trying to dereference a different
type of machine class object (TYPE_PC_MACHINE) to that of this microVM.
This can be resolved by extending the changes in the following commit
f0bb276bf8d5 ("hw/i386: split PCMachineState deriving X86MachineState from it")
and moving the save_tsc_khz field in PCMachineClass to X86MachineClass.
Fixes: f0bb276bf8d5 ("hw/i386: split PCMachineState deriving X86MachineState from it")
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <1574075605-25215-1-git-send-email-liam.merwick@oracle.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
TSX Async Abort (TAA) is a side channel attack on internal buffers in
some Intel processors similar to Microachitectural Data Sampling (MDS).
Some future Intel processors will use the ARCH_CAP_TAA_NO bit in the
IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable to
TAA. Make this bit available to guests.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This is required to disable ITLB multihit mitigations in nested
hypervisors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
Change the handling of port F0h writes and FPU exceptions to implement IGNNE.
The implementation mixes a bit what the chipset and processor do in real
hardware, but the effect is the same as what happens with actual FERR#
and IGNNE# pins: writing to port F0h asserts IGNNE# in addition to lowering
FP_IRQ; while clearing the SE bit in the FPU status word deasserts IGNNE#.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In the next patch, this will provide a hook to detect clearing of
FSW.ES.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Move it out of pc.c since it is strictly tied to TCG. This is
almost exclusively code movement, the next patch will implement
IGNNE.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This merge fixes a semantic conflict with the trivial tree.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Denverton is the Atom Processor of Intel Harrisonville platform.
For more information:
https://ark.intel.com/content/www/us/en/ark/products/\
codename/63508/denverton.html
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190718073405.28301-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
UMWAIT and TPAUSE instructions use 32bits IA32_UMWAIT_CONTROL at MSR
index E1H to determines the maximum time in TSC-quanta that the processor
can reside in either C0.1 or C0.2.
This patch is to Add support for save/load IA32_UMWAIT_CONTROL MSR in
guest.
Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-3-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions.
This patch adds support for user wait instructions in KVM. Availability
of the user wait instructions is indicated by the presence of the CPUID
feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may
be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to
set the maximum time.
The patch enable the umonitor, umwait and tpause features in KVM.
Because umwait and tpause can put a (psysical) CPU into a power saving
state, by default we dont't expose it to kvm and enable it only when
guest CPUID has it. And use QEMU command-line "-overcommit cpu-pm=on"
(enable_cpu_pm is enabled), a VM can use UMONITOR, UMWAIT and TPAUSE
instructions. If the instruction causes a delay, the amount of time
delayed is called here the physical delay. The physical delay is first
computed by determining the virtual delay (the time to delay relative to
the VM’s timestamp counter). Otherwise, UMONITOR, UMWAIT and TPAUSE cause
an invalid-opcode exception(#UD).
The release document ref below link:
https://software.intel.com/sites/default/files/\
managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-2-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Hyper-V TLFS specifies this enlightenment as:
"NoNonArchitecturalCoreSharing - Indicates that a virtual processor will never
share a physical core with another virtual processor, except for virtual
processors that are reported as sibling SMT threads. This can be used as an
optimization to avoid the performance overhead of STIBP".
However, STIBP is not the only implication. It was found that Hyper-V on
KVM doesn't pass MD_CLEAR bit to its guests if it doesn't see
NoNonArchitecturalCoreSharing bit.
KVM reports NoNonArchitecturalCoreSharing in KVM_GET_SUPPORTED_HV_CPUID to
indicate that SMT on the host is impossible (not supported of forcefully
disabled).
Implement NoNonArchitecturalCoreSharing support in QEMU as tristate:
'off' - the feature is disabled (default)
'on' - the feature is enabled. This is only safe if vCPUS are properly
pinned and correct topology is exposed. As CPU pinning is done outside
of QEMU the enablement decision will be made on a higher level.
'auto' - copy KVM setting. As during live migration SMT settings on the
source and destination host may differ this requires us to add a migration
blocker.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20191018163908.10246-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Patch logs MCE AO, AR messages injected to guest or taken by QEMU itself.
We print the QEMU address for guest MCEs, helps on hypervisors that have
another source of MCE logging like mce log, and when they go missing.
For example we found these QEMU logs:
September 26th 2019, 17:36:02.309 Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected qemu-system-x86_64 amsN ams3nodeNNNN
September 27th 2019, 06:25:03.234 Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected qemu-system-x86_64 amsN ams3nodeNNNN
The first log had a corresponding mce log entry, the second didnt (logging
thresholds) we can infer from second entry same PA and mce type.
Signed-off-by: Mario Smarduch <msmarduch@digitalocean.com>
Message-Id: <20191009164459.8209-3-msmarduch@digitalocean.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add new version of Snowridge CPU model that removes MPX feature.
MPX support is being phased out by Intel. GCC has dropped it, Linux kernel
and KVM are also going to do that in the future.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191012024748.127135-1-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
KVM has a 80-entry limit at KVM_SET_CPUID2. With the
introduction of CPUID[0x1F], it is now possible to hit this limit
with unusual CPU configurations, e.g.:
$ ./x86_64-softmmu/qemu-system-x86_64 \
-smp 1,dies=2,maxcpus=2 \
-cpu EPYC,check=off,enforce=off \
-machine accel=kvm
qemu-system-x86_64: kvm_init_vcpu failed: Argument list too long
This happens because QEMU adds a lot of all-zeroes CPUID entries
for unused CPUID leaves. In the example above, we end up
creating 48 all-zeroes CPUID entries.
KVM already returns all-zeroes when emulating the CPUID
instruction if an entry is missing, so the all-zeroes entries are
redundant. Skip those entries. This reduces the CPUID table
size by half while keeping CPUID output unchanged.
Reported-by: Yumei Huang <yuhuang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1741508
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190822225210.32541-1-ehabkost@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
without kvm commit 412a3c41, CPUID(EAX=0xd,ECX=0).EBX always equal to 0 even
through guest update xcr0, this will crash legacy guest(e.g., CentOS 6).
Below is the call trace on the guest.
[ 0.000000] kernel BUG at mm/bootmem.c:469!
[ 0.000000] invalid opcode: 0000 [#1] SMP
[ 0.000000] last sysfs file:
[ 0.000000] CPU 0
[ 0.000000] Modules linked in:
[ 0.000000]
[ 0.000000] Pid: 0, comm: swapper Tainted: G --------------- H 2.6.32-279#2 Red Hat KVM
[ 0.000000] RIP: 0010:[<ffffffff81c4edc4>] [<ffffffff81c4edc4>] alloc_bootmem_core+0x7b/0x29e
[ 0.000000] RSP: 0018:ffffffff81a01cd8 EFLAGS: 00010046
[ 0.000000] RAX: ffffffff81cb1748 RBX: ffffffff81cb1720 RCX: 0000000001000000
[ 0.000000] RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffffffff81cb1720
[ 0.000000] RBP: ffffffff81a01d38 R08: 0000000000000000 R09: 0000000000001000
[ 0.000000] R10: 02008921da802087 R11: 00000000ffff8800 R12: 0000000000000000
[ 0.000000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000001000000
[ 0.000000] FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
[ 0.000000] CS: 0010 DS: 0018 ES: 0018 CR0: 0000000080050033
[ 0.000000] CR2: 0000000000000000 CR3: 0000000001a85000 CR4: 00000000001406b0
[ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 0.000000] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020)
[ 0.000000] Stack:
[ 0.000000] 0000000000000002 81a01dd881eaf060 000000007e5fe227 0000000000001001
[ 0.000000] <d> 0000000000000040 0000000000000001 0000006cffffffff 0000000001000000
[ 0.000000] <d> ffffffff81cb1720 0000000000000000 0000000000000000 0000000000000000
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff81c4f074>] ___alloc_bootmem_nopanic+0x8d/0xca
[ 0.000000] [<ffffffff81c4f0cf>] ___alloc_bootmem+0x11/0x39
[ 0.000000] [<ffffffff81c4f172>] __alloc_bootmem+0xb/0xd
[ 0.000000] [<ffffffff814d42d9>] xsave_cntxt_init+0x249/0x2c0
[ 0.000000] [<ffffffff814e0689>] init_thread_xstate+0x17/0x25
[ 0.000000] [<ffffffff814e0710>] fpu_init+0x79/0xaa
[ 0.000000] [<ffffffff814e27e3>] cpu_init+0x301/0x344
[ 0.000000] [<ffffffff81276395>] ? sort+0x155/0x230
[ 0.000000] [<ffffffff81c30cf2>] trap_init+0x24e/0x25f
[ 0.000000] [<ffffffff81c2bd73>] start_kernel+0x21c/0x430
[ 0.000000] [<ffffffff81c2b33a>] x86_64_start_reservations+0x125/0x129
[ 0.000000] [<ffffffff81c2b438>] x86_64_start_kernel+0xfa/0x109
[ 0.000000] Code: 03 48 89 f1 49 c1 e8 0c 48 0f af d0 48 c7 c6 00 a6 61 81 48 c7 c7 00 e5 79 81 31 c0 4c 89 74 24 08 e8 f2 d7 89 ff 4d 85 e4 75 04 <0f> 0b eb fe 48 8b 45 c0 48 83 e8 01 48 85 45
c0 74 04 0f 0b eb
Signed-off-by: Bingsong Si <owen.si@ucloud.cn>
Message-Id: <20190822042901.16858-1-owen.si@ucloud.cn>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
Drop the duplicated definition of cpuid AVX512_VBMI macro and rename
it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as
CPUID_7_0_ECX_AVX512_VBMI2.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-3-tao3.xu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
Add some comments, clean up comments over 80 chars per line. And there
is an extra line in comment of CPUID_8000_0008_EBX_WBNOINVD, remove
the extra enter and spaces.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-2-tao3.xu@intel.com>
[ehabkost: rebase to latest git master]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
When I run QEMU with KVM under Valgrind, I currently get this warning:
Syscall param ioctl(generic) points to uninitialised byte(s)
at 0x95BA45B: ioctl (in /usr/lib64/libc-2.28.so)
by 0x429DC3: kvm_ioctl (kvm-all.c:2365)
by 0x51B249: kvm_arch_get_supported_msr_feature (kvm.c:469)
by 0x4C2A49: x86_cpu_get_supported_feature_word (cpu.c:3765)
by 0x4C4116: x86_cpu_expand_features (cpu.c:5065)
by 0x4C7F8D: x86_cpu_realizefn (cpu.c:5242)
by 0x5961F3: device_set_realized (qdev.c:835)
by 0x7038F6: property_set_bool (object.c:2080)
by 0x707EFE: object_property_set_qobject (qom-qobject.c:26)
by 0x705814: object_property_set_bool (object.c:1338)
by 0x498435: pc_new_cpu (pc.c:1549)
by 0x49C67D: pc_cpus_init (pc.c:1681)
Address 0x1ffeffee74 is on thread 1's stack
in frame #2, created by kvm_arch_get_supported_msr_feature (kvm.c:445)
It's harmless, but a little bit annoying, so silence it by properly
initializing the whole structure with zeroes.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Some secondary controls are automatically enabled/disabled based on the CPUID
values that are set for the guest. However, they are still available at a
global level and therefore should be present when KVM_GET_MSRS is sent to
/dev/kvm.
Unfortunately KVM forgot to include those, so fix that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add code to convert the VMX feature words back into MSR values,
allowing the user to enable/disable VMX features as they wish. The same
infrastructure enables support for limiting VMX features in named
CPU models.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
These will be used to compile the list of VMX features for named
CPU models, and/or by the code that sets up the VMX MSRs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP
and IA32_VMX_BASIC MSRs. (The VMX control MSRs are 64-bit wide but
actually have only 32 bits of information).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Sometimes a CPU feature does not make sense unless another is
present. In the case of VMX features, KVM does not even allow
setting the VMX controls to some invalid combinations.
Therefore, this patch adds a generic mechanism that looks for bits
that the user explicitly cleared, and uses them to remove other bits
from the expanded CPU definition. If these dependent bits were also
explicitly *set* by the user, this will be a warning for "-cpu check"
and an error for "-cpu enforce". If not, then the dependent bits are
cleared silently, for convenience.
With VMX features, this will be used so that for example
"-cpu host,-rdrand" will also hide support for RDRAND exiting.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
mark_unavailable_features
The next patch will add a different reason for filtering features, unrelated
to host feature support. Extract a new function that takes care of disabling
the features and optionally reporting them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
There is a problem, that you don't have access to the data using cpu_memory_rw_debug() function when in SMM. You can't remotely debug SMM mode program because of that for example.
Likely attrs version of get_phys_page_debug should be used to get correct asidx at the end to handle access properly.
Here the patch to fix it.
Signed-off-by: Dmitry Poletaev <poletaev@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|