slackcoder/qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer

Age	Commit message (Collapse)	Author
2019-12-03	hvf: correctly inject VMCS_INTR_T_HWINTR versus VMCS_INTR_T_SWINTR.	Cameron Esfahani
	Previous implementation in hvf_inject_interrupts() would always inject VMCS_INTR_T_SWINTR even when VMCS_INTR_T_HWINTR was required. Now correctly determine when VMCS_INTR_T_HWINTR is appropriate versus VMCS_INTR_T_SWINTR. Make sure to clear ins_len and has_error_code when ins_len isn't valid and error_code isn't set. Signed-off-by: Cameron Esfahani <dirty@apple.com> Message-Id: <bf8d945ea1b423786d7802bbcf769517d1fd01f8.1575330463.git.dirty@apple.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26	hvf: more accurately match SDM when setting CR0 and PDPTE registers	Cameron Esfahani
	More accurately match SDM when setting CR0 and PDPTE registers. Clear PDPTE registers when resetting vcpus. Signed-off-by: Cameron Esfahani <dirty@apple.com> Message-Id: <464adb39c8699fb8331d8ad6016fc3e2eff53dbc.1574625592.git.dirty@apple.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26	hvf: correctly handle REX prefix in relation to legacy prefixes	Cameron Esfahani
	In real x86 processors, the REX prefix must come after legacy prefixes. REX before legacy is ignored. Update the HVF emulation code to properly handle this. Fix some spelling errors in constants. Fix some decoder table initialization issues found by Coverity. Signed-off-by: Cameron Esfahani <dirty@apple.com> Message-Id: <eff30ded8307471936bec5d84c3b6efbc95e3211.1574625592.git.dirty@apple.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26	hvf: remove TSC synchronization code because it isn't fully complete	Cameron Esfahani
	The existing code in QEMU's HVF support to attempt to synchronize TSC across multiple cores is not sufficient. TSC value on other cores can go backwards. Until implementation is fixed, remove calls to hv_vm_sync_tsc(). Pass through TSC to guest OS. Signed-off-by: Cameron Esfahani <dirty@apple.com> Message-Id: <44c4afd2301b8bf99682b229b0796d84edd6d66f.1574625592.git.dirty@apple.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26	hvf: non-RAM, non-ROMD memory ranges are now correctly mapped in	Cameron Esfahani
	If an area is non-RAM and non-ROMD, then remove mappings so accesses will trap and can be emulated. Change hvf_find_overlap_slot() to take a size instead of an end address: it wouldn't return a slot because callers would pass the same address for start and end. Don't always map area as read/write/execute, respect area flags. Signed-off-by: Cameron Esfahani <dirty@apple.com> Message-Id: <1d8476c8f86959273fbdf23c86f8b4b611f5e2e1.1574625592.git.dirty@apple.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26	target/i386: add two missing VMX features for Skylake and CascadeLake Server	Paolo Bonzini
	They are present in client (Core) Skylake but pasted wrong into the server SKUs. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21	i386: Add -noTSX aliases for hle=off, rtm=off CPU models	Eduardo Habkost
	We have been trying to avoid adding new aliases for CPU model versions, but in the case of changes in defaults introduced by the TAA mitigation patches, the aliases might help avoid user confusion when applying host software updates. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21	i386: Add new versions of Skylake/Cascadelake/Icelake without TSX	Eduardo Habkost
	One of the mitigation methods for TAA[1] is to disable TSX support on the host system. Linux added a mechanism to disable TSX globally through the kernel command line, and many Linux distributions now default to tsx=off. This makes existing CPU models that have HLE and RTM enabled not usable anymore. Add new versions of all CPU models that have the HLE and RTM features enabled, that can be used when TSX is disabled in the host system. References: [1] TAA, TSX asynchronous Abort: https://software.intel.com/security-software-guidance/insights/deep-dive-intel-transactional-synchronization-extensions-intel-tsx-asynchronous-abort https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21	target/i386: add support for MSR_IA32_TSX_CTRL	Paolo Bonzini
	The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the Trusty Side-channel Extension). By virtualizing the MSR, KVM guests can disable TSX and avoid paying the price of mitigating TSX-based attacks on microarchitectural side channels. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21	target/i386: add VMX features to named CPU models	Paolo Bonzini
	This allows using "-cpu Haswell,+vmx", which we did not really want to support in QEMU but was produced by Libvirt when using the "host-model" CPU model. Without this patch, no VMX feature is _actually_ supported (only the basic instruction set extensions are) and KVM fails to load in the guest. This was produced from the output of scripts/kvm/vmxcap using the following very ugly Python script: bits = { 'INS/OUTS instruction information': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_INS_OUTS'], 'IA32_VMX_TRUE_*_CTLS support': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_TRUE_CTLS'], 'External interrupt exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_EXT_INTR_MASK'], 'NMI exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_NMI_EXITING'], 'Virtual NMIs': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VIRTUAL_NMIS'], 'Activate VMX-preemption timer': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VMX_PREEMPTION_TIMER'], 'Process posted interrupts': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_POSTED_INTR'], 'Interrupt window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_INTR_PENDING'], 'Use TSC offsetting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_TSC_OFFSETING'], 'HLT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_HLT_EXITING'], 'INVLPG exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_INVLPG_EXITING'], 'MWAIT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MWAIT_EXITING'], 'RDPMC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDPMC_EXITING'], 'RDTSC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDTSC_EXITING'], 'CR3-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_LOAD_EXITING'], 'CR3-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_STORE_EXITING'], 'CR8-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_LOAD_EXITING'], 'CR8-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_STORE_EXITING'], 'Use TPR shadow': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_TPR_SHADOW'], 'NMI-window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_NMI_PENDING'], 'MOV-DR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MOV_DR_EXITING'], 'Unconditional I/O exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_UNCOND_IO_EXITING'], 'Use I/O bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_IO_BITMAPS'], 'Monitor trap flag': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_TRAP_FLAG'], 'Use MSR bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_MSR_BITMAPS'], 'MONITOR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_EXITING'], 'PAUSE exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_PAUSE_EXITING'], 'Activate secondary control': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS'], 'Virtualize APIC accesses': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES'], 'Enable EPT': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_EPT'], 'Descriptor-table exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_DESC'], 'Enable RDTSCP': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDTSCP'], 'Virtualize x2APIC mode': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE'], 'Enable VPID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VPID'], 'WBINVD exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_WBINVD_EXITING'], 'Unrestricted guest': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST'], 'APIC register emulation': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT'], 'Virtual interrupt delivery': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY'], 'PAUSE-loop exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_PAUSE_LOOP_EXITING'], 'RDRAND exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDRAND_EXITING'], 'Enable INVPCID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_INVPCID'], 'Enable VM functions': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VMFUNC'], 'VMCS shadowing': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_SHADOW_VMCS'], 'RDSEED exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDSEED_EXITING'], 'Enable PML': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_PML'], 'Enable XSAVES/XRSTORS': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_XSAVES'], 'Save debug controls': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_DEBUG_CONTROLS'], 'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL'], 'Acknowledge interrupt on exit': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_ACK_INTR_ON_EXIT'], 'Save IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_PAT'], 'Load IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PAT'], 'Save IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_EFER'], 'Load IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_EFER'], 'Save VMX-preemption timer value': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER'], 'Clear IA32_BNDCFGS': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_CLEAR_BNDCFGS'], 'Load debug controls': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS'], 'IA-32e mode guest': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_IA32E_MODE'], 'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL'], 'Load IA32_PAT': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PAT'], 'Load IA32_EFER': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_EFER'], 'Load IA32_BNDCFGS': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_BNDCFGS'], 'Store EFER.LMA into IA-32e mode guest control': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_STORE_LMA'], 'HLT activity state': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ACTIVITY_HLT'], 'VMWRITE to VM-exit information fields': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_VMWRITE_VMEXIT'], 'Inject event with insn length=0': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ZERO_LEN_INJECT'], 'Execute-only EPT translations': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_EXECONLY'], 'Page-walk length 4': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_PAGE_WALK_LENGTH_4'], 'Paging-structure memory type WB': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_WB'], '2MB EPT pages': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_2MB \| MSR_VMX_EPT_1GB'], 'INVEPT supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT'], 'EPT accessed and dirty flags': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_AD_BITS'], 'Single-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT'], 'All-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_ALL_CONTEXT'], 'INVVPID supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID'], 'Individual-address INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_ADDR'], 'Single-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT'], 'All-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_ALL_CONTEXT'], 'Single-context-retaining-globals INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS'], 'EPTP Switching': ['FEAT_VMX_VMFUNC', 'MSR_VMX_VMFUNC_EPT_SWITCHING'] } import sys import textwrap out = {} for l in sys.stdin.readlines(): l = l.rstrip() if l.endswith('!!'): l = l[:-2].rstrip() if l.startswith(' ') and (l.endswith('default') or l.endswith('yes')): l = l[4:] for key, value in bits.items(): if l.startswith(key): ctl, bit = value if ctl in out: out[ctl] = out[ctl] + ' \| ' else: out[ctl] = ' [%s] = ' % ctl out[ctl] = out[ctl] + bit for x in sorted(out.keys()): print("\n ".join(textwrap.wrap(out[x] + ","))) Note that the script has a bug in that some keys apply to both VM entry and VM exit controls ("load IA32_PERF_GLOBAL_CTRL", "load IA32_EFER", "load IA32_PAT". Those have to be fixed by hand. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-19	hw/i386: Move save_tsc_khz from PCMachineClass to X86MachineClass	Liam Merwick
	Attempting to migrate a VM using the microvm machine class results in the source QEMU aborting with the following message/backtrace: target/i386/machine.c:955:tsc_khz_needed: Object 0x555556608fa0 is not an instance of type generic-pc-machine abort() object_class_dynamic_cast_assert() vmstate_save_state_v() vmstate_save_state() vmstate_save() qemu_savevm_state_complete_precopy() migration_thread() migration_thread() migration_thread() qemu_thread_start() start_thread() clone() The access to the machine class returned by MACHINE_GET_CLASS() in tsc_khz_needed() is crashing as it is trying to dereference a different type of machine class object (TYPE_PC_MACHINE) to that of this microVM. This can be resolved by extending the changes in the following commit f0bb276bf8d5 ("hw/i386: split PCMachineState deriving X86MachineState from it") and moving the save_tsc_khz field in PCMachineClass to X86MachineClass. Fixes: f0bb276bf8d5 ("hw/i386: split PCMachineState deriving X86MachineState from it") Signed-off-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Message-Id: <1574075605-25215-1-git-send-email-liam.merwick@oracle.com> Reviewed-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-19	target/i386: Export TAA_NO bit to guests	Pawan Gupta
	TSX Async Abort (TAA) is a side channel attack on internal buffers in some Intel processors similar to Microachitectural Data Sampling (MDS). Some future Intel processors will use the ARCH_CAP_TAA_NO bit in the IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable to TAA. Make this bit available to guests. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-19	target/i386: add PSCHANGE_NO bit for the ARCH_CAPABILITIES MSR	Paolo Bonzini
	This is required to disable ITLB multihit mitigations in nested hypervisors. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-28	target/i386: fetch code with translator_ld	Emilio G. Cota
	Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-10-26	i386: implement IGNNE	Paolo Bonzini
	Change the handling of port F0h writes and FPU exceptions to implement IGNNE. The implementation mixes a bit what the chipset and processor do in real hardware, but the effect is the same as what happens with actual FERR# and IGNNE# pins: writing to port F0h asserts IGNNE# in addition to lowering FP_IRQ; while clearing the SE bit in the FPU status word deasserts IGNNE#. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26	target/i386: introduce cpu_set_fpus	Paolo Bonzini
	In the next patch, this will provide a hook to detect clearing of FSW.ES. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26	target/i386: move FERR handling to target/i386	Paolo Bonzini
	Move it out of pc.c since it is strictly tied to TCG. This is almost exclusively code movement, the next patch will implement IGNNE. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26	Merge commit 'df84f17' into HEAD	Paolo Bonzini
	This merge fixes a semantic conflict with the trivial tree. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-23	target/i386: Introduce Denverton CPU model	Tao Xu
	Denverton is the Atom Processor of Intel Harrisonville platform. For more information: https://ark.intel.com/content/www/us/en/ark/products/\ codename/63508/denverton.html Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20190718073405.28301-1-tao3.xu@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-23	target/i386: Add support for save/load IA32_UMWAIT_CONTROL MSR	Tao Xu
	UMWAIT and TPAUSE instructions use 32bits IA32_UMWAIT_CONTROL at MSR index E1H to determines the maximum time in TSC-quanta that the processor can reside in either C0.1 or C0.2. This patch is to Add support for save/load IA32_UMWAIT_CONTROL MSR in guest. Co-developed-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191011074103.30393-3-tao3.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-23	x86/cpu: Add support for UMONITOR/UMWAIT/TPAUSE	Tao Xu
	UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions. This patch adds support for user wait instructions in KVM. Availability of the user wait instructions is indicated by the presence of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to set the maximum time. The patch enable the umonitor, umwait and tpause features in KVM. Because umwait and tpause can put a (psysical) CPU into a power saving state, by default we dont't expose it to kvm and enable it only when guest CPUID has it. And use QEMU command-line "-overcommit cpu-pm=on" (enable_cpu_pm is enabled), a VM can use UMONITOR, UMWAIT and TPAUSE instructions. If the instruction causes a delay, the amount of time delayed is called here the physical delay. The physical delay is first computed by determining the virtual delay (the time to delay relative to the VMâ€™s timestamp counter). Otherwise, UMONITOR, UMWAIT and TPAUSE cause an invalid-opcode exception(#UD). The release document ref below link: https://software.intel.com/sites/default/files/\ managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf Co-developed-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191011074103.30393-2-tao3.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22	i386/kvm: add NoNonArchitecturalCoreSharing Hyper-V enlightenment	Vitaly Kuznetsov
	Hyper-V TLFS specifies this enlightenment as: "NoNonArchitecturalCoreSharing - Indicates that a virtual processor will never share a physical core with another virtual processor, except for virtual processors that are reported as sibling SMT threads. This can be used as an optimization to avoid the performance overhead of STIBP". However, STIBP is not the only implication. It was found that Hyper-V on KVM doesn't pass MD_CLEAR bit to its guests if it doesn't see NoNonArchitecturalCoreSharing bit. KVM reports NoNonArchitecturalCoreSharing in KVM_GET_SUPPORTED_HV_CPUID to indicate that SMT on the host is impossible (not supported of forcefully disabled). Implement NoNonArchitecturalCoreSharing support in QEMU as tristate: 'off' - the feature is disabled (default) 'on' - the feature is enabled. This is only safe if vCPUS are properly pinned and correct topology is exposed. As CPU pinning is done outside of QEMU the enablement decision will be made on a higher level. 'auto' - copy KVM setting. As during live migration SMT settings on the source and destination host may differ this requires us to add a migration blocker. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20191018163908.10246-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22	target/i386: log MCE guest and host addresses	Mario Smarduch
	Patch logs MCE AO, AR messages injected to guest or taken by QEMU itself. We print the QEMU address for guest MCEs, helps on hypervisors that have another source of MCE logging like mce log, and when they go missing. For example we found these QEMU logs: September 26th 2019, 17:36:02.309 Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected qemu-system-x86_64 amsN ams3nodeNNNN September 27th 2019, 06:25:03.234 Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected qemu-system-x86_64 amsN ams3nodeNNNN The first log had a corresponding mce log entry, the second didnt (logging thresholds) we can infer from second entry same PA and mce type. Signed-off-by: Mario Smarduch <msmarduch@digitalocean.com> Message-Id: <20191009164459.8209-3-msmarduch@digitalocean.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-15	target/i386: Add Snowridge-v2 (no MPX) CPU model	Xiaoyao Li
	Add new version of Snowridge CPU model that removes MPX feature. MPX support is being phased out by Intel. GCC has dropped it, Linux kernel and KVM are also going to do that in the future. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-Id: <20191012024748.127135-1-xiaoyao.li@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15	i386: Omit all-zeroes entries from KVM CPUID table	Eduardo Habkost
	KVM has a 80-entry limit at KVM_SET_CPUID2. With the introduction of CPUID[0x1F], it is now possible to hit this limit with unusual CPU configurations, e.g.: $ ./x86_64-softmmu/qemu-system-x86_64 \ -smp 1,dies=2,maxcpus=2 \ -cpu EPYC,check=off,enforce=off \ -machine accel=kvm qemu-system-x86_64: kvm_init_vcpu failed: Argument list too long This happens because QEMU adds a lot of all-zeroes CPUID entries for unused CPUID leaves. In the example above, we end up creating 48 all-zeroes CPUID entries. KVM already returns all-zeroes when emulating the CPUID instruction if an entry is missing, so the all-zeroes entries are redundant. Skip those entries. This reduces the CPUID table size by half while keeping CPUID output unchanged. Reported-by: Yumei Huang <yuhuang@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1741508 Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190822225210.32541-1-ehabkost@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15	i386: Fix legacy guest with xsave panic on host kvm without update cpuid.	Bingsong Si
	without kvm commit 412a3c41, CPUID(EAX=0xd,ECX=0).EBX always equal to 0 even through guest update xcr0, this will crash legacy guest(e.g., CentOS 6). Below is the call trace on the guest. [ 0.000000] kernel BUG at mm/bootmem.c:469! [ 0.000000] invalid opcode: 0000 [#1] SMP [ 0.000000] last sysfs file: [ 0.000000] CPU 0 [ 0.000000] Modules linked in: [ 0.000000] [ 0.000000] Pid: 0, comm: swapper Tainted: G --------------- H 2.6.32-279#2 Red Hat KVM [ 0.000000] RIP: 0010:[<ffffffff81c4edc4>] [<ffffffff81c4edc4>] alloc_bootmem_core+0x7b/0x29e [ 0.000000] RSP: 0018:ffffffff81a01cd8 EFLAGS: 00010046 [ 0.000000] RAX: ffffffff81cb1748 RBX: ffffffff81cb1720 RCX: 0000000001000000 [ 0.000000] RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffffffff81cb1720 [ 0.000000] RBP: ffffffff81a01d38 R08: 0000000000000000 R09: 0000000000001000 [ 0.000000] R10: 02008921da802087 R11: 00000000ffff8800 R12: 0000000000000000 [ 0.000000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000001000000 [ 0.000000] FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 [ 0.000000] CS: 0010 DS: 0018 ES: 0018 CR0: 0000000080050033 [ 0.000000] CR2: 0000000000000000 CR3: 0000000001a85000 CR4: 00000000001406b0 [ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 0.000000] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020) [ 0.000000] Stack: [ 0.000000] 0000000000000002 81a01dd881eaf060 000000007e5fe227 0000000000001001 [ 0.000000] <d> 0000000000000040 0000000000000001 0000006cffffffff 0000000001000000 [ 0.000000] <d> ffffffff81cb1720 0000000000000000 0000000000000000 0000000000000000 [ 0.000000] Call Trace: [ 0.000000] [<ffffffff81c4f074>] ___alloc_bootmem_nopanic+0x8d/0xca [ 0.000000] [<ffffffff81c4f0cf>] ___alloc_bootmem+0x11/0x39 [ 0.000000] [<ffffffff81c4f172>] __alloc_bootmem+0xb/0xd [ 0.000000] [<ffffffff814d42d9>] xsave_cntxt_init+0x249/0x2c0 [ 0.000000] [<ffffffff814e0689>] init_thread_xstate+0x17/0x25 [ 0.000000] [<ffffffff814e0710>] fpu_init+0x79/0xaa [ 0.000000] [<ffffffff814e27e3>] cpu_init+0x301/0x344 [ 0.000000] [<ffffffff81276395>] ? sort+0x155/0x230 [ 0.000000] [<ffffffff81c30cf2>] trap_init+0x24e/0x25f [ 0.000000] [<ffffffff81c2bd73>] start_kernel+0x21c/0x430 [ 0.000000] [<ffffffff81c2b33a>] x86_64_start_reservations+0x125/0x129 [ 0.000000] [<ffffffff81c2b438>] x86_64_start_kernel+0xfa/0x109 [ 0.000000] Code: 03 48 89 f1 49 c1 e8 0c 48 0f af d0 48 c7 c6 00 a6 61 81 48 c7 c7 00 e5 79 81 31 c0 4c 89 74 24 08 e8 f2 d7 89 ff 4d 85 e4 75 04 <0f> 0b eb fe 48 8b 45 c0 48 83 e8 01 48 85 45 c0 74 04 0f 0b eb Signed-off-by: Bingsong Si <owen.si@ucloud.cn> Message-Id: <20190822042901.16858-1-owen.si@ucloud.cn> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15	target/i386: drop the duplicated definition of cpuid AVX512_VBMI macro	Tao Xu
	Drop the duplicated definition of cpuid AVX512_VBMI macro and rename it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as CPUID_7_0_ECX_AVX512_VBMI2. Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20190926021055.6970-3-tao3.xu@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15	target/i386: clean up comments over 80 chars per line	Tao Xu
	Add some comments, clean up comments over 80 chars per line. And there is an extra line in comment of CPUID_8000_0008_EBX_WBNOINVD, remove the extra enter and spaces. Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20190926021055.6970-2-tao3.xu@intel.com> [ehabkost: rebase to latest git master] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-04	target/i386/kvm: Silence warning from Valgrind about uninitialized bytes	Thomas Huth
	When I run QEMU with KVM under Valgrind, I currently get this warning: Syscall param ioctl(generic) points to uninitialised byte(s) at 0x95BA45B: ioctl (in /usr/lib64/libc-2.28.so) by 0x429DC3: kvm_ioctl (kvm-all.c:2365) by 0x51B249: kvm_arch_get_supported_msr_feature (kvm.c:469) by 0x4C2A49: x86_cpu_get_supported_feature_word (cpu.c:3765) by 0x4C4116: x86_cpu_expand_features (cpu.c:5065) by 0x4C7F8D: x86_cpu_realizefn (cpu.c:5242) by 0x5961F3: device_set_realized (qdev.c:835) by 0x7038F6: property_set_bool (object.c:2080) by 0x707EFE: object_property_set_qobject (qom-qobject.c:26) by 0x705814: object_property_set_bool (object.c:1338) by 0x498435: pc_new_cpu (pc.c:1549) by 0x49C67D: pc_cpus_init (pc.c:1681) Address 0x1ffeffee74 is on thread 1's stack in frame #2, created by kvm_arch_get_supported_msr_feature (kvm.c:445) It's harmless, but a little bit annoying, so silence it by properly initializing the whole structure with zeroes. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04	target/i386: work around KVM_GET_MSRS bug for secondary execution controls	Paolo Bonzini
	Some secondary controls are automatically enabled/disabled based on the CPUID values that are set for the guest. However, they are still available at a global level and therefore should be present when KVM_GET_MSRS is sent to /dev/kvm. Unfortunately KVM forgot to include those, so fix that. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04	target/i386: add VMX features	Paolo Bonzini
	Add code to convert the VMX feature words back into MSR values, allowing the user to enable/disable VMX features as they wish. The same infrastructure enables support for limiting VMX features in named CPU models. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04	target/i386: add VMX definitions	Paolo Bonzini
	These will be used to compile the list of VMX features for named CPU models, and/or by the code that sets up the VMX MSRs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04	target/i386: expand feature words to 64 bits	Paolo Bonzini
	VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP and IA32_VMX_BASIC MSRs. (The VMX control MSRs are 64-bit wide but actually have only 32 bits of information). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04	target/i386: introduce generic feature dependency mechanism	Paolo Bonzini
	Sometimes a CPU feature does not make sense unless another is present. In the case of VMX features, KVM does not even allow setting the VMX controls to some invalid combinations. Therefore, this patch adds a generic mechanism that looks for bits that the user explicitly cleared, and uses them to remove other bits from the expanded CPU definition. If these dependent bits were also explicitly set by the user, this will be a warning for "-cpu check" and an error for "-cpu enforce". If not, then the dependent bits are cleared silently, for convenience. With VMX features, this will be used so that for example "-cpu host,-rdrand" will also hide support for RDRAND exiting. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04	target/i386: handle filtered_features in a new function ↵	Paolo Bonzini
	mark_unavailable_features The next patch will add a different reason for filtering features, unrelated to host feature support. Extract a new function that takes care of disabling the features and optionally reporting them. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04	Fix wrong behavior of cpu_memory_rw_debug() function in SMM	Dmitry Poletaev
	There is a problem, that you don't have access to the data using cpu_memory_rw_debug() function when in SMM. You can't remotely debug SMM mode program because of that for example. Likely attrs version of get_phys_page_debug should be used to get correct asidx at the end to handle access properly. Here the patch to fix it. Signed-off-by: Dmitry Poletaev <poletaev@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04	i386: Add CPUID bit for CLZERO and XSAVEERPTR	Sebastian Andrzej Siewior
	The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform and could be passed to the guest. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-26	target/i386: Fix broken build with WHPX enabled	Philippe Mathieu-Daudé
	The WHPX build is broken since commit 12e9493df92 which removed the "hw/boards.h" where MachineState is declared: $ ./configure \ --enable-hax --enable-whpx $ make x86_64-softmmu/all [...] CC x86_64-softmmu/target/i386/whpx-all.o target/i386/whpx-all.c: In function 'whpx_accel_init': target/i386/whpx-all.c:1378:25: error: dereferencing pointer to incomplete type 'MachineState' {aka 'struct MachineState'} whpx->mem_quota = ms->ram_size; ^~ make[1]: * [rules.mak:69: target/i386/whpx-all.o] Error 1 CC x86_64-softmmu/trace/generated-helpers.o make[1]: Target 'all' not remade because of errors. make: * [Makefile:471: x86_64-softmmu/all] Error 2 Restore this header, partially reverting commit 12e9493df92. Fixes: 12e9493df92 Reported-by: Ilias Maratos <i.maratos@gmail.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20190920113329.16787-2-philmd@redhat.com>
2019-09-16	hw/i386/pc: Extract e820 memory layout code	Philippe Mathieu-Daudé
	Suggested-by: Samuel Ortiz <sameo@linux.intel.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190818225414.22590-3-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-16	i386/kvm: support guest access CORE cstate	Wanpeng Li
	Allow guest reads CORE cstate when exposing host CPU power management capabilities to the guest. PKG cstate is restricted to avoid a guest to get the whole package information in multi-tenant scenario. Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1563154124-18579-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-03	tcg: TCGMemOp is now accelerator independent MemOp	Tony Nguyen
	Preparation for collapsing the two byte swaps, adjust_endianness and handle_bswap, along the I/O path. Target dependant attributes are conditionalized upon NEED_CPU_H. Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-08-22	Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2019-08-21' ↵	Peter Maydell
	into staging Monitor patches for 2019-08-21 # gpg: Signature made Wed 21 Aug 2019 16:35:07 BST # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-monitor-2019-08-21: monitor/qmp: Update comment for commit 4eaca8de268 qdev: Collect HMP handlers command handlers in qdev-monitor.c qapi: Move query-target from misc.json to machine.json hw/core: Move cpu.c, cpu.h from qom/ to hw/core/ Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-21	hw/core: Move cpu.c, cpu.h from qom/ to hw/core/	Markus Armbruster
	Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190709152053.16670-2-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> [Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h in comments replaced]
2019-08-20	x86: Intel AVX512_BF16 feature enabling	Jing Liu
	Intel CooperLake cpu adds AVX512_BF16 instruction, defining as CPUID.(EAX=7,ECX=1):EAX[bit 05]. The patch adds a property for setting the subleaf of CPUID leaf 7 in case that people would like to specify it. The release spec link as follows, https://software.intel.com/sites/default/files/managed/c5/15/\ architecture-instruction-set-extensions-programming-reference.pdf Signed-off-by: Jing Liu <jing2.liu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20	icount: remove unnecessary gen_io_end calls	Pavel Dovgalyuk
	Prior patch resets can_do_io flag at the TB entry. Therefore there is no need in resetting this flag at the end of the block. This patch removes redundant gen_io_end calls. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Message-Id: <156404429499.18669.13404064982854123855.stgit@pasha-Precision-3630-Tower> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
2019-08-20	target/i386: Return 'indefinite integer value' for invalid SSE fp->int ↵	Peter Maydell
	conversions The x86 architecture requires that all conversions from floating point to integer which raise the 'invalid' exception (infinities of both signs, NaN, and all values which don't fit in the destination integer) return what the x86 spec calls the "indefinite integer value", which is 0x8000_0000 for 32-bits or 0x8000_0000_0000_0000 for 64-bits. The softfloat functions return the more usual behaviour of positive overflows returning the maximum value that fits in the destination integer format and negative overflows returning the minimum value that fits. Wrap the softfloat functions in x86-specific versions which detect the 'invalid' condition and return the indefinite integer. Note that we don't use these wrappers for the 3DNow! pf2id and pf2iw instructions, which do return the minimum value that fits in an int32 if the input float is a large negative number. Fixes: https://bugs.launchpad.net/qemu/+bug/1815423 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20190805180332.10185-1-peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20	i386/kvm: initialize struct at full before ioctl call	Andrey Shinkevich
	Not the whole structure is initialized before passing it to the KVM. Reduce the number of Valgrind reports. Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <1564502498-805893-4-git-send-email-andrey.shinkevich@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20	target-i386: kvm: 'kvm_get_supported_msrs' cleanup	Li Qiang
	Function 'kvm_get_supported_msrs' is only called once now, get rid of the static variable 'kvm_supported_msrs'. Signed-off-by: Li Qiang <liq3ea@163.com> Message-Id: <20190725151639.21693-1-liq3ea@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20	target-i386: adds PV_SCHED_YIELD CPUID feature bit	Wanpeng Li
	Adds PV_SCHED_YIELD CPUID feature bit. Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1562745771-8414-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20	kvm: i386: halt poll control MSR support	Marcelo Tosatti
	Add support for halt poll control MSR: save/restore, migration and new feature name. The purpose of this MSR is to allow the guest to disable host halt poll. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20190603230408.GA7938@amt.cnet> [Do not enable by default, as pointed out by Mark Kanda. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>