aboutsummaryrefslogtreecommitdiff
path: root/include/hw/i386
AgeCommit message (Collapse)Author
2017-04-20intel_iommu: enable remote IOTLBPeter Xu
This patch is based on Aviv Ben-David (<bd.aviv@gmail.com>)'s patch upstream: "IOMMU: enable intel_iommu map and unmap notifiers" https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html However I removed/fixed some content, and added my own codes. Instead of translate() every page for iotlb invalidations (which is slower), we walk the pages when needed and notify in a hook function. This patch enables vfio devices for VT-d emulation. And, since we already have vhost DMAR support via device-iotlb, a natural benefit that this patch brings is that vt-d enabled vhost can live even without ATS capability now. Though more tests are needed. Signed-off-by: Aviv Ben-David <bdaviv@cs.technion.ac.il> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-10-git-send-email-peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20intel_iommu: allow dynamic switch of IOMMU regionPeter Xu
This is preparation work to finally enabled dynamic switching ON/OFF for VT-d protection. The old VT-d codes is using static IOMMU address space, and that won't satisfy vfio-pci device listeners. Let me explain. vfio-pci devices depend on the memory region listener and IOMMU replay mechanism to make sure the device mapping is coherent with the guest even if there are domain switches. And there are two kinds of domain switches: (1) switch from domain A -> B (2) switch from domain A -> no domain (e.g., turn DMAR off) Case (1) is handled by the context entry invalidation handling by the VT-d replay logic. What the replay function should do here is to replay the existing page mappings in domain B. However for case (2), we don't want to replay any domain mappings - we just need the default GPA->HPA mappings (the address_space_memory mapping). And this patch helps on case (2) to build up the mapping automatically by leveraging the vfio-pci memory listeners. Another important thing that this patch does is to seperate IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not depend on the DMAR region (like before this patch). It should be a standalone region, and it should be able to be activated without DMAR (which is a common behavior of Linux kernel - by default it enables IR while disabled DMAR). Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-9-git-send-email-peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-05tco: do not generate an NMIPaolo Bonzini
This behavior is not indicated in the datasheet and can confuse the OS. The TCO can trap NMIs from SERR# or IOCHK# and convert them to SMIs; but any other TCO event is either delivered as an SMI or completely disabled. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27Revert "apic: save apic_delivered flag"Paolo Bonzini
This reverts commit 07bfa354772f2de67008dc66c201b627acff0106. The global variable is only read as part of a apic_reset_irq_delivered(); qemu_irq_raise(s->irq); if (!apic_get_irq_delivered()) { sequence, so the value never matters at migration time. Reported-by: Dr. David Alan Gilbert <dglibert@redhat.com> Cc: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-10i386: Change stepping of Haswell to non-blacklisted valueEduardo Habkost
glibc blacklists TSX on Haswell CPUs with model==60 and stepping < 4. To make the Haswell CPU model more useful, make those guests actually use TSX by changing CPU stepping to 4. References: * glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359 https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359 Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170309181212.18864-4-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-03x86: Work around SMI migration breakagesDr. David Alan Gilbert
Migration from a 2.3.0 qemu results in a reboot on the receiving QEMU due to a disagreement about SM (System management) interrupts. 2.3.0 didn't have much SMI support, but it did set CPU_INTERRUPT_SMI and this gets into the migration stream, but on 2.3.0 it never got delivered. ~2.4.0 SMI interrupt support was added but was broken - so that when a 2.3.0 stream was received it cleared the CPU_INTERRUPT_SMI but never actually caused an interrupt. The SMI delivery was recently fixed by 68c6efe07a, but the effect now is that an incoming 2.3.0 stream takes the interrupt it had flagged but it's bios can't actually handle it(I think partly due to the original interrupt not being taken during boot?). The consequence is a triple(?) fault and a reboot. Tested from: 2.3.1 -M 2.3.0 2.7.0 -M 2.3.0 2.8.0 -M 2.3.0 2.8.0 -M 2.8.0 This corresponds to RH bugzilla entry 1420679. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170223133441.16010-1-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-22machine: move possible_cpus to MachineStateIgor Mammedov
so that it would be possible to reuse it with spapr/virt-aarch64 targets. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-17intel_iommu: add "caching-mode" optionAviv Ben-David
This capability asks the guest to invalidate cache before each map operation. We can use this invalidation to trap map operations in the hypervisor. Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com> [peterx: using "caching-mode" instead of "cache-mode" to align with spec] [peterx: re-write the subject to make it short and clear] Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-27char: rename CharDriverState ChardevMarc-André Lureau
Pick a uniform chardev type name. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27pc: Enable vmware-cpuid-freq CPU option for 2.9+ machine typesPhil Dennis-Jordan
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu> Message-Id: <1484921496-11257-4-git-send-email-phil@philjordan.eu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27hw/isa/lpc_ich9: negotiate SMI broadcast on pc-q35-2.9+ machine typesLaszlo Ersek
Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20170126014416.11211-4-lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27hw/isa/lpc_ich9: add broadcast SMI featureLaszlo Ersek
The generic edk2 SMM infrastructure prefers EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each processor. If Trigger() only brings the current processor into SMM, then edk2 handles it in the following ways: (1) If Trigger() is executed by the BSP (which is guaranteed before ExitBootServices(), but is not necessarily true at runtime), then: (a) If edk2 has been configured for "traditional" SMM synchronization, then the BSP sends directed SMIs to the APs with APIC delivery, bringing them into SMM individually. Then the BSP runs the SMI handler / dispatcher. (b) If edk2 has been configured for "relaxed" SMM synchronization, then the APs that are not already in SMM are not brought in, and the BSP runs the SMI handler / dispatcher. (2) If Trigger() is executed by an AP (which is possible after ExitBootServices(), and can be forced e.g. by "taskset -c 1 efibootmgr"), then the AP in question brings in the BSP with a directed SMI, and the BSP runs the SMI handler / dispatcher. The smaller problem with (1a) and (2) is that the BSP and AP synchronization is slow. For example, the "taskset -c 1 efibootmgr" command from (2) can take more than 3 seconds to complete, because efibootmgr accesses non-volatile UEFI variables intensively. The larger problem is that QEMU's current behavior diverges from the behavior usually seen on physical hardware, and that keeps exposing obscure corner cases, race conditions and other instabilities in edk2, which generally expects / prefers a software SMI to affect all CPUs at once. Therefore introduce the "broadcast SMI" feature that causes QEMU to inject the SMI on all VCPUs. While the original posting of this patch <http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg05658.html> only intended to speed up (2), based on our recent "stress testing" of SMM this patch actually provides functional improvements. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20170126014416.11211-3-lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27hw/isa/lpc_ich9: add SMI feature negotiation via fw_cfgLaszlo Ersek
Introduce the following fw_cfg files: - "etc/smi/supported-features": a little endian uint64_t feature bitmap, presenting the features known by the host to the guest. Read-only for the guest. The content of this file will be determined via bit-granularity ICH9-LPC device properties, to be introduced later. For now, the bitmask is left zeroed. The bits will be set from machine type compat properties and on the QEMU command line, hence this file is not migrated. - "etc/smi/requested-features": a little endian uint64_t feature bitmap, representing the features the guest would like to request. Read-write for the guest. The guest can freely (re)write this file, it has no direct consequence. Initial value is zero. A nonzero value causes the SMI-related fw_cfg files and fields that are under guest influence to be migrated. - "etc/smi/features-ok": contains a uint8_t value, and it is read-only for the guest. When the guest selects the associated fw_cfg key, the guest features are validated against the host features. In case of error, the negotiation doesn't proceed, and the "features-ok" file remains zero. In case of success, the "features-ok" file becomes (uint8_t)1, and the negotiated features are locked down internally (to which no further changes are possible until reset). The initial value is zero. A nonzero value causes the SMI-related fw_cfg files and fields that are under guest influence to be migrated. The C-language fields backing the "supported-features" and "requested-features" files are uint8_t arrays. This is because they carry guest-side representation (our choice is little endian), while VMSTATE_UINT64() assumes / implies host-side endianness for any uint64_t fields. If we migrate a guest between hosts with different endiannesses (which is possible with TCG), then the host-side value is preserved, and the host-side representation is translated. This would be visible to the guest through fw_cfg, unless we used plain byte arrays. So we do. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20170126014416.11211-2-lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27apic: save apic_delivered flagPavel Dovgalyuk
This patch implements saving/restoring of static apic_delivered variable. v8: saving static variable only for one of the APICs Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20170126123429.5412.94368.stgit@PASHA-ISP> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-23machine: Make possible_cpu_arch_ids() return const pointerIgor Mammedov
make sure that external callers won't try to modify possible_cpus and owner of possible_cpus can access it directly when it modifies it. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1484759609-264075-5-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-01-20pc.h: move x-mach-use-reliable-get-clock compat entry to PC_COMPAT_2_8Marcelo Tosatti
As noticed by David Gilbert, commit 6053a86 'kvmclock: reduce kvmclock differences on migration' added 'x-mach-use-reliable-get-clock' and a compatibility entry that turns it off; however it got merged after 2.8.0 was released but the entry has gone into PC_COMPAT_2_7 where it should have gone into PC_COMPAT_2_8. Fix it by moving the entry to PC_COMPAT_2_8. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170118175343.GA26873@amt.cnet> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-10intel_iommu: support device iotlb descriptorJason Wang
This patch enables device IOTLB support for intel iommu. The major work is to implement QI device IOTLB descriptor processing and notify the device through iommu notifier. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2016-12-22kvmclock: reduce kvmclock difference on migrationMarcelo Tosatti
Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which indicates that KVM_GET_CLOCK returns a value as seen by the guest at that moment. For new machine types, use this value rather than reading from guest memory. This reduces kvmclock difference on migration from 5s to 0.1s (when max_downtime == 5s). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20161121105052.598267440@redhat.com> [Add comment explaining what is going on. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22pc: make pit configurableChao Peng
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <1478330391-74060-4-git-send-email-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22pc: make sata configurableChao Peng
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <1478330391-74060-3-git-send-email-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22pc: make smbus configurableChao Peng
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <1478330391-74060-2-git-send-email-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-28migration/pcspk: Turn migration of pcspk off for 2.7 and olderDr. David Alan Gilbert
To keep backwards migration compatibility allow us to turn pcspk migration off. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20161128133201.16104-3-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16pc: fix FW_CFG_NB_CPUS to account for -device added CPUsIgor Mammedov
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1479301481-197333-1-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-11-16Revert "pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs"Igor Mammedov
This reverts commit 080ac219cc7d9c55adf925c3545b7450055ad625. Legacy FW_CFG_NB_CPUS will be reused instead of 'etc/boot-cpus' fw_cfg file since it does the same and there is no point to maintaing duplicate guest ABI, if it can be helped. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1479212236-183810-2-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-11-15intel_iommu: fix several incorrect endianess and bit fieldsPeter Xu
Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-02PCMachineState: introduce acpi_build_enabled fieldWei Liu
Introduce this field to control whether ACPI build is enabled by a particular machine or accelerator. It defaults to true if the machine itself supports ACPI build. Xen accelerator will disable it because Xen is in charge of building ACPI tables for the guest. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
2016-10-28clean-up: removed duplicate #includesAnand J
Some files contain multiple #includes of the same header file. Removed most of those unnecessary duplicate entries using scripts/clean-includes. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Anand J <anand.indukala@gmail.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-24pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUsIgor Mammedov
Currently firmware uses 1 byte at 0x5F offset in RTC CMOS to get number of CPUs present at boot. However 1 byte is not enough to handle more than 255 CPUs. So add a new fw_cfg file that would allow QEMU to tell it. For compat reasons add file only for machine types that support more than 255 CPUs. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24pc: apic_common: Extend APIC ID property to 32bitIgor Mammedov
ACPI ID is 32 bit wide on CPUs with x2APIC support. Extend 'id' property to support it. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17intel_iommu: reject broken EIMRadim Krčmář
Cluster x2APIC cannot work without KVM's x2apic API when the maximal APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we forbid other APICs and also the old KVM case with less than 9, to simplify the code. There is no point in enabling EIM in forbidden APICs, so we keep it enabled only for the KVM APIC; unconditionally, because making the option depend on KVM version would be a maintanance burden. Old QEMUs would enable eim whenever intremap was on, which would trick guests into thinking that they can enable cluster x2APIC even if any interrupt destination would get clamped to 8 bits. Depending on your configuration, QEMU could notice that the destination LAPIC is not present and report it with a very non-obvious: KVM: injection failed, MSI lost (Operation not permitted) Or the guest could say something about unexpected interrupts, because clamping leads to aliasing so interrupts were being delivered to incorrect VCPUs. KVM_X2APIC_API is the feature that allows us to enable EIM for KVM. QEMU 2.7 allowed EIM whenever interrupt remapping was enabled. In order to keep backward compatibility, we again allow guests to misbehave in non-obvious ways, and make it the default for old machine types. A user can enable the buggy mode it with "x-buggy-eim=on". Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17intel_iommu: add OnOffAuto intr_eim as "eim" propertyRadim Krčmář
The default (auto) emulates the current behavior. A user can now control EIM like -device intel-iommu,intremap=on,eim=off Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17apic: add send_msi() to APICCommonClassRadim Krčmář
The MMIO based interface to APIC doesn't work well with MSIs that have upper address bits set (remapped x2APIC MSIs). A specialized interface is a quick and dirty way to avoid the shortcoming. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17apic: add global apic_get_class()Radim Krčmář
Every configuration has only up to one APIC class and we'll be extending the class with a function that can be called without an instanced object, so a direct access to the class is convenient. This patch will break compilation if some code uses apic_get_class() with CONFIG_USER_ONLY. Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-10Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* Thread Sanitizer fixes (Alex) * Coverity fixes (David) * test-qht fixes (Emilio) * QOM interface for info irq/info pic (Hervé) * -rtc clock=rt fix (Junlian) * mux chardev fixes (Marc-André) * nicer report on death by signal (Michal) * qemu-tech TLC (Paolo) * MSI support for edu device (Peter) * qemu-nbd --offset fix (Tomáš) # gpg: Signature made Fri 07 Oct 2016 17:25:10 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (39 commits) qemu-doc: merge qemu-tech and qemu-doc qemu-tech: rewrite some parts qemu-tech: reorganize content qemu-tech: move TCG test documentation to tests/tcg/README qemu-tech: move user mode emulation features from qemu-tech qemu-tech: document lazy condition code evaluation in cpu.h qemu-tech: move text from qemu-tech to tcg/README qemu-doc: drop installation and compilation notes qemu-doc: replace introduction with the one from the internals manual qemu-tech: drop index test-qht: perform lookups under rcu_read_lock qht: fix unlock-after-free segfault upon resizing qht: simplify qht_reset_size qemu-nbd: Shrink image size by specified offset qemu_kill_report: Report PID name too util: Introduce qemu_get_pid_name char: update read handler in all cases char: use a fixed idx for child muxed chr i8259: give ISA device when registering ISA ioports .travis.yml: add gcc sanitizer build ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-04intc: make HMP 'info irq' and 'info pic' commands use InterruptStatsProvider ↵Hervé Poussineau
interface Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Message-Id: <1474921408-24710-6-git-send-email-hpoussin@reactos.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-03target-i386: Correct family/model/stepping for Opteron_G3Evgeny Yakovlev
Current CPU definition for AMD Opteron third generation includes features like SSE4a and LAHF_LM support in emulated CPUID. These features are present in K8 rev.E or K10 CPUs and later. However, current G3 family and model describe 2nd generation K8 cores instead. This is incorrect but was considered harmless until our tests found a problem with linux kernels >= 3.10 (and maybe earlier) which specifically check for Opteron K8 model when parsing CPUID leaf 0x80000001: http://lxr.free-electrons.com/source/arch/x86/kernel/cpu/amd.c?v=3.16#L552 This code will disable LAHF_LM feature in /proc/cpuinfo if model number is inconsistent. This change sets Opteron_G3 family/model/stepping to 16/2/3 which is a proper Opteron 3rd generation 2350 CPU. Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Richard Henderson <rth@twiddle.net> CC: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Automatically set level/xlevel/xlevel2 when neededEduardo Habkost
Instead of requiring users and management software to be aware of required CPUID level/xlevel/xlevel2 values for each feature, automatically increase those values when features need them. This was already done for CPUID[7].EBX, and is now made generic for all CPUID feature flags. Unit test included, to make sure we don't break ABI on older machine-types and don't mess with the CPUID level values if they are explicitly set by the user. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-24hw/i386: AMD IOMMU IVRS tableDavid Kiarie
Add IVRS table for AMD IOMMU. Generate IVRS or DMAR depending on emulated IOMMU. Signed-off-by: David Kiarie <davidkiarie4@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-23target-i386: turn off CPU.l3-cache only for 2.7 and older machine typesIgor Mammedov
commit (14c985cff target-i386: present virtual L3 cache info for vcpus) misplaced compat property putting it in new 2.8 machine type which would effectively to disable feature until 2.9 is released. Intent of commit probably should be to disable feature for 2.7 and older while allowing not yet released 2.8 to have feature enabled by default. Cc: qemu-stable@nongnu.org Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-23pc: clean up COMPAT macro chainingIgor Mammedov
Since commit bacc344c ("machine: add properties to compat_props incrementaly") there is no need to chain per machine type compat macro. Clean up places where it was done anyway so it will be consistent and won't confuse contributors during addtion of new machine types. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-15Remove unused function declarationsLadi Prosek
Unused function declarations were found using a simple gcc plugin and manually verified by grepping the sources. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-09target-i386: present virtual L3 cache info for vcpusLongpeng(Mike)
Some software algorithms are based on the hardware's cache info, for example, for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger a resched IPI and told cpu2 to do the wakeup if they don't share low level cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc. The relevant linux-kernel code as bellow: static void ttwu_queue(struct task_struct *p, int cpu) { struct rq *rq = cpu_rq(cpu); ...... if (... && !cpus_share_cache(smp_processor_id(), cpu)) { ...... ttwu_queue_remote(p, cpu); /* will trigger RES IPI */ return; } ...... ttwu_do_activate(rq, p, 0); /* access target's rq directly */ ...... } In real hardware, the cpus on the same socket share L3 cache, so one won't trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs under some workloads even if the virtual cpus belongs to the same virtual socket. For KVM, there will be lots of vmexit due to guest send IPIs. The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates) and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during the period: No-L3 With-L3(applied this patch) cpu0: 363890 44582 cpu1: 373405 43109 cpu2: 340783 43797 cpu3: 333854 43409 cpu4: 327170 40038 cpu5: 325491 39922 cpu6: 319129 42391 cpu7: 306480 41035 cpu8: 161139 32188 cpu9: 164649 31024 cpu10: 149823 30398 cpu11: 149823 32455 cpu12: 164830 35143 cpu13: 172269 35805 cpu14: 179979 33898 cpu15: 194505 32754 avg: 268963.6 40129.8 The VM's topology is "1*socket 8*cores 2*threads". After present virtual L3 cache info for VM, the amounts of RES IPIs in guest reduce 85%. For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause severe performance degradation. We had tested the overall system performance if vcpus actually run on sparate physical socket. With L3 cache, the performance improves 7.2%~33.1%(avg:15.7%). Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09pc: Add 2.8 machineLongpeng(Mike)
This will used by the next patch. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-08pc: keep gsi referenceMarc-André Lureau
Further cleanup would need to call qemu_free_irq() at the appropriate time, but for now this silences ASAN about direct leaks. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-09-08machine: use class base init generated nameMarc-André Lureau
machine_class_base_init() member name is allocated by machine_class_base_init(), but not freed by machine_class_finalize(). Simply freeing there doesn't work, because DEFINE_PC_MACHINE() overwrites it with a literal string. Fix DEFINE_PC_MACHINE() not to overwrite it, and add the missing free to machine_class_finalize(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-09-08pc: simplify passing qemu_irqMarc-André Lureau
qemu_irq is already a pointer, no need to have an extra pointer level. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-08-03x86: ioapic: add support for explicit EOIPeter Xu
Some old Linux kernels (upstream before v4.0), or any released RHEL kernels has problem in sending APIC EOI when IR is enabled. Meanwhile, many of them only support explicit EOI for IOAPIC, which is only introduced in IOAPIC version 0x20. This patch provide a way to boost QEMU IOAPIC to version 0x20, in order for QEMU to correctly receive EOI messages. Without boosting IOAPIC version to 0x20, kernels before commit d32932d ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces") will have trouble enabling both IR and level-triggered interrupt devices (like e1000). To upgrade IOAPIC to version 0x20, we need to specify: -global ioapic.version=0x20 To be compatible with old systems, 0x11 will still be the default IOAPIC version. Here 0x11 and 0x20 are the only versions to be supported. One thing to mention: this patch only applies to emulated IOAPIC. It does not affect kernel IOAPIC behavior. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1470059959-372-1-git-send-email-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-03apic: fix broken migration for kvm-apicIgor Mammedov
commit f6e98444 (apic: Use apic_id as apic's migration instance_id) breaks migration when in kernel irqchip is used for 2.6 and older machine types. It applies compat property only for userspace 'apic' type instead of applying it to all apic types inherited from 'apic-common' type as it was supposed to do. Fix it by setting compat property 'legacy-instance-id' for 'apic-common' type which affects inherited types (i.e. not only 'apic' but also 'kvm-apic' types) Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1469800542-11402-1-git-send-email-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-21Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pc, pci, virtio: new features, cleanups, fixes - interrupt remapping for intel iommus - a bunch of virtio cleanups - fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 21 Jul 2016 18:49:30 BST # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (57 commits) intel_iommu: avoid unnamed fields virtio: Update migration docs virtio-gpu: Wrap in vmstate virtio-gpu: Use migrate_add_blocker for virgl migration blocking virtio-input: Wrap in vmstate 9pfs: Wrap in vmstate virtio-serial: Wrap in vmstate virtio-net: Wrap in vmstate virtio-balloon: Wrap in vmstate virtio-rng: Wrap in vmstate virtio-blk: Wrap in vmstate virtio-scsi: Wrap in vmstate virtio: Migration helper function and macro virtio-serial: Remove old migration version support virtio-net: Remove old migration version support virtio-scsi: Replace HandleOutput typedef Revert "mirror: Workaround for unexpected iohandler events during completion" virtio-scsi: Call virtio_add_queue_aio virtio-blk: Call virtio_add_queue_aio virtio: Introduce virtio_add_queue_aio ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-21intel_iommu: avoid unnamed fieldsMichael S. Tsirkin
Also avoid unnamed fields for portability. Also, rename VTD_IRTE to VTD_IR_TableEntry for coding style compliance. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>