aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2018-11-08riscv: spike: Fix memory leak in the board initAlistair Francis
Coverity caught a malloc() call that was never freed. This patch ensures that we free the memory but also updates the allocation to use g_strdup_printf() instead of malloc(). Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Suggested-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-11-08Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-3.1-20181108' ↵Peter Maydell
into staging ppc patch queue 2018-11-08 Here's another patch of accumulated ppc patches for qemu-3.1. Highlights are: * Support for nested HV KVM on POWER9 hosts * Remove Alex Graf as ppc maintainer * Emulation of external PID instructions # gpg: Signature made Thu 08 Nov 2018 12:14:27 GMT # gpg: using RSA key 6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-3.1-20181108: (22 commits) ppc/spapr_caps: Add SPAPR_CAP_NESTED_KVM_HV target/ppc: Add one reg id for ptcr This patch fixes processing of rfi instructions in icount mode. hw/ppc/ppc440_uc: Remove dead code in sdram_size() MAINTAINERS: PPC: Remove myself ppc/pnv: check size before data buffer access target/ppc: fix mtmsr instruction for icount hw/ppc/mac_newworld: Free openpic_irqs array after use macio/pmu: Fix missing vmsd terminator spapr_pci: convert g_malloc() to g_new() target/ppc: Split out float_invalid_cvt target/ppc: Split out float_invalid_op_div target/ppc: Split out float_invalid_op_mul target/ppc: Split out float_invalid_op_addsub target/ppc: Introduce fp number classification target/ppc: Remove float_check_status target/ppc: Split up float_invalid_op_excp hw/ppc/spapr_rng: Introduce CONFIG_SPAPR_RNG switch for spapr_rng.c PPC: e500: convert SysBus init method to a realize method ppc4xx_pci: convert SysBus init method to a realize method ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-11-08Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* icount fix (Clement) * dumping fixes for non-volatile memory (Marc-André, myself) * x86 emulation fix (Rudolf) * recent Hyper-V CPUID flag (Vitaly) * Q35 doc fix (Daniel) * lsi fix (Prasad) * SCSI block limits emulation fixes (myself) * qemu_thread_atexit rework (Peter) * ivshmem memory leak fix (Igor) # gpg: Signature made Tue 06 Nov 2018 21:34:30 GMT # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: util/qemu-thread-posix: Fix qemu_thread_atexit* for OSX include/qemu/thread.h: Document qemu_thread_atexit* API scsi-generic: do not do VPD emulation for sense other than ILLEGAL_REQUEST scsi-generic: avoid invalid access to struct when emulating block limits scsi-generic: avoid out-of-bounds access to VPD page list scsi-generic: keep VPD page list sorted lsi53c895a: check message length value is valid scripts/dump-guest-memory: Synchronize with guest_phys_blocks_region_add memory-mapping: skip non-volatile memory regions in GuestPhysBlockList nvdimm: set non-volatile on the memory region memory: learn about non-volatile memory region target/i386: Clear RF on SYSCALL instruction MAINTAINERS: remove or downgrade myself to reviewer from some subsystems ivshmem: fix memory backend leak i386: clarify that the Q35 machine type implements a P35 chipset x86: hv_evmcs CPU flag support icount: fix deadlock when all cpus are sleeping Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-11-08ppc/spapr_caps: Add SPAPR_CAP_NESTED_KVM_HVSuraj Jitindar Singh
Add the spapr cap SPAPR_CAP_NESTED_KVM_HV to be used to control the availability of nested kvm-hv to the level 1 (L1) guest. Assuming a hypervisor with support enabled an L1 guest can be allowed to use the kvm-hv module (and thus run it's own kvm-hv guests) by setting: -machine pseries,cap-nested-hv=true or disabled with: -machine pseries,cap-nested-hv=false Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08hw/ppc/ppc440_uc: Remove dead code in sdram_size()Peter Maydell
Coverity points out in CID 1390588 that the test for sh == 0 in sdram_size() can never fire, because we calculate sh with sh = 1024 - ((bcr >> 6) & 0x3ff); which must result in a value between 1 and 1024 inclusive. Without the relevant manual for the SoC, we're not completely sure of the correct behaviour here, but we can remove the dead code without changing how QEMU currently behaves. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08ppc/pnv: check size before data buffer accessPrasad J Pandit
While performing PowerNV memory r/w operations, the access length 'sz' could exceed the data[4] buffer size. Add check to avoid OOB access. Reported-by: Moguofang <moguofang@huawei.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08hw/ppc/mac_newworld: Free openpic_irqs array after usePeter Maydell
In ppc_core99_init(), we allocate an openpic_irqs array, which we then use to collect up the various qemu_irqs which we're going to connect to the interrupt controller. Once we've called sysbus_connect_irq() to connect them all up, the array is no longer required, but we forgot to free it. Since board init is only run once at startup, the memory leak is not a significant one. Spotted by Coverity: CID 1192916. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08macio/pmu: Fix missing vmsd terminatorDr. David Alan Gilbert
Fix missing terminator in VMStateDescription Fixes: d811d61fbc6ca5f2be2185fd7cfa916e7ba613ce Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08spapr_pci: convert g_malloc() to g_new()Greg Kurz
When allocating an array, it is a recommended coding practice to call g_new(FooType, n) instead of g_malloc(n * sizeof(FooType)) because it takes care to avoid overflow when calculating the size of the allocated block and it returns FooType *, which allows the compiler to perform type checking. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08hw/ppc/spapr_rng: Introduce CONFIG_SPAPR_RNG switch for spapr_rng.cThomas Huth
The spapr-rng device is suboptimal when compared to virtio-rng, so users might want to disable it in their builds. Thus let's introduce a proper CONFIG switch to allow us to compile QEMU without this device. The function spapr_rng_populate_dt is required for linking, so move it to a different location. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08PPC: e500: convert SysBus init method to a realize methodCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08ppc4xx_pci: convert SysBus init method to a realize methodCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08ppc440_pcix: convert SysBus init method to a realize methodCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-06scsi-generic: do not do VPD emulation for sense other than ILLEGAL_REQUESTPaolo Bonzini
Pass other sense, such as UNIT_ATTENTION or BUSY, directly to the guest. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06scsi-generic: avoid invalid access to struct when emulating block limitsPaolo Bonzini
Emulation of the block limits VPD page called back into scsi-disk.c, which however expected the request to be for a SCSIDiskState and accessed a scsi-generic device outside the bounds of its struct (namely to retrieve s->max_unmap_size and s->max_io_size). To avoid this, move the emulation code to a separate function that takes a new SCSIBlockLimits struct and marshals it into the VPD response format. Reported-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06scsi-generic: avoid out-of-bounds access to VPD page listPaolo Bonzini
A device can report an excessive number of VPD pages when asked for a list; this can cause an out-of-bounds access to buf in scsi_generic_set_vpd_bl_emulation. It should not happen, but it is technically not incorrect so handle it: do not check any byte past the allocation length that was sent to the INQUIRY command. Reported-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06scsi-generic: keep VPD page list sortedPaolo Bonzini
Block limits emulation is just placing 0xb0 as the final byte of the VPD pages list. However, VPD page numbers must be sorted, so change that to an in-place insert. Since I couldn't find any disk that triggered the loop more than once, this was tested by adding manually 0xb1 at the end of the list and checking that 0xb0 was added before. Reported-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06lsi53c895a: check message length value is validPrasad J Pandit
While writing a message in 'lsi_do_msgin', message length value in 'msg_len' could be invalid due to an invalid migration stream. Add an assertion to avoid an out of bounds access, and reject the incoming migration data if it contains an invalid message length. Discovered by Deja vu Security. Reported by Oracle. Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <20181026194314.18663-1-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06nvdimm: set non-volatile on the memory regionMarc-André Lureau
qemu-system-x86_64 -machine pc,nvdimm -m 2G,slots=4,maxmem=16G -enable-kvm -monitor stdio -object memory-backend-file,id=mem1,share=on,mem-path=/tmp/foo,size=1G -device nvdimm,id=nvdimm1,memdev=mem1 HMP info mtree command reflects the flag with "nv-" prefix on memory type: (qemu) info mtree 0000000100000000-000000013fffffff (prio 0, nv-i/o): alias nvdimm-memory @/objects/mem1 0000000000000000-000000003fffffff (qemu) info mtree -f 0000000100000000-000000013fffffff (prio 0, nv-ram): /objects/mem1 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181003114454.5662-3-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06ivshmem: fix memory backend leakIgor Mammedov
object_new() returns a new backend with refcount == 1 and then later object_property_add_child() increases refcount to 2 So when ivshmem is destroyed, the backend it has created isn't destroyed along with it as children cleanup will bring backend's refcount only to 1, which leaks backend including resources it is using. Drop the original reference from object_new() once backend is attached to its parent. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1541069086-167036-1-git-send-email-imammedo@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Fixes: 5503e285041979dd29698ecb41729b3b22622e8d Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06i386: clarify that the Q35 machine type implements a P35 chipsetDaniel P. Berrangé
The 'q35' machine type implements an Intel Series 3 chipset, of which there are several variants: https://www.intel.com/Assets/PDF/datasheet/316966.pdf The key difference between the 82P35 MCH ('p35', PCI device ID 0x29c0) and 82Q35 GMCH ('q35', PCI device ID 0x29b0) variants is that the latter has an integrated graphics adapter. QEMU does not implement integrated graphics, so uses the PCI ID for the 82P35 chipset, despite calling the machine type 'q35'. Thus we rename the PCI device ID constant to reflect reality, to avoid confusing future developers. The new name more closely matches what pci.ids reports it to be: $ grep P35 /usr/share/hwdata/pci.ids | grep 29 29c0 82G33/G31/P35/P31 Express DRAM Controller 29c1 82G33/G31/P35/P31 Express PCI Express Root Port 29c4 82G33/G31/P35/P31 Express MEI Controller 29c5 82G33/G31/P35/P31 Express MEI Controller 29c6 82G33/G31/P35/P31 Express PT IDER Controller 29c7 82G33/G31/P35/P31 Express Serial KT Controller $ grep Q35 /usr/share/hwdata/pci.ids | grep 29 29b0 82Q35 Express DRAM Controller 29b1 82Q35 Express PCI Express Root Port 29b2 82Q35 Express Integrated Graphics Controller 29b3 82Q35 Express Integrated Graphics Controller 29b4 82Q35 Express MEI Controller 29b5 82Q35 Express MEI Controller 29b6 82Q35 Express PT IDER Controller 29b7 82Q35 Express Serial KT Controller Arguably the QEMU machine type should be named 'p35'. At this point in time, however, it is not worth the churn for management applications & documentation to worry about renaming it. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180830105757.10577-1-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06Merge remote-tracking branch ↵Peter Maydell
'remotes/pmaydell/tags/pull-target-arm-20181106' into staging target-arm queue: * Remove can't-happen if() from handle_vec_simd_shli() * hw/arm/exynos4210: Zero memory allocated for Exynos4210State * Set S and PTW in 64-bit PAR format * Fix ATS1Hx instructions * milkymist: Check for failure trying to load BIOS image # gpg: Signature made Tue 06 Nov 2018 11:37:30 GMT # gpg: using RSA key 3C2525ED14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20181106: target/arm: Fix ATS1Hx instructions target/arm: Set S and PTW in 64-bit PAR format hw/arm/exynos4210: Zero memory allocated for Exynos4210State milkymist: Check for failure trying to load BIOS image target/arm: Remove can't-happen if() from handle_vec_simd_shli() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-11-06Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pci, pc, virtio: fixes, features AMD IOMMU VAPIC support + fixes all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 05 Nov 2018 18:24:10 GMT # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (33 commits) vhost-scsi: prevent using uninitialized vqs piix_pci: fix i440fx data sheet link piix: use TYPE_FOO constants than string constats i440fx: use ARRAY_SIZE for pam_regions pci_bridge: fix typo in comment hw/pci: Add missing include hw/pci-bridge/ioh3420: Remove unuseful header hw/pci-bridge/xio3130: Remove unused functions tests/bios-tables-test: add 64-bit PCI MMIO aperture round-up test on Q35 bios-tables-test: prepare expected files for mmio64 hw/pci-host/x86: extend the 64-bit PCI hole relative to the fw-assigned base hw/pci-host/x86: extract get_pci_hole64_start_value() helpers pci-testdev: add optional memory bar MAINTAINERS: list "tests/acpi-test-data" files in ACPI/SMBIOS section x86_iommu/amd: Enable Guest virtual APIC support x86_iommu/amd: Add interrupt remap support when VAPIC is enabled i386: acpi: add IVHD device entry for IOAPIC x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled x86_iommu/amd: Prepare for interrupt remap support x86_iommu/amd: make the address space naming consistent with intel-iommu ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-11-06hw/arm/exynos4210: Zero memory allocated for Exynos4210StatePeter Maydell
In exynos4210_init() we allocate memory for an Exynos4210State struct. Generally devices can assume that the memory allocated for their state struct is zero-initialized; we broke that assumption here by using g_new(). Use g_new0() instead. (In particular, some code assumes that the various irq arrays in the Exynos4210Irq sub-struct are zero-initialized.) In the longer term, this code should be QOMified, and then the struct memory will be allocated elsewhere and by functions which always zero-initalize it; but for 3.1 this is a simple fix. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20181105151132.13884-1-peter.maydell@linaro.org
2018-11-06milkymist: Check for failure trying to load BIOS imagePeter Maydell
Check the return value from load_image_targphys(), which tells us whether our attempt to load the BIOS image into RAM failed. (Spotted by Coverity, CID 1190305.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Michael Walle <michael@walle.cc> Message-id: 20181030170032.1844-1-peter.maydell@linaro.org
2018-11-05vhost-scsi: prevent using uninitialized vqsyuchenlin
There are 3 virtqueues (ctrl, event and cmd) for virtio scsi device, but seabios will only set the physical address for the 3rd one (cmd). Then in vhost_virtqueue_start(), virtio_queue_get_desc_addr() will be 0 for ctrl and event vq. In this case, ctrl and event vq are not initialized. vhost_verify_ring_mappings may use uninitialized vhost_virtqueue such that vhost_verify_ring_part_mapping returns ENOMEM. When encountered this problem, we got the following logs: qemu-system-x86_64: Unable to map available ring for ring 0 qemu-system-x86_64: Verify ring failure on region 0 Signed-off-by: Forrest Liu <forrestl@synology.com> Signed-off-by: yuchenlin <yuchenlin@synology.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05piix_pci: fix i440fx data sheet linkLi Qiang
It seems that the intel link is unavailable, change it to point to the qemu site. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05piix: use TYPE_FOO constants than string constatsLi Qiang
Make them more QOMConventional. Cc:qemu-trivial@nongnu.org Signed-off-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05i440fx: use ARRAY_SIZE for pam_regionsLi Qiang
Cc: qemu-trivial@nongnu.org Signed-off-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05pci_bridge: fix typo in commentMao Zhongyi
Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05hw/pci-bridge/ioh3420: Remove unuseful headerPhilippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05hw/pci-bridge/xio3130: Remove unused functionsPhilippe Mathieu-Daudé
Introduced in 48ebf2f90f8 and faf1e708d5b, these functions were never used. Remove them. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05hw/pci-host/x86: extend the 64-bit PCI hole relative to the fw-assigned baseLaszlo Ersek
In commit 9fa99d2519cb ("hw/pci-host: Fix x86 Host Bridges 64bit PCI hole", 2017-11-16), we meant to expose such a 64-bit PCI MMIO aperture in the ACPI DSDT that would be at least as large as the new "pci-hole64-size" property (2GB on i440fx, 32GB on q35). The goal was to offer "enough" 64-bit MMIO aperture to the guest OS for hotplug purposes. In that commit, we added or modified five functions: - pc_pci_hole64_start(): shared between i440fx and q35. Provides a default 64-bit base, which starts beyond the cold-plugged 64-bit RAM, and skips the DIMM hotplug area too (if any). - i440fx_pcihost_get_pci_hole64_start(), q35_host_get_pci_hole64_start(): board-specific 64-bit base property getters called abstractly by the ACPI generator. Both of these fall back to pc_pci_hole64_start() if the firmware didn't program any 64-bit hole (i.e. if the firmware didn't assign a 64-bit GPA to any MMIO BAR on any device). Otherwise, they honor the firmware's BAR assignments (i.e., they treat the lowest 64-bit GPA programmed by the firmware as the base address for the aperture). - i440fx_pcihost_get_pci_hole64_end(), q35_host_get_pci_hole64_end(): these intended to extend the aperture to our size recommendation, calculated relative to the base of the aperture. Despite the original intent, i440fx_pcihost_get_pci_hole64_end() and q35_host_get_pci_hole64_end() currently only extend the aperture relative to the default base (pc_pci_hole64_start()), ignoring any programming done by the firmware. This means that our size recommendation may not be met. Fix it by honoring the firmware's address assignments. The strange extension sizes were spotted by Alex, in the log of a guest kernel running on top of OVMF (which prefers to assign 64-bit GPAs to 64-bit BARs). This change only affects DSDT generation, therefore no new compat property is being introduced. Using an i440fx OVMF guest with 5GB RAM, an example _CRS change is: > @@ -881,9 +881,9 @@ > QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000000800000000, // Range Minimum > - 0x000000080001C0FF, // Range Maximum > + 0x000000087FFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > - 0x000000000001C100, // Length > + 0x0000000080000000, // Length > ,, , AddressRangeMemory, TypeStatic) > }) > Device (GPE0) (On i440fx, the low RAM split is at 3GB, in this case. Therefore, with 5GB guest RAM and no DIMM hotplug range, pc_pci_hole64_start() returns 4 + (5-3) = 6 GB. Adding the 2GB extension to that yields 8GB, which is below the firmware-programmed base of 32GB, before the patch. Therefore, before the patch, the extension is ineffective. After the patch, we add the 2GB extension to the firmware-programmed base, namely 32GB.) Using a q35 OVMF guest with 5GB RAM, an example _CRS change is: > @@ -3162,9 +3162,9 @@ > QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000000800000000, // Range Minimum > - 0x00000009BFFFFFFF, // Range Maximum > + 0x0000000FFFFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > - 0x00000001C0000000, // Length > + 0x0000000800000000, // Length > ,, , AddressRangeMemory, TypeStatic) > }) > Device (GPE0) (On Q35, the low RAM split is at 2GB. Therefore, with 5GB guest RAM and no DIMM hotplug range, pc_pci_hole64_start() returns 4 + (5-2) = 7 GB. Adding the 32GB extension to that yields 39GB (0x0000_0009_BFFF_FFFF + 1), before the patch. After the patch, we add the 32GB extension to the firmware-programmed base, namely 32GB.) The ACPI test data for the bios-tables-test case that we added earlier in this series are corrected too, as follows: > @@ -3339,9 +3339,9 @@ > QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000000200000000, // Range Minimum > - 0x00000009BFFFFFFF, // Range Maximum > + 0x00000009FFFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > - 0x00000007C0000000, // Length > + 0x0000000800000000, // Length > ,, , AddressRangeMemory, TypeStatic) > }) > Device (GPE0) Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Fixes: 9fa99d2519cbf71f871e46871df12cb446dc1c3e Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05hw/pci-host/x86: extract get_pci_hole64_start_value() helpersLaszlo Ersek
Expose the calculated "hole64 start" GPAs as plain uint64_t values, extracting the internals of the current property getters. This patch doesn't change behavior. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05pci-testdev: add optional memory barGerd Hoffmann
Add memory bar to pci-testdev. Size is configurable using the membar property. Setting the size to zero (default) turns it off. Can be used to check whether guests handle large pci bars correctly. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Enable Guest virtual APIC supportSingh, Brijesh
Now that amd-iommu support interrupt remapping, enable the GASup in IVRS table and GASup in extended feature register to indicate that IOMMU support guest virtual APIC mode. GASup provides option to guest OS to make use of 128-bit IRTE. Note that the GAMSup is set to zero to indicate that amd-iommu does not support guest virtual APIC mode (aka AVIC) which would be used for the nested VMs. See Table 21 from IOMMU spec for interrupt virtualization controls Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Add interrupt remap support when VAPIC is enabledSingh, Brijesh
Emulate the interrupt remapping support when guest virtual APIC is enabled. For more information refer: IOMMU spec rev 3.0 (section 2.2.5.2) When VAPIC is enabled, it uses interrupt remapping as defined in Table 22 and Figure 17 from IOMMU spec. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05i386: acpi: add IVHD device entry for IOAPICSingh, Brijesh
When interrupt remapping is enabled, add a special IVHD device (type IOAPIC). Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Add interrupt remap support when VAPIC is not enabledSingh, Brijesh
Emulate the interrupt remapping support when guest virtual APIC is not enabled. For more info Refer: AMD IOMMU spec Rev 3.0 - section 2.2.5.1 When VAPIC is not enabled, it uses interrupt remapping as defined in Table 20 and Figure 15 from IOMMU spec. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Prepare for interrupt remap supportSingh, Brijesh
Register the interrupt remapping callback and read/write ops for the amd-iommu-ir memory region. amd-iommu-ir is set to higher priority to ensure that this region won't be masked out by other memory regions. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: make the address space naming consistent with intel-iommuSingh, Brijesh
To be consistent with intel-iommu: - rename the address space to use '_' instead of '-' - update the memory region relationships Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: remove V=1 check from amdvi_validate_dte()Singh, Brijesh
Currently, the amdvi_validate_dte() assumes that a valid DTE will always have V=1. This is not true. The V=1 means that bit[127:1] are valid. A valid DTE can have IV=1 and V=0 (i.e address translation disabled and interrupt remapping enabled) Remove the V=1 check from amdvi_validate_dte(), make the caller responsible to check for V or IV bits. This also fixes a bug in existing code that when error is detected during the translation we'll fail the translation instead of assuming a passthrough mode. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu: move vtd_generate_msi_message in common fileSingh, Brijesh
The vtd_generate_msi_message() in intel-iommu is used to construct a MSI Message from IRQ. A similar function will be needed when we add interrupt remapping support in amd-iommu. Moving the function in common file to avoid the code duplication. Rename it to x86_iommu_irq_to_msi_message(). There is no logic changes in the code flow. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu: move the kernel-irqchip check in common codeSingh, Brijesh
Interrupt remapping needs kernel-irqchip={off|split} on both Intel and AMD platforms. Move the check in common place. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05vhost-user-blk: start vhost when guest kicksYongji Xie
Some old guests (before commit 7a11370e5: "virtio_blk: enable VQs early") kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK. This violates the virtio spec. But virtio 1.0 transitional devices support this behaviour. So we should start vhost when guest kicks in this case. Signed-off-by: Yongji Xie <xieyongji@baidu.com> Signed-off-by: Chai Wen <chaiwen@baidu.com> Signed-off-by: Ni Xun <nixun@baidu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: handle invalid ce for shadow syncPeter Xu
We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing shadow page tables. Having invalid context entry there is perfectly valid when we move a device out of an existing domain. When that happens, instead of posting an error we invalidate the whole region. Without this patch, QEMU will crash if we do these steps: (1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe) (2) bind the NICs with vfio-pci in the guest (3) start testpmd with the NICs applied (4) stop testpmd (5) rebind the NIC back to ixgbe kernel driver The patch should fix it. Reported-by: Pei Zhang <pezhang@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272 Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: move ce fetching out when sync shadowPeter Xu
There are two callers for vtd_sync_shadow_page_table_range(): one provided a valid context entry and one not. Move that fetching operation into the caller vtd_sync_shadow_page_table() where we need to fetch the context entry. Meanwhile, remove the error_report_once() directly since we're already tracing all the error cases in the previous call. Instead, return error number back to caller. This will not change anything functional since callers are dropping it after all. We do this move majorly because we want to do something more later in vtd_sync_shadow_page_table(). Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: better handling of dmar state switchPeter Xu
QEMU is not handling the global DMAR switch well, especially when from "on" to "off". Let's first take the example of system reset. Assuming that a guest has IOMMU enabled. When it reboots, we will drop all the existing DMAR mappings to handle the system reset, however we'll still keep the existing memory layouts which has the IOMMU memory region enabled. So after the reboot and before the kernel reloads again, there will be no mapping at all for the host device. That's problematic since any software (for example, SeaBIOS) that runs earlier than the kernel after the reboot will assume the IOMMU is disabled, so any DMA from the software will fail. For example, a guest that boots on an assigned NVMe device might fail to find the boot device after a system reboot/reset and we'll be able to observe SeaBIOS errors if we capture the debugging log: WARNING - Timeout at nvme_wait:144! Meanwhile, we should see DMAR errors on the host of that NVMe device. It's the DMA fault that caused a NVMe driver timeout. The correct fix should be that we do proper switching of device DMA address spaces when system resets, which will setup correct memory regions and notify the backend of the devices. This might not affect much on non-assigned devices since QEMU VT-d emulation will assume a default passthrough mapping if DMAR is not enabled in the GCMD register (please refer to vtd_iommu_translate). However that's required for an assigned devices, since that'll rebuild the correct GPA to HPA mapping that is needed for any DMA operation during guest bootstrap. Besides the system reset, we have some other places that might change the global DMAR status and we'd better do the same thing there. For example, when we change the state of GCMD register, or the DMAR root pointer. Do the same refresh for all these places. For these two places we'll also need to explicitly invalidate the context entry cache and iotlb cache. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173 CC: QEMU Stable <qemu-stable@nongnu.org> Reported-by: Cong Li <coli@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> -- v2: - do the same for GCMD write, or root pointer update [Alex] - test is carried out by me this time, by observing the vtd_switch_address_space tracepoint after system reboot v3: - rewrite commit message as suggested by Alex Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: introduce vtd_reset_caches()Peter Xu
Provide the function and use it in vtd_init(). Used to reset both context entry cache and iotlb cache for the whole IOMMU unit. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05virtio-blk: fix comment for virtio_blk_rw_completeYaowei Bai
Here should be submit_requests, there is no submit_merged_requests function. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>