aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2023-10-23Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Stefan Hajnoczi
into staging virtio,pc,pci: features, cleanups infrastructure for vhost-vdpa shadow work piix south bridge rework reconnect for vhost-user-scsi dummy ACPI QTG DSM for cxl tests, cleanups, fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU # pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd # 7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT # ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV # 9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa # zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg= # =ILek # -----END PGP SIGNATURE----- # gpg: Signature made Sun 22 Oct 2023 02:18:43 PDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (62 commits) intel-iommu: Report interrupt remapping faults, fix return value MAINTAINERS: Add include/hw/intc/i8259.h to the PC chip section vhost-user: Fix protocol feature bit conflict tests/acpi: Update DSDT.cxl with QTG DSM hw/cxl: Add QTG _DSM support for ACPI0017 device tests/acpi: Allow update of DSDT.cxl hw/i386/cxl: ensure maxram is greater than ram size for calculating cxl range vhost-user: fix lost reconnect vhost-user-scsi: start vhost when guest kicks vhost-user-scsi: support reconnect to backend vhost: move and rename the conn retry times vhost-user-common: send get_inflight_fd once hw/i386/pc_piix: Make PIIX4 south bridge usable in PC machine hw/isa/piix: Implement multi-process QEMU support also for PIIX4 hw/isa/piix: Resolve duplicate code regarding PCI interrupt wiring hw/isa/piix: Reuse PIIX3's PCI interrupt triggering in PIIX4 hw/isa/piix: Rename functions to be shared for PCI interrupt triggering hw/isa/piix: Reuse PIIX3 base class' realize method in PIIX4 hw/isa/piix: Share PIIX3's base class with PIIX4 hw/isa/piix: Harmonize names of reset control memory regions ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-22intel-iommu: Report interrupt remapping faults, fix return valueDavid Woodhouse
A generic X86IOMMUClass->int_remap function should not return VT-d specific values; fix it to return 0 if the interrupt was successfully translated or -EINVAL if not. The VTD_FR_IR_xxx values are supposed to be used to actually raise faults through the fault reporting mechanism, so do that instead for the case where the IRQ is actually being injected. There is more work to be done here, as pretranslations for the KVM IRQ routing table can't fault; an untranslatable IRQ should be handled in userspace and the fault raised only when the IRQ actually happens (if indeed the IRTE is still not valid at that time). But we can work on that later; we can at least raise faults for the direct case. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <31bbfc9041690449d3ac891f4431ec82174ee1b4.camel@infradead.org> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/cxl: Add QTG _DSM support for ACPI0017 deviceDave Jiang
Add a simple _DSM call support for the ACPI0017 device to return fake QTG ID values of 0 and 1 in all cases. This for _DSM plumbing testing from the OS. Following edited for readability Device (CXLM) { Name (_HID, "ACPI0017") // _HID: Hardware ID ... Method (_DSM, 4, Serialized) // _DSM: Device-Specific Method { If ((Arg0 == ToUUID ("f365f9a6-a7de-4071-a66a-b40c0b4f8e52"))) { If ((Arg2 == Zero)) { Return (Buffer (One) { 0x01 }) } If ((Arg2 == One)) { Return (Package (0x02) { One, Package (0x02) { Zero, One } }) } } } Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20231012125623.21101-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/cxl: ensure maxram is greater than ram size for calculating cxl rangeAni Sinha
pc_get_device_memory_range() finds the device memory size by calculating the difference between maxram and ram sizes. This calculation makes sense only when maxram is greater than the ram size. Make sure we check for that before calling pc_get_device_memory_range(). Signed-off-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20231011105335.42296-1-anisinha@redhat.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: fix lost reconnectLi Feng
When the vhost-user is reconnecting to the backend, and if the vhost-user fails at the get_features in vhost_dev_init(), then the reconnect will fail and it will not be retriggered forever. The reason is: When the vhost-user fails at get_features, the vhost_dev_cleanup will be called immediately. vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'. The reconnect path is: vhost_user_blk_event vhost_user_async_close(.. vhost_user_blk_disconnect ..) qemu_chr_fe_set_handlers <----- clear the notifier callback schedule vhost_user_async_close_bh The vhost->vdev is null, so the vhost_user_blk_disconnect will not be called, then the event fd callback will not be reinstalled. All vhost-user devices have this issue, including vhost-user-blk/scsi. With this patch, if the vdev->vdev is null, the fd callback will still be reinstalled. Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling") Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20231009044735.941655-6-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user-scsi: start vhost when guest kicksLi Feng
Let's keep the same behavior as vhost-user-blk. Some old guests kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK. Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20231009044735.941655-5-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user-scsi: support reconnect to backendLi Feng
If the backend crashes and restarts, the device is broken. This patch adds reconnect for vhost-user-scsi. This patch also improves the error messages, and reports some silent errors. Tested with spdk backend. Signed-off-by: Li Feng <fengli@smartx.com> Message-Id: <20231009044735.941655-4-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
2023-10-22vhost: move and rename the conn retry timesLi Feng
Multiple devices need this macro, move it to a common header. Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20231009044735.941655-3-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user-common: send get_inflight_fd onceLi Feng
Currently the get_inflight_fd will be sent every time the device is started, and the backend will allocate shared memory to save the inflight state. If the backend finds that it receives the second get_inflight_fd, it will release the previous shared memory, which breaks inflight working logic. This patch is a preparation for the following patches. Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20231009044735.941655-2-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/pc_piix: Make PIIX4 south bridge usable in PC machineBernhard Beschow
QEMU's PIIX3 implementation actually models the real PIIX4, but with different PCI IDs. Usually, guests deal just fine with it. Still, in order to provide a more consistent illusion to guests, allow QEMU's PIIX4 implementation to be used in the PC machine. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-30-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Implement multi-process QEMU support also for PIIX4Bernhard Beschow
So far multi-process QEMU was only implemented for PIIX3. Move the support into the base class to achieve feature parity between both device models. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-29-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Resolve duplicate code regarding PCI interrupt wiringBernhard Beschow
Now that both PIIX3 and PIIX4 use piix_set_irq() to trigger PCI IRQs the wiring in the respective realize methods can be shared, too. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-28-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Reuse PIIX3's PCI interrupt triggering in PIIX4Bernhard Beschow
Speeds up PIIX4 which resolves an old TODO. Also makes PIIX4 compatible with Xen which relies on pci_bus_fire_intx_routing_notifier() to be fired. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-27-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Rename functions to be shared for PCI interrupt triggeringBernhard Beschow
PIIX4 will get the same optimizations which are already implemented for PIIX3. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-26-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Reuse PIIX3 base class' realize method in PIIX4Bernhard Beschow
Resolves duplicate code. Also makes PIIX4 respect the PIIX3 properties which get added, too. This allows for using PIIX4 in the PC machine. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-25-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Share PIIX3's base class with PIIX4Bernhard Beschow
Having a common base class will allow for futher code sharing between PIIX3 and PIIX4. Moreover, it makes PIIX4 implement the acpi-dev-aml-interface. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-24-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Harmonize names of reset control memory regionsBernhard Beschow
There is no need for having different names here. Having the same name further allows code to be shared between PIIX3 and PIIX4. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-23-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Allow for optional PIT creation in PIIX3Bernhard Beschow
In the PC machine, the PIT is created in board code to allow it to be virtualized with various virtualization techniques. So explicitly disable its creation in the PC machine via a property which defaults to enabled. Once the PIIX implementations are consolidated this default will keep Malta working without further ado. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-22-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix: Allow for optional PIC creation in PIIX3Bernhard Beschow
In the PC machine, the PIC is created in board code to allow it to be virtualized with various virtualization techniques. So explicitly disable its creation in the PC machine via a property which defaults to enabled. Once the PIIX implementations are consolidated this default will keep Malta working without further ado. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-21-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix3: Merge hw/isa/piix4.cBernhard Beschow
Now that the PIIX3 and PIIX4 device models are sufficiently prepared, their implementations can be merged into one file for further consolidation. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-20-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix4: Reuse struct PIIXState from PIIX3Bernhard Beschow
PIIX4 has its own, private PIIX4State structure. PIIX3 has almost the same structure, provided in a public header. So reuse it and add a cpu_intr attribute to it which is only used by PIIX4. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-19-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix4: Rename reset control operations to match PIIX3Bernhard Beschow
Both implementations are the same and will be shared upon merging. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-18-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix4: Rename "isa" attribute to "isa_irqs_in"Bernhard Beschow
Rename the "isa" attribute to align it with PIIX3 for consolidation. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-17-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix4: Remove unused inbound ISA interrupt linesBernhard Beschow
The Malta board, which is the only user of PIIX4, doesn't connect to the exported interrupt lines. PIIX3 doesn't expose such interrupt lines either, so remove them for PIIX4 for simplicity and consistency. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-16-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix3: Drop the "3" from PIIX base class nameBernhard Beschow
TYPE_PIIX3_PCI_DEVICE was the former base class of the Xen and non-Xen variants of the PIIX3 ISA device models. It will become the base class for the PIIX3 and PIIX4 device models, so drop the "3" from the type names. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-15-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix3: Create power management controller in host deviceBernhard Beschow
The power management controller is an integral part of PIIX3 (function 3). So create it as part of the south bridge. Note that the ACPI function is optional in QEMU. This is why it gets object_initialize_child()'ed in realize rather than in instance_init. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-14-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix3: Create USB controller in host deviceBernhard Beschow
The USB controller is an integral part of PIIX3 (function 2). So create it as part of the south bridge. Note that the USB function is optional in QEMU. This is why it gets object_initialize_child()'ed in realize rather than in instance_init. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-13-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix3: Create IDE controller in host deviceBernhard Beschow
The IDE controller is an integral part of PIIX3 (function 1). So create it as part of the south bridge. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-12-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/pc: Wire RTC ISA IRQs in south bridgesBernhard Beschow
Makes the south bridges a bit more self-contained and aligns PIIX3 more with PIIX4. The latter is needed for consolidating the PIIX south bridges. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-11-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix3: Wire PIC IRQs to ISA bus in host deviceBernhard Beschow
Thie PIIX3 south bridge implements both the PIC and the ISA bus, so wiring the interrupts there makes the device model more self-contained. Furthermore, this allows the ISA interrupts to be wired to internal child devices in pci_piix3_realize() which will be performed in subsequent patches. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-10-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/pc_q35: Wire ICH9 LPC function's interrupts before its realize()Bernhard Beschow
When the board assigns the ISA IRQs after the device's realize(), internal devices such as the RTC can't be wired in ich9_lpc_realize() since the qemu_irqs are still NULL. Fix that by assigning the ISA interrupts before realize(). This change is necessary for PIIX consolidation because PIIX4 wires the RTC interrupts in its realize() method, so PIIX3 needs to do so as well. Since the PC and Q35 boards share RTC code, and since PIIX3 needs the change, ICH9 needs to be adapted as well. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-9-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix3: Rename "pic" attribute to "isa_irqs_in"Bernhard Beschow
TYPE_PIIX3_DEVICE doesn't instantiate a PIC since it relies on the board to do so. The "pic" attribute, however, suggests that there is one. Rename the attribute to reflect that it represents ISA interrupt lines. Use the same naming convention as in the VIA south bridges as well as in TYPE_I82378. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-8-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/pc_piix: Remove redundant "piix3" variableBernhard Beschow
The variable is never used by its declared type. Eliminate it. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-7-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/pc_piix: Wire PIIX3's ISA interrupts by new "isa-irqs" propertyBernhard Beschow
Avoid assigning the private member of struct PIIX3State from outside which goes against best QOM practices. Instead, implement best QOM practice by adding an "isa-irqs" array property to TYPE_PIIX3_DEVICE and assign it in board code, i.e. from outside. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-6-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/isa/piix3: Resolve redundant PIIX_NUM_PIC_IRQSBernhard Beschow
PIIX_NUM_PIC_IRQS is assumed to be the same as ISA_NUM_IRQS, otherwise inconsistencies can occur. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-5-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/pc_piix: Assign PIIX3's ISA interrupts before its realize()Bernhard Beschow
Unlike its PIIX4 counterpart, TYPE_PIIX3_DEVICE doesn't instantiate a PIC itself. Instead, it relies on the board to do so. This means that the board needs to wire the ISA IRQs to the PIIX3 device model. As long as the board assigns the ISA IRQs after PIIX3's realize(), internal devices can't be wired in pci_piix3_realize() since the qemu_irqs are still NULL. Fix that by assigning the ISA interrupts before realize(). This will allow for embedding child devices into the host device as already done for PIIX4. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-4-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/pc_piix: Allow for setting properties before realizing PIIX3 south ↵Bernhard Beschow
bridge The next patches will need to take advantage of it. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20231007123843.127151-3-shentey@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/pc: Merge two if statements into oneBernhard Beschow
By being the only entity assigning a non-NULL value to "rtc_irq", the first if statement determines whether the second if statement is executed. So merge the two statements into one. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20231007123843.127151-2-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/display: fix memleak from virtio_add_resourceMatheus Tavares Bernardino
When the given uuid is already present in the hash table, virtio_add_resource() does not add the passed VirtioSharedObject. In this case, free it in the callers to avoid leaking memory. This fixed the following `make check` error, when built with --enable-sanitizers: 4/166 qemu:unit / test-virtio-dmabuf ERROR 1.51s exit status 1 ==7716==ERROR: LeakSanitizer: detected memory leaks Direct leak of 320 byte(s) in 20 object(s) allocated from: #0 0x7f6fc16e3808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144 #1 0x7f6fc1503e98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98) #2 0x564d63cafb6b in test_add_invalid_resource ../tests/unit/test-virtio-dmabuf.c:100 #3 0x7f6fc152659d (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a59d) SUMMARY: AddressSanitizer: 320 byte(s) leaked in 20 allocation(s). The changes at virtio_add_resource() itself are not strictly necessary for the memleak fix, but they make it more obvious that, on an error return, the passed object is not added to the hash. Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com> Message-Id: <c61c13f9a0c67dec473bdbfc8789c29ef26c900b.1696624734.git.quic_mathbern@quicinc.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Albert Esteve <aesteve@redhat.com> Signed-off-by: Matheus Tavares Bernardino &lt;<a href="mailto:quic_mathbern@quicinc.com" target="_blank">quic_mathbern@quicinc.com</a>&gt;<br>
2023-10-22timer/i8254: Fix one shot PIT modeDamien Zammit
Currently, the one-shot (mode 1) PIT expires far too quickly, due to the output being set under the wrong logic. This change fixes the one-shot PIT mode to behave similarly to mode 0. TESTED: using the one-shot PIT mode to calibrate a local apic timer. Signed-off-by: Damien Zammit <damien@zamaudio.com> Message-Id: <20230226015755.52624-1-damien@zamaudio.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22hw/i386/acpi-build: Remove build-time assertion on PIIX/ICH9 reset registers ↵Bernhard Beschow
being identical Commit 6103451aeb74 ("hw/i386: Build-time assertion on pc/q35 reset register being identical.") introduced a build-time check where the addresses of the reset registers are expected to be equal. Back then rev3 of the FADT was used which required the reset register to be populated and there was common code. In commit 3a3fcc75f92a ("pc: acpi: force FADT rev1 for 440fx based machine types") the FADT was downgraded to rev1 for PIIX where the reset register isn't available. Thus, there is no need for the assertion any longer, so remove it. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20231004092355.12929-1-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22virtio: call ->vhost_reset_device() during resetStefan Hajnoczi
vhost-user-scsi has a VirtioDeviceClass->reset() function that calls ->vhost_reset_device(). The other vhost devices don't notify the vhost device upon reset. Stateful vhost devices may need to handle device reset in order to free resources or prevent stale device state from interfering after reset. Call ->vhost_device_reset() from virtio_reset() so that that vhost devices are notified of device reset. This patch affects behavior as follows: - vhost-kernel: No change in behavior since ->vhost_reset_device() is not implemented. - vhost-user: back-ends that negotiate VHOST_USER_PROTOCOL_F_RESET_DEVICE now receive a VHOST_USER_DEVICE_RESET message upon device reset. Otherwise there is no change in behavior. DPDK, SPDK, libvhost-user, and the vhost-user-backend crate do not negotiate VHOST_USER_PROTOCOL_F_RESET_DEVICE automatically. - vhost-vdpa: an extra SET_STATUS 0 call is made during device reset. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20231004014532.1228637-4-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
2023-10-22vhost-backend: remove vhost_kernel_reset_device()Stefan Hajnoczi
vhost_kernel_reset_device() invokes RESET_OWNER, which disassociates the owner process from the device. The device is left non-operational since SET_OWNER is only called once during startup in vhost_dev_init(). vhost_kernel_reset_device() is never called so this latent bug never appears. Get rid of vhost_kernel_reset_device() for now. If someone needs it in the future they'll need to implement it correctly. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20231004014532.1228637-3-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
2023-10-22vhost-user: do not send RESET_OWNER on device resetStefan Hajnoczi
The VHOST_USER_RESET_OWNER message is deprecated in the spec: This is no longer used. Used to be sent to request disabling all rings, but some back-ends interpreted it to also discard connection state (this interpretation would lead to bugs). It is recommended that back-ends either ignore this message, or use it to disable all rings. The only caller of vhost_user_reset_device() is vhost_user_scsi_reset(). It checks that F_RESET_DEVICE was negotiated before calling it: static void vhost_user_scsi_reset(VirtIODevice *vdev) { VHostSCSICommon *vsc = VHOST_SCSI_COMMON(vdev); struct vhost_dev *dev = &vsc->dev; /* * Historically, reset was not implemented so only reset devices * that are expecting it. */ if (!virtio_has_feature(dev->protocol_features, VHOST_USER_PROTOCOL_F_RESET_DEVICE)) { return; } if (dev->vhost_ops->vhost_reset_device) { dev->vhost_ops->vhost_reset_device(dev); } } Therefore VHOST_USER_RESET_OWNER is actually never sent by vhost_user_reset_device(). Remove the dead code. This effectively moves the vhost-user protocol specific code from vhost-user-scsi.c into vhost-user.c where it belongs. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20231004014532.1228637-2-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2023-10-22vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronouslyLaszlo Ersek
(1) The virtio-1.2 specification <http://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html> writes: > 3 General Initialization And Device Operation > 3.1 Device Initialization > 3.1.1 Driver Requirements: Device Initialization > > [...] > > 7. Perform device-specific setup, including discovery of virtqueues for > the device, optional per-bus setup, reading and possibly writing the > device’s virtio configuration space, and population of virtqueues. > > 8. Set the DRIVER_OK status bit. At this point the device is “live”. and > 4 Virtio Transport Options > 4.1 Virtio Over PCI Bus > 4.1.4 Virtio Structure PCI Capabilities > 4.1.4.3 Common configuration structure layout > 4.1.4.3.2 Driver Requirements: Common configuration structure layout > > [...] > > The driver MUST configure the other virtqueue fields before enabling the > virtqueue with queue_enable. > > [...] (The same statements are present in virtio-1.0 identically, at <http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html>.) These together mean that the following sub-sequence of steps is valid for a virtio-1.0 guest driver: (1.1) set "queue_enable" for the needed queues as the final part of device initialization step (7), (1.2) set DRIVER_OK in step (8), (1.3) immediately start sending virtio requests to the device. (2) When vhost-user is enabled, and the VHOST_USER_F_PROTOCOL_FEATURES special virtio feature is negotiated, then virtio rings start in disabled state, according to <https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#ring-states>. In this case, explicit VHOST_USER_SET_VRING_ENABLE messages are needed for enabling vrings. Therefore setting "queue_enable" from the guest (1.1) -- which is technically "buffered" on the QEMU side until the guest sets DRIVER_OK (1.2) -- is a *control plane* operation, which -- after (1.2) -- travels from the guest through QEMU to the vhost-user backend, using a unix domain socket. Whereas sending a virtio request (1.3) is a *data plane* operation, which evades QEMU -- it travels from guest to the vhost-user backend via eventfd. This means that operations ((1.1) + (1.2)) and (1.3) travel through different channels, and their relative order can be reversed, as perceived by the vhost-user backend. That's exactly what happens when OVMF's virtiofs driver (VirtioFsDxe) runs against the Rust-language virtiofsd version 1.7.2. (Which uses version 0.10.1 of the vhost-user-backend crate, and version 0.8.1 of the vhost crate.) Namely, when VirtioFsDxe binds a virtiofs device, it goes through the device initialization steps (i.e., control plane operations), and immediately sends a FUSE_INIT request too (i.e., performs a data plane operation). In the Rust-language virtiofsd, this creates a race between two components that run *concurrently*, i.e., in different threads or processes: - Control plane, handling vhost-user protocol messages: The "VhostUserSlaveReqHandlerMut::set_vring_enable" method [crates/vhost-user-backend/src/handler.rs] handles VHOST_USER_SET_VRING_ENABLE messages, and updates each vring's "enabled" flag according to the message processed. - Data plane, handling virtio / FUSE requests: The "VringEpollHandler::handle_event" method [crates/vhost-user-backend/src/event_loop.rs] handles the incoming virtio / FUSE request, consuming the virtio kick at the same time. If the vring's "enabled" flag is set, the virtio / FUSE request is processed genuinely. If the vring's "enabled" flag is clear, then the virtio / FUSE request is discarded. Note that OVMF enables the queue *first*, and sends FUSE_INIT *second*. However, if the data plane processor in virtiofsd wins the race, then it sees the FUSE_INIT *before* the control plane processor took notice of VHOST_USER_SET_VRING_ENABLE and green-lit the queue for the data plane processor. Therefore the latter drops FUSE_INIT on the floor, and goes back to waiting for further virtio / FUSE requests with epoll_wait. Meanwhile OVMF is stuck waiting for the FUSET_INIT response -- a deadlock. The deadlock is not deterministic. OVMF hangs infrequently during first boot. However, OVMF hangs almost certainly during reboots from the UEFI shell. The race can be "reliably masked" by inserting a very small delay -- a single debug message -- at the top of "VringEpollHandler::handle_event", i.e., just before the data plane processor checks the "enabled" field of the vring. That delay suffices for the control plane processor to act upon VHOST_USER_SET_VRING_ENABLE. We can deterministically prevent the race in QEMU, by blocking OVMF inside step (1.2) -- i.e., in the write to the device status register that "unleashes" queue enablement -- until VHOST_USER_SET_VRING_ENABLE actually *completes*. That way OVMF's VCPU cannot advance to the FUSE_INIT submission before virtiofsd's control plane processor takes notice of the queue being enabled. Wait for VHOST_USER_SET_VRING_ENABLE completion by: - setting the NEED_REPLY flag on VHOST_USER_SET_VRING_ENABLE, and waiting for the reply, if the VHOST_USER_PROTOCOL_F_REPLY_ACK vhost-user feature has been negotiated, or - performing a separate VHOST_USER_GET_FEATURES *exchange*, which requires a backend response regardless of VHOST_USER_PROTOCOL_F_REPLY_ACK. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> [lersek@redhat.com: work Eugenio's explanation into the commit message, about QEMU containing step (1.1) until step (1.2)] Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-8-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: allow "vhost_set_vring" to wait for a replyLaszlo Ersek
The "vhost_set_vring" function already centralizes the common parts of "vhost_user_set_vring_num", "vhost_user_set_vring_base" and "vhost_user_set_vring_enable". We'll want to allow some of those callers to wait for a reply. Therefore, rebase "vhost_set_vring" from just "vhost_user_write" to "vhost_user_write_sync", exposing the "wait_for_reply" parameter. This is purely refactoring -- there is no observable change. That's because: - all three callers pass in "false" for "wait_for_reply", which disables all logic in "vhost_user_write_sync" except the call to "vhost_user_write"; - the fds=NULL and fd_num=0 arguments of the original "vhost_user_write" call inside "vhost_set_vring" are hard-coded within "vhost_user_write_sync". Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-7-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: hoist "write_sync", "get_features", "get_u64"Laszlo Ersek
In order to avoid a forward-declaration for "vhost_user_write_sync" in a subsequent patch, hoist "vhost_user_write_sync" -> "vhost_user_get_features" -> "vhost_user_get_u64" just above "vhost_set_vring". This is purely code movement -- no observable change. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-6-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: flatten "enforce_reply" into "vhost_user_write_sync"Laszlo Ersek
At this point, only "vhost_user_write_sync" calls "enforce_reply"; embed the latter into the former. This is purely refactoring -- no observable change. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-5-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: factor out "vhost_user_write_sync"Laszlo Ersek
The tails of the "vhost_user_set_vring_addr" and "vhost_user_set_u64" functions are now byte-for-byte identical. Factor the common tail out to a new function called "vhost_user_write_sync". This is purely refactoring -- no observable change. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-4-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: tighten "reply_supported" scope in "set_vring_addr"Laszlo Ersek
In the vhost_user_set_vring_addr() function, we calculate "reply_supported" unconditionally, even though we'll only need it if "wait_for_reply" is also true. Restrict the scope of "reply_supported" to the minimum. This is purely refactoring -- no observable change. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-3-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>