aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2024-07-01hw/virtio: Free vqs after vhost_dev_cleanup()Akihiko Odaki
This fixes LeakSanitizer warnings. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240627-san-v2-7-750bb0946dbd@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01i386/apic: Add hint on boot failure because of disabling x2APICZhao Liu
Currently, the Q35 supports up to 4096 vCPUs (since v9.0), but for TCG cases, if x2APIC is not actively enabled to boot more than 255 vCPUs ( e.g., qemu-system-i386 -M pc-q35-9.0 -smp 666), the following error is reported: Unexpected error in apic_common_set_id() at ../hw/intc/apic_common.c:449: qemu-system-i386: APIC ID 255 requires x2APIC feature in CPU Aborted (core dumped) This error can be resolved by setting x2apic=on in -cpu. In order to better help users deal with this scenario, add the error hint to instruct users on how to enable the x2apic feature. Then, the error report becomes the following: Unexpected error in apic_common_set_id() at ../hw/intc/apic_common.c:448: qemu-system-i386: APIC ID 255 requires x2APIC feature in CPU Try x2apic=on in -cpu. Aborted (core dumped) Note since @errp is &error_abort, error_append_hint() can't be applied on @errp. And in order to separate the exact error message from the (perhaps effectively) hint, adding a hint via error_append_hint() is also necessary. Therefore, introduce @local_error in apic_common_set_id() to handle both the error message and the error hint. Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20240606140858.2157106-1-zhao1.liu@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-pci: implement No_Soft_Reset bitJiqian Chen
In current code, when guest does S3, virtio-gpu are reset due to the bit No_Soft_Reset is not set. After resetting, the display resources of virtio-gpu are destroyed, then the display can't come back and only show blank after resuming. Implement No_Soft_Reset bit of PCI_PM_CTRL register, then guest can check this bit, if this bit is set, the devices resetting will not be done, and then the display can work after resuming. No_Soft_Reset bit is implemented for all virtio devices, and was tested only on virtio-gpu device. Set it false by default for safety. Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Message-Id: <20240606102205.114671-3-Jiqian.Chen@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/cxl: Fix read from bogus memoryIra Weiny
Peter and coverity report: We've passed '&data' to address_space_write(), which means "read from the address on the stack where the function argument 'data' lives", so instead of writing 64 bytes of data to the guest , we'll write 64 bytes which start with a host pointer value and then continue with whatever happens to be on the host stack after that. Indeed the intention was to write 64 bytes of data at the address given. Fix the parameter to address_space_write(). Reported-by: Peter Maydell <peter.maydell@linaro.org> Link: https://lore.kernel.org/all/CAFEAcA-u4sytGwTKsb__Y+_+0O2-WwARntm3x8WNhvL1WfHOBg@mail.gmail.com/ Fixes: 6bda41a69bdc ("hw/cxl: Add clear poison mailbox command support.") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Message-Id: <20240531-fix-poison-set-cacheline-v1-1-e3bc7e8f1158@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-07-01virtio-pci: Fix the failure process in kvm_virtio_pci_vector_use_one()Cindy Lu
In function kvm_virtio_pci_vector_use_one(), the function will only use the irqfd/vector for itself. Therefore, in the undo label, the failing process is incorrect. To fix this, we can just remove this label. Fixes: f9a09ca3ea ("vhost: add support for configure interrupt") Cc: qemu-stable@nongnu.org Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20240528084840.194538-1-lulu@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/misc/pvpanic: add support for normal shutdownsThomas Weißschuh
Shutdown requests are normally hardware dependent. By extending pvpanic to also handle shutdown requests, guests can submit such requests with an easily implementable and cross-platform mechanism. Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de> Message-Id: <20240527-pvpanic-shutdown-v8-5-5a28ec02558b@t-8ch.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/misc/pvpanic: centralize definition of supported eventsThomas Weißschuh
The different components of pvpanic duplicate the list of supported events. Move it to the shared header file to minimize changes when new events are added. MST: tweak: keep header included in pvpanic.c to avoid header dependency, rebase. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de> Message-Id: <20240527-pvpanic-shutdown-v8-3-5a28ec02558b@t-8ch.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/mem/cxl_type3: Allow to release extent superset in QMP interfaceFan Ni
Before the change, the QMP interface used for add/release DC extents only allows to release an extent whose DPA range is contained by a single accepted extent in the device. With the change, we relax the constraints. As long as the DPA range of the extent is covered by accepted extents, we allow the release. Tested-by: Svetly Todorov <svetly.todorov@memverge.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-15-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox supportFan Ni
With the change, we extend the extent release mailbox command processing to allow more flexible release. As long as the DPA range of the extent to release is covered by accepted extent(s) in the device, the release can be performed. Tested-by: Svetly Todorov <svetly.todorov@memverge.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-14-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/mem/cxl_type3: Add DPA range validation for accesses to DC regionsFan Ni
All DPA ranges in the DC regions are invalid to access until an extent covering the range has been successfully accepted by the host. A bitmap is added to each region to record whether a DC block in the region has been backed by a DC extent. Each bit in the bitmap represents a DC block. When a DC extent is accepted, all the bits representing the blocks in the extent are set, which will be cleared when the extent is released. Tested-by: Svetly Todorov <svetly.todorov@memverge.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-13-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extentsFan Ni
To simulate FM functionalities for initiating Dynamic Capacity Add (Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL spec r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces to issue add/release dynamic capacity extents requests. With the change, we allow to release an extent only when its DPA range is contained by a single accepted extent in the device. That is to say, extent superset release is not supported yet. 1. Add dynamic capacity extents: For example, the command to add two continuous extents (each 128MiB long) to region 0 (starting at DPA offset 0) looks like below: { "execute": "qmp_capabilities" } { "execute": "cxl-add-dynamic-capacity", "arguments": { "path": "/machine/peripheral/cxl-dcd0", "host-id": 0, "selection-policy": "prescriptive", "region": 0, "extents": [ { "offset": 0, "len": 134217728 }, { "offset": 134217728, "len": 134217728 } ] } } 2. Release dynamic capacity extents: For example, the command to release an extent of size 128MiB from region 0 (DPA offset 128MiB) looks like below: { "execute": "cxl-release-dynamic-capacity", "arguments": { "path": "/machine/peripheral/cxl-dcd0", "host-id": 0, "removal-policy":"prescriptive", "region": 0, "extents": [ { "offset": 134217728, "len": 134217728 } ] } } Tested-by: Svetly Todorov <svetly.todorov@memverge.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-12-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release ↵Fan Ni
dynamic capacity response Per CXL spec 3.1, two mailbox commands are implemented: Add Dynamic Capacity Response (Opcode 4802h) 8.2.9.9.9.3, and Release Dynamic Capacity (Opcode 4803h) 8.2.9.9.9.4. For the process of the above two commands, we use two-pass approach. Pass 1: Check whether the input payload is valid or not; if not, skip Pass 2 and return mailbox process error. Pass 2: Do the real work--add or release extents, respectively. Tested-by: Svetly Todorov <svetly.todorov@memverge.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-11-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/mem/cxl_type3: Add DC extent list representative and get DC extent list ↵Fan Ni
mailbox support Add dynamic capacity extent list representative to the definition of CXLType3Dev and implement get DC extent list mailbox command per CXL.spec.3.1:.8.2.9.9.9.2. Tested-by: Svetly Todorov <svetly.todorov@memverge.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-10-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/mem/cxl_type3: Add host backend and address space handling for DC regionsFan Ni
Add (file/memory backed) host backend for DCD. All the dynamic capacity regions will share a single, large enough host backend. Set up address space for DC regions to support read/write operations to dynamic capacity for DCD. With the change, the following support is added: 1. Add a new property to type3 device "volatile-dc-memdev" to point to host memory backend for dynamic capacity. Currently, all DC regions share one host backend; 2. Add namespace for dynamic capacity for read/write support; 3. Create cdat entries for each dynamic capacity region. Reviewed-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-9-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr size ↵Fan Ni
instead of mr as argument The function ct3_build_cdat_entries_for_mr only uses size of the passed memory region argument, refactor the function definition to make the passed arguments more specific. Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-8-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/mem/cxl_type3: Add support to create DC regions to type3 memory devicesFan Ni
With the change, when setting up memory for type3 memory device, we can create DC regions. A property 'num-dc-regions' is added to ct3_props to allow users to pass the number of DC regions to create. To make it easier, other region parameters like region base, length, and block size are hard coded. If needed, these parameters can be added easily. With the change, we can create DC regions with proper kernel side support like below: region=$(cat /sys/bus/cxl/devices/decoder0.0/create_dc_region) echo $region > /sys/bus/cxl/devices/decoder0.0/create_dc_region echo 256 > /sys/bus/cxl/devices/$region/interleave_granularity echo 1 > /sys/bus/cxl/devices/$region/interleave_ways echo "dc0" >/sys/bus/cxl/devices/decoder2.0/mode echo 0x40000000 >/sys/bus/cxl/devices/decoder2.0/dpa_size echo 0x40000000 > /sys/bus/cxl/devices/$region/size echo "decoder2.0" > /sys/bus/cxl/devices/$region/target0 echo 1 > /sys/bus/cxl/devices/$region/commit echo $region > /sys/bus/cxl/drivers/cxl_region/bind Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-7-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
2024-07-01include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for type3 ↵Fan Ni
memory devices Rename mem_size as static_mem_size for type3 memdev to cover static RAM and pmem capacity, preparing for the introduction of dynamic capacity to support dynamic capacity devices. Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-6-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative and ↵Fan Ni
mailbox command support Per cxl spec r3.1, add dynamic capacity (DC) region representative based on Table 8-165 and extend the cxl type3 device definition to include DC region information. Also, based on info in 8.2.9.9.9.1, add 'Get Dynamic Capacity Configuration' mailbox support. Note: we store region decode length as byte-wise length on the device, which should be divided by 256 * MiB before being returned to the host for "Get Dynamic Capacity Configuration" mailbox command per specification. Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-5-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output payload of ↵Fan Ni
identify memory device command Based on CXL spec r3.1 Table 8-127 (Identify Memory Device Output Payload), dynamic capacity event log size should be part of output of the Identify command. Add dc_event_log_size to the output payload for the host to get the info. Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-4-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/cxl/mailbox: interface to add CCI commands to an existing CCIGregory Price
This enables wrapper devices to customize the base device's CCI (for example, with custom commands outside the specification) without the need to change the base device. The also enabled the base device to dispatch those commands without requiring additional driver support. Heavily edited by Jonathan Cameron to increase code reuse Signed-off-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-3-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/cxl/mailbox: change CCI cmd set structure to be a member, not a referenceGregory Price
This allows devices to have fully customized CCIs, along with complex devices where wrapper devices can override or add additional CCI commands without having to replicate full command structures or pollute a base device with every command that might ever be used. Signed-off-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Fan Ni <fan.ni@samsung.com> Message-Id: <20240523174651.1089554-2-nifan.cxl@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost-user: fix lost reconnect againLi Feng
When the vhost-user is reconnecting to the backend, and if the vhost-user fails at the get_features in vhost_dev_init(), then the reconnect will fail and it will not be retriggered forever. The reason is: When the vhost-user fail at get_features, the vhost_dev_cleanup will be called immediately. vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'. The reconnect path is: vhost_user_blk_event vhost_user_async_close(.. vhost_user_blk_disconnect ..) qemu_chr_fe_set_handlers <----- clear the notifier callback schedule vhost_user_async_close_bh The vhost->vdev is null, so the vhost_user_blk_disconnect will not be called, then the event fd callback will not be reinstalled. We need to ensure that even if vhost_dev_init initialization fails, the event handler still needs to be reinstalled when s->connected is false. All vhost-user devices have this issue, including vhost-user-blk/scsi. Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling") Signed-off-by: Li Feng <fengli@smartx.com> Message-Id: <20240516025753.130171-3-fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01Revert "vhost-user: fix lost reconnect"Li Feng
This reverts commit f02a4b8e6431598612466f76aac64ab492849abf. Since the current patch cannot completely fix the lost reconnect problem, there is a scenario that is not considered: - When the virtio-blk driver is removed from the guest os, s->connected has no chance to be set to false, resulting in subsequent reconnection not being executed. The next patch will completely fix this issue with a better approach. Signed-off-by: Li Feng <fengli@smartx.com> Message-Id: <20240516025753.130171-2-fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost-user-gpu: fix import of DMABUFMarc-André Lureau
When using vhost-user-gpu with GL, qemu -display gtk doesn't show output and prints: qemu: eglCreateImageKHR failed Since commit 9ac06df8b ("virtio-gpu-udmabuf: correct naming of QemuDmaBuf size properties"), egl_dmabuf_import_texture() uses backing_{width,height} for the texture dimension. Fixes: 9ac06df8b ("virtio-gpu-udmabuf: correct naming of QemuDmaBuf size properties") Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20240515105237.1074116-1-marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-pci: only reset pm state during resettingJiqian Chen
Fix bug imported by 27ce0f3afc9dd ("fix Power Management Control Register for PCI Express virtio devices" After this change, observe that QEMU may erroneously clear the power status of the device, or may erroneously clear non writable registers, such as NO_SOFT_RESET, etc. Only state of PM_CTRL is writable. Only when flag VIRTIO_PCI_FLAG_INIT_PM is set, need to reset state. Fixes: 27ce0f3afc9dd ("fix Power Management Control Register for PCI Express virtio devices" Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Message-Id: <20240515073526.17297-2-Jiqian.Chen@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/virtio: Fix obtain the buffer id from the last descriptorWafer
The virtio-1.3 specification <https://docs.oasis-open.org/virtio/virtio/v1.3/virtio-v1.3.html> writes: 2.8.6 Next Flag: Descriptor Chaining Buffer ID is included in the last descriptor in the list. If the feature (_F_INDIRECT_DESC) has been negotiated, install only one descriptor in the virtqueue. Therefor the buffer id should be obtained from the first descriptor. In descriptor chaining scenarios, the buffer id should be obtained from the last descriptor. Fixes: 86044b24e8 ("virtio: basic packed virtqueue support") Signed-off-by: Wafer <wafer@jaguarmicro.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240510072753.26158-2-wafer@jaguarmicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost-vsock: add VIRTIO_F_RING_PACKED to feature_bitsHalil Pasic
Not having VIRTIO_F_RING_PACKED in feature_bits[] is a problem when the vhost-vsock device does not offer the feature bit VIRTIO_F_RING_PACKED but the in QEMU device is configured to try to use the packed layout (the virtio property "packed" is on). As of today, the Linux kernel vhost-vsock device does not support the packed queue layout (as vhost does not support packed), and does not offer VIRTIO_F_RING_PACKED. Thus when for example a vhost-vsock-ccw is used with packed=on, VIRTIO_F_RING_PACKED ends up being negotiated, despite the fact that the device does not actually support it, and one gets to keep the pieces. Fixes: 74b3e46630 ("virtio: add property to enable packed virtqueue") Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20240429113334.2454197-1-pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost/vhost-user: Add VIRTIO_F_NOTIFICATION_DATA to vhost feature bitsJonah Palmer
Add support for the VIRTIO_F_NOTIFICATION_DATA feature across a variety of vhost devices. The inclusion of VIRTIO_F_NOTIFICATION_DATA in the feature bits arrays for these devices ensures that the backend is capable of offering and providing support for this feature, and that it can be disabled if the backend does not support it. Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-6-jonah.palmer@oracle.com> Acked-by: Srujana Challa <schalla@marvell.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-ccw: Handle extra notification dataJonah Palmer
Add support to virtio-ccw devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-ccw device when this feature is enabled varies depending on the device's virtqueue layout. That data passed to the virtio-ccw device is in the same format as the data passed to virtio-pci devices. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-5-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-mmio: Handle extra notification dataJonah Palmer
Add support to virtio-mmio devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-mmio device when this feature is enabled varies depending on the device's virtqueue layout. The data passed to the virtio-mmio device is in the same format as the data passed to virtio-pci devices. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-4-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio: Prevent creation of device using notification-data with ioeventfdJonah Palmer
Prevent the realization of a virtio device that attempts to use the VIRTIO_F_NOTIFICATION_DATA transport feature without disabling ioeventfd. Due to ioeventfd not being able to carry the extra data associated with this feature, having both enabled is a functional mismatch and therefore Qemu should not continue the device's realization process. Although the device does not yet know if the feature will be successfully negotiated, many devices using this feature wont actually work without this extra data and would fail FEATURES_OK anyway. If ioeventfd is able to work with the extra notification data in the future, this compatibility check can be removed. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-3-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio/virtio-pci: Handle extra notification dataJonah Palmer
Add support to virtio-pci devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-pci device when this feature is enabled varies depending on the device's virtqueue layout. In a split virtqueue layout, this data includes: - upper 16 bits: shadow_avail_idx - lower 16 bits: virtqueue index In a packed virtqueue layout, this data includes: - upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx - lower 16 bits: virtqueue index Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-2-jonah.palmer@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost: Perform memory section dirty scans once per iterationSi-Wei Liu
On setups with one or more virtio-net devices with vhost on, dirty tracking iteration increases cost the bigger the number amount of queues are set up e.g. on idle guests migration the following is observed with virtio-net with vhost=on: 48 queues -> 78.11% [.] vhost_dev_sync_region.isra.13 8 queues -> 40.50% [.] vhost_dev_sync_region.isra.13 1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13 2 devices, 1 queue -> 18.60% [.] vhost_dev_sync_region.isra.14 With high memory rates the symptom is lack of convergence as soon as it has a vhost device with a sufficiently high number of queues, the sufficient number of vhost devices. On every migration iteration (every 100msecs) it will redundantly query the *shared log* the number of queues configured with vhost that exist in the guest. For the virtqueue data, this is necessary, but not for the memory sections which are the same. So essentially we end up scanning the dirty log too often. To fix that, select a vhost device responsible for scanning the log with regards to memory sections dirty tracking. It is selected when we enable the logger (during migration) and cleared when we disable the logger. If the vhost logger device goes away for some reason, the logger will be re-selected from the rest of vhost devices. After making mem-section logger a singleton instance, constant cost of 7%-9% (like the 1 queue report) will be seen, no matter how many queues or how many vhost devices are configured: 48 queues -> 8.71% [.] vhost_dev_sync_region.isra.13 2 devices, 8 queues -> 7.97% [.] vhost_dev_sync_region.isra.14 Co-developed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <1710448055-11709-2-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2024-07-01vhost: dirty log should be per backend typeSi-Wei Liu
There could be a mix of both vhost-user and vhost-kernel clients in the same QEMU process, where separate vhost loggers for the specific vhost type have to be used. Make the vhost logger per backend type, and have them properly reference counted. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <1710448055-11709-1-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01Merge tag 'pull-target-arm-20240701' of ↵Richard Henderson
https://git.linaro.org/people/pmaydell/qemu-arm into staging target-arm queue: * tests/avocado: update firmware for sbsa-ref and use all cores * hw/arm/smmu-common: Replace smmu_iommu_mr with smmu_find_sdev * arm: Fix VCMLA Dd, Dn, Dm[idx] * arm: Fix SQDMULH (by element) with Q=0 * arm: Fix FJCVTZS vs flush-to-zero * arm: More conversion of A64 AdvSIMD to decodetree * arm: Enable FEAT_Debugv8p8 for -cpu max * MAINTAINERS: Update family name for Patrick Leis * hw/arm/xilinx_zynq: Add boot-mode property * docs/system/arm: Add a doc for zynq board * hw/misc: In STM32L4x5 EXTI, correct configurable interrupts * tests/qtest: fix minor issues in STM32L4x5 tests # -----BEGIN PGP SIGNATURE----- # # iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmaC1BMZHHBldGVyLm1h # eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3nDOEACCoewjO2FJ4RFXMSmgr0Zf # jxWliu7osw7oeG4ZNq1+xMiXeW0vyS54eW41TMki1f98N/yK8v55BM8kBBvDvZaz # R5DUXpN+MtwD9A62md3B2c4mFXHqk1UOGbKi4btbtFj4lS8pV51mPmApzBUr2iTj # w6dCLciLOt87NWgtLECXsZ3evn+VlTRc+Hmfp1M/C/Rf2Qx3zis/CFHGQsZLGwzG # 2WhTpU1BKeOfsQa1VbSX6un14d72/JATFZN3rSgMbOEbvsCEeP+rnkzX57ejGyxV # 4DUx69gEAqS5bOfkQHLwy82WsunD/oIgp+GpYaYgINHzh6UkEsPoymrHAaPgV1Vh # g0TaBtbv2p89RFY1C2W2Mi4ICQ14a+oIV9FPvDsOE8Wq+wDAy/ZxZs7G6flxqods # s4JvcMqB3kUNBZaMsFVXTKdqT1PufICS+gx0VsKdKDwXcOHwMS10nTlEOPzqvoBA # phAsEbjnjWVhf03XTfCus+l5NT96lswCzPcUovb3CitSc2A1KUye3TyzHnxIqmOt # Owcl+Oiso++cgYzr/BCveTAYKYoRZzVcq5jCl4bBUH/8sLrRDbT0cpFpcMk72eE9 # VhR00kbkDfL3nKrulLsG8FeUlisX5+oGb3G5AdPtU9sqJPJMmBGaF+KniI0wi7VN # 5teHq08upLMF5JAjiKzZIA== # =faXD # -----END PGP SIGNATURE----- # gpg: Signature made Mon 01 Jul 2024 09:06:43 AM PDT # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full] # gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown] * tag 'pull-target-arm-20240701' of https://git.linaro.org/people/pmaydell/qemu-arm: (29 commits) tests/qtest: Ensure STM32L4x5 EXTI state is correct at the end of QTests hw/misc: In STM32L4x5 EXTI, correct configurable interrupts tests/qtest: Fix STM32L4x5 SYSCFG irq line 15 state assumption docs/system/arm: Add a doc for zynq board hw/arm/xilinx_zynq: Add boot-mode property hw/misc/zynq_slcr: Add boot-mode property MAINTAINERS: Update my family name target/arm: Enable FEAT_Debugv8p8 for -cpu max target/arm: Move initialization of debug ID registers target/arm: Fix indentation target/arm: Delete dead code from disas_simd_indexed target/arm: Convert FCMLA to decodetree target/arm: Convert FCADD to decodetree target/arm: Add data argument to do_fp3_vector target/arm: Convert BFMMLA, SMMLA, UMMLA, USMMLA to decodetree target/arm: Convert BFMLALB, BFMLALT to decodetree target/arm: Convert BFDOT to decodetree target/arm: Convert SUDOT, USDOT to decodetree target/arm: Convert SDOT, UDOT to decodetree target/arm: Convert SQRDMLAH, SQRDMLSH to decodetree ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-01hw/misc: In STM32L4x5 EXTI, correct configurable interruptsInès Varhol
The implementation of configurable interrupts (interrupts supporting edge selection) was incorrectly expecting alternating input levels : this commits adds a new status field `irq_levels` to actually detect edges. Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr> Message-id: 20240629110800.539969-2-ines.varhol@telecom-paris.fr Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-01hw/arm/xilinx_zynq: Add boot-mode propertySai Pavan Boddu
Read boot-mode value as machine property and propagate that to SLCR.BOOT_MODE register. Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@amd.com> Acked-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Message-id: 20240621125906.1300995-3-sai.pavan.boddu@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-01hw/misc/zynq_slcr: Add boot-mode propertySai Pavan Boddu
boot-mode property sets user values into BOOT_MODE register, on hardware these are derived from board switches. Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Message-id: 20240621125906.1300995-2-sai.pavan.boddu@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-01xen-hvm: Avoid livelock while handling buffered ioreqsRoss Lagerwall
A malicious or buggy guest may generated buffered ioreqs faster than QEMU can process them in handle_buffered_iopage(). The result is a livelock - QEMU continuously processes ioreqs on the main thread without iterating through the main loop which prevents handling other events, processing timers, etc. Without QEMU handling other events, it often results in the guest becoming unsable and makes it difficult to stop the source of buffered ioreqs. To avoid this, if we process a full page of buffered ioreqs, stop and reschedule an immediate timer to continue processing them. This lets QEMU go back to the main loop and catch up. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Message-Id: <20240404140833.1557953-1-ross.lagerwall@citrix.com> Signed-off-by: Anthony PERARD <anthony@xenproject.org>
2024-07-01xen: fix stubdom PCI addrMarek Marczykowski-Górecki
When running in a stubdomain, the config space access via sysfs needs to use BDF as seen inside stubdomain (connected via xen-pcifront), which is different from the real BDF. For other purposes (hypercall parameters etc), the real BDF needs to be used. Get the in-stubdomain BDF by looking up relevant PV PCI xenstore entries. Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <35049e99da634a74578a1ff2cb3ae4cc436ede33.1711506237.git-series.marmarek@invisiblethingslab.com> Signed-off-by: Anthony PERARD <anthony@xenproject.org>
2024-07-01hw/xen: detect when running inside stubdomainMarek Marczykowski-Górecki
Introduce global xen_is_stubdomain variable when qemu is running inside a stubdomain instead of dom0. This will be relevant for subsequent patches, as few things like accessing PCI config space need to be done differently. Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <e66aa97dca5120f22e015c19710b2ff04f525720.1711506237.git-series.marmarek@invisiblethingslab.com> Signed-off-by: Anthony PERARD <anthony@xenproject.org>
2024-07-01hw/arm/smmu-common: Replace smmu_iommu_mr with smmu_find_sdevNicolin Chen
The caller of smmu_iommu_mr wants to get sdev for smmuv3_flush_config(). Do it directly instead of bridging with an iommu mr pointer. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Message-id: 20240619002218.926674-1-nicolinc@nvidia.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-01hw/misc: Implement mailbox properties for customer OTP and device specific ↵Rayhan Faizel
private keys Four mailbox properties are implemented as follows: 1. Customer OTP: GET_CUSTOMER_OTP and SET_CUSTOMER_OTP 2. Device-specific private key: GET_PRIVATE_KEY and SET_PRIVATE_KEY. The customer OTP is located in the rows 36-43. The device-specific private key is located in the rows 56-63. The customer OTP can be locked with the magic numbers 0xffffffff 0xaffe0000 when running the SET_CUSTOMER_OTP mailbox command. Bit 6 of row 32 indicates this lock, which is undocumented. The lock also applies to the device-specific private key. Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-01hw/arm: Connect OTP device to BCM2835Rayhan Faizel
Replace stubbed OTP memory region with the new OTP device. Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-07-01hw/nvram: Add BCM2835 OTP deviceRayhan Faizel
The OTP device registers are currently stubbed. For now, the device houses the OTP rows which will be accessed directly by other peripherals. Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-06-30Merge tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu into stagingRichard Henderson
trivial patches for 2024-06-30 # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCAAdFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmaBjTkACgkQcBtPaxpp # PlmhAAf+PZEsiBvffwwNH5n1q39Hilih35p/GCVpNYKcLsFB6bLmt9A/x062NqTS # ob1Uj134ofHlSQtNjP1KxXdriwc40ZMahkTO+x6gYc+IpoRJGTGYEA0MWh4gPPYK # S6l/nOI9JK1x+ot+bQzGOzOjz3/S7RJteXzwOPlWQ7GChz8NIUPWV3DkcVP0AeT0 # 7Lq7GtDBSV5Jbne2IrvOGadjPOpJiiLEmLawmw1c9qapIKAu2wxNBMlO98ufsg6L # hDFEg6K0CKvM9fcdK8UXhnMa+58QwHhoJT+Q00aQcU1xzu+ifi/CrmgbRCK5ruTA # o0I8q6ONbK33cTzyZ/ZmKtoA8b/Rzw== # =N3GX # -----END PGP SIGNATURE----- # gpg: Signature made Sun 30 Jun 2024 09:52:09 AM PDT # gpg: using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59 # gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full] # gpg: aka "Michael Tokarev <mjt@debian.org>" [full] # gpg: aka "Michael Tokarev <mjt@corpit.ru>" [full] * tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu: hw/core/loader: gunzip(): fix memory leak on error path vl.c: select_machine(): add selected machine type to error message vl.c: select_machine(): use g_autoptr vl.c: select_machine(): use ERRP_GUARD instead of error propagation docs/system/devices/usb: Replace the non-existing "qemu" binary docs/cxl: fix some typos os-posix: Expand setrlimit() syscall compatibility net/can: Remove unused struct 'CanBusState' hw/arm/bcm2836: Remove unusued struct 'BCM283XClass' linux-user: sparc: Remove unused struct 'target_mc_fq' linux-user: cris: Remove unused struct 'rt_signal_frame' monitor: Remove obsolete stubs target/i386: Advertise MWAIT iff host supports vl: Allow multiple -overcommit commands cpu: fix memleak of 'halt_cond' and 'thread' hmp-commands-info.hx: Add missing info command for stats subcommand Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-06-30hw/core/loader: gunzip(): fix memory leak on error pathVladimir Sementsov-Ogievskiy
We should call inflateEnd() like on success path to cleanup state in s variable. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-06-30hw/arm/bcm2836: Remove unusued struct 'BCM283XClass'Dr. David Alan Gilbert
This struct has been unused since Commit f932093ae165 ("hw/arm/bcm2836: Split out common part of BCM283X classes") Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-06-30cpu: fix memleak of 'halt_cond' and 'thread'Matheus Tavares Bernardino
Since a4c2735f35 (cpu: move Qemu[Thread|Cond] setup into common code, 2024-05-30) these fields are now allocated at cpu_common_initfn(). So let's make sure we also free them at cpu_common_finalize(). Furthermore, the code also frees these on round robin, but we missed 'halt_cond'. Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-06-30hw/ufs: Fix potential bugs in MMIO read|writeMinwoo Im
This patch fixes two points reported in coverity scan report [1]. Check the MMIO access address with (addr + size), not just with the start offset addr to make sure that the requested memory access not to exceed the actual register region. We also updated (uint8_t *) to (uint32_t *) to represent we are accessing the MMIO registers by dword-sized only. [1] https://lore.kernel.org/qemu-devel/CAFEAcA82L-WZnHMW0X+Dr40bHM-EVq2ZH4DG4pdqop4xxDP2Og@mail.gmail.com/ Cc: Jeuk Kim <jeuk20.kim@gmail.com> Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Jeuk Kim <jeuk20.kim@samsung.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20240623024555.78697-1-minwoo.im.dev@gmail.com> Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>