aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2023-07-31vhost: register and change IOMMU flag depending on Device-TLB stateViktor Prutyanov
The guest can disable or never enable Device-TLB. In these cases, it can't be used even if enabled in QEMU. So, check Device-TLB state before registering IOMMU notifier and select unmap flag depending on that. Also, implement a way to change IOMMU notifier flag if Device-TLB state is changed. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2001312 Signed-off-by: Viktor Prutyanov <viktor@daynix.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230626091258.24453-2-viktor@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit ee071f67f7a103c66f85f68ffe083712929122e3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31virtio-pci: add handling of PCI ATS and Device-TLB enable/disableViktor Prutyanov
According to PCIe Address Translation Services specification 5.1.3., ATS Control Register has Enable bit to enable/disable ATS. Guest may enable/disable PCI ATS and, accordingly, Device-TLB for the VirtIO PCI device. So, raise/lower a flag and call a trigger function to pass this event to a device implementation. Signed-off-by: Viktor Prutyanov <viktor@daynix.com> Message-Id: <20230512135122.70403-2-viktor@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 206e91d143301414df2deb48a411e402414ba6db) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: include/hw/virtio/virtio.h: skip extra struct field added in 8.0)
2023-07-15hw/ide/piix: properly initialize the BMIBA registerOlaf Hering
According to the 82371FB documentation (82371FB.pdf, 2.3.9. BMIBA-BUS MASTER INTERFACE BASE ADDRESS REGISTER, April 1997), the register is 32bit wide. To properly reset it to default values, all 32bit need to be cleared. Bit #0 "Resource Type Indicator (RTE)" needs to be enabled. The initial change wrote just the lower 8 bit, leaving parts of the "Bus Master Interface Base Address" address at bit 15:4 unchanged. Fixes: e6a71ae327 ("Add support for 82371FB (Step A1) and Improved support for 82371SB (Function 1)") Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230712074721.14728-1-olaf@aepfle.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 230dfd9257e92259876c113e58b5f0d22b056d2e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-30vfio/pci: Call vfio_prepare_kvm_msi_virq_batch() in MSI retry pathShameer Kolothum
When vfio_enable_vectors() returns with less than requested nr_vectors we retry with what kernel reported back. But the retry path doesn't call vfio_prepare_kvm_msi_virq_batch() and this results in, qemu-system-aarch64: vfio: Error: Failed to enable 4 MSI vectors, retry with 1 qemu-system-aarch64: ../hw/vfio/pci.c:602: vfio_commit_kvm_msi_virq_batch: Assertion `vdev->defer_kvm_irq_routing' failed Fixes: dc580d51f7dd ("vfio: defer to commit kvm irq routing when enable msi/msix") Reviewed-by: Longpeng <longpeng2@huawei.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> (cherry picked from commit c17408892319712c12357e5d1c6b305499c58c2a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-30vfio/pci: Fix a segfault in vfio_realizeZhenzhong Duan
The kvm irqchip notifier is only registered if the device supports INTx, however it's unconditionally removed in vfio realize error path. If the assigned device does not support INTx, this will cause QEMU to crash when vfio realize fails. Change it to conditionally remove the notifier only if the notify hook is setup. Before fix: (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1 Connection closed by foreign host. After fix: (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1 Error: vfio 0000:81:11.1: xres and yres properties require display=on (qemu) Fixes: c5478fea27ac ("vfio/pci: Respond to KVM irqchip change notifier") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> (cherry picked from commit 357bd7932a136613d700ee8bc83e9165f059d1f7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-30target/ppc: Fix decrementer time underflow and infinite timer loopNicholas Piggin
It is possible to store a very large value to the decrementer that it does not raise the decrementer exception so the timer is scheduled, but the next time value wraps and is treated as in the past. This can occur if (u64)-1 is stored on a zero-triggered exception, or (u64)-1 is stored twice on an underflow-triggered exception, for example. If such a value is set in DECAR, it gets stored to the decrementer by the timer function, which then immediately causes another timer, which hangs QEMU. Clamp the decrementer to the implemented width, and use that as the value for the timer calculation, effectively preventing this overflow. Reported-by: sdicaro@DDCI.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20230530131214.373524-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> (cherry picked from commit 09d2db9f46e38e2da990df8ad914d735d764251a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-28virtio-gpu: Make non-gl display updates work again when blob=trueVivek Kasireddy
In the case where the console does not have gl capability, and if blob is set to true, make sure that the display updates still work. Commit e86a93f55463 accidentally broke this by misplacing the return statement (in resource_flush) causing the updates to be silently ignored. Fixes: e86a93f55463 ("virtio-gpu: splitting one extended mode guest fb into n-scanouts") Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Dongwon Kim <dongwon.kim@intel.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-ID: <20230623060454.3749910-1-vivek.kasireddy@intel.com> (cherry picked from commit 34e29d85a7734802317c4cac9ad52b10d461c1dc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26vhost: release memory_listener object in error pathPrasad Pandit
vhost_dev_start function does not release memory_listener object in case of an error. This may crash the guest when vhost is unable to set memory table: stack trace of thread 125653: Program terminated with signal SIGSEGV, Segmentation fault #0 memory_listener_register (qemu-kvm + 0x6cda0f) #1 vhost_dev_start (qemu-kvm + 0x699301) #2 vhost_net_start (qemu-kvm + 0x45b03f) #3 virtio_net_set_status (qemu-kvm + 0x665672) #4 qmp_set_link (qemu-kvm + 0x548fd5) #5 net_vhost_user_event (qemu-kvm + 0x552c45) #6 tcp_chr_connect (qemu-kvm + 0x88d473) #7 tcp_chr_new_client (qemu-kvm + 0x88cf83) #8 tcp_chr_accept (qemu-kvm + 0x88b429) #9 qio_net_listener_channel_func (qemu-kvm + 0x7ac07c) #10 g_main_context_dispatch (libglib-2.0.so.0 + 0x54e2f) Release memory_listener objects in the error path. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> Message-Id: <20230529114333.31686-2-ppandit@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Fixes: c471ad0e9b ("vhost_net: device IOTLB support") Cc: qemu-stable@nongnu.org Acked-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 1e3ffb34f764f8ac4c003b2b2e6a775b2b073a16) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26target/hppa: Provide qemu version via fw_cfg to firmwareHelge Deller
Give current QEMU version string to SeaBIOS-hppa via fw_cfg interface so that the firmware can show the QEMU version in the boot menu info. Signed-off-by: Helge Deller <deller@gmx.de> (cherry picked from commit 069d296669448b9eef72c6332ae84af962d9582c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26target/hppa: Fix OS reboot issuesHelge Deller
When the OS triggers a reboot, the reset helper function sends a qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET) together with an EXCP_HLT exception to halt the CPUs. So, at reboot when initializing the CPUs again, make sure to set all instruction pointers to the firmware entry point, disable any interrupts, disable data and instruction translations, enable PSW_Q bit and tell qemu to unhalt (halted=0) the CPUs again. This fixes the various reboot issues which were seen when rebooting a Linux VM, including the case where even the monarch CPU has been virtually halted from the OS (e.g. via "chcpu -d 0" inside the Linux VM). Signed-off-by: Helge Deller <deller@gmx.de> (cherry picked from commit 50ba97e928b44ff5bc731c9ffe68d86acbe44639) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-22hw/timer/nrf51_timer: Don't lose time when timer is queried in tight loopPeter Maydell
The nrf51_timer has a free-running counter which we implement using the pattern of using two fields (update_counter_ns, counter) to track the last point at which we calculated the counter value, and the counter value at that time. Then we can find the current counter value by converting the difference in wall-clock time between then and now to a tick count that we need to add to the counter value. Unfortunately the nrf51_timer's implementation of this has a bug which means it loses time every time update_counter() is called. After updating s->counter it always sets s->update_counter_ns to 'now', even though the actual point when s->counter hit the new value will be some point in the past (half a tick, say). In the worst case (guest code in a tight loop reading the counter, icount mode) the counter is continually queried less than a tick after it was last read, so s->counter never advances but s->update_counter_ns does, and the guest never makes forward progress. The fix for this is to only advance update_counter_ns to the timestamp of the last tick, not all the way to 'now'. (This is the pattern used in hw/misc/mps2-fpgaio.c's counter.) Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Joel Stanley <joel@jms.id.au> Message-id: 20230606134917.3782215-1-peter.maydell@linaro.org (cherry picked from commit d2f9a79a8cf6ab992e1d0f27ad05b3e582d2b18a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-22hw/intc/allwinner-a10-pic: Handle IRQ levels other than 0 or 1Peter Maydell
In commit 2c5fa0778c3b430 we fixed an endianness bug in the Allwinner A10 PIC model; however in the process we introduced a regression. This is because the old code was robust against the incoming 'level' argument being something other than 0 or 1, whereas the new code was not. In particular, the allwinner-sdhost code treats its IRQ line as 0-vs-non-0 rather than 0-vs-1, so when the SD controller set its IRQ line for any reason other than transmit the interrupt controller would ignore it. The observed effect was a guest timeout when rebooting the guest kernel. Handle level values other than 0 or 1, to restore the old behaviour. Fixes: 2c5fa0778c3b430 ("hw/intc/allwinner-a10-pic: Don't use set_bit()/clear_bit()") (Mjt: 5eb742fce562dc7 in stable-7.2) Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Message-id: 20230606104609.3692557-2-peter.maydell@linaro.org (cherry picked from commit f837b468cdaa7e736b5385c7dc4f8c5adcad3bf1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-16aspeed/hace: Initialize g_autofree pointerCédric Le Goater
As mentioned in docs/devel/style.rst "Automatic memory deallocation": * Variables declared with g_auto* MUST always be initialized, otherwise the cleanup function will use uninitialized stack memory This avoids QEMU to coredump when running the "hash test" command under Zephyr. Cc: Steven Lee <steven_lee@aspeedtech.com> Cc: Joel Stanley <joel@jms.id.au> Cc: qemu-stable@nongnu.org Fixes: c5475b3f9a ("hw: Model ASPEED's Hash and Crypto Engine") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Message-Id: <20230421131547.2177449-1-clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org> (cherry picked from commit c8f48b120b31f6bbe33135ef5d478e485c37e3c2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-14hw/riscv: qemu crash when NUMA nodes exceed available CPUsYin Wang
Command "qemu-system-riscv64 -machine virt -m 2G -smp 1 -numa node,mem=1G -numa node,mem=1G" would trigger this problem.Backtrace with: #0 0x0000555555b5b1a4 in riscv_numa_get_default_cpu_node_id at ../hw/riscv/numa.c:211 #1 0x00005555558ce510 in machine_numa_finish_cpu_init at ../hw/core/machine.c:1230 #2 0x00005555558ce9d3 in machine_run_board_init at ../hw/core/machine.c:1346 #3 0x0000555555aaedc3 in qemu_init_board at ../softmmu/vl.c:2513 #4 0x0000555555aaf064 in qmp_x_exit_preconfig at ../softmmu/vl.c:2609 #5 0x0000555555ab1916 in qemu_init at ../softmmu/vl.c:3617 #6 0x000055555585463b in main at ../softmmu/main.c:47 This commit fixes the issue by adding parameter checks. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com> Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn> Signed-off-by: Yin Wang <yin.wang@intel.com> Message-Id: <20230519023758.1759434-1-yin.wang@intel.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com> (cherry picked from commit b9cedbf19cb4be04908a3a623f0f237875483499) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-11hw/remote: Fix vfu_cfg trace offset formatMattias Nissler
The printed offset value is prefixed with 0x, but was actually printed in decimal. To spare others the confusion, adjust the format specifier to hexadecimal. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Reviewed-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (cherry picked from commit 5fb9e8295531f957cf7ac20e89736c8963a25e04) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-089pfs: prevent opening special files (CVE-2023-2861)Christian Schoenebeck
The 9p protocol does not specifically define how server shall behave when client tries to open a special file, however from security POV it does make sense for 9p server to prohibit opening any special file on host side in general. A sane Linux 9p client for instance would never attempt to open a special file on host side, it would always handle those exclusively on its guest side. A malicious client however could potentially escape from the exported 9p tree by creating and opening a device file on host side. With QEMU this could only be exploited in the following unsafe setups: - Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough' security model. or - Using 9p 'proxy' fs driver (which is running its helper daemon as root). These setups were already discouraged for safety reasons before, however for obvious reasons we are now tightening behaviour on this. Fixes: CVE-2023-2861 Reported-by: Yanwu Shen <ywsPlz@gmail.com> Reported-by: Jietao Xiao <shawtao1125@gmail.com> Reported-by: Jinku Li <jkli@xidian.edu.cn> Reported-by: Wenbo Shen <shenwenbo@zju.edu.cn> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Message-Id: <E1q6w7r-0000Q0-NM@lizzy.crudebyte.com> (cherry picked from commit f6b0de53fb87ddefed348a39284c8e2f28dc4eda) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: drop adding qemu_fstat wrapper for 7.2 where wrappers aren't used)
2023-05-31hw/arm/xlnx-zynqmp: fix unsigned error when checking the RPUs numberClément Chigot
When passing --smp with a number lower than XLNX_ZYNQMP_NUM_APU_CPUS, the expression (ms->smp.cpus - XLNX_ZYNQMP_NUM_APU_CPUS) will result in a positive number as ms->smp.cpus is a unsigned int. This will raise the following error afterwards, as Qemu will try to instantiate some additional RPUs. | $ qemu-system-aarch64 --smp 1 -M xlnx-zcu102 | ** | ERROR:../src/tcg/tcg.c:777:tcg_register_thread: | assertion failed: (n < tcg_max_ctxs) Signed-off-by: Clément Chigot <chigot@adacore.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com> Message-id: 20230524143714.565792-1-chigot@adacore.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit c9ba1c9f02cfede5329f504cdda6fd3a256e0434) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31hw/dma/xilinx_axidma: Check DMASR.HALTED to prevent infinite loop.Tommy Wu
When we receive a packet from the xilinx_axienet and then try to s2mem through the xilinx_axidma, if the descriptor ring buffer is full in the xilinx axidma driver, we’ll assert the DMASR.HALTED in the function : stream_process_s2mem and return 0. In the end, we’ll be stuck in an infinite loop in axienet_eth_rx_notify. This patch checks the DMASR.HALTED state when we try to push data from xilinx axi-enet to xilinx axi-dma. When the DMASR.HALTED is asserted, we will not keep pushing the data and then prevent the infinte loop. Signed-off-by: Tommy Wu <tommy.wu@sifive.com> Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com> Reviewed-by: Frank Chang <frank.chang@sifive.com> Message-id: 20230519062137.1251741-1-tommy.wu@sifive.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 31afe04586efeccb80cc36ffafcd0e32a3245ffb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31hw/ppc/prep: Fix wiring of PIC -> CPU interruptBernhard Beschow
Commit cef2e7148e32 ("hw/isa/i82378: Remove intermediate IRQ forwarder") passes s->cpu_intr to i8259_init() in i82378_realize() directly. However, s- >cpu_intr isn't initialized yet since that happens after the south bridge's pci_realize_and_unref() in board code. Fix this by initializing s->cpu_intr before realizing the south bridge. Fixes: cef2e7148e32 ("hw/isa/i82378: Remove intermediate IRQ forwarder") Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20230304114043.121024-4-shentey@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> (cherry picked from commit 2237af5e60ada06d90bf714e85523deafd936b9b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-28machine: do not crash if default RAM backend name has been stolenIgor Mammedov
QEMU aborts when default RAM backend should be used (i.e. no explicit '-machine memory-backend=' specified) but user has created an object which 'id' equals to default RAM backend name used by board. $QEMU -machine pc \ -object memory-backend-ram,id=pc.ram,size=4294967296 Actual results: QEMU 7.2.0 monitor - type 'help' for more information (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239: qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container') Aborted (core dumped) Instead of abort, check for the conflicting 'id' and exit with an error, suggesting how to remedy the issue. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2207886 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20230522131717.3780533-1-imammedo@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com> (cherry picked from commit a37531f2381c4e294e48b1417089474128388b44) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-28hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)Thomas Huth
We cannot use the generic reentrancy guard in the LSI code, so we have to manually prevent endless reentrancy here. The problematic lsi_execute_script() function has already a way to detect whether too many instructions have been executed - we just have to slightly change the logic here that it also takes into account if the function has been called too often in a reentrant way. The code in fuzz-lsi53c895a-test.c has been taken from an earlier patch by Mauro Matteo Cascella. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563 Message-Id: <20230522091011.1082574-1-thuth@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com> (cherry picked from commit b987718bbb1d0eabf95499b976212dd5f0120d75) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-28usb/ohci: Set pad to 0 after frame updatePaolo Bonzini
When the OHCI controller's framenumber is incremented, HccaPad1 register should be set to zero (Ref OHCI Spec 4.4) ReactOS uses hccaPad1 to determine if the OHCI hardware is running, consequently it fails this check in current qemu master. Signed-off-by: Ryan Wendland <wendland@live.com.au> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1048 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 6301460ce9f59885e8feb65185bcfb6b128c8eff) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-28rtl8139: fix large_send_mss divide-by-zeroStefan Hajnoczi
If the driver sets large_send_mss to 0 then a divide-by-zero occurs. Even if the division wasn't a problem, the for loop that emits MSS-sized packets would never terminate. Solve these issues by skipping offloading when large_send_mss=0. This issue was found by OSS-Fuzz as part of Alexander Bulekov's device fuzzing work. The reproducer is: $ cat << EOF | ./qemu-system-i386 -display none -machine accel=qtest, -m \ 512M,slots=1,maxmem=0xffff000000000000 -machine q35 -nodefaults -device \ rtl8139,netdev=net0 -netdev user,id=net0 -device \ pc-dimm,id=nv1,memdev=mem1,addr=0xb800a64602800000 -object \ memory-backend-ram,id=mem1,size=2M -qtest stdio outl 0xcf8 0x80000814 outl 0xcfc 0xe0000000 outl 0xcf8 0x80000804 outw 0xcfc 0x06 write 0xe0000037 0x1 0x04 write 0xe00000e0 0x2 0x01 write 0x1 0x1 0x04 write 0x3 0x1 0x98 write 0xa 0x1 0x8c write 0xb 0x1 0x02 write 0xc 0x1 0x46 write 0xd 0x1 0xa6 write 0xf 0x1 0xb8 write 0xb800a646028c000c 0x1 0x08 write 0xb800a646028c000e 0x1 0x47 write 0xb800a646028c0010 0x1 0x02 write 0xb800a646028c0017 0x1 0x06 write 0xb800a646028c0036 0x1 0x80 write 0xe00000d9 0x1 0x40 EOF Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1582 Closes: https://gitlab.com/qemu-project/qemu/-/issues/1582 Cc: qemu-stable@nongnu.org Cc: Peter Maydell <peter.maydell@linaro.org> Fixes: 6d71357a3b65 ("rtl8139: honor large send MSS value") Reported-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 792676c165159c11412346870fd58fd243ab2166) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-23e1000e: Fix tx/rx counterstimothee.cocault@gmail.com
The bytes and packets counter registers are cleared on read. Copying the "total counter" registers to the "good counter" registers has side effects. If the "total" register is never read by the OS, it only gets incremented. This leads to exponential growth of the "good" register. This commit increments the counters individually to avoid this. Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 8d689f6aae8be096b4a1859be07c1b083865f755) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: removed hw/net/igb_core.c part: igb introduced in 8.0)
2023-05-23e1000: Count CRC in Tx statisticsAkihiko Odaki
The Software Developer's Manual 13.7.4.5 "Packets Transmitted (64 Bytes) Count" says: > This register counts the number of packets transmitted that are > exactly 64 bytes (from <Destination Address> through <CRC>, > inclusively) in length. It also says similar for the other Tx statistics registers. Add the number of bytes for CRC to those registers. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit c50b152485d4e10dfa1e1d7ea668f29a5fb92e9c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: pick this for 7.2 too: a fix by its own and makes next patch to apply cleanly)
2023-05-22virtio-crypto: fix NULL pointer dereference in virtio_crypto_free_requestMauro Matteo Cascella
Ensure op_info is not NULL in case of QCRYPTODEV_BACKEND_ALG_SYM algtype. Fixes: 0e660a6f90a ("crypto: Introduce RSA algorithm") Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com> Reported-by: Yiming Tao <taoym@zju.edu.cn> Message-Id: <20230509075317.1132301-1-mcascell@redhat.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: zhenwei pi<pizhenwei@bytedance.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 3e69908907f8d3dd20d5753b0777a6e3824ba824) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: context tweak after 999c789f00 cryptodev: Introduce cryptodev alg type in QAPI)
2023-05-22virtio-net: not enable vq reset feature unconditionallyEugenio Pérez
The commit 93a97dc5200a ("virtio-net: enable vq reset feature") enables unconditionally vq reset feature as long as the device is emulated. This makes impossible to actually disable the feature, and it causes migration problems from qemu version previous than 7.2. The entire final commit is unneeded as device system already enable or disable the feature properly. This reverts commit 93a97dc5200a95e63b99cb625f20b7ae802ba413. Fixes: 93a97dc5200a ("virtio-net: enable vq reset feature") Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230504101447.389398-1-eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 1fac00f70b3261050af5564b20ca55c1b2a3059a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-19vhost: fix possible wrap in SVQ descriptor ringHawkins Jiawei
QEMU invokes vhost_svq_add() when adding a guest's element into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots() to check whether QEMU can add the element into SVQ. If there is enough space, then QEMU combines some out descriptors and some in descriptors into one descriptor chain, and adds it into `svq->vring.desc` by vhost_svq_vring_write_descs(). Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx` in vhost_svq_available_slots() returns the number of occupied elements, or the number of descriptor chains, instead of the number of occupied descriptors, which may cause wrapping in SVQ descriptor ring. Here is an example. In vhost_handle_guest_kick(), QEMU forwards as many available buffers to device by virtqueue_pop() and vhost_svq_add_element(). virtqueue_pop() returns a guest's element, and then this element is added into SVQ by vhost_svq_add_element(), a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and vhost_svq_add_element() `svq->vring.num` times, vhost_svq_available_slots() thinks QEMU just ran out of slots and everything should work fine. But in fact, virtqueue_pop() returns `svq->vring.num` elements or descriptor chains, more than `svq->vring.num` descriptors due to guest memory fragmentation, and this causes wrapping in SVQ descriptor ring. This bug is valid even before marking the descriptors used. If the guest memory is fragmented, SVQ must add chains so it can try to add more descriptors than possible. This patch solves it by adding `num_free` field in VhostShadowVirtqueue structure and updating this field in vhost_svq_add() and vhost_svq_get_buf(), to record the number of free descriptors. Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding") Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230509084817.3973-1-yin31149@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> (cherry picked from commit 5d410557dea452f6231a7c66155e29a37e168528) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18scsi-generic: fix buffer overflow on block limits inquiryPaolo Bonzini
Using linux 6.x guest, at boot time, an inquiry on a scsi-generic device makes qemu crash. This is caused by a buffer overflow when scsi-generic patches the block limits VPD page. Do the operations on a temporary on-stack buffer that is guaranteed to be large enough. Reported-by: Théo Maillart <tmaillart@freebox.fr> Analyzed-by: Théo Maillart <tmaillart@freebox.fr> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 9bd634b2f5e2f10fe35d7609eb83f30583f2e15a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18Revert "vhost-user: Introduce nested event loop in vhost_user_read()"Greg Kurz
This reverts commit a7f523c7d114d445c5d83aecdba3efc038e5a692. The nested event loop is broken by design. It's only user was removed. Drop the code as well so that nobody ever tries to use it again. I had to fix a couple of trivial conflicts around return values because of 025faa872bcf ("vhost-user: stick to -errno error return convention"). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20230119172424.478268-3-groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> (cherry picked from commit 4382138f642f69fdbc79ebf4e93d84be8061191f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18Revert "vhost-user: Monitor slave channel in vhost_user_read()"Greg Kurz
This reverts commit db8a3772e300c1a656331a92da0785d81667dc81. Motivation : this is breaking vhost-user with DPDK as reported in [0]. Received unexpected msg type. Expected 22 received 40 Fail to update device iotlb Received unexpected msg type. Expected 40 received 22 Received unexpected msg type. Expected 22 received 11 Fail to update device iotlb Received unexpected msg type. Expected 11 received 22 vhost VQ 1 ring restore failed: -71: Protocol error (71) Received unexpected msg type. Expected 22 received 11 Fail to update device iotlb Received unexpected msg type. Expected 11 received 22 vhost VQ 0 ring restore failed: -71: Protocol error (71) unable to start vhost net: 71: falling back on userspace virtio The failing sequence that leads to the first error is : - QEMU sends a VHOST_USER_GET_STATUS (40) request to DPDK on the master socket - QEMU starts a nested event loop in order to wait for the VHOST_USER_GET_STATUS response and to be able to process messages from the slave channel - DPDK sends a couple of legitimate IOTLB miss messages on the slave channel - QEMU processes each IOTLB request and sends VHOST_USER_IOTLB_MSG (22) updates on the master socket - QEMU assumes to receive a response for the latest VHOST_USER_IOTLB_MSG but it gets the response for the VHOST_USER_GET_STATUS instead The subsequent errors have the same root cause : the nested event loop breaks the order by design. It lures QEMU to expect responses to the latest message sent on the master socket to arrive first. Since this was only needed for DAX enablement which is still not merged upstream, just drop the code for now. A working solution will have to be merged later on. Likely protect the master socket with a mutex and service the slave channel with a separate thread, as discussed with Maxime in the mail thread below. [0] https://lore.kernel.org/qemu-devel/43145ede-89dc-280e-b953-6a2b436de395@redhat.com/ Reported-by: Yanghang Liu <yanghliu@redhat.com> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2155173 Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20230119172424.478268-2-groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> (cherry picked from commit f340a59d5a852d75ae34555723694c7e8eafbd0c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18xen/pt: reserve PCI slot 2 for Intel igd-passthruChuck Zmudzinski
Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, as noted in docs/igd-assign.txt in the Qemu source code. Currently, when the xl toolstack is used to configure a Xen HVM guest with Intel IGD passthrough to the guest with the Qemu upstream device model, a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy a different slot. This problem often prevents the guest from booting. The only available workarounds are not good: Configure Xen HVM guests to use the old and no longer maintained Qemu traditional device model available from xenbits.xen.org which does reserve slot 2 for the Intel IGD or use the "pc" machine type instead of the "xenfv" machine type and add the xen platform device at slot 3 using a command line option instead of patching qemu to fix the "xenfv" machine type directly. The second workaround causes some degredation in startup performance such as a longer boot time and reduced resolution of the grub menu that is displayed on the monitor. This patch avoids that reduced startup performance when using the Qemu upstream device model for Xen HVM guests configured with the igd-passthru=on option. To implement this feature in the Qemu upstream device model for Xen HVM guests, introduce the following new functions, types, and macros: * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS * typedef XenPTQdevRealize function pointer * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 * xen_igd_reserve_slot and xen_igd_clear_slot functions Michael Tsirkin: * Introduce XEN_PCI_IGD_DOMAIN, XEN_PCI_IGD_BUS, XEN_PCI_IGD_DEV, and XEN_PCI_IGD_FN - use them to compute the value of XEN_PCI_IGD_SLOT_MASK The new xen_igd_reserve_slot function uses the existing slot_reserved_mask member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using the xl toolstack with the gfx_passthru option enabled, which sets the igd-passthru=on option to Qemu for the Xen HVM machine type. The new xen_igd_reserve_slot function also needs to be implemented in hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, in which case it does nothing. The new xen_igd_clear_slot function overrides qdev->realize of the parent PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was created in hw/i386/pc_piix.c for the case when igd-passthru=on. Move the call to xen_host_pci_device_get, and the associated error handling, from xen_pt_realize to the new xen_igd_clear_slot function to initialize the device class and vendor values which enables the checks for the Intel IGD to succeed. The verification that the host device is an Intel IGD to be passed through is done by checking the domain, bus, slot, and function values as well as by checking that gfx_passthru is enabled, the device class is VGA, and the device vendor in Intel. Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Message-Id: <b1b4a21fe9a600b1322742dda55a40e9961daa57.1674346505.git.brchuckz@aol.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> (cherry picked from commit 4f67543bb8c5b031c2ad3785c1a2f3c255d72b25) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-189pfs/xen: Fix segfault on shutdownJason Andryuk
xen_9pfs_free can't use gnttabdev since it is already closed and NULL-ed out when free is called. Do the teardown in _disconnect(). This matches the setup done in _connect(). trace-events are also added for the XenDevOps functions. Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Message-Id: <20230502143722.15613-1-jandryuk@gmail.com> [C.S.: - Remove redundant return in xen_9pfs_free(). - Add comment to trace-events. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> (cherry picked from commit 92e667f6fd5806a6a705a2a43e572bd9ec6819da) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: minor context conflict in hw/9pfs/xen-9p-backend.c)
2023-05-18virtio: fix reachable assertion due to stale value of cached region sizeCarlos López
In virtqueue_{split,packed}_get_avail_bytes() descriptors are read in a loop via MemoryRegionCache regions and calls to vring_{split,packed}_desc_read() - these take a region cache and the index of the descriptor to be read. For direct descriptors we use a cache provided by the caller, whose size matches that of the virtqueue vring. We limit the number of descriptors we can read by the size of that vring: max = vq->vring.num; ... MemoryRegionCache *desc_cache = &caches->desc; For indirect descriptors, we initialize a new cache and limit the number of descriptors by the size of the intermediate descriptor: len = address_space_cache_init(&indirect_desc_cache, vdev->dma_as, desc.addr, desc.len, false); desc_cache = &indirect_desc_cache; ... max = desc.len / sizeof(VRingDesc); However, the first initialization of `max` is done outside the loop where we process guest descriptors, while the second one is done inside. This means that a sequence of an indirect descriptor followed by a direct one will leave a stale value in `max`. If the second descriptor's `next` field is smaller than the stale value, but greater than the size of the virtqueue ring (and thus the cached region), a failed assertion will be triggered in address_space_read_cached() down the call chain. Fix this by initializing `max` inside the loop in both functions. Fixes: 9796d0ac8fb0 ("virtio: use address_space_map/unmap to access descriptors") Signed-off-by: Carlos López <clopez@suse.de> Message-Id: <20230302100358.3613-1-clopez@suse.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit bbc1c327d7974261c61566cdb950cc5fa0196b41) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/virtio/vhost-user: avoid using unitialized errpAlbert Esteve
During protocol negotiation, when we the QEMU stub does not support a backend with F_CONFIG, it throws a warning and supresses the VHOST_USER_PROTOCOL_F_CONFIG bit. However, the warning uses warn_reportf_err macro and passes an unitialized errp pointer. However, the macro tries to edit the 'msg' member of the unitialized Error and segfaults. Instead, just use warn_report, which prints a warning message directly to the output. Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported") Signed-off-by: Albert Esteve <aesteve@redhat.com> Message-Id: <20230302121719.9390-1-aesteve@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 90e31232cf8fa7f257263dd431ea954a1ae54bff) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/net/allwinner-sun8i-emac: Correctly byteswap descriptor fieldsPeter Maydell
In allwinner-sun8i-emac we just read directly from guest memory into a host FrameDescriptor struct and back. This only works on little-endian hosts. Reading and writing of descriptors is already abstracted into functions; make those functions also handle the byte-swapping so that TransferDescriptor structs as seen by the rest of the code are always in host-order, and fix two places that were doing ad-hoc descriptor reading without using the functions. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424165053.1428857-3-peter.maydell@linaro.org (cherry picked from commit a4ae17e5ec512862bf73e40dfbb1e7db71f2c1e7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/sd/allwinner-sdhost: Correctly byteswap descriptor fieldsPeter Maydell
In allwinner_sdhost_process_desc() we just read directly from guest memory into a host TransferDescriptor struct and back. This only works on little-endian hosts. Abstract the reading and writing of descriptors into functions that handle the byte-swapping so that TransferDescriptor structs as seen by the rest of the code are always in host-order. This fixes a failure of one of the avocado tests on s390. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424165053.1428857-2-peter.maydell@linaro.org (cherry picked from commit 3e20d90824c262de6887aa1bc52af94db69e4310) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/intc/allwinner-a10-pic: Don't use set_bit()/clear_bit()Peter Maydell
The Allwinner PIC model uses set_bit() and clear_bit() to update the values in its irq_pending[] array when an interrupt arrives. However it is using these functions wrongly: they work on an array of type 'long', and it is passing an array of type 'uint32_t'. Because the code manually figures out the right array element, this works on little-endian hosts and on 32-bit big-endian hosts, where bits 0..31 in a 'long' are in the same place as they are in a 'uint32_t'. However it breaks on 64-bit big-endian hosts. Remove the use of set_bit() and clear_bit() in favour of using deposit32() on the array element. This fixes a bug where on big-endian 64-bit hosts the guest kernel would hang early on in bootup. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424152833.1334136-1-peter.maydell@linaro.org (cherry picked from commit 2c5fa0778c3b4307f9f3af7f27886c46d129c62f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/arm/raspi: Use arm_write_bootloader() to write boot codePeter Maydell
When writing the secondary-CPU stub boot loader code to the guest, use arm_write_bootloader() instead of directly calling rom_add_blob_fixed(). This fixes a bug on big-endian hosts, because arm_write_bootloader() will correctly byte-swap the host-byte-order array values into the guest-byte-order to write into the guest memory. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424152717.1333930-4-peter.maydell@linaro.org (cherry picked from commit 0acbdb4c4ab6b0a09f159bae4899b0737cf64242) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/arm/aspeed: Use arm_write_bootloader() to write the bootloaderCédric Le Goater
When writing the secondary-CPU stub boot loader code to the guest, use arm_write_bootloader() instead of directly calling rom_add_blob_fixed(). This fixes a bug on big-endian hosts, because arm_write_bootloader() will correctly byte-swap the host-byte-order array values into the guest-byte-order to write into the guest memory. Cc: qemu-stable@nongnu.org Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20230424152717.1333930-3-peter.maydell@linaro.org [PMM: Moved the "make arm_write_bootloader() function public" part to its own patch; updated commit message to note that this fixes an actual bug; adjust to the API changes noted in previous commit] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 902bba549fc386b4b9805320ed1a2e5b68478bdd) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/arm/boot: Make write_bootloader() public as arm_write_bootloader()Cédric Le Goater
The arm boot.c code includes a utility function write_bootloader() which assists in writing a boot-code fragment into guest memory, including handling endianness and fixing it up with entry point addresses and similar things. This is useful not just for the boot.c code but also in board model code, so rename it to arm_write_bootloader() and make it globally visible. Since we are making it public, make its API a little neater: move the AddressSpace* argument to be next to the hwaddr argument, and allow the fixupcontext array to be const, since we never modify it in this function. Cc: qemu-stable@nongnu.org Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20230424152717.1333930-2-peter.maydell@linaro.org [PMM: Split out from another patch by Cédric, added doc comment] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 0fe43f0abf19bbe24df3dbf0613bb47ed55f1482) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/net/msf2-emac: Don't modify descriptor in-place in emac_store_desc()Peter Maydell
The msf2-emac ethernet controller has functions emac_load_desc() and emac_store_desc() which read and write the in-memory descriptor blocks and handle conversion between guest and host endianness. As currently written, emac_store_desc() does the endianness conversion in-place; this means that it effectively consumes the input EmacDesc struct, because on a big-endian host the fields will be overwritten with the little-endian versions of their values. Unfortunately, in all the callsites the code continues to access fields in the EmacDesc struct after it has called emac_store_desc() -- specifically, it looks at the d.next field. The effect of this is that on a big-endian host networking doesn't work because the address of the next descriptor is corrupted. We could fix this by making the callsite avoid using the struct; but it's more robust to have emac_store_desc() leave its input alone. (emac_load_desc() also does an in-place conversion, but here this is fine, because the function is supposed to be initializing the struct.) Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-id: 20230424151919.1333299-1-peter.maydell@linaro.org (cherry picked from commit d565f58b38424e9a390a7ea33ff7477bab693fda) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18acpi: pcihp: allow repeating hot-unplug requestsIgor Mammedov
with Q35 using ACPI PCI hotplug by default, user's request to unplug device is ignored when it's issued before guest OS has been booted. And any additional attempt to request device hot-unplug afterwards results in following error: "Device XYZ is already in the process of unplug" arguably it can be considered as a regression introduced by [2], before which it was possible to issue unplug request multiple times. Accept new uplug requests after timeout (1ms). This brings ACPI PCI hotplug on par with native PCIe unplug behavior [1] and allows user to repeat unplug requests at propper times. Set expire timeout to arbitrary 1msec so user won't be able to flood guest with SCI interrupts by calling device_del in tight loop. PS: ACPI spec doesn't mandate what OSPM can do with GPEx.status bits set before it's booted => it's impl. depended. Status bits may be retained (I tested with one Windows version) or cleared (Linux since 2.6 kernel times) during guest's ACPI subsystem initialization. Clearing status bits (though not wrong per se) hides the unplug event from guest, and it's upto user to repeat device_del later when guest is able to handle unplug requests. 1) 18416c62e3 ("pcie: expire pending delete") 2) Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del") Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> CC: mst@redhat.com CC: anisinha@redhat.com CC: jusual@redhat.com CC: kraxel@redhat.com Message-Id: <20230418090449.2155757-1-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> (cherry picked from commit 0f689cf5ada4d5df5ab95c7f7aa9fc221afa855d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-13hw/nvme: fix memory leak in nvme_dsmKlaus Jensen
The iocb (and the allocated memory to hold LBA ranges) leaks if reading the LBA ranges fails. Fix this by adding a free and an unref of the iocb. Reported-by: Coverity (CID 1508281) Fixes: d7d1474fd85d ("hw/nvme: reimplement dsm to allow cancellation") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> (cherry picked from commit 4b32319cdacd99be983e1a74128289ef52c5964e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-10hw/arm: do not free machine->fdt in arm_load_dtb()Markus Armbruster
At this moment, arm_load_dtb() can free machine->fdt when binfo->dtb_filename is NULL. If there's no 'dtb_filename', 'fdt' will be retrieved by binfo->get_dtb(). If get_dtb() returns machine->fdt, as is the case of machvirt_dtb() from hw/arm/virt.c, fdt now has a pointer to machine->fdt. And, in that case, the existing g_free(fdt) at the end of arm_load_dtb() will make machine->fdt point to an invalid memory region. Since monitor command 'dumpdtb' was introduced a couple of releases ago, running it with any ARM machine that uses arm_load_dtb() will crash QEMU. Let's enable all arm_load_dtb() callers to use dumpdtb properly. Instead of freeing 'fdt', assign it back to ms->fdt. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-arm@nongnu.org Fixes: bf353ad55590f ("qmp/hmp, device_tree.c: introduce dumpdtb") Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-id: 20230328165935.1512846-1-armbru@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 12148d442ec3f4386c8624ffcf44c61a8b344018) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-03-30hw/pvrdma: Protect against buggy or malicious guest driverYuval Shaia
Guest driver might execute HW commands when shared buffers are not yet allocated. This could happen on purpose (malicious guest) or because of some other guest/host address mapping error. We need to protect againts such case. Fixes: CVE-2022-1050 Reported-by: Raven <wxhusst@gmail.com> Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com> Message-Id: <20220403095234.2210-1-yuval.shaia.ml@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu> (cherry picked from commit 31c4b6fb0293e359f9ef8a61892667e76eea4c99) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-03-30hw/net/vmxnet3: allow VMXNET3_MAX_MTU itself as a valueFiona Ebner
Currently, VMXNET3_MAX_MTU itself (being 9000) is not considered a valid value for the MTU, but a guest running ESXi 7.0 might try to set it and fail the assert [0]. In the Linux kernel, dev->max_mtu itself is a valid value for the MTU and for the vmxnet3 driver it's 9000, so a guest running Linux will also fail the assert when trying to set an MTU of 9000. VMXNET3_MAX_MTU and s->mtu don't seem to be used in relation to buffer allocations/accesses, so allowing the upper limit itself as a value should be fine. [0]: https://forum.proxmox.com/threads/114011/ Fixes: d05dcd94ae ("net: vmxnet3: validate configuration values during activate (CVE-2021-20203)") Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 099a63828130843741d317cb28e936f468b2b53b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-03-29intel-iommu: fail DEVIOTLB_UNMAP without dt modeJason Wang
Without dt mode, device IOTLB notifier won't work since guest won't send device IOTLB invalidation descriptor in this case. Let's fail early instead of misbehaving silently. Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Viktor Prutyanov <viktor@daynix.com> Buglink: https://bugzilla.redhat.com/2156876 Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230223065924.42503-3-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 09adb0e021207b60a0c51a68939b4539d98d3ef3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-03-29intel-iommu: fail MAP notifier without caching modeJason Wang
Without caching mode, MAP notifier won't work correctly since guest won't send IOTLB update event when it establishes new mappings in the I/O page tables. Let's fail the IOMMU notifiers early instead of misbehaving silently. Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Viktor Prutyanov <viktor@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230223065924.42503-2-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit b8d78277c091f26fdd64f239bc8bb7e55d74cecf) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-03-29vhost: avoid a potential use of an uninitialized variable in vhost_svq_poll()Carlos López
In vhost_svq_poll(), if vhost_svq_get_buf() fails due to a device providing invalid descriptors, len is left uninitialized and returned to the caller, potentally leaking stack data or causing undefined behavior. Fix this by initializing len to 0. Found with GCC 13 and -fanalyzer (abridged): ../hw/virtio/vhost-shadow-virtqueue.c: In function ‘vhost_svq_poll’: ../hw/virtio/vhost-shadow-virtqueue.c:538:12: warning: use of uninitialized value ‘len’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value] 538 | return len; | ^~~ ‘vhost_svq_poll’: events 1-4 | | 522 | size_t vhost_svq_poll(VhostShadowVirtqueue *svq) | | ^~~~~~~~~~~~~~ | | | | | (1) entry to ‘vhost_svq_poll’ |...... | 525 | uint32_t len; | | ~~~ | | | | | (2) region created on stack here | | (3) capacity: 4 bytes |...... | 528 | if (vhost_svq_more_used(svq)) { | | ~ | | | | | (4) inlined call to ‘vhost_svq_more_used’ from ‘vhost_svq_poll’ (...) | 528 | if (vhost_svq_more_used(svq)) { | | ^~~~~~~~~~~~~~~~~~~~~~~~~ | | || | | |(8) ...to here | | (7) following ‘true’ branch... |...... | 537 | vhost_svq_get_buf(svq, &len); | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (9) calling ‘vhost_svq_get_buf’ from ‘vhost_svq_poll’ | +--> ‘vhost_svq_get_buf’: events 10-11 | | 416 | static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq, | | ^~~~~~~~~~~~~~~~~ | | | | | (10) entry to ‘vhost_svq_get_buf’ |...... | 423 | if (!vhost_svq_more_used(svq)) { | | ~ | | | | | (11) inlined call to ‘vhost_svq_more_used’ from ‘vhost_svq_get_buf’ | (...) | ‘vhost_svq_get_buf’: event 14 | | 423 | if (!vhost_svq_more_used(svq)) { | | ^ | | | | | (14) following ‘false’ branch... | ‘vhost_svq_get_buf’: event 15 | |cc1: | (15): ...to here | <------+ | ‘vhost_svq_poll’: events 16-17 | | 537 | vhost_svq_get_buf(svq, &len); | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (16) returning to ‘vhost_svq_poll’ from ‘vhost_svq_get_buf’ | 538 | return len; | | ~~~ | | | | | (17) use of uninitialized value ‘len’ here Note by Laurent Vivier <lvivier@redhat.com>: The return value is only used to detect an error: vhost_svq_poll vhost_vdpa_net_cvq_add vhost_vdpa_net_load_cmd vhost_vdpa_net_load_mac -> a negative return is only used to detect error vhost_vdpa_net_load_mq -> a negative return is only used to detect error vhost_vdpa_net_handle_ctrl_avail -> a negative return is only used to detect error Fixes: d368c0b052ad ("vhost: Do not depend on !NULL VirtQueueElement on vhost_svq_flush") Signed-off-by: Carlos López <clopez@suse.de> Message-Id: <20230213085747.19956-1-clopez@suse.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit e4dd39c699b7d63a06f686ec06ded8adbee989c1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>