aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-28util/vfio-helpers: Use g_file_read_link()Akihiko Odaki
When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is 12.1.0, the compiler complains as follows: In file included from /usr/include/features.h:490, from /usr/include/bits/libc-header-start.h:33, from /usr/include/stdint.h:26, from /usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9, from /home/alarm/q/var/qemu/include/qemu/osdep.h:94, from ../util/vfio-helpers.c:13: In function 'readlink', inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9, inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18, inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9: /usr/include/bits/unistd.h:119:10: error: argument 2 is null but the corresponding size argument 3 value is 4095 [-Werror=nonnull] 119 | return __glibc_fortify (readlink, __len, sizeof (char), | ^~~~~~~~~~~~~~~ This error implies the allocated buffer can be NULL. Use g_file_read_link(), which allocates buffer automatically to avoid the error. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> (cherry picked from commit dbdea0dbfe2cef9ef6c752e9077e4fc98724194c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-28rtl8139: fix large_send_mss divide-by-zeroStefan Hajnoczi
If the driver sets large_send_mss to 0 then a divide-by-zero occurs. Even if the division wasn't a problem, the for loop that emits MSS-sized packets would never terminate. Solve these issues by skipping offloading when large_send_mss=0. This issue was found by OSS-Fuzz as part of Alexander Bulekov's device fuzzing work. The reproducer is: $ cat << EOF | ./qemu-system-i386 -display none -machine accel=qtest, -m \ 512M,slots=1,maxmem=0xffff000000000000 -machine q35 -nodefaults -device \ rtl8139,netdev=net0 -netdev user,id=net0 -device \ pc-dimm,id=nv1,memdev=mem1,addr=0xb800a64602800000 -object \ memory-backend-ram,id=mem1,size=2M -qtest stdio outl 0xcf8 0x80000814 outl 0xcfc 0xe0000000 outl 0xcf8 0x80000804 outw 0xcfc 0x06 write 0xe0000037 0x1 0x04 write 0xe00000e0 0x2 0x01 write 0x1 0x1 0x04 write 0x3 0x1 0x98 write 0xa 0x1 0x8c write 0xb 0x1 0x02 write 0xc 0x1 0x46 write 0xd 0x1 0xa6 write 0xf 0x1 0xb8 write 0xb800a646028c000c 0x1 0x08 write 0xb800a646028c000e 0x1 0x47 write 0xb800a646028c0010 0x1 0x02 write 0xb800a646028c0017 0x1 0x06 write 0xb800a646028c0036 0x1 0x80 write 0xe00000d9 0x1 0x40 EOF Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1582 Closes: https://gitlab.com/qemu-project/qemu/-/issues/1582 Cc: qemu-stable@nongnu.org Cc: Peter Maydell <peter.maydell@linaro.org> Fixes: 6d71357a3b65 ("rtl8139: honor large send MSS value") Reported-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 792676c165159c11412346870fd58fd243ab2166) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-23e1000e: Fix tx/rx counterstimothee.cocault@gmail.com
The bytes and packets counter registers are cleared on read. Copying the "total counter" registers to the "good counter" registers has side effects. If the "total" register is never read by the OS, it only gets incremented. This leads to exponential growth of the "good" register. This commit increments the counters individually to avoid this. Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 8d689f6aae8be096b4a1859be07c1b083865f755) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: removed hw/net/igb_core.c part: igb introduced in 8.0)
2023-05-23e1000: Count CRC in Tx statisticsAkihiko Odaki
The Software Developer's Manual 13.7.4.5 "Packets Transmitted (64 Bytes) Count" says: > This register counts the number of packets transmitted that are > exactly 64 bytes (from <Destination Address> through <CRC>, > inclusively) in length. It also says similar for the other Tx statistics registers. Add the number of bytes for CRC to those registers. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit c50b152485d4e10dfa1e1d7ea668f29a5fb92e9c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: pick this for 7.2 too: a fix by its own and makes next patch to apply cleanly)
2023-05-22virtio-crypto: fix NULL pointer dereference in virtio_crypto_free_requestMauro Matteo Cascella
Ensure op_info is not NULL in case of QCRYPTODEV_BACKEND_ALG_SYM algtype. Fixes: 0e660a6f90a ("crypto: Introduce RSA algorithm") Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com> Reported-by: Yiming Tao <taoym@zju.edu.cn> Message-Id: <20230509075317.1132301-1-mcascell@redhat.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: zhenwei pi<pizhenwei@bytedance.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 3e69908907f8d3dd20d5753b0777a6e3824ba824) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: context tweak after 999c789f00 cryptodev: Introduce cryptodev alg type in QAPI)
2023-05-22virtio-net: not enable vq reset feature unconditionallyEugenio Pérez
The commit 93a97dc5200a ("virtio-net: enable vq reset feature") enables unconditionally vq reset feature as long as the device is emulated. This makes impossible to actually disable the feature, and it causes migration problems from qemu version previous than 7.2. The entire final commit is unneeded as device system already enable or disable the feature properly. This reverts commit 93a97dc5200a95e63b99cb625f20b7ae802ba413. Fixes: 93a97dc5200a ("virtio-net: enable vq reset feature") Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230504101447.389398-1-eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 1fac00f70b3261050af5564b20ca55c1b2a3059a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-19vhost: fix possible wrap in SVQ descriptor ringHawkins Jiawei
QEMU invokes vhost_svq_add() when adding a guest's element into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots() to check whether QEMU can add the element into SVQ. If there is enough space, then QEMU combines some out descriptors and some in descriptors into one descriptor chain, and adds it into `svq->vring.desc` by vhost_svq_vring_write_descs(). Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx` in vhost_svq_available_slots() returns the number of occupied elements, or the number of descriptor chains, instead of the number of occupied descriptors, which may cause wrapping in SVQ descriptor ring. Here is an example. In vhost_handle_guest_kick(), QEMU forwards as many available buffers to device by virtqueue_pop() and vhost_svq_add_element(). virtqueue_pop() returns a guest's element, and then this element is added into SVQ by vhost_svq_add_element(), a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and vhost_svq_add_element() `svq->vring.num` times, vhost_svq_available_slots() thinks QEMU just ran out of slots and everything should work fine. But in fact, virtqueue_pop() returns `svq->vring.num` elements or descriptor chains, more than `svq->vring.num` descriptors due to guest memory fragmentation, and this causes wrapping in SVQ descriptor ring. This bug is valid even before marking the descriptors used. If the guest memory is fragmented, SVQ must add chains so it can try to add more descriptors than possible. This patch solves it by adding `num_free` field in VhostShadowVirtqueue structure and updating this field in vhost_svq_add() and vhost_svq_get_buf(), to record the number of free descriptors. Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding") Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230509084817.3973-1-yin31149@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> (cherry picked from commit 5d410557dea452f6231a7c66155e29a37e168528) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18target/i386: fix avx2 instructions vzeroall and vpermdqXinyu Li
vzeroall: xmm_regs should be used instead of xmm_t0 vpermdq: bit 3 and 7 of imm should be considered Signed-off-by: Xinyu Li <lixinyu20s@ict.ac.cn> Message-Id: <20230510145222.586487-1-lixinyu20s@ict.ac.cn> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 056d649007bc9fdae9f1d576e77c1316e9a34468) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18target/i386: fix operand size for VCOMI/VUCOMI instructionsPaolo Bonzini
Compared to other SSE instructions, VUCOMISx and VCOMISx are different: the single and double precision versions are distinguished through a prefix, however they use no-prefix and 0x66 for SS and SD respectively. Scalar values usually are associated with 0xF2 and 0xF3. Because of these, they incorrectly perform a 128-bit memory load instead of a 32- or 64-bit load. Fix this by writing a custom decoding function. I tested that the reproducer is fixed and the test-avx output does not change. Reported-by: Gabriele Svelto <gsvelto@mozilla.com> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1637 Fixes: f8d19eec0d53 ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18) Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 2b55e479e6fcbb466585fd25077a50c32e10dc3a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18scsi-generic: fix buffer overflow on block limits inquiryPaolo Bonzini
Using linux 6.x guest, at boot time, an inquiry on a scsi-generic device makes qemu crash. This is caused by a buffer overflow when scsi-generic patches the block limits VPD page. Do the operations on a temporary on-stack buffer that is guaranteed to be large enough. Reported-by: Théo Maillart <tmaillart@freebox.fr> Analyzed-by: Théo Maillart <tmaillart@freebox.fr> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 9bd634b2f5e2f10fe35d7609eb83f30583f2e15a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18target/arm: Fix vd == vm overlap in sve_ldff1_zRichard Henderson
If vd == vm, copy vm to scratch, so that we can pre-zero the output and still access the gather indicies. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1612 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230504104232.1877774-1-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit a6771f2f5cbfbf312e2fb5b1627f38a6bf6321d0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18migration: Attempt disk reactivation in more failure scenariosEric Blake
Commit fe904ea824 added a fail_inactivate label, which tries to reactivate disks on the source after a failure while s->state == MIGRATION_STATUS_ACTIVE, but didn't actually use the label if qemu_savevm_state_complete_precopy() failed. This failure to reactivate is also present in commit 6039dd5b1c (also covering the new s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring s->block_inactive is set more reliably). Consolidate the two labels back into one - no matter HOW migration is failed, if there is any chance we can reach vm_start() after having attempted inactivation, it is essential that we have tried to restart disks before then. This also makes the cleanup more like migrate_fd_cancel(). Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20230502205212.134680-1-eblake@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 6dab4c93ecfae48e2e67b984d1032c1e988d3005) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: minor context tweak near added comment in migration/migration.c)
2023-05-18migration: Minor control flow simplificationEric Blake
No need to declare a temporary variable. Suggested-by: Juan Quintela <quintela@redhat.com> Fixes: 1df36e8c6289 ("migration: Handle block device inactivation failures better") Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> (cherry picked from commit 5d39f44d7ac5c63f53d4d0900ceba9521bc27e49) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18migration: Handle block device inactivation failures betterEric Blake
Consider what happens when performing a migration between two host machines connected to an NFS server serving multiple block devices to the guest, when the NFS server becomes unavailable. The migration attempts to inactivate all block devices on the source (a necessary step before the destination can take over); but if the NFS server is non-responsive, the attempt to inactivate can itself fail. When that happens, the destination fails to get the migrated guest (good, because the source wasn't able to flush everything properly): (qemu) qemu-kvm: load of migration failed: Input/output error at which point, our only hope for the guest is for the source to take back control. With the current code base, the host outputs a message, but then appears to resume: (qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1) (src qemu)info status VM status: running but a second migration attempt now asserts: (src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. Whether the guest is recoverable on the source after the first failure is debatable, but what we do not want is to have qemu itself fail due to an assertion. It looks like the problem is as follows: In migration.c:migration_completion(), the source sets 'inactivate' to true (since COLO is not enabled), then tries savevm.c:qemu_savevm_state_complete_precopy() with a request to inactivate block devices. In turn, this calls block.c:bdrv_inactivate_all(), which fails when flushing runs up against the non-responsive NFS server. With savevm failing, we are now left in a state where some, but not all, of the block devices have been inactivated; but migration_completion() then jumps to 'fail' rather than 'fail_invalidate' and skips an attempt to reclaim those those disks by calling bdrv_activate_all(). Even if we do attempt to reclaim disks, we aren't taking note of failure there, either. Thus, we have reached a state where the migration engine has forgotten all state about whether a block device is inactive, because we did not set s->block_inactive in enough places; so migration allows the source to reach vm_start() and resume execution, violating the block layer invariant that the guest CPUs should not be restarted while a device is inactive. Note that the code in migration.c:migrate_fd_cancel() will also try to reactivate all block devices if s->block_inactive was set, but because we failed to set that flag after the first failure, the source assumes it has reclaimed all devices, even though it still has remaining inactivated devices and does not try again. Normally, qmp_cont() will also try to reactivate all disks (or correctly fail if the disks are not reclaimable because NFS is not yet back up), but the auto-resumption of the source after a migration failure does not go through qmp_cont(). And because we have left the block layer in an inconsistent state with devices still inactivated, the later migration attempt is hitting the assertion failure. Since it is important to not resume the source with inactive disks, this patch marks s->block_inactive before attempting inactivation, rather than after succeeding, in order to prevent any vm_start() until it has successfully reactivated all devices. See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982 Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Juan Quintela <quintela@redhat.com> (cherry picked from commit 403d18ae384239876764bbfa111d6cc5dcb673d1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18linux-user: fix getgroups/setgroups allocationsMichael Tokarev
linux-user getgroups(), setgroups(), getgroups32() and setgroups32() used alloca() to allocate grouplist arrays, with unchecked gidsetsize coming from the "guest". With NGROUPS_MAX being 65536 (linux, and it is common for an application to allocate NGROUPS_MAX for getgroups()), this means a typical allocation is half the megabyte on the stack. Which just overflows stack, which leads to immediate SIGSEGV in actual system getgroups() implementation. An example of such issue is aptitude, eg https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=811087#72 Cap gidsetsize to NGROUPS_MAX (return EINVAL if it is larger than that), and use heap allocation for grouplist instead of alloca(). While at it, fix coding style and make all 4 implementations identical. Try to not impose random limits - for example, allow gidsetsize to be negative for getgroups() - just do not allocate negative-sized grouplist in this case but still do actual getgroups() call. But do not allow negative gidsetsize for setgroups() since its argument is unsigned. Capping by NGROUPS_MAX seems a bit arbitrary, - we can do more, it is not an error if set size will be NGROUPS_MAX+1. But we should not allow integer overflow for the array being allocated. Maybe it is enough to just call g_try_new() and return ENOMEM if it fails. Maybe there's also no need to convert setgroups() since this one is usually smaller and known beforehand (KERN_NGROUPS_MAX is actually 63, - this is apparently a kernel-imposed limit for runtime group set). The patch fixes aptitude segfault mentioned above. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Message-Id: <20230409105327.1273372-1-mjt@msgid.tls.msk.ru> Signed-off-by: Laurent Vivier <laurent@vivier.eu> (cherry picked from commit 1e35d327890bdd117a67f79c52e637fb12bb1bf4) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18linux-user: Fix mips fp64 executables loadingDaniil Kovalev
If a program requires fr1, we should set the FR bit of CP0 control status register and add F64 hardware flag. The corresponding `else if` branch statement is copied from the linux kernel sources (see `arch_check_elf` function in linux/arch/mips/kernel/elf.c). Signed-off-by: Daniil Kovalev <dkovalev@compiler-toolchain-for.me> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <20230404052153.16617-1-dkovalev@compiler-toolchain-for.me> Signed-off-by: Laurent Vivier <laurent@vivier.eu> (cherry picked from commit a0f8d2701b205d9d7986aa555e0566b13dc18fa0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18tests/docker: bump the xtensa base to debian:11-slimAlex Bennée
Stretch is going out of support so things like security updates will fail. As the toolchain itself is binary it hopefully won't mind the underlying OS being updated. Message-Id: <20230503091244.1450613-3-alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit 3217b84f3cd813a7daffc64b26543c313f3a042a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18target/ppc: Fix helper_pminsn() prototypeCédric Le Goater
GCC13 reports an error: ../target/ppc/excp_helper.c:2625:6: error: conflicting types for ‘helper_pminsn’ due to enum/integer mismatch; have ‘void(CPUPPCState *, powerpc_pm_insn_t)’ {aka ‘void(struct CPUArchState *, powerpc_pm_insn_t)’} [-Werror=enum-int-mismatch] 2625 | void helper_pminsn(CPUPPCState *env, powerpc_pm_insn_t insn) | ^~~~~~~~~~~~~ In file included from /home/legoater/work/qemu/qemu.git/include/qemu/osdep.h:49, from ../target/ppc/excp_helper.c:19: /home/legoater/work/qemu/qemu.git/include/exec/helper-head.h:23:27: note: previous declaration of ‘helper_pminsn’ with type ‘void(CPUArchState *, uint32_t)’ {aka ‘void(CPUArchState *, unsigned int)’} 23 | #define HELPER(name) glue(helper_, name) | ^~~~~~~ Fixes: 7778a575c7 ("ppc: Add P7/P8 Power Management instructions") Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20230321161609.716474-4-clg@kaod.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> (cherry picked from commit 07e4804fcde1559aaa335fd680487ba308d86fb3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18Revert "vhost-user: Introduce nested event loop in vhost_user_read()"Greg Kurz
This reverts commit a7f523c7d114d445c5d83aecdba3efc038e5a692. The nested event loop is broken by design. It's only user was removed. Drop the code as well so that nobody ever tries to use it again. I had to fix a couple of trivial conflicts around return values because of 025faa872bcf ("vhost-user: stick to -errno error return convention"). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20230119172424.478268-3-groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> (cherry picked from commit 4382138f642f69fdbc79ebf4e93d84be8061191f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18Revert "vhost-user: Monitor slave channel in vhost_user_read()"Greg Kurz
This reverts commit db8a3772e300c1a656331a92da0785d81667dc81. Motivation : this is breaking vhost-user with DPDK as reported in [0]. Received unexpected msg type. Expected 22 received 40 Fail to update device iotlb Received unexpected msg type. Expected 40 received 22 Received unexpected msg type. Expected 22 received 11 Fail to update device iotlb Received unexpected msg type. Expected 11 received 22 vhost VQ 1 ring restore failed: -71: Protocol error (71) Received unexpected msg type. Expected 22 received 11 Fail to update device iotlb Received unexpected msg type. Expected 11 received 22 vhost VQ 0 ring restore failed: -71: Protocol error (71) unable to start vhost net: 71: falling back on userspace virtio The failing sequence that leads to the first error is : - QEMU sends a VHOST_USER_GET_STATUS (40) request to DPDK on the master socket - QEMU starts a nested event loop in order to wait for the VHOST_USER_GET_STATUS response and to be able to process messages from the slave channel - DPDK sends a couple of legitimate IOTLB miss messages on the slave channel - QEMU processes each IOTLB request and sends VHOST_USER_IOTLB_MSG (22) updates on the master socket - QEMU assumes to receive a response for the latest VHOST_USER_IOTLB_MSG but it gets the response for the VHOST_USER_GET_STATUS instead The subsequent errors have the same root cause : the nested event loop breaks the order by design. It lures QEMU to expect responses to the latest message sent on the master socket to arrive first. Since this was only needed for DAX enablement which is still not merged upstream, just drop the code for now. A working solution will have to be merged later on. Likely protect the master socket with a mutex and service the slave channel with a separate thread, as discussed with Maxime in the mail thread below. [0] https://lore.kernel.org/qemu-devel/43145ede-89dc-280e-b953-6a2b436de395@redhat.com/ Reported-by: Yanghang Liu <yanghliu@redhat.com> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2155173 Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20230119172424.478268-2-groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> (cherry picked from commit f340a59d5a852d75ae34555723694c7e8eafbd0c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18xen/pt: reserve PCI slot 2 for Intel igd-passthruChuck Zmudzinski
Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, as noted in docs/igd-assign.txt in the Qemu source code. Currently, when the xl toolstack is used to configure a Xen HVM guest with Intel IGD passthrough to the guest with the Qemu upstream device model, a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy a different slot. This problem often prevents the guest from booting. The only available workarounds are not good: Configure Xen HVM guests to use the old and no longer maintained Qemu traditional device model available from xenbits.xen.org which does reserve slot 2 for the Intel IGD or use the "pc" machine type instead of the "xenfv" machine type and add the xen platform device at slot 3 using a command line option instead of patching qemu to fix the "xenfv" machine type directly. The second workaround causes some degredation in startup performance such as a longer boot time and reduced resolution of the grub menu that is displayed on the monitor. This patch avoids that reduced startup performance when using the Qemu upstream device model for Xen HVM guests configured with the igd-passthru=on option. To implement this feature in the Qemu upstream device model for Xen HVM guests, introduce the following new functions, types, and macros: * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS * typedef XenPTQdevRealize function pointer * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 * xen_igd_reserve_slot and xen_igd_clear_slot functions Michael Tsirkin: * Introduce XEN_PCI_IGD_DOMAIN, XEN_PCI_IGD_BUS, XEN_PCI_IGD_DEV, and XEN_PCI_IGD_FN - use them to compute the value of XEN_PCI_IGD_SLOT_MASK The new xen_igd_reserve_slot function uses the existing slot_reserved_mask member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using the xl toolstack with the gfx_passthru option enabled, which sets the igd-passthru=on option to Qemu for the Xen HVM machine type. The new xen_igd_reserve_slot function also needs to be implemented in hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, in which case it does nothing. The new xen_igd_clear_slot function overrides qdev->realize of the parent PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was created in hw/i386/pc_piix.c for the case when igd-passthru=on. Move the call to xen_host_pci_device_get, and the associated error handling, from xen_pt_realize to the new xen_igd_clear_slot function to initialize the device class and vendor values which enables the checks for the Intel IGD to succeed. The verification that the host device is an Intel IGD to be passed through is done by checking the domain, bus, slot, and function values as well as by checking that gfx_passthru is enabled, the device class is VGA, and the device vendor in Intel. Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Message-Id: <b1b4a21fe9a600b1322742dda55a40e9961daa57.1674346505.git.brchuckz@aol.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> (cherry picked from commit 4f67543bb8c5b031c2ad3785c1a2f3c255d72b25) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-189pfs/xen: Fix segfault on shutdownJason Andryuk
xen_9pfs_free can't use gnttabdev since it is already closed and NULL-ed out when free is called. Do the teardown in _disconnect(). This matches the setup done in _connect(). trace-events are also added for the XenDevOps functions. Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Message-Id: <20230502143722.15613-1-jandryuk@gmail.com> [C.S.: - Remove redundant return in xen_9pfs_free(). - Add comment to trace-events. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> (cherry picked from commit 92e667f6fd5806a6a705a2a43e572bd9ec6819da) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: minor context conflict in hw/9pfs/xen-9p-backend.c)
2023-05-18s390x/tcg: Fix LDER instruction formatIlya Leoshkevich
It's RRE, not RXE. Found by running valgrind's none/tests/s390x/bfp-2. Fixes: 86b59624c4aa ("s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP") Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Message-Id: <20230511134726.469651-1-iii@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com> (cherry picked from commit 970641de01908dd09b569965e78f13842e5854bc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: context tweak)
2023-05-18target/s390x: Fix EXECUTE of relative branchesIlya Leoshkevich
Fix a problem similar to the one fixed by commit 703d03a4aaf3 ("target/s390x: Fix EXECUTE of relative long instructions"), but now for relative branches. Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230426235813.198183-2-iii@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com> (cherry picked from commit e8ecdfeb30f087574191cde523e846e023911c8d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18block/monitor: Fix crash when executing HMP commitWang Liang
hmp_commit() calls blk_is_available() from a non-coroutine context (and in the main loop). blk_is_available() is a co_wrapper_mixed_bdrv_rdlock function, and in the non-coroutine context it calls AIO_WAIT_WHILE(), which crashes if the aio_context lock is not taken before. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1615 Signed-off-by: Wang Liang <wangliangzz@inspur.com> Message-Id: <20230424103902.45265-1-wangliangzz@126.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 8c1e8fb2e7fc2cbeb57703e143965a4cd3ad301a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18virtio: fix reachable assertion due to stale value of cached region sizeCarlos López
In virtqueue_{split,packed}_get_avail_bytes() descriptors are read in a loop via MemoryRegionCache regions and calls to vring_{split,packed}_desc_read() - these take a region cache and the index of the descriptor to be read. For direct descriptors we use a cache provided by the caller, whose size matches that of the virtqueue vring. We limit the number of descriptors we can read by the size of that vring: max = vq->vring.num; ... MemoryRegionCache *desc_cache = &caches->desc; For indirect descriptors, we initialize a new cache and limit the number of descriptors by the size of the intermediate descriptor: len = address_space_cache_init(&indirect_desc_cache, vdev->dma_as, desc.addr, desc.len, false); desc_cache = &indirect_desc_cache; ... max = desc.len / sizeof(VRingDesc); However, the first initialization of `max` is done outside the loop where we process guest descriptors, while the second one is done inside. This means that a sequence of an indirect descriptor followed by a direct one will leave a stale value in `max`. If the second descriptor's `next` field is smaller than the stale value, but greater than the size of the virtqueue ring (and thus the cached region), a failed assertion will be triggered in address_space_read_cached() down the call chain. Fix this by initializing `max` inside the loop in both functions. Fixes: 9796d0ac8fb0 ("virtio: use address_space_map/unmap to access descriptors") Signed-off-by: Carlos López <clopez@suse.de> Message-Id: <20230302100358.3613-1-clopez@suse.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit bbc1c327d7974261c61566cdb950cc5fa0196b41) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/virtio/vhost-user: avoid using unitialized errpAlbert Esteve
During protocol negotiation, when we the QEMU stub does not support a backend with F_CONFIG, it throws a warning and supresses the VHOST_USER_PROTOCOL_F_CONFIG bit. However, the warning uses warn_reportf_err macro and passes an unitialized errp pointer. However, the macro tries to edit the 'msg' member of the unitialized Error and segfaults. Instead, just use warn_report, which prints a warning message directly to the output. Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported") Signed-off-by: Albert Esteve <aesteve@redhat.com> Message-Id: <20230302121719.9390-1-aesteve@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 90e31232cf8fa7f257263dd431ea954a1ae54bff) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18tcg: ppc64: Fix mask generation for vextractdmShivaprasad G Bhat
In function do_extractm() the mask is calculated as dup_const(1 << (element_width - 1)). '1' being signed int works fine for MO_8,16,32. For MO_64, on PPC64 host this ends up becoming 0 on compilation. The vextractdm uses MO_64, and it ends up having mask as 0. Explicitly use 1ULL instead of signed int 1 like its used everywhere else. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1536 Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Lucas Mateus Castro <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Message-Id: <168319292809.1159309.5817546227121323288.stgit@ltc-boston1.aus.stglabs.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> (cherry picked from commit 6a5d81b17201ab8a95539bad94c8a6c08a42e076) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18async: Suppress GCC13 false positive in aio_bh_poll()Cédric Le Goater
GCC13 reports an error : ../util/async.c: In function ‘aio_bh_poll’: include/qemu/queue.h:303:22: error: storing the address of local variable ‘slice’ in ‘*ctx.bh_slice_list.sqh_last’ [-Werror=dangling-pointer=] 303 | (head)->sqh_last = &(elm)->field.sqe_next; \ | ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ ../util/async.c:169:5: note: in expansion of macro ‘QSIMPLEQ_INSERT_TAIL’ 169 | QSIMPLEQ_INSERT_TAIL(&ctx->bh_slice_list, &slice, next); | ^~~~~~~~~~~~~~~~~~~~ ../util/async.c:161:17: note: ‘slice’ declared here 161 | BHListSlice slice; | ^~~~~ ../util/async.c:161:17: note: ‘ctx’ declared here But the local variable 'slice' is removed from the global context list in following loop of the same routine. Add a pragma to silent GCC. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20230420202939.1982044-1-clg@kaod.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit d66ba6dc1cce914673bd8a89fca30a7715ea70d1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: cherry-picked to stable-7.2 to eliminate CI failures on win*)
2023-05-18ui: Fix pixel colour channel order for PNG screenshotsPeter Maydell
When we take a PNG screenshot the ordering of the colour channels in the data is not correct, resulting in the image having weird colouring compared to the actual display. (Specifically, on a little-endian host the blue and red channels are swapped; on big-endian everything is wrong.) This happens because the pixman idea of the pixel data and the libpng idea differ. PIXMAN_a8r8g8b8 defines that pixels are 32-bit values, with A in bits 24-31, R in bits 16-23, G in bits 8-15 and B in bits 0-7. This means that on little-endian systems the bytes in memory are B G R A and on big-endian systems they are A R G B libpng, on the other hand, thinks of pixels as being a series of values for each channel, so its format PNG_COLOR_TYPE_RGB_ALPHA always wants bytes in the order R G B A This isn't the same as the pixman order for either big or little endian hosts. The alpha channel is also unnecessary bulk in the output PNG file, because there is no alpha information in a screenshot. To handle the endianness issue, we already define in ui/qemu-pixman.h various PIXMAN_BE_* and PIXMAN_LE_* values that give consistent byte-order pixel channel formats. So we can use PIXMAN_BE_r8g8b8 and PNG_COLOR_TYPE_RGB, which both have an in-memory byte order of R G B and 3 bytes per pixel. (PPM format screenshots get this right; they already use the PIXMAN_BE_r8g8b8 format.) Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1622 Fixes: 9a0a119a382867 ("Added parameter to take screenshot with screendump as PNG") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20230502135548.2451309-1-peter.maydell@linaro.org (cherry picked from commit cd22a0f520f471e3bd33bc19cf3b2fa772cdb2a8) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18accel/tcg: Fix atomic_mmu_lookup for readsRichard Henderson
A copy-paste bug had us looking at the victim cache for writes. Cc: qemu-stable@nongnu.org Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Fixes: 08dff435e2 ("tcg: Probe the proper permissions for atomic ops") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20230505204049.352469-1-richard.henderson@linaro.org> (cherry picked from commit 8c313254e61ed47a1bf4a2db714b25cdd94fbcce) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18target/riscv: Fix itrigger when icount is usedLIU Zhiwei
When I boot a ubuntu image, QEMU output a "Bad icount read" message and exit. The reason is that when execute helper_mret or helper_sret, it will cause a call to icount_get_raw_locked (), which needs set can_do_io flag on cpustate. Thus we setting this flag when execute these two instructions. Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com> Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn> Acked-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20230324064011.976-1-zhiwei_liu@linux.alibaba.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com> (cherry picked from commit df3ac6da476e346a17bad5bc843de1135a269229) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18block: Fix use after free in blockdev_mark_auto_del()Kevin Wolf
job_cancel_locked() drops the job list lock temporarily and it may call aio_poll(). We must assume that the list has changed after this call. Also, with unlucky timing, it can end up freeing the job during job_completed_txn_abort_locked(), making the job pointer invalid, too. For both reasons, we can't just continue at block_job_next_locked(job). Instead, start at the head of the list again after job_cancel_locked() and skip those jobs that we already cancelled (or that are completing anyway). Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230503140142.474404-1-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit e2626874a32602d4e52971c786ef5ffb4430629d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18meson: leave unnecessary modules out of the buildPaolo Bonzini
meson.build files choose whether to build modules based on foo.found() expressions. If a feature is enabled (e.g. --enable-gtk), these expressions are true even if the code is not used by any emulator, and this results in an unexpected difference between modular and non-modular builds. For non-modular builds, the files are not included in any binary, and therefore the source files are never processed. For modular builds, however, all .so files are unconditionally built by default, and therefore a normal "make" tries to build them. However, the corresponding trace-*.h files are absent due to this conditional: if have_system trace_events_subdirs += [ ... 'ui', ... ] endif which was added to avoid wasting time running tracetool on unused trace-events files. This causes a compilation failure; fix it by skipping module builds entirely if (depending on the module directory) have_block or have_system are false. Reported-by: Michael Tokarev <mjt@tls.msk.ru> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit ef709860ea12ec59c4cd7373bd2fd7a4e50143ee) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18softfloat: Fix the incorrect computation in float32_exp2Shivaprasad G Bhat
The float32_exp2 function is computing wrong exponent of 2. For example, with the following set of values {0.1, 2.0, 2.0, -1.0}, the expected output would be {1.071773, 4.000000, 4.000000, 0.500000}. Instead, the function is computing {1.119102, 3.382044, 3.382044, -0.191022} Looking at the code, the float32_exp2() attempts to do this 2 3 4 5 n x x x x x x x e = 1 + --- + --- + --- + --- + --- + ... + --- + ... 1! 2! 3! 4! 5! n! But because of the typo it ends up doing x x x x x x x e = 1 + --- + --- + --- + --- + --- + ... + --- + ... 1! 2! 3! 4! 5! n! This is because instead of the xnp which holds the numerator, parts_muladd is using the xp which is just 'x'. Commit '572c4d862ff2' refactored this function, and mistakenly used xp instead of xnp. Cc: qemu-stable@nongnu.org Fixes: 572c4d862ff2 "softfloat: Convert float32_exp2 to FloatParts" Partially-Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1623 Reported-By: Luca Barbato (https://gitlab.com/lu-zero) Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Message-Id: <168304110865.537992.13059030916325018670.stgit@localhost.localdomain> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit 1098cc3fcf952763fc9fd72c1c8fda30a18cc8ea) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/net/allwinner-sun8i-emac: Correctly byteswap descriptor fieldsPeter Maydell
In allwinner-sun8i-emac we just read directly from guest memory into a host FrameDescriptor struct and back. This only works on little-endian hosts. Reading and writing of descriptors is already abstracted into functions; make those functions also handle the byte-swapping so that TransferDescriptor structs as seen by the rest of the code are always in host-order, and fix two places that were doing ad-hoc descriptor reading without using the functions. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424165053.1428857-3-peter.maydell@linaro.org (cherry picked from commit a4ae17e5ec512862bf73e40dfbb1e7db71f2c1e7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/sd/allwinner-sdhost: Correctly byteswap descriptor fieldsPeter Maydell
In allwinner_sdhost_process_desc() we just read directly from guest memory into a host TransferDescriptor struct and back. This only works on little-endian hosts. Abstract the reading and writing of descriptors into functions that handle the byte-swapping so that TransferDescriptor structs as seen by the rest of the code are always in host-order. This fixes a failure of one of the avocado tests on s390. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424165053.1428857-2-peter.maydell@linaro.org (cherry picked from commit 3e20d90824c262de6887aa1bc52af94db69e4310) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18target/arm: Define and use new load_cpu_field_low32()Peter Maydell
In several places in the 32-bit Arm translate.c, we try to use load_cpu_field() to load from a CPUARMState field into a TCGv_i32 where the field is actually 64-bit. This works on little-endian hosts, but gives the wrong half of the register on big-endian. Add a new load_cpu_field_low32() which loads the low 32 bits of a 64-bit field into a TCGv_i32. The new macro includes a compile-time check against accidentally using it on a field of the wrong size. Use it to fix the two places in the code where we were using load_cpu_field() on a 64-bit field. This fixes a bug where on big-endian hosts the guest would crash after executing an ERET instruction, and a more corner case one where some UNDEFs for attempted accesses to MSR banked registers from Secure EL1 might go to the wrong EL. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230424153909.1419369-2-peter.maydell@linaro.org (cherry picked from commit 7f3a3d3dc433dc06c0adb480729af80f9c8e3739) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/intc/allwinner-a10-pic: Don't use set_bit()/clear_bit()Peter Maydell
The Allwinner PIC model uses set_bit() and clear_bit() to update the values in its irq_pending[] array when an interrupt arrives. However it is using these functions wrongly: they work on an array of type 'long', and it is passing an array of type 'uint32_t'. Because the code manually figures out the right array element, this works on little-endian hosts and on 32-bit big-endian hosts, where bits 0..31 in a 'long' are in the same place as they are in a 'uint32_t'. However it breaks on 64-bit big-endian hosts. Remove the use of set_bit() and clear_bit() in favour of using deposit32() on the array element. This fixes a bug where on big-endian 64-bit hosts the guest kernel would hang early on in bootup. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424152833.1334136-1-peter.maydell@linaro.org (cherry picked from commit 2c5fa0778c3b4307f9f3af7f27886c46d129c62f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/arm/raspi: Use arm_write_bootloader() to write boot codePeter Maydell
When writing the secondary-CPU stub boot loader code to the guest, use arm_write_bootloader() instead of directly calling rom_add_blob_fixed(). This fixes a bug on big-endian hosts, because arm_write_bootloader() will correctly byte-swap the host-byte-order array values into the guest-byte-order to write into the guest memory. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424152717.1333930-4-peter.maydell@linaro.org (cherry picked from commit 0acbdb4c4ab6b0a09f159bae4899b0737cf64242) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/arm/aspeed: Use arm_write_bootloader() to write the bootloaderCédric Le Goater
When writing the secondary-CPU stub boot loader code to the guest, use arm_write_bootloader() instead of directly calling rom_add_blob_fixed(). This fixes a bug on big-endian hosts, because arm_write_bootloader() will correctly byte-swap the host-byte-order array values into the guest-byte-order to write into the guest memory. Cc: qemu-stable@nongnu.org Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20230424152717.1333930-3-peter.maydell@linaro.org [PMM: Moved the "make arm_write_bootloader() function public" part to its own patch; updated commit message to note that this fixes an actual bug; adjust to the API changes noted in previous commit] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 902bba549fc386b4b9805320ed1a2e5b68478bdd) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/arm/boot: Make write_bootloader() public as arm_write_bootloader()Cédric Le Goater
The arm boot.c code includes a utility function write_bootloader() which assists in writing a boot-code fragment into guest memory, including handling endianness and fixing it up with entry point addresses and similar things. This is useful not just for the boot.c code but also in board model code, so rename it to arm_write_bootloader() and make it globally visible. Since we are making it public, make its API a little neater: move the AddressSpace* argument to be next to the hwaddr argument, and allow the fixupcontext array to be const, since we never modify it in this function. Cc: qemu-stable@nongnu.org Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20230424152717.1333930-2-peter.maydell@linaro.org [PMM: Split out from another patch by Cédric, added doc comment] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 0fe43f0abf19bbe24df3dbf0613bb47ed55f1482) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18hw/net/msf2-emac: Don't modify descriptor in-place in emac_store_desc()Peter Maydell
The msf2-emac ethernet controller has functions emac_load_desc() and emac_store_desc() which read and write the in-memory descriptor blocks and handle conversion between guest and host endianness. As currently written, emac_store_desc() does the endianness conversion in-place; this means that it effectively consumes the input EmacDesc struct, because on a big-endian host the fields will be overwritten with the little-endian versions of their values. Unfortunately, in all the callsites the code continues to access fields in the EmacDesc struct after it has called emac_store_desc() -- specifically, it looks at the d.next field. The effect of this is that on a big-endian host networking doesn't work because the address of the next descriptor is corrupted. We could fix this by making the callsite avoid using the struct; but it's more robust to have emac_store_desc() leave its input alone. (emac_load_desc() also does an in-place conversion, but here this is fine, because the function is supposed to be initializing the struct.) Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-id: 20230424151919.1333299-1-peter.maydell@linaro.org (cherry picked from commit d565f58b38424e9a390a7ea33ff7477bab693fda) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18target/arm: Initialize debug capabilities only onceAkihiko Odaki
kvm_arm_init_debug() used to be called several times on a SMP system as kvm_arch_init_vcpu() calls it. Move the call to kvm_arch_init() to make sure it will be called only once; otherwise it will overwrite pointers to memory allocated with the previous call and leak it. Fixes: e4482ab7e3 ("target-arm: kvm - add support for HW assisted debug") Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-id: 20230405153644.25300-1-akihiko.odaki@daynix.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit ad5c6ddea327758daa9f0e6edd916be39dce7dca) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18docs/about/deprecated.rst: Add "since 7.1" tag to dtb-kaslr-seed deprecationPeter Maydell
In commit 5242876f37ca we deprecated the dtb-kaslr-seed property of the virt board, but forgot the "since n.n" tag in the documentation of this in deprecated.rst. This deprecation note first appeared in the 7.1 release, so retrospectively add the correct "since 7.1" annotation to it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20230420122256.1023709-1-peter.maydell@linaro.org (cherry picked from commit ac64ebbecf80f6bc764d120f85fe9fa28fbd9e85) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18qemu-options: finesse the recommendations around -blockdevAlex Bennée
We are a bit premature in recommending -blockdev/-device as the best way to configure block devices. It seems there are times the more human friendly -drive still makes sense especially when -snapshot is involved. Improve the language to hopefully make things clearer. Suggested-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230424092249.58552-7-alex.bennee@linaro.org> (cherry picked from commit c1654c3e37c31fb638597efedcd07d071837b78b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18acpi: pcihp: allow repeating hot-unplug requestsIgor Mammedov
with Q35 using ACPI PCI hotplug by default, user's request to unplug device is ignored when it's issued before guest OS has been booted. And any additional attempt to request device hot-unplug afterwards results in following error: "Device XYZ is already in the process of unplug" arguably it can be considered as a regression introduced by [2], before which it was possible to issue unplug request multiple times. Accept new uplug requests after timeout (1ms). This brings ACPI PCI hotplug on par with native PCIe unplug behavior [1] and allows user to repeat unplug requests at propper times. Set expire timeout to arbitrary 1msec so user won't be able to flood guest with SCI interrupts by calling device_del in tight loop. PS: ACPI spec doesn't mandate what OSPM can do with GPEx.status bits set before it's booted => it's impl. depended. Status bits may be retained (I tested with one Windows version) or cleared (Linux since 2.6 kernel times) during guest's ACPI subsystem initialization. Clearing status bits (though not wrong per se) hides the unplug event from guest, and it's upto user to repeat device_del later when guest is able to handle unplug requests. 1) 18416c62e3 ("pcie: expire pending delete") 2) Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del") Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> CC: mst@redhat.com CC: anisinha@redhat.com CC: jusual@redhat.com CC: kraxel@redhat.com Message-Id: <20230418090449.2155757-1-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> (cherry picked from commit 0f689cf5ada4d5df5ab95c7f7aa9fc221afa855d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-27target/i386: Change wrong XFRM value in SGX CPUID leafYang Zhong
The previous patch wrongly replaced FEAT_XSAVE_XCR0_{LO|HI} with FEAT_XSAVE_XSS_{LO|HI} in CPUID(EAX=12,ECX=1):{ECX,EDX}. As a result, SGX enclaves only supported SSE and x87 feature (xfrm=0x3). Fixes: 301e90675c3f ("target/i386: Enable support for XSAVES based features") Signed-off-by: Yang Zhong <yang.zhong@linux.intel.com> Reviewed-by: Yang Weijiang <weijiang.yang@intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-Id: <20230406064041.420039-1-yang.zhong@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 72497cff896fecf74306ed33626c30e43633cdd6) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-27vnc: avoid underflow when accessing user-provided addressPaolo Bonzini
If hostlen is zero, there is a possibility that addrstr[hostlen - 1] underflows and, if a closing bracked is there, hostlen - 2 is passed to g_strndup() on the next line. If websocket==false then addrstr[0] would be a colon, but if websocket==true this could in principle happen. Fix it by checking hostlen. Reported by Coverity. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 3f9c41c5df9617510d8533cf6588172efb3df34b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-23Update version for 7.2.2 releasev7.2.2Michael Tokarev