aboutsummaryrefslogtreecommitdiff
path: root/util
AgeCommit message (Collapse)Author
2023-09-01util/async-teardown.c: move to softmmu/, only build it when system build is ↵Michael Tokarev
requested Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20230901101302.3618955-9-mjt@tls.msk.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-30aio-posix: zero out io_uring sqe user_dataStefan Hajnoczi
liburing does not clear sqe->user_data. We must do it ourselves to avoid undefined behavior in process_cqe() when user_data is used. Note that fdmon-io_uring is currently disabled, so this is a latent bug that does not affect users. Let's merge this fix now to make it easier to enable fdmon-io_uring in the future (and I'm working on that). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230426212639.82310-1-stefanha@redhat.com>
2023-08-09util/interval-tree: Check root for null in interval_tree_iter_firstHelge Deller
Fix a crash in qemu-user when running cat /proc/self/maps in a chroot, where /proc isn't mounted. The problem was introduced by commit 3ce3dd8ca965 ("util/selfmap: Rewrite using qemu/interval-tree.h") where in open_self_maps_1() the function read_self_maps() is called and which returns NULL if it can't read the hosts /proc/self/maps file. Afterwards that NULL is fed into interval_tree_iter_first() which doesn't check if the root node is NULL. Fix it by adding a check if root is NULL and return NULL in that case. Signed-off-by: Helge Deller <deller@gmx.de> Fixes: 3ce3dd8ca965 ("util/selfmap: Rewrite using qemu/interval-tree.h") Message-Id: <ZNOsq6Z7t/eyIG/9@p100> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-08-08util/selfmap: Rewrite using qemu/interval-tree.hRichard Henderson
We will want to be able to search the set of mappings. For this patch, the two users iterate the tree in order. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-08-03util/oslib-win32: Fix compiling with Clang from MSYS2Thomas Huth
Clang complains: ../util/oslib-win32.c:483:56: error: omitting the parameter name in a function definition is a C2x extension [-Werror,-Wc2x-extensions] win32_close_exception_handler(struct _EXCEPTION_RECORD*, ^ Fix it by adding parameter names. Message-Id: <20230728142748.305341-4-thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-08-01thread-pool: signal "request_cond" while lockedAnthony PERARD
thread_pool_free() might have been called on the `pool`, which would be a reason for worker_thread() to quit. In this case, `pool->request_cond` is been destroyed. If worker_thread() didn't managed to signal `request_cond` before it been destroyed by thread_pool_free(), we got: util/qemu-thread-posix.c:198: qemu_cond_signal: Assertion `cond->initialized' failed. One backtrace: __GI___assert_fail (assertion=0x55555614abcb "cond->initialized", file=0x55555614ab88 "util/qemu-thread-posix.c", line=198, function=0x55555614ad80 <__PRETTY_FUNCTION__.17104> "qemu_cond_signal") at assert.c:101 qemu_cond_signal (cond=0x7fffb800db30) at util/qemu-thread-posix.c:198 worker_thread (opaque=0x7fffb800dab0) at util/thread-pool.c:129 qemu_thread_start (args=0x7fffb8000b20) at util/qemu-thread-posix.c:505 start_thread (arg=<optimized out>) at pthread_create.c:486 Reported here: https://lore.kernel.org/all/ZJwoK50FcnTSfFZ8@MacBook-Air-de-Roger.local/T/#u To avoid issue, keep lock while sending a signal to `request_cond`. Fixes: 900fa208f506 ("thread-pool: replace semaphore with condition variable") Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230714152720.5077-1-anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2023-07-31util/interval-tree: Use qatomic_read/set for rb_parent_colorRichard Henderson
While less susceptible to optimization problems than left and right, interval_tree_iter_next also reads rb_parent(), so make sure that stores and loads are atomic. This goes further than technically required, changing all loads to be atomic, rather than simply the ones in the iteration side. But it doesn't really affect the code generation on the rebalance side and is cleaner to handle everything the same. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-31util/interval-tree: Introduce pc_parentRichard Henderson
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-31util/interval-tree: Use qatomic_set_mb in rb_link_nodeRichard Henderson
Ensure that the stores to rb_left and rb_right are complete before inserting the new node into the tree. Otherwise a concurrent reader could see garbage in the new leaf. Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-31util/interval-tree: Use qatomic_read for left/right while searchingRichard Henderson
Fixes a race condition (generally without optimization) in which the subtree is re-read after the protecting if condition. Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-10os-posix: Allow 'chroot' via '-run-with' and deprecate the old '-chroot' optionThomas Huth
We recently introduced "-run-with" for options that influence the runtime behavior of QEMU. This option has the big advantage that it can group related options (so that it is easier for the users to spot them) and that the options become introspectable via QMP this way. So let's start moving more switches into this option group, starting with "-chroot" now. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Ján Tomko <jtomko@redhat.com> Message-Id: <20230703074447.17044-1-thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-07-08host/include/ppc: Implement aes-round.hRichard Henderson
Detect CRYPTO in cpuinfo; implement the accel hooks. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08host/include/aarch64: Implement aes-round.hRichard Henderson
Detect AES in cpuinfo; implement the accel hooks. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08host/include/i386: Implement aes-round.hRichard Henderson
Detect AES in cpuinfo; implement the accel hooks. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08util: Add cpuinfo-ppc.cRichard Henderson
Move the code from tcg/. Fix a bug in that PPC_FEATURE2_ARCH_3_10 is actually spelled PPC_FEATURE2_ARCH_3_1. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-27console/win32: allocate shareable display surfaceMarc-André Lureau
Introduce qemu_win32_map_alloc() and qemu_win32_map_free() to allocate shared memory mapping. The handle can be used to share the mapping with another process. Teach qemu_create_displaysurface() to allocate shared memory. Following patches will introduce other places for shared memory allocation. Other patches for -display dbus will share the memory when possible with the client, to avoid expensive memory copy between the processes. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20230606115658.677673-10-marcandre.lureau@redhat.com>
2023-06-14Merge tag 'pull-riscv-to-apply-20230614' of ↵Richard Henderson
https://github.com/alistair23/qemu into staging Second RISC-V PR for 8.1 * Skip Vector set tail when vta is zero * Move zc* out of the experimental properties * Mask the implicitly enabled extensions in isa_string based on priv version * Rework CPU extension validation and validate MISA changes * Fixup PMP TLB cacheing errors * Writing to pmpaddr and MML/MMWP correctly triggers TLB flushes * Fixup PMP bypass checks * Deny access if access is partially inside a PMP entry * Correct OpenTitanState parent type/size * Fix QEMU crash when NUMA nodes exceed available CPUs * Fix pointer mask transformation for vector address * Updates and improvements for Smstateen * Support disas for Zcm* extensions * Support disas for Z*inx extensions * Remove unused decomp_rv32/64 value for vector instructions * Enable PC-relative translation * Assume M-mode FW in pflash0 only when "-bios none" * Support using pflash via -blockdev option * Add vector registers to log * Clean up reference of Vector MTYPE * Remove the check for extra Vector tail elements * Smepmp: Return error when access permission not allowed in PMP * Fixes for smsiaddrcfg and smsiaddrcfgh in AIA # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmSJFRoACgkQr3yVEwxT # gBMUkg/8Cuhqpx+zy7MeouVkyhEjUuhtCWyr0WVZBJzDkVEOrlY6TyR0hb5/o1Js # LZf6ZMF6JQDN78bmUct8yFBZBGafey5tyonDCsnD7CNQuLPf2NSjTHhu9n5hKFqF # F8Mpn9iFu6k1pr0iF7FbCccVWuDb3P4h2PaM0iFhmf4uz42BCMYdgJThhvv38xlt # jr6A3dcjTpp8yB+iRCuhL2IU2XVee0XBiDUECqRXd0gmtOtqJNST8L+l8YkLy1VO # WUMe8RCO6NMP7BLJ383WwCDeiFTo0mJebZQ0eR/G1xEhy7c8BBMh/CgQmq2F3wDZ # Q0biaeozADgAaCC7aOAHI+1sAoMhOm1v2WhIVmh+XXUqT9856cKwc7DUPBmzb9Sj # N5Zh+t9WCnZG7qpfxvkDF0Y/aRODMHZ1BW5L/ky9yBtyuRwXOJ6VycZTFyRkSwnN # Gd/s9IClDOP1IP5s4TSMGGdelk4lH97x7fZE/2hxn59lp761JtMxbaEceBtqaBh8 # zNMTNN/KHs8LeiIBI2ZZ+nQav452Y6XYBivQ7OdsI8xkjnjG9gfgXXjvX1TIh0ow # Hy5ZxtAtjXty49Gmjkx5VcBx4auJcnRDlLTzoZjTxq1te+gEWpw6O1EsEKasVLZe # uN6PxTOxS3nHvRvPgQc1xNUdhDRqBaYsju6b9YmMxz1uefAjGM0= # =fOTc # -----END PGP SIGNATURE----- # gpg: Signature made Wed 14 Jun 2023 03:17:14 AM CEST # gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013 # gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013 * tag 'pull-riscv-to-apply-20230614' of https://github.com/alistair23/qemu: (60 commits) hw/intc: If mmsiaddrcfgh.L == 1, smsiaddrcfg and smsiaddrcfgh are read-only. target/riscv: Smepmp: Return error when access permission not allowed in PMP target/riscv/vector_helper.c: Remove the check for extra tail elements target/riscv/vector_helper.c: clean up reference of MTYPE target/riscv: Fix initialized value for cur_pmmask util/log: Add vector registers to log docs/system: riscv: Add pflash usage details riscv/virt: Support using pflash via -blockdev option hw/riscv: virt: Assume M-mode FW in pflash0 only when "-bios none" target/riscv: Remove pc_succ_insn from DisasContext target/riscv: Enable PC-relative translation target/riscv: Use true diff for gen_pc_plus_diff target/riscv: Change gen_set_pc_imm to gen_update_pc target/riscv: Change gen_goto_tb to work on displacements target/riscv: Introduce cur_insn_len into DisasContext target/riscv: Fix target address to update badaddr disas/riscv.c: Remove redundant parentheses disas/riscv.c: Fix lines with over 80 characters disas/riscv.c: Remove unused decomp_rv32/64 value for vector instructions disas/riscv.c: Support disas for Z*inx extensions ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-13util/cacheflush: Avoid possible redundant dcache flush on DarwinPhilippe Mathieu-Daudé
<libkern/OSCacheControl.h> describes sys_icache_invalidate() as "equivalent to sys_cache_control(kCacheFunctionPrepareForExecution)", having kCacheFunctionPrepareForExecution defined as: /* Prepare memory for execution. This should be called * after writing machine instructions to memory, before * executing them. It syncs the dcache and icache. [...] */ Since the dcache is also sync'd, we can avoid the sys_dcache_flush() call when both rx/rw pointers are equal. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20230605195911.96033-1-philmd@linaro.org>
2023-06-13util/cacheflush: Use declarations from <OSCacheControl.h> on DarwinPhilippe Mathieu-Daudé
Per the cache(3) man page, sys_icache_invalidate() and sys_dcache_flush() are declared in <libkern/OSCacheControl.h>. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230605175647.88395-2-philmd@linaro.org>
2023-06-13util/log: Add vector registers to logIvan Klokov
Added QEMU option 'vpu' to log vector extension registers such as gpr\fpu. Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20230410124451.15929-2-ivan.klokov@syntacore.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-06-06atomics: eliminate mb_read/mb_setPaolo Bonzini
qatomic_mb_read and qatomic_mb_set were the very first atomic primitives introduced for QEMU; their semantics are unclear and they provide a false sense of safety. The last use of qatomic_mb_read() has been removed, so delete it. qatomic_mb_set() instead can survive as an optimized qatomic_set()+smp_mb(), similar to Linux's smp_store_mb(), but rename it to qatomic_set_mb() to match the order of the two operations. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-05util/iov: Remove qemu_iovec_init_extended()Hanna Czenczek
bdrv_pad_request() was the main user of qemu_iovec_init_extended(). HEAD^ has removed that use, so we can remove qemu_iovec_init_extended() now. The only remaining user is qemu_iovec_init_slice(), which can easily inline the small part it really needs. Note that qemu_iovec_init_extended() offered a memcpy() optimization to initialize the new I/O vector. qemu_iovec_concat_iov(), which is used to replace its functionality, does not, but calls qemu_iovec_add() for every single element. If we decide this optimization was important, we will need to re-implement it in qemu_iovec_concat_iov(), which might also benefit its pre-existing users. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-4-hreitz@redhat.com>
2023-06-05util/iov: Make qiov_slice() publicHanna Czenczek
We want to inline qemu_iovec_init_extended() in block/io.c for padding requests, and having access to qiov_slice() is useful for this. As a public function, it is renamed to qemu_iovec_slice(). (We will need to count the number of I/O vector elements of a slice there, and then later process this slice. Without qiov_slice(), we would need to call qemu_iovec_subvec_niov(), and all further IOV-processing functions may need to skip prefixing elements to accomodate for a qiov_offset. Because qemu_iovec_subvec_niov() internally calls qiov_slice(), we can just have the block/io.c code call qiov_slice() itself, thus get the number of elements, and also create an iovec array with the superfluous prefixing elements stripped, so the following processing functions no longer need to skip them.) Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-2-hreitz@redhat.com>
2023-06-02cutils: Improve qemu_strtosz handling of fractionsEric Blake
We have several limitations and bugs worth fixing; they are inter-related enough that it is not worth splitting this patch into smaller pieces: * ".5k" should work to specify 512, just as "0.5k" does * "1.9999k" and "1." + "9"*50 + "k" should both produce the same result of 2048 after rounding * "1." + "0"*350 + "1B" should not be treated the same as "1.0B"; underflow in the fraction should not be lost * "7.99e99" and "7.99e999" look similar, but our code was doing a read-out-of-bounds on the latter because it was not expecting ERANGE due to overflow. While we document that scientific notation is not supported, and the previous patch actually fixed qemu_strtod_finite() to no longer return ERANGE overflows, it is easier to pre-filter than to try and determine after the fact if strtod() consumed more than we wanted. Note that this is a low-level semantic change (when endptr is not NULL, we can now successfully parse with a scale of 'E' and then report trailing junk, instead of failing outright with EINVAL); but an earlier commit already argued that this is not a high-level semantic change since the only caller passing in a non-NULL endptr also checks that the tail is whitespace-only. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1629 Fixes: cf923b78 ("utils: Improve qemu_strtosz() to have 64 bits of precision", 6.0.0) Fixes: 7625a1ed ("utils: Use fixed-point arithmetic in qemu_strtosz", 6.0.0) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-20-eblake@redhat.com> [eblake: tweak function comment for accuracy]
2023-06-02cutils: Improve qemu_strtod* error pathsEric Blake
Previous patches changed all integral qemu_strto*() error paths to guarantee that *value is never left uninitialized. Do likewise for qemu_strtod. Also, tighten qemu_strtod_finite() to never return a non-finite value (prior to this patch, we were rejecting "inf" with -EINVAL and unspecified result 0.0, but failing "9e999" with -ERANGE and HUGE_VAL - which is infinite on IEEE machines - despite our function claiming to recognize only finite values). Auditing callers, we have no external callers of qemu_strtod, and among the callers of qemu_strtod_finite: - qapi/qobject-input-visitor.c:qobject_input_type_number_keyval() and qapi/string-input-visitor.c:parse_type_number() which reject all errors (does not matter what we store) - utils/cutils.c:do_strtosz() incorrectly assumes that *endptr points to '.' on all failures (that is, it is not distinguishing between EINVAL and ERANGE; and therefore still does the WRONG THING for "9.9e999". The change here does not entirely fix that (a later patch will tackle this more systematically), but at least it fixes the read-out-of-bounds first diagnosed in https://gitlab.com/qemu-project/qemu/-/issues/1629 - our testsuite, which we can update to match what we document Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> CC: qemu-stable@nongnu.org Message-Id: <20230522190441.64278-19-eblake@redhat.com>
2023-06-02cutils: Use parse_uint in qemu_strtosz for negative rejectionEric Blake
Rather than open-coding two different ways to check for an unwanted negative sign, reuse the same code in both functions. That way, if we decide down the road to accept "-0" instead of rejecting it, we have fewer places to change. Also, it means we now get ERANGE instead of EINVAL for negative values in qemu_strtosz, which is reasonable for what it represents. This in turn changes the expected output of a couple of iotests. The change is not quite complete: negative fractional scaled values can trip us up. This will be fixed in a later patch addressing other issues with fractional scaled values. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-18-eblake@redhat.com>
2023-06-02cutils: Set value in all integral qemu_strto* error pathsEric Blake
Our goal in writing qemu_strtoi() and friends is to have an interface harder to abuse than libc's strtol(). Leaving the return value uninitialized on some but not all error paths does not lend itself well to this goal; and our documentation wasn't helpful on what to expect. Note that the previous patch changed all qemu_strtosz() EINVAL error paths to slam value to 0 rather than stay uninitialized, even when the EINVAL eror occurs because of trailing junk. But for the remaining integral qemu_strto*, it's easier to return the parsed value than to force things back to zero, in part because of how check_strtox_error works; in part because people expect that from libc strto* (while there is no libc strtosz to compare to), and in part because doing so creates less churn in the testsuite. Here, the list of affected callers is much longer ('git grep "qemu_strto[ui]" "*.c" "**/*.c" | grep -v tests/ |wc -l' outputs 107, although a few of those are the implementation in in cutils.c), so touching as little as possible is the wisest course of action. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-17-eblake@redhat.com>
2023-06-02cutils: Set value in all qemu_strtosz* error pathsEric Blake
Making callers determine whether or not *value was populated on error is not nice for usability. Pre-patch, we have unit tests that check that *result is left unchanged on most EINVAL errors and set to 0 on many ERANGE errors. This is subtly different from libc strtoumax() behavior which returns UINT64_MAX on ERANGE errors, as well as different from our parse_uint() which slams to 0 on EINVAL on the grounds that we want our functions to be harder to mis-use than strtoumax(). Let's audit callers: - hw/core/numa.c:parse_numa() fixed in the previous patch to check for errors - migration/migration-hmp-cmds.c:hmp_migrate_set_parameter(), monitor/hmp.c:monitor_parse_arguments(), qapi/opts-visitor.c:opts_type_size(), qapi/qobject-input-visitor.c:qobject_input_type_size_keyval(), qemu-img.c:cvtnum_full(), qemu-io-cmds.c:cvtnum(), target/i386/cpu.c:x86_cpu_parse_featurestr(), and util/qemu-option.c:parse_option_size() appear to reject all failures (although some with distinct messages for ERANGE as opposed to EINVAL), so it doesn't matter what is in the value parameter on error. - All remaining callers are in the testsuite, where we can tweak our expectations to match our new desired behavior. Advancing to the end of the string parsed on overflow (ERANGE), while still returning 0, makes sense (UINT64_MAX as a size is unlikely to be useful); likewise, our size parsing code is complex enough that it's easier to always return 0 when endptr is NULL but trailing garbage was found, rather than trying to return the value of the prefix actually parsed (no current caller cared about the value of the prefix). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-16-eblake@redhat.com>
2023-06-02cutils: Allow NULL str in qemu_strtoszEric Blake
All the other qemu_strto* and parse_uint allow a NULL str. Having qemu_strtosz not crash on qemu_strtosz(NULL, NULL, &value) is an easy fix that adds some consistency between our string parsers. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-13-eblake@redhat.com>
2023-06-02cutils: Allow NULL endptr in parse_uint()Eric Blake
All the qemu_strto*() functions permit a NULL endptr, just like their libc counterparts, leaving parse_uint() as the oddball that caused SEGFAULT on NULL and required the user to call parse_uint_full() instead. Relax things for consistency, even though the testsuite is the only impacted caller. Add one more unit test to ensure even parse_uint_full(NULL, 0, &value) works. This also fixes our code to uniformly favor EINVAL over ERANGE when both apply. Also fixes a doc mismatch @v vs. a parameter named value. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-9-eblake@redhat.com>
2023-06-02cutils: Adjust signature of parse_uint[_full]Eric Blake
It's already confusing that we have two very similar functions for wrapping the parse of a 64-bit unsigned value, differing mainly on whether they permit leading '-'. Adjust the signature of parse_uint() and parse_uint_full() to be like all of qemu_strto*(): put the result parameter last, use the same types (uint64_t and unsigned long long have the same width, but are not always the same type), and mark endptr const (this latter change only affects the rare caller of parse_uint). Adjust all callers in the tree. While at it, note that since cutils.c already includes: QEMU_BUILD_BUG_ON(sizeof(int64_t) != sizeof(long long)); we are guaranteed that the result of parse_uint* cannot exceed UINT64_MAX (or the build would have failed), so we can drop pre-existing dead comparisons in opts-visitor.c that were never false. Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-8-eblake@redhat.com> [eblake: Drop dead code spotted by Markus] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-06-02cutils: Document differences between parse_uint and qemu_strtou64Eric Blake
These two functions are subtly different, and not just because of swapped parameter order. It took me adding better unit tests to figure out why. Document the differences to make it more obvious to developers trying to pick which one to use, as well as to aid in upcoming semantic changes. While touching the documentation, adjust a mis-statement: parse_uint does not return -EINVAL on invalid base, but assert()s, like all the other qemu_strto* functions that take a base argument. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-7-eblake@redhat.com>
2023-06-02cutils: Fix wraparound parsing in qemu_strtouiEric Blake
While we were matching 32-bit strtol in qemu_strtoi, our use of a 64-bit parse was leaking through for some inaccurate answers in qemu_strtoui in comparison to a 32-bit strtoul (see the unit test for examples). The comment for that function even described what we have to do for a correct parse, but didn't implement it correctly: since strtoull checks for overflow against the wrong values and then negates, we have to temporarily undo negation before checking for overflow against our desired value. Our int wrappers would be a lot easier to write if libc had a guaranteed 32-bit parser even on platforms with 64-bit long. Whether we parse C2x binary strings like "0b1000" is currently up to what libc does; our unit tests intentionally don't cover that at the moment, though. Fixes: 473a2a331e ("cutils: add qemu_strtoi & qemu_strtoui parsers for int/unsigned int types", v2.12.0) Signed-off-by: Eric Blake <eblake@redhat.com> CC: qemu-stable@nongnu.org Message-Id: <20230522190441.64278-6-eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-01accel/tcg: include cs_base in our hash calculationsAlex Bennée
We weren't using cs_base in the hash calculations before. Since the arm front end moved a chunk of flags in a378206a20 (target/arm: Move mode specific TB flags to tb->cs_base) they comprise of an important part of the execution state. Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8() to accommodate it. My initial benchmark shows very little difference in the runtime. Before: armhf ➜ hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot" Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.627 s ± 2.708 s [User: 34.309 s, System: 1.797 s] Range (min … max): 22.345 s … 29.864 s 20 runs arm64 ➜ hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.559 s ± 2.917 s [User: 189.115 s, System: 4.089 s] Range (min … max): 59.997 s … 70.153 s 10 runs After: armhf Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.223 s ± 2.151 s [User: 34.284 s, System: 1.906 s] Range (min … max): 22.000 s … 28.476 s 20 runs arm64 hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.769 s ± 1.978 s [User: 188.431 s, System: 5.269 s] Range (min … max): 60.285 s … 66.868 s 10 runs Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230526165401.574474-12-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-30aio: remove aio_disable_external() APIStefan Hajnoczi
All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block/export: stop using is_external in vhost-user-blk serverStefan Hajnoczi
vhost-user activity must be suspended during bdrv_drained_begin/end(). This prevents new requests from interfering with whatever is happening in the drained section. Previously this was done using aio_set_fd_handler()'s is_external argument. In a multi-queue block layer world the aio_disable_external() API cannot be used since multiple AioContext may be processing I/O, not just one. Switch to BlockDevOps->drained_begin/end() callbacks. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-8-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block/export: wait for vhost-user-blk requests when drainingStefan Hajnoczi
Each vhost-user-blk request runs in a coroutine. When the BlockBackend enters a drained section we need to enter a quiescent state. Currently any in-flight requests race with bdrv_drained_begin() because it is unaware of vhost-user-blk requests. When blk_co_preadv/pwritev()/etc returns it wakes the bdrv_drained_begin() thread but vhost-user-blk request processing has not yet finished. The request coroutine continues executing while the main loop thread thinks it is in a drained section. One example where this is unsafe is for blk_set_aio_context() where bdrv_drained_begin() is called before .aio_context_detached() and .aio_context_attach(). If request coroutines are still running after bdrv_drained_begin(), then the AioContext could change underneath them and they race with new requests processed in the new AioContext. This could lead to virtqueue corruption, for example. (This example is theoretical, I came across this while reading the code and have not tried to reproduce it.) It's easy to make bdrv_drained_begin() wait for in-flight requests: add a .drained_poll() callback that checks the VuServer's in-flight counter. VuServer just needs an API that returns true when there are requests in flight. The in-flight counter needs to be atomic. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-7-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30util/vhost-user-server: rename refcount to in_flight counterStefan Hajnoczi
The VuServer object has a refcount field and ref/unref APIs. The name is confusing because it's actually an in-flight request counter instead of a refcount. Normally a refcount destroys the object upon reaching zero. The VuServer counter is used to wake up the vhost-user coroutine when there are no more requests. Avoid confusing by renaming refcount and ref/unref to in_flight and inc/dec. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-6-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-28win32: wrap socket close() with an exception handlerMarc-André Lureau
Since commit abe34282 ("win32: avoid mixing SOCKET and file descriptor space"), we set HANDLE_FLAG_PROTECT_FROM_CLOSE on the socket FD, to prevent closing the HANDLE with CloseHandle. This raises an exception which under gdb is fatal, and qemu exits. Let's catch the expected error instead. Note: this appears to work, but the mingw64 macro is not well documented or tested, and it's not obvious how it is meant to be used. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20230515132440.1025315-1-marcandre.lureau@redhat.com>
2023-05-24Merge tag 'pull-vfio-20230524' of https://github.com/legoater/qemu into stagingRichard Henderson
vfio queue: * Fix for a memory corruption due to an extra free * Fix for a compile breakage # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmRtyB4ACgkQUaNDx8/7 # 7KFQvRAAhexL/Q8rWM8og+VESL5gPlpxDhWCI+l76+YJqQzZkgebwZ5rw920f8EG # bRs5AAk8fPTX/qKKq/JkYMmQwpM2jo8W4elcNumm44WAG7hDwd1LQ3nAZeOcvgU0 # jQ1IwRYcgNo+oOTN9b7GhePQK27OraliLUrf/sBGUWvbdAttVc2pcB91CMur0Dxb # 9KK2vEA4MJ9B8zf2/ZkaK6Z+28GsratR7803Nvv25rm5sP3VBb9w0TnKZAOmaHLv # X5Tz8yjNvQxxzB9SzgOK6yMtnrp42ArVC5u2aDa33uzSWUeFiTF1HEFeGAps2nJg # 8tSNo0fTKhznrVR3q2pyxC05Dp+jmKicrmivc26iBdAWAUxQYX44UQoLYD5ISdti # nlSE+Is+0ZE5E2tHE9yAOPa4rrXHNBqpueu+VMPbYMyVEqzblP7twYe6HkGPYhrD # zbx/ABZAAGOf+3YmyL1yQrCc0WyJ2lHDySQt/llMrhkBTCHGEF8yjfWFypluZFWX # X7Mb0YZP0qPpFsV3TDcrqV3onaFSNehp2EJs2EJAa/DeUNbnKlz4LiYBzZE95egb # 9PGrLnB5w1Vlp44H+ctrnYj55TnspHT+Qqwvhkr/vOMupZukbGus0VFIU2IDrh2g # qEqhaigwxfVyZ1Eqwti4IgX8RVX8bW43slR33aD6vsO7jpiP2Pk= # =TA2V # -----END PGP SIGNATURE----- # gpg: Signature made Wed 24 May 2023 01:17:34 AM PDT # gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1 # gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1 * tag 'pull-vfio-20230524' of https://github.com/legoater/qemu: util/vfio-helpers: Use g_file_read_link() vfio/pci: Fix a use-after-free issue Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-24util/vfio-helpers: Use g_file_read_link()Akihiko Odaki
When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is 12.1.0, the compiler complains as follows: In file included from /usr/include/features.h:490, from /usr/include/bits/libc-header-start.h:33, from /usr/include/stdint.h:26, from /usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9, from /home/alarm/q/var/qemu/include/qemu/osdep.h:94, from ../util/vfio-helpers.c:13: In function 'readlink', inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9, inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18, inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9: /usr/include/bits/unistd.h:119:10: error: argument 2 is null but the corresponding size argument 3 value is 4095 [-Werror=nonnull] 119 | return __glibc_fortify (readlink, __len, sizeof (char), | ^~~~~~~~~~~~~~~ This error implies the allocated buffer can be NULL. Use g_file_read_link(), which allocates buffer automatically to avoid the error. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-05-23util: Add cpuinfo-aarch64.cRichard Henderson
Move the code from tcg/. The only use of these bits so far is with respect to the atomicity of tcg operations. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23util/bufferiszero: Use i386 host/cpuinfo.hRichard Henderson
Use cpuinfo_init() during init_accel(), and the variable cpuinfo during test_buffer_is_zero_next_accel(). Adjust the logic that cycles through the set of accelerators for testing. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23util: Add i386 CPUINFO_ATOMIC_VMOVDQURichard Henderson
Add a bit to indicate when VMOVDQU is also atomic if aligned. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23util: Add cpuinfo-i386.cRichard Henderson
Add cpuinfo.h for i386 and x86_64, and the initialization for that in util/. Populate that with a slightly altered copy of the tcg host probing code. Other uses of cpuid.h will be adjusted one patch at a time. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23igb: Implement Rx SCTP CSOAkihiko Odaki
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-19aio-posix: do not nest poll handlersStefan Hajnoczi
QEMU's event loop supports nesting, which means that event handler functions may themselves call aio_poll(). The condition that triggered a handler must be reset before the nested aio_poll() call, otherwise the same handler will be called and immediately re-enter aio_poll. This leads to an infinite loop and stack exhaustion. Poll handlers are especially prone to this issue, because they typically reset their condition by finishing the processing of pending work. Unfortunately it is during the processing of pending work that nested aio_poll() calls typically occur and the condition has not yet been reset. Disable a poll handler during ->io_poll_ready() so that a nested aio_poll() call cannot invoke ->io_poll_ready() again. As a result, the disabled poll handler and its associated fd handler do not run during the nested aio_poll(). Calling aio_set_fd_handler() from inside nested aio_poll() could cause it to run again. If the fd handler is pending inside nested aio_poll(), then it will also run again. In theory fd handlers can be affected by the same issue, but they are more likely to reset the condition before calling nested aio_poll(). This is a special case and it's somewhat complex, but I don't see a way around it as long as nested aio_poll() is supported. Cc: qemu-stable@nongnu.org Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2186181 Fixes: c38270692593 ("block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCK") Cc: Kevin Wolf <kwolf@redhat.com> Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230502184134.534703-2-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-18build: move coroutine backend selection to mesonPaolo Bonzini
To simplify the code, rename coroutine-win32.c to match the option passed to configure. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-18build: move glib detection and workarounds to mesonPaolo Bonzini
QEMU adds the path to glib.h to all compilation commands. This is simpler due to the pervasive use of static_library, and was grandfathered in from the previous Make-based build system. Until Meson 0.63 the only way to do this was to detect glib in configure and use add_project_arguments, but now it is possible to use add_project_dependencies instead. gmodule is detected in a separate variable, with export enabled for modules and disabled for plugin. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-16util/async-teardown: wire up query-command-line-optionsClaudio Imbrenda
Add new -run-with option with an async-teardown=on|off parameter. It is visible in the output of query-command-line-options QMP command, so it can be discovered and used by libvirt. The option -async-teardown is now redundant, deprecate it. Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Fixes: c891c24b1a ("os-posix: asynchronous teardown for shutdown on Linux") Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-Id: <20230505120051.36605-2-imbrenda@linux.ibm.com> [thuth: Add curly braces to fix error with GCC 8.5, fix bug in deprecated.rst] Signed-off-by: Thomas Huth <thuth@redhat.com>