aboutsummaryrefslogtreecommitdiff
path: root/util
AgeCommit message (Collapse)Author
2023-01-23util/aio: Defer disabling poll mode as long as possibleChao Gao
When we measure FIO read performance (cache=writethrough, bs=4k, iodepth=64) in VMs, ~80K/s notifications (e.g., EPT_MISCONFIG) are observed from guest to qemu. It turns out those frequent notificatons are caused by interference from worker threads. Worker threads queue bottom halves after completing IO requests. Pending bottom halves may lead to either aio_compute_timeout() zeros timeout and pass it to try_poll_mode() or run_poll_handlers() returns no progress after noticing pending aio_notify() events. Both cause run_poll_handlers() to call poll_set_started(false) to disable poll mode. However, for both cases, as timeout is already zeroed, the event loop (i.e., aio_poll()) just processes bottom halves and then starts the next event loop iteration. So, disabling poll mode has no value but leads to unnecessary notifications from guest. To minimize unnecessary notifications from guest, defer disabling poll mode to when the event loop is about to be blocked. With this patch applied, FIO seq-read performance (bs=4k, iodepth=64, cache=writethrough) in VMs increases from 330K/s to 413K/s IOPS. Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Chao Gao <chao.gao@intel.com> Message-id: 20220710120849.63086-1-chao.gao@intel.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-01-20coroutine: Use Coroutine typedef name instead of structure tagMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20221221131435.3851212-6-armbru@redhat.com>
2023-01-19coroutine: Clean up superfluous inclusion of qemu/coroutine.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20221221131435.3851212-2-armbru@redhat.com>
2023-01-16util/bufferiszero: Use __attribute__((target)) for avx2/avx512Richard Henderson
Use the attribute, which is supported by clang, instead of the #pragma, which is not supported and, for some reason, also not detected by the meson probe, so we fail by -Werror. Include only <immintrin.h> as that is the outermost "official" header for these intrinsics -- emmintrin.h and smmintrin -- are older SSE2 and SSE4 specific headers, while the immintrin.h includes all of the Intel intrinsics. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-11util/error: add G_GNUC_PRINTF for various functionsDaniel P. Berrangé
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20221219130205.687815-5-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-11accel: introduce accelerator blocker APIEmanuele Giuseppe Esposito
This API allows the accelerators to prevent vcpus from issuing new ioctls while execting a critical section marked with the accel_ioctl_inhibit_begin/end functions. Note that all functions submitting ioctls must mark where the ioctl is being called with accel_{cpu_}ioctl_begin/end(). This API requires the caller to always hold the BQL. API documentation is in sysemu/accel-blocker.h Internally, it uses a QemuLockCnt together with a per-CPU QemuLockCnt (to minimize cache line bouncing) to keep avoid that new ioctls run when the critical section starts, and a QemuEvent to wait that all running ioctls finish. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20221111154758.1372674-2-eesposit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-09Merge tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu into ↵Peter Maydell
staging * s390x header clean-ups from Philippe * Rework and improvements of the EINTR handling by Nikita * Deprecate the -no-hpet command line option * Disable the qtests in the 32-bit Windows CI job again * Some other misc fixes here and there # gpg: Signature made Mon 09 Jan 2023 14:21:19 GMT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu: .gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job error handling: Use RETRY_ON_EINTR() macro where applicable Refactoring: refactor TFR() macro to RETRY_ON_EINTR() docs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst i386: Deprecate the -no-hpet QEMU command line option tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter tests/readconfig: spice doesn't support unix socket on windows yet target/s390x: Restrict sysemu/reset.h to system emulation target/s390x/tcg/excp_helper: Restrict system headers to sysemu target/s390x/tcg/misc_helper: Remove unused "memory.h" include hw/s390x/pv: Restrict Protected Virtualization to sysemu exec/memory: Expose memory_region_access_valid() MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts qemu-iotests/stream-under-throttle: do not shutdown QEMU Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-01-09error handling: Use RETRY_ON_EINTR() macro where applicableNikita Ivanov
There is a defined RETRY_ON_EINTR() macro in qemu/osdep.h which handles the same while loop. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/415 Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com> Message-Id: <20221023090422.242617-3-nivanov@cloudlinux.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> [thuth: Dropped the hunk that changed socket_accept() in libqtest.c] Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-01-06util: remove support for hex numbers with a scaling suffixPaolo Bonzini
This was deprecated in 6.0 and can now be removed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-06util/log: Always send errors to logfile when daemonizedGreg Kurz
When QEMU is started with `-daemonize`, all stdio descriptors get redirected to `/dev/null`. This basically means that anything printed with error_report() and friends is lost. Current logging code allows to redirect to a file with `-D` but this requires to enable some logging item with `-d` as well to be functional. Relax the check on the log flags when QEMU is daemonized, so that other users of stderr can benefit from the redirection, without the need to enable unwanted debug logs. Previous behaviour is retained for the non-daemonized case. The logic is unrolled as an `if` for better readability. The qemu_log_level and log_per_thread globals reflect the state we want to transition to at this point : use them instead of the intermediary locals for correctness. qemu_set_log_internal() is adapted to open a per-thread log file when '-d tid' is passed. This is done by hijacking qemu_try_lock() which seems simpler that refactoring the code. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20221108140032.1460307-3-groug@kaod.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-06util/log: do not close and reopen log files when flags are turned offPaolo Bonzini
log_append makes sure that if you turn off the logging (which clears log_flags and makes need_to_open_file false) the old log is not overwritten. The usecase is that if you remove or move the file QEMU will not keep writing to the old file. However, this is not always the desited behavior, in particular having log_append==1 after changing the file name makes little sense. When qemu_set_log_internal is called from the logfile monitor command, filename must be non-NULL and therefore changed_name must be true. Therefore, the only case where the file is closed and need_to_open_file == false is indeed when log_flags becomes zero. In this case, just flush the file and do not bother closing it, thus faking the same append behavior as previously. The behavioral change is that changing the logfile twice, for example log1 -> log2 -> log1, will cause log1 to be overwritten. This can simply be documented, since it is not a particularly surprising behavior. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20221025092119.236224-1-pbonzini@redhat.com> [groug: nullify global_file before actually closing the file] Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20221108140032.1460307-2-groug@kaod.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-20util: Add interval-tree.cRichard Henderson
Copy and simplify the Linux kernel's interval_tree_generic.h, instantiating for uint64_t. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-12-16Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into stagingPeter Maydell
Block layer patches - Code cleanups around block graph modification - Simplify drain - coroutine_fn correctness fixes, including splitting generated coroutine wrappers into co_wrapper (to be called only from non-coroutine context) and co_wrapper_mixed (both coroutine and non-coroutine context) - Introduce a block graph rwlock # gpg: Signature made Thu 15 Dec 2022 15:08:34 GMT # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (50 commits) block: GRAPH_RDLOCK for functions only called by co_wrappers block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock block-coroutine-wrapper.py: introduce annotations that take the graph rdlock Mark assert_bdrv_graph_readable/writable() GRAPH_RD/WRLOCK graph-lock: TSA annotations for lock/unlock functions block: assert that graph read and writes are performed correctly block: remove unnecessary assert_bdrv_graph_writable() block: wrlock in bdrv_replace_child_noperm block: Fix locking in external_snapshot_prepare() test-bdrv-drain: Fix incorrrect drain assumptions clang-tsa: Add macros for shared locks clang-tsa: Add TSA_ASSERT() macro Import clang-tsa.h async: Register/unregister aiocontext in graph lock list graph-lock: Implement guard macros graph-lock: Introduce a lock to protect block graph operations block: Factor out bdrv_drain_all_begin_nopoll() block/dirty-bitmap: convert coroutine-only functions to co_wrapper block: convert bdrv_create to co_wrapper block-coroutine-wrapper.py: support also basic return types ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-12-15async: Register/unregister aiocontext in graph lock listEmanuele Giuseppe Esposito
Add/remove the AioContext in aio_context_list in graph-lock.c when it is created/destroyed. This allows using the graph locking operations from this AioContext. In order to allow linking util/async.c with binaries that don't include the block layer, introduce stubs for (un)register_aiocontext(). Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221207131838.239125-5-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15util/oslib-win32: Remove obsolete reference to g_poll codeThomas Huth
The comment about g_poll is not required here anymore since the corresponding code has been removed a while ago already. Fixes: b4c6036faa ("configure: bump min required glib version to 2.56") Message-Id: <20221208133257.95673-1-thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-12-15util/qemu-config: Fix "query-command-line-options" to provide the right valuesThomas Huth
The "query-command-line-options" command uses a hand-crafted list of options that should be returned for the "machine" parameter. This is pretty much out of sync with reality, for example settings like "kvm_shadow_mem" or "accel" are not parameters for the machine anymore. Also, there is no distinction between the targets here, so e.g. the s390x-specific values like "loadparm" in this list also show up with the other targets like x86_64. Let's fix this now by geting rid of the hand-crafted list and by querying the properties of the machine classes instead to assemble the list. Fixes: 0a7cf217d8 ("fix regression of qmp_query_command_line_options") Message-Id: <20221111141323.246267-1-thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-12-15Merge tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru into stagingPeter Maydell
Miscellaneous patches for 2022-12-14 # gpg: Signature made Wed 14 Dec 2022 15:23:02 GMT # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru: ppc4xx_sdram: Simplify sdram_ddr_size() to return block/vmdk: Simplify vmdk_co_create() to return directly cleanup: Tweak and re-run return_directly.cocci io: Tidy up fat-fingered parameter name qapi: Use returned bool to check for failure (again) sockets: Use ERRP_GUARD() where obviously appropriate qemu-config: Use ERRP_GUARD() where obviously appropriate qemu-config: Make config_parse_qdict() return bool monitor: Use ERRP_GUARD() in monitor_init() monitor: Simplify monitor_fd_param()'s error handling error: Move ERRP_GUARD() to the beginning of the function error: Drop a few superfluous ERRP_GUARD() error: Drop some obviously superfluous error_propagate() Drop more useless casts from void * to pointer Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-12-14qapi misc: Elide redundant has_FOO in generated CMarkus Armbruster
The has_FOO for pointer-valued FOO are redundant, except for arrays. They are also a nuisance to work with. Recent commit "qapi: Start to elide redundant has_FOO in generated C" provided the means to elide them step by step. This is the step for qapi/misc.json. Said commit explains the transformation in more detail. The invariant violations mentioned there do not occur here. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20221104160712.3005652-18-armbru@redhat.com>
2022-12-14qapi: Use returned bool to check for failure (again)Markus Armbruster
Commit 012d4c96e2 changed the visitor functions taking Error ** to return bool instead of void, and the commits following it used the new return value to simplify error checking. Since then a few more uses in need of the same treatment crept in. Do that. All pretty mechanical except for * balloon_stats_get_all() This is basically the same transformation commit 012d4c96e2 applied to the virtual walk example in include/qapi/visitor.h. * set_max_queue_size() Additionally replace "goto end of function" by return. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2022-12-14sockets: Use ERRP_GUARD() where obviously appropriateMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-9-armbru@redhat.com>
2022-12-14qemu-config: Use ERRP_GUARD() where obviously appropriateMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-8-armbru@redhat.com>
2022-12-14qemu-config: Make config_parse_qdict() return boolMarkus Armbruster
This simplifies error checking. Cc: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-7-armbru@redhat.com>
2022-12-14Drop more useless casts from void * to pointerMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20221123133811.1398562-1-armbru@redhat.com>
2022-11-21migration: Use non-atomic ops for clear log bitmapPeter Xu
Since we already have bitmap_mutex to protect either the dirty bitmap or the clear log bitmap, we don't need atomic operations to set/clear/test on the clear log bitmap. Switching all ops from atomic to non-atomic versions, meanwhile touch up the comments to show which lock is in charge. Introduced non-atomic version of bitmap_test_and_clear_atomic(), mostly the same as the atomic version but simplified a few places, e.g. dropped the "old_bits" variable, and also the explicit memory barriers. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-11-11qga: Allow building of the guest agent without system emulators or toolsThomas Huth
If configuring with "--disable-system --disable-user --enable-guest-agent" the linking currently fails with: qga/qemu-ga.p/commands.c.o: In function `qmp_command_info': build/../../home/thuth/devel/qemu/qga/commands.c:70: undefined reference to `qmp_command_name' build/../../home/thuth/devel/qemu/qga/commands.c:71: undefined reference to `qmp_command_is_enabled' build/../../home/thuth/devel/qemu/qga/commands.c:72: undefined reference to `qmp_has_success_response' qga/qemu-ga.p/commands.c.o: In function `qmp_guest_info': build/../../home/thuth/devel/qemu/qga/commands.c:82: undefined reference to `qmp_for_each_command' qga/qemu-ga.p/commands.c.o: In function `qmp_guest_exec': build/../../home/thuth/devel/qemu/qga/commands.c:410: undefined reference to `qbase64_decode' qga/qemu-ga.p/channel-posix.c.o: In function `ga_channel_open': build/../../home/thuth/devel/qemu/qga/channel-posix.c:214: undefined reference to `unix_listen' build/../../home/thuth/devel/qemu/qga/channel-posix.c:228: undefined reference to `socket_parse' build/../../home/thuth/devel/qemu/qga/channel-posix.c:234: undefined reference to `socket_listen' qga/qemu-ga.p/commands-posix.c.o: In function `qmp_guest_file_write': build/../../home/thuth/devel/qemu/qga/commands-posix.c:527: undefined reference to `qbase64_decode' Let's make sure that we also compile and link the required files if the system emulators have not been enabled. Message-Id: <20221110083626.31899-1-thuth@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-11-08Revert "s390x/s390-virtio-ccw: add zpcii-disable machine property"Cédric Le Goater
This reverts commit 59d1ce44396e3ad2330dc3261ff3da7ad3a16184. The "zpcii-disable" machine property is redundant with the "interpret" zPCI device property. Remove it for clarification. Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-Id: <20221107161349.1032730-2-clg@kaod.org> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-11-07util/log: Ignore per-thread flag if global file already thereGreg Kurz
If QEMU is started with `-D qemu.log.%d` without any `-d` option, doing `log all` in the monitor fails with: Filename template with '%d' required for 'tid' It is confusing since '%d' was actually passed. This happens because QEMU caches the log file name with %d converted to getpid() since `tid` wasn't required. This name isn't suitable for a subsequent enablement of per-thread logs. There's little cause to change the behavior as `-d tid` is mostly used at user-only startup. Drop the per-thread from the requested flags in this case : `log all` will thus enable everything except `tid` instead of failing. This is preferable over forcing the user to enable each log item individually. With this patch, `tid` is now truely immutable : it can only be set or unset from the command line and never changed afterwards. Fixes: 4e51069d6793 ("util/log: Support per-thread log files") Cc: richard.henderson@linaro.org Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221104120059.678470-3-groug@kaod.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-11-07util/log: Make the per-thread flag immutableGreg Kurz
Per-thread logging was implemented under the assumption that once enabled, it is not possible to switch back to single file logging. This isn't enforced though and it is possible to go through the global file opening sequence in per-thread mode. The code isn't ready for this and produces unexpected results as detailed below. Start QEMU in system emulation mode with `-D ./qemu.log.%d -d tid` and then change the log level from the monitor to something that doesn't have tid, e.g. `log cpu_reset`. The value of log_flags is zero and per_thread is set to false : the rest of the code then assumes it is running in the global log case and opens a file named `qemu.log.%d`, which is obviously not an expected behavior. Enforce the immutability of the flag early in qemu_set_log_internal() so that its value is correct for all subsequent users. Fixes: 4e51069d6793 ("util/log: Support per-thread log files") Cc: richard.henderson@linaro.org Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20221104120059.678470-2-groug@kaod.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-11-06module: add Error arguments to module_load and module_load_qomClaudio Fontana
improve error handling during module load, by changing: bool module_load(const char *prefix, const char *lib_name); void module_load_qom(const char *type); to: int module_load(const char *prefix, const char *name, Error **errp); int module_load_qom(const char *type, Error **errp); where the return value is: -1 on module load error, and errp is set with the error 0 on module or one of its dependencies are not installed 1 on module load success 2 on module load success (module already loaded or built-in) module_load_qom_one has been introduced in: commit 28457744c345 ("module: qom module support"), which built on top of module_load_one, but discarded the bool return value. Restore it. Adapt all callers to emit errors, or ignore them, or fail hard, as appropriate in each context. Replace the previous emission of errors via fprintf in _some_ error conditions with Error and error_report, so as to emit to the appropriate target. A memory leak is also fixed as part of the module_load changes. audio: when attempting to load an audio module, report module load errors. Note that still for some callers, a single issue may generate multiple error reports, and this could be improved further. Regarding the audio code itself, audio_add() seems to ignore errors, and this should probably be improved. block: when attempting to load a block module, report module load errors. For the code paths that already use the Error API, take advantage of those to report module load errors into the Error parameter. For the other code paths, we currently emit the error, but this could be improved further by adding Error parameters to all possible code paths. console: when attempting to load a display module, report module load errors. qdev: when creating a new qdev Device object (DeviceState), report load errors. If a module cannot be loaded to create that device, now abort execution (if no CONFIG_MODULE) or exit (if CONFIG_MODULE). qom/object.c: when initializing a QOM object, or looking up class_by_name, report module load errors. qtest: when processing the "module_load" qtest command, report errors in the load of the module. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220929093035.4231-4-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-06module: rename module_load_one to module_loadClaudio Fontana
Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220929093035.4231-3-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-06module: removed unused function argument "mayfail"Claudio Fontana
mayfail is always passed as false for every invocation throughout the program. It controls whether to printf or not to printf an error on g_module_open failure. Remove this unused argument. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220929093035.4231-2-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-06util/aio-win32: Correct the event array size in aio_poll()Bin Meng
WaitForMultipleObjects() can only wait for MAXIMUM_WAIT_OBJECTS object handles. Correct the event array size in aio_poll() and add a assert() to ensure it does not cause out of bound access. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20221019102015.2441622-3-bmeng.cn@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-06util/main-loop: Avoid adding the same HANDLE twiceBin Meng
Fix the logic in qemu_add_wait_object() to avoid adding the same HANDLE twice, as the behavior is undefined when passing an array that contains same HANDLEs to WaitForMultipleObjects() API. Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20221019102015.2441622-2-bmeng.cn@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-06util/main-loop: Fix maximum number of wait objects for win32Bin Meng
The maximum number of wait objects for win32 should be MAXIMUM_WAIT_OBJECTS, not MAXIMUM_WAIT_OBJECTS + 1. Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20221019102015.2441622-1-bmeng.cn@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-03Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingStefan Hajnoczi
* bug fixes * reduced memory footprint for IPI virtualization on Intel processors * asynchronous teardown support (Linux only) # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNiVykUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0Swf/YxjphCtFgYYSO14WP+7jAnfRZLhm # 0xWChWP8rco5I352OBFeFU64Av5XoLGNn6SZLl8lcg86lQ/G0D27jxu6wOcDDHgw # 0yTDO1gevj51UKsbxoC66OWSZwKTEo398/BHPDcI2W41yOFycSdtrPgspOrFRVvf # 7M3nNjuNPsQorZeuu8NGr3jakqbt99ZDXcyDEWbrEAcmy2JBRMbGgT0Kdnc6aZfW # CvL+1ljxzldNwGeNBbQW2QgODbfHx5cFZcy4Daze35l5Ra7K/FrgAzr6o/HXptya # 9fEs5LJQ1JWI6JtpaWwFy7fcIIOsJ0YW/hWWQZSDt9JdAJFE5/+vF+Kz5Q== # =CgrO # -----END PGP SIGNATURE----- # gpg: Signature made Wed 02 Nov 2022 07:40:25 EDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: target/i386: Fix test for paging enabled util/log: Close per-thread log file on thread termination target/i386: Set maximum APIC ID to KVM prior to vCPU creation os-posix: asynchronous teardown for shutdown on Linux target/i386: Fix calculation of LOCK NEG eflags Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-31Merge tag 'pull-request-2022-10-28' of https://gitlab.com/thuth/qemu into ↵Stefan Hajnoczi
staging * Fix and test the VISTR instruction on s390x * Some more small s390x fixes and maintainer updates * Make sure to remove all temporary files from qtests * OpenBSD VM test update to version 7.2 * Add sndio to FreeBSD tests * More patches to enable the qtests on Windows # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmNb1x8RHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbXmcA//TCliiFkhprVxzIqy7zb9uz2Odu+sS4dT # azUSlXvC14fECm/Rb/rd2VLqCu5x2er8CYauxKQ4VhRImzcDta4kvpt/HKIppN2t # sqw5tipJL0DYcWBwYL1llvfutM26M+Oh0igwR8uV7b+W1FjojEZdcOr9IZ6E6V55 # wQCE5OHm0VCr61QeI5IBfZTsiPo+DFomUCpj7w66j6i0CVDvmpoe36tCmvGgrcpZ # SP7ep7/Iq+dnGh2YnJyoUOPlXeeiBCxAygOVnIRXptDeniGoliCFn7ksLdKDQ9qY # 69pSPR/W7mTZB/HkCRalAbYuYrI9Rcqxdu6c9vcyB8Pr0snQLTf8qThY+BJ2oC4w # JSGgWVniAk5MmrDazwNRkSbgngYLYf+CcT1h5AANuU5Kt50Bdy9Y3TuL5YVmofEp # N4bypV0ICImQyDECz76+i5/iJOcWiRyjMfLT6y00dspeuy983xHakrsHGD8xj0U/ # 3IVxnF9bDnUSVg6lFhYrgCB3dRG1TNPJoYQOM7raS5MAPRrDtIuSabwtyn84jo4+ # 9kZRPJBriMBHNsCjGVlJ9CATmaK1SKVAbRcabjgOKoIwhZTpAe6JalykREUJlTys # hB2V//lWWYPaSpzwY+OkvxoOmJIziixEskOmx6hPcoxID5v/bqlR69W15aUlKuLq # VWFb+/yMvaE= # =h0Ep # -----END PGP SIGNATURE----- # gpg: Signature made Fri 28 Oct 2022 09:20:31 EDT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2022-10-28' of https://gitlab.com/thuth/qemu: (21 commits) tests/qtest: libqtest: Correct the timeout unit of blocking receive calls for win32 tests/qtest: libqos: Do not build virtio-9p unconditionally tests/qtest: migration-test: Make sure QEMU process "to" exited after migration is canceled tests/qtest: libqtest: Introduce qtest_wait_qemu() tests/qtest: Use EXIT_FAILURE instead of magic number tests/qtest: device-plug-test: Reverse the usage of double/single quotes tests/qtest: Support libqtest to build and run on Windows tests/qtest: Use send/recv for socket communication accel/qtest: Support qtest accelerator for Windows tests: Add sndio to the FreeBSD CI containers / VM tests/vm: update openbsd to release 7.2 tests/qtest/libqos/e1000e: Use e1000_regs.h tests/qtest/cxl-test: Remove temporary directories after testing tests/qtest/tpm: Clean up remainders of swtpm MAINTAINERS: target/s390x/: add Ilya as reviewer tests/tcg/s390x: Add a test for the vistr instruction target/s390x: Fix emulation of the VISTR instruction tests/tcg/s390x: Test compiler flags only once, not every time s390x/tod-kvm: don't save/restore the TOD in PV guests s390x: step down as general arch maintainer ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-31util/log: Close per-thread log file on thread terminationGreg Kurz
When `-D ${logfile} -d tid` is passed, qemu_log_trylock() creates a dedicated log file for the current thread and opens it. The corresponding file descriptor is cached in a __thread variable. Nothing is done to close the corresponding file descriptor when the thread terminates though and the file descriptor is leaked. The issue was found during code inspection and reproduced manually. Fix that with an atexit notifier. Fixes: 4e51069d6793 ("util/log: Support per-thread log files") Cc: richard.henderson@linaro.org Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20221021105734.555797-1-groug@kaod.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-31os-posix: asynchronous teardown for shutdown on LinuxClaudio Imbrenda
This patch adds support for asynchronously tearing down a VM on Linux. When qemu terminates, either naturally or because of a fatal signal, the VM is torn down. If the VM is huge, it can take a considerable amount of time for it to be cleaned up. In case of a protected VM, it might take even longer than a non-protected VM (this is the case on s390x, for example). Some users might want to shut down a VM and restart it immediately, without having to wait. This is especially true if management infrastructure like libvirt is used. This patch implements a simple trick on Linux to allow qemu to return immediately, with the teardown of the VM being performed asynchronously. If the new commandline option -async-teardown is used, a new process is spawned from qemu at startup, using the clone syscall, in such way that it will share its address space with qemu.The new process will have the name "cleanup/<QEMU_PID>". It will wait until qemu terminates completely, and then it will exit itself. This allows qemu to terminate quickly, without having to wait for the whole address space to be torn down. The cleanup process will exit after qemu, so it will be the last user of the address space, and therefore it will take care of the actual teardown. The cleanup process will share the same cgroups as qemu, so both memory usage and cpu time will be accounted properly. If possible, close_range will be used in the cleanup process to close all open file descriptors. If it is not available or if it fails, /proc will be used to determine which file descriptors to close. If the cleanup process is forcefully killed with SIGKILL before the main qemu process has terminated completely, the mechanism is defeated and the teardown will not be asynchronous. This feature can already be used with libvirt by adding the following to the XML domain definition to pass the parameter to qemu directly: <commandline xmlns="http://libvirt.org/schemas/domain/qemu/1.0"> <arg value='-async-teardown'/> </commandline> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Tested-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Message-Id: <20220812133453.82671-1-imbrenda@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-30Merge tag 'mem-2022-10-28' of https://github.com/davidhildenbrand/qemu into ↵Stefan Hajnoczi
staging Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Fix NVDIMM error message - Add ThreadContext user-creatable object and wire it up for NUMA-aware hostmem preallocation # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmNbpHARHGRhdmlkQHJl # ZGhhdC5jb20ACgkQTd4Q9wD/g1pDpw//bG9cyIlzTzDnU5pbQiXyLm0nF9tW/tli # npGPSbFFYz/72XD9VJSVLhbNHoQSmFcMK5m/DA4WAMdOc5zF7lP3XdZcj72pDyxu # 31hJRvuRhxNb09jhEdWRfX5+Jg9UyYXuIvtKXHSWgrtaYDtHBdTXq/ojZlvlo/rr # 36v0jaVaTNRs7dKQL2oaN+DSMiPXHxBzA6FABqYmJNNwuMJT0kkX8pfz0OFwkRn+ # iqf9uRhM6b/fNNB0+ReA7FfGL+hzU6Uv8AvAL3orXUqjwPMRe9Fz2gE7HpFnE6DD # dOP4Xk2iSSJ5XQA8HwtvrQfrGPh4gPYE80ziK/+8boy3alVeGYbYbvWVtdsNju41 # Cq9kM1wDyjZf6SSUIAbjOrNPdbhwyK4GviVBR1zh+/gA3uF5MhrDtZh4h3mWX2if # ijmT9mfte4NwF3K1MvckAl7IHRb8nxmr7wjjhJ26JwpD+76lfAcmXC2YOlFGHCMi # 028mjvThf3HW7BD2LjlQSX4UkHmM2vUBrgMGQKyeMham1VmMfSK32wzvUNfF7xSz # o9k0loBh7unGcUsv3EbqUGswV5F6AgjK3vWRkDql8dNrdIoapDfaejPCd58kVM98 # 5N/aEoha4bAeJ6NGIKzD+4saiMxUqJ0y2NjSrE8iO4HszXgZW5e1Gbkn4Ae6d37D # QSSqyfasVHY= # =bLuc # -----END PGP SIGNATURE----- # gpg: Signature made Fri 28 Oct 2022 05:44:16 EDT # gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A # gpg: issuer "david@redhat.com" # gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown] # gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full] # gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A * tag 'mem-2022-10-28' of https://github.com/davidhildenbrand/qemu: vl: Allow ThreadContext objects to be created before the sandbox option hostmem: Allow for specifying a ThreadContext for preallocation util: Make qemu_prealloc_mem() optionally consume a ThreadContext util: Add write-only "node-affinity" property for ThreadContext util: Introduce ThreadContext user-creatable object util: Introduce qemu_thread_set_affinity() and qemu_thread_get_affinity() util: Cleanup and rename os_mem_prealloc() hw/mem/nvdimm: fix error message for 'unarmed' flag Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-30Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into stagingStefan Hajnoczi
# -----BEGIN PGP SIGNATURE----- # Version: GnuPG v1 # # iQEcBAABAgAGBQJjW2i1AAoJEO8Ells5jWIR5HMIAIvDEmWQ2eZ1R+CfsefXkD5H # W3RSZbMrOHR6sb9cbYpqK/vWmH8E/jZkKY4n/q7vQ3QerFMeDPgxu0Qn43iElLXS # iGHhC51fa5IwJNDomjUGI8oJzyk0sxAbgwOjXZ4qbAkk9KeQYWU+JqYghvOrBYbd # VaIwEiBlECuBy0DKx2gBTfxgeuw3V3WvtjpKeZVRlR+4vLQtI9Goga78qIPfjpH2 # sN/lFhZGaX1FT8DSft0oCCBxCK8ZaNzmMpmr39a8+e4g/EJZC9xbgNkl3fN0yYa5 # P0nluv/Z6e1TZ4FBDaiVFysTS5WnS0KRjUrodzctiGECE3sQiAvkWbKZ+QdGrlw= # =0/b/ # -----END PGP SIGNATURE----- # gpg: Signature made Fri 28 Oct 2022 01:29:25 EDT # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [full] # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * tag 'net-pull-request' of https://github.com/jasowang/qemu: (26 commits) net: stream: add QAPI events to report connection state net: stream: move to QIO to enable additional parameters qemu-sockets: update socket_uri() and socket_parse() to be consistent qemu-sockets: move and rename SocketAddress_to_str() net: dgram: add unix socket net: dgram: move mcast specific code from net_socket_fd_init_dgram() net: dgram: make dgram_dst generic net: stream: add unix socket net: stream: Don't ignore EINVAL on netdev socket connection net: socket: Don't ignore EINVAL on netdev socket connection qapi: net: add stream and dgram netdevs net: introduce qemu_set_info_str() function qapi: net: introduce a way to bypass qemu_opts_parse_noisily() net: simplify net_client_parse() error management net: remove the @errp argument of net_client_inits() net: introduce convert_host_port() vhost: Accept event idx flag vhost: use avail event idx on vhost_svq_kick vhost: toggle device callbacks using used event idx vhost: allocate event_idx fields on vring ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-30Merge tag 'misc-next-pull-request' of https://gitlab.com/berrange/qemu into ↵Stefan Hajnoczi
staging pull: crypto and io queue * Many LUKS header robustness checks * Fix TLS PSK error reporting * Enable LUKS creation on macOS * Report useful errnos from seccomp * I/O chanel Windows portability fix # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmNawAcACgkQvobrtBUQ # T9/pWA/9FXE6kvkv9YQhb/h1rMALO1aLKqUG/jWKP/mzqqLpDKHxxPin/nw8RYff # xyHt5mC7t1g7a8FFMlXxFHw1WE9o46j3tQg2IokWlX2ossYaZQx+BVv4s1zjTxcK # KPVKWoEqN5sfa2T7gUGbfZ+dH9LSZ29DRT+GrO9YEvjdSg0yUKHXPetjw6iw5OVT # GuI22xOVKbuCBf7PW/nvUe/6prxAfc7IavvAusrdkMFXymcys87q7ZCxGYEsDxyC # vUkLdAoB9kcjwvmU+sZl9WhjasRQkUxW8zCToKea4TSS1fp5pgVL0TT4x7yq7ts4 # nqnaqiSTBfRda62lF64A9lM91K7hbDqPC33FkCNKWJGsQAYIFvdVJdqJsvZHUr1/ # 3KyHkXMsyzRfGnT7MHK+GpwcgvTupBP8ceiyYq28CLNAKXpXb6vmJIsIAdF3UaYi # N320ogiU3iRmkqdbbbGTpBB40UQvQvdbmqKTTDmigLdpDL2TLzAqfpu1zepg+7xE # wcXoPM9ZcRSwM7i9QyPMtjharCTeVR/QPlUN9agDGOlzNpUahIC5YrmCVKXNunnE # M259Ytyb6ymaMrsHgshW1gJP3327N/lIOp5yLLHEzgLM1xAGOaDP83FsF8JA/Zsd # f1he75N3KbDPYhgrdfFfitcO8F8zvhK3AqyqNDPCpJKVSeKKqFE= # =qrzm # -----END PGP SIGNATURE----- # gpg: Signature made Thu 27 Oct 2022 13:29:43 EDT # gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full] # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full] # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF * tag 'misc-next-pull-request' of https://gitlab.com/berrange/qemu: crypto: add test cases for many malformed LUKS header scenarios crypto: ensure LUKS tests run with GNUTLS crypto provider crypto: quote algorithm names in error messages crypto: split off helpers for converting LUKS header endianess crypto: split LUKS header definitions off into file crypto: check that LUKS PBKDF2 iterations count is non-zero crypto: strengthen the check for key slots overlapping with LUKS header crypto: validate that LUKS payload doesn't overlap with header crypto: enforce that key material doesn't overlap with LUKS header crypto: enforce that LUKS stripes is always a fixed value crypto: sanity check that LUKS header strings are NUL-terminated tests: avoid DOS line endings in PSK file crypto: check for and report errors setting PSK credentials scripts: check if .git exists before checking submodule status seccomp: Get actual errno value from failed seccomp functions io/channel-watch: Fix socket watch on Windows io/channel-watch: Drop the unnecessary cast io/channel-watch: Drop a superfluous '#ifdef WIN32' util/qemu-sockets: Use g_get_tmp_dir() to get the directory for temporary files crypto/luks: Support creating LUKS image on Darwin Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-28tests/qtest: Use send/recv for socket communicationXuzhou Cheng
Socket communication in the libqtest and libqmp codes uses read() and write() which work on any file descriptor on *nix, and sockets in *nix are an example of a file descriptor. However sockets on Windows do not use *nix-style file descriptors, so read() and write() cannot be used on sockets on Windows. Switch over to use send() and recv() instead which work on both Windows and *nix. Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20221028045736.679903-3-bin.meng@windriver.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-10-28qemu-sockets: update socket_uri() and socket_parse() to be consistentLaurent Vivier
To be consistent with socket_uri(), add 'tcp:' prefix for inet type in socket_parse(), by default socket_parse() use tcp when no prefix is provided (format is host:port). In socket_uri(), use 'vsock:' prefix for vsock type rather than 'tcp:' because it makes a vsock address look like an inet address with CID misinterpreted as host. Goes back to commit 9aca82ba31 "migration: Create socket-address parameter" Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28qemu-sockets: move and rename SocketAddress_to_str()Laurent Vivier
Rename SocketAddress_to_str() to socket_uri() and move it to util/qemu-sockets.c close to socket_parse(). socket_uri() generates a string from a SocketAddress while socket_parse() generates a SocketAddress from a string. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-27util: Make qemu_prealloc_mem() optionally consume a ThreadContextDavid Hildenbrand
... and implement it under POSIX. When a ThreadContext is provided, create new threads via the context such that these new threads obtain a properly configured CPU affinity. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-6-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2022-10-27util: Add write-only "node-affinity" property for ThreadContextDavid Hildenbrand
Let's make it easier to pin threads created via a ThreadContext to all host CPUs currently belonging to a given set of host NUMA nodes -- which is the common case. "node-affinity" is simply a shortcut for setting "cpu-affinity" manually to the list of host CPUs belonging to the set of host nodes. This property can only be written. A simple QEMU example to set the CPU affinity to host node 1 on a system with two nodes, 24 CPUs each, whereby odd-numbered host CPUs belong to host node 1: qemu-system-x86_64 -S \ -object thread-context,id=tc1,node-affinity=1 And we can query the cpu-affinity via HMP/QMP: (qemu) qom-get tc1 cpu-affinity [ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47 ] We cannot query the node-affinity: (qemu) qom-get tc1 node-affinity Error: Insufficient permission to perform this operation But note that due to dynamic library loading this example will not work before we actually make use of thread_context_create_thread() in QEMU code, because the type will otherwise not get registered. We'll wire this up next to make it work. Note that if the host CPUs for a host node change due do CPU hot(un)plug CPU onlining/offlining (i.e., lscpu output changes) after the ThreadContext was started, the CPU affinity will not get updated. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221014134720.168738-5-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2022-10-27util: Introduce ThreadContext user-creatable objectDavid Hildenbrand
Setting the CPU affinity of QEMU threads is a bit problematic, because QEMU doesn't always have permissions to set the CPU affinity itself, for example, with seccomp after initialized by QEMU: -sandbox enable=on,resourcecontrol=deny General information about CPU affinities can be found in the man page of taskset: CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system. The Linux scheduler will honor the given CPU affinity and the process will not run on any other CPUs. While upper layers are already aware of how to handle CPU affinities for long-lived threads like iothreads or vcpu threads, especially short-lived threads, as used for memory-backend preallocation, are more involved to handle. These threads are created on demand and upper layers are not even able to identify and configure them. Introduce the concept of a ThreadContext, that is essentially a thread used for creating new threads. All threads created via that context thread inherit the configured CPU affinity. Consequently, it's sufficient to create a ThreadContext and configure it once, and have all threads created via that ThreadContext inherit the same CPU affinity. The CPU affinity of a ThreadContext can be configured two ways: (1) Obtaining the thread id via the "thread-id" property and setting the CPU affinity manually (e.g., via taskset). (2) Setting the "cpu-affinity" property and letting QEMU try set the CPU affinity itself. This will fail if QEMU doesn't have permissions to do so anymore after seccomp was initialized. A simple QEMU example to set the CPU affinity to host CPU 0,1,6,7 would be: qemu-system-x86_64 -S \ -object thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7 And we can query it via HMP/QMP: (qemu) qom-get tc1 cpu-affinity [ 0, 1, 6, 7 ] But note that due to dynamic library loading this example will not work before we actually make use of thread_context_create_thread() in QEMU code, because the type will otherwise not get registered. We'll wire this up next to make it work. In general, the interface behaves like pthread_setaffinity_np(): host CPU numbers that are currently not available are ignored; only host CPU numbers that are impossible with the current kernel will fail. If the list of host CPU numbers does not include a single CPU that is available, setting the CPU affinity will fail. A ThreadContext can be reused, simply by reconfiguring the CPU affinity. Note that the CPU affinity of previously created threads will not get adjusted. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221014134720.168738-4-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2022-10-27util: Introduce qemu_thread_set_affinity() and qemu_thread_get_affinity()David Hildenbrand
Usually, we let upper layers handle CPU pinning, because pthread_setaffinity_np() (-> sched_setaffinity()) is blocked via seccomp when starting QEMU with -sandbox enable=on,resourcecontrol=deny However, we want to configure and observe the CPU affinity of threads from QEMU directly in some cases when the sandbox option is either not enabled or not active yet. So let's add a way to configure CPU pinning via qemu_thread_set_affinity() and obtain CPU affinity via qemu_thread_get_affinity() and implement them under POSIX using pthread_setaffinity_np() + pthread_getaffinity_np(). Implementation under Windows is possible using SetProcessAffinityMask() + GetProcessAffinityMask(), however, that is left as future work. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-3-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2022-10-27util: Cleanup and rename os_mem_prealloc()David Hildenbrand
Let's * give the function a "qemu_*" style name * make sure the parameters in the implementation match the prototype * rename smp_cpus to max_threads, which makes the semantics of that parameter clearer ... and add a function documentation. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-2-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2022-10-26numa: call ->ram_block_removed() in ram_block_notifer_remove()Stefan Hajnoczi
When a RAMBlockNotifier is added, ->ram_block_added() is called with all existing RAMBlocks. There is no equivalent ->ram_block_removed() call when a RAMBlockNotifier is removed. The util/vfio-helpers.c code (the sole user of RAMBlockNotifier) is fine with this asymmetry because it does not rely on RAMBlockNotifier for cleanup. It walks its internal list of DMA mappings and unmaps them by itself. Future users of RAMBlockNotifier may not have an internal data structure that records added RAMBlocks so they will need ->ram_block_removed() callbacks. This patch makes ram_block_notifier_remove() symmetric with respect to callbacks. Now util/vfio-helpers.c needs to unmap remaining DMA mappings after ram_block_notifier_remove() has been called. This is necessary since users like block/nvme.c may create additional DMA mappings that do not originate from the RAMBlockNotifier. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20221013185908.1297568-4-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>