aboutsummaryrefslogtreecommitdiff
path: root/iothread.c
AgeCommit message (Collapse)Author
2022-05-09util/event-loop-base: Introduce options to set the thread pool sizeNicolas Saenz Julienne
The thread pool regulates itself: when idle, it kills threads until empty, when in demand, it creates new threads until full. This behaviour doesn't play well with latency sensitive workloads where the price of creating a new thread is too high. For example, when paired with qemu's '-mlock', or using safety features like SafeStack, creating a new thread has been measured take multiple milliseconds. In order to mitigate this let's introduce a new 'EventLoopBase' property to set the thread pool size. The threads will be created during the pool's initialization or upon updating the property's value, remain available during its lifetime regardless of demand, and destroyed upon freeing it. A properly characterized workload will then be able to configure the pool to avoid any latency spikes. Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-id: 20220425075723.20019-4-nsaenzju@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-05-09Introduce event-loop-base abstract classNicolas Saenz Julienne
Introduce the 'event-loop-base' abstract class, it'll hold the properties common to all event loops and provide the necessary hooks for their creation and maintenance. Then have iothread inherit from it. EventLoopBaseClass is defined as user creatable and provides a hook for its children to attach themselves to the user creatable class 'complete' function. It also provides an update_params() callback to propagate property changes onto its children. The new 'event-loop-base' class will live in the root directory. It is built on its own using the 'link_whole' option (there are no direct function dependencies between the class and its children, it all happens trough 'constructor' magic). And also imposes new compilation dependencies: qom <- event-loop-base <- blockdev (iothread.c) And in subsequent patches: qom <- event-loop-base <- qemuutil (util/main-loop.c) All this forced some amount of reordering in meson.build: - Moved qom build definition before qemuutil. Doing it the other way around (i.e. moving qemuutil after qom) isn't possible as a lot of core libraries that live in between the two depend on it. - Process the 'hw' subdir earlier, as it introduces files into the 'qom' source set. No functional changes intended. Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-id: 20220425075723.20019-2-nsaenzju@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-10-07iothread: use IOThreadParamInfo in iothread_[set|get]_param()Stefano Garzarella
Commit 0445409d74 ("iothread: generalize iothread_set_param/iothread_get_param") moved common code to set and get IOThread parameters in two new functions. These functions are called inside callbacks, so we don't need to use an opaque pointer. Let's replace `void *opaque` parameter with `IOThreadParamInfo *info`. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210727145936.147032-3-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-10-07iothread: rename PollParamInfo to IOThreadParamInfoStefano Garzarella
Commit 1793ad0247 ("iothread: add aio-max-batch parameter") added a new parameter (aio-max-batch) to IOThread and used PollParamInfo structure to handle it. Since it is not a parameter of the polling mechanism, we rename the structure to a more generic IOThreadParamInfo. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210727145936.147032-2-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-07-21iothread: add aio-max-batch parameterStefano Garzarella
The `aio-max-batch` parameter will be propagated to AIO engines and it will be used to control the maximum number of queued requests. When there are in queue a number of requests equal to `aio-max-batch`, the engine invokes the system call to forward the requests to the kernel. This parameter allows us to control the maximum batch size to reduce the latency that requests might accumulate while queued in the AIO engine queue. If `aio-max-batch` is equal to 0 (default value), the AIO engine will use its default maximum batch size value. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20210721094211.69853-3-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-07-21iothread: generalize iothread_set_param/iothread_get_paramStefano Garzarella
Changes in preparation for next patches where we add a new parameter not related to the poll mechanism. Let's add two new generic functions (iothread_set_param and iothread_get_param) that we use to set and get IOThread parameters. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20210721094211.69853-2-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-06-18async: the main AioContext is only "current" if under the BQLPaolo Bonzini
If we want to wake up a coroutine from a worker thread, aio_co_wake() currently does not work. In that scenario, aio_co_wake() calls aio_co_enter(), but there is no current AioContext and therefore qemu_get_current_aio_context() returns the main thread. aio_co_wake() then attempts to call aio_context_acquire() instead of going through aio_co_schedule(). The default case of qemu_get_current_aio_context() was added to cover synchronous I/O started from the vCPU thread, but the main and vCPU threads are quite different. The main thread is an I/O thread itself, only running a more complicated event loop; the vCPU thread instead is essentially a worker thread that occasionally calls qemu_mutex_lock_iothread(). It is only in those critical sections that it acts as if it were the home thread of the main AioContext. Therefore, this patch detaches qemu_get_current_aio_context() from iothreads, which is a useless complication. The AioContext pointer is stored directly in the thread-local variable, including for the main loop. Worker threads (including vCPU threads) optionally behave as temporary home threads if they have taken the big QEMU lock, but if that is not the case they will always schedule coroutines on remote threads via aio_co_schedule(). With this change, the stub qemu_mutex_iothread_locked() must be changed from true to false. The previous value of true was needed because the main thread did not have an AioContext in the thread-local variable, but now it does have one. Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210609122234.544153-1-pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: tweak commit message per Vladimir's review] Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-10multi-process: define MPQemuMsg format and transmission functionsElena Ufimtseva
Defines MPQemuMsg, which is the message that is sent to the remote process. This message is sent over QIOChannel and is used to command the remote process to perform various tasks. Define transmission functions used by proxy and by remote. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 56ca8bcf95195b2b195b08f6b9565b6d7410bce5.1611938319.git.jag.raman@oracle.com [Replace struct iovec send[2] = {0} with {} to make clang happy as suggested by Peter Maydell <peter.maydell@linaro.org>. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-01-28qapi: Use QAPI_LIST_APPEND in trivial casesEric Blake
The easiest spots to use QAPI_LIST_APPEND are where we already have an obvious pointer to the tail of a list. While at it, consistently use the variable name 'tail' for that purpose. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210113221013.390592-5-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-09-23qemu/atomic.h: rename atomic_ to qatomic_Stefan Hajnoczi
clang's C11 atomic_fetch_*() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int *' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic\(64\)\?_[a-z0-9_]\+' include/qemu/atomic.h | \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>
2020-09-09Use DECLARE_*CHECKER* macrosEduardo Habkost
Generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=TypeCheckMacro $(git grep -l '' -- '*.[ch]') Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-12-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-13-ehabkost@redhat.com> Message-Id: <20200831210740.126168-14-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-07-21qom: Change object_get_canonical_path_component() not to mallocMarkus Armbruster
object_get_canonical_path_component() returns a malloced copy of a property name on success, null on failure. 19 of its 25 callers immediately free the returned copy. Change object_get_canonical_path_component() to return the property name directly. Since modifying the name would be wrong, adjust the return type to const char *. Drop the free from the 19 callers become simpler, add the g_strdup() to the other six. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200714160202.3121879-4-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Li Qiang <liq3ea@gmail.com>
2020-07-10error: Eliminate error_propagate() with Coccinelle, part 1Markus Armbruster
When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) | - !fun(args, &err, args2) + !fun(args, errp, args2) | - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; | return c2; | return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var | !var | var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; | return c2; | return false; | return var; ) } @depends on rule1 || rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>
2020-07-10error: Avoid unnecessary error_propagate() after error_setg()Markus Armbruster
Replace error_setg(&err, ...); error_propagate(errp, err); by error_setg(errp, ...); Related pattern: if (...) { error_setg(&err, ...); goto out; } ... out: error_propagate(errp, err); return; When all paths to label out are that way, replace by if (...) { error_setg(errp, ...); return; } and delete the label along with the error_propagate(). When we have at most one other path that actually needs to propagate, and maybe one at the end that where propagation is unnecessary, e.g. foo(..., &err); if (err) { goto out; } ... bar(..., &err); out: error_propagate(errp, err); return; move the error_propagate() to where it's needed, like if (...) { foo(..., &err); error_propagate(errp, err); return; } ... bar(..., errp); return; and transform the error_setg() as above. In some places, the transformation results in obviously unnecessary error_propagate(). The next few commits will eliminate them. Bonus: the elimination of gotos will make later patches in this series easier to review. Candidates for conversion tracked down with this Coccinelle script: @@ identifier err, errp; expression list args; @@ - error_setg(&err, args); + error_setg(errp, args); ... when != err error_propagate(errp, err); Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-34-armbru@redhat.com>
2020-07-10qapi: Use returned bool to check for failure, Coccinelle partMarkus Armbruster
The previous commit enables conversion of visit_foo(..., &err); if (err) { ... } to if (!visit_foo(..., errp)) { ... } for visitor functions that now return true / false on success / error. Coccinelle script: @@ identifier fun =~ "check_list|input_type_enum|lv_start_struct|lv_type_bool|lv_type_int64|lv_type_str|lv_type_uint64|output_type_enum|parse_type_bool|parse_type_int64|parse_type_null|parse_type_number|parse_type_size|parse_type_str|parse_type_uint64|print_type_bool|print_type_int64|print_type_null|print_type_number|print_type_size|print_type_str|print_type_uint64|qapi_clone_start_alternate|qapi_clone_start_list|qapi_clone_start_struct|qapi_clone_type_bool|qapi_clone_type_int64|qapi_clone_type_null|qapi_clone_type_number|qapi_clone_type_str|qapi_clone_type_uint64|qapi_dealloc_start_list|qapi_dealloc_start_struct|qapi_dealloc_type_anything|qapi_dealloc_type_bool|qapi_dealloc_type_int64|qapi_dealloc_type_null|qapi_dealloc_type_number|qapi_dealloc_type_str|qapi_dealloc_type_uint64|qobject_input_check_list|qobject_input_check_struct|qobject_input_start_alternate|qobject_input_start_list|qobject_input_start_struct|qobject_input_type_any|qobject_input_type_bool|qobject_input_type_bool_keyval|qobject_input_type_int64|qobject_input_type_int64_keyval|qobject_input_type_null|qobject_input_type_number|qobject_input_type_number_keyval|qobject_input_type_size_keyval|qobject_input_type_str|qobject_input_type_str_keyval|qobject_input_type_uint64|qobject_input_type_uint64_keyval|qobject_output_start_list|qobject_output_start_struct|qobject_output_type_any|qobject_output_type_bool|qobject_output_type_int64|qobject_output_type_null|qobject_output_type_number|qobject_output_type_str|qobject_output_type_uint64|start_list|visit_check_list|visit_check_struct|visit_start_alternate|visit_start_list|visit_start_struct|visit_type_.*"; expression list args; typedef Error; Error *err; @@ - fun(args, &err); - if (err) + if (!fun(args, &err)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-19-armbru@redhat.com>
2020-05-15qom: Drop parameter @errp of object_property_add() & friendsMarkus Armbruster
The only way object_property_add() can fail is when a property with the same name already exists. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Same for its variants, except for object_property_add_child(), which additionally fails when the child already has a parent. Parentage is also under program control, so this is a programming error, too. We have a bit over 500 callers. Almost half of them pass &error_abort, slightly fewer ignore errors, one test case handles errors, and the remaining few callers pass them to their own callers. The previous few commits demonstrated once again that ignoring programming errors is a bad idea. Of the few ones that pass on errors, several violate the Error API. The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. ich9_pm_add_properties(), sparc32_ledma_realize(), sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize() are wrong that way. When the one appropriate choice of argument is &error_abort, letting users pick the argument is a bad idea. Drop parameter @errp and assert the preconditions instead. There's one exception to "duplicate property name is a programming error": the way object_property_add() implements the magic (and undocumented) "automatic arrayification". Don't drop @errp there. Instead, rename object_property_add() to object_property_try_add(), and add the obvious wrapper object_property_add(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-15-armbru@redhat.com> [Two semantic rebase conflicts resolved]
2019-03-08iothread: document about why we need explicit aio_poll()Peter Xu
After consulting Paolo I know why we'd better keep the explicit aio_poll() in iothread_run(). Document it directly into the code so that future readers will know the answer from day one. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190306115532.23025-6-peterx@redhat.com Message-Id: <20190306115532.23025-6-peterx@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-08iothread: push gcontext earlier in the thread_fnPeter Xu
We were pushing the context until right before running the gmainloop. Now since we have everything unconditionally, we can move this earlier. One benefit is that now it's done even before init_done_sem, so as long as the iothread user calls iothread_create() and completes, we know that the thread stack is ready. Signed-off-by: Peter Xu <peterx@redhat.com> Message-id: 20190306115532.23025-5-peterx@redhat.com Message-Id: <20190306115532.23025-5-peterx@redhat.com> [Tweaked comment wording as discussed with Peter Xu. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-08iothread: create main loop unconditionallyPeter Xu
Since we've have the gcontext always there, create the main loop altogether. The iothread_run() is even cleaner. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-id: 20190306115532.23025-4-peterx@redhat.com Message-Id: <20190306115532.23025-4-peterx@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-08iothread: create the gcontext unconditionallyPeter Xu
In existing code we create the gcontext dynamically at the first access of the gcontext from caller. That can bring some complexity and potential races during using iothread. Since the context itself is not that big a resource, and we won't have millions of iothread, let's simply create the gcontext unconditionally. This will also be a preparation work further to move the thread context push operation earlier than before (now it's only pushed right before we want to start running the gmainloop). Removing the g_once since it's not necessary, while introducing a new run_gcontext boolean to show whether we want to run the gcontext. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-id: 20190306115532.23025-3-peterx@redhat.com Message-Id: <20190306115532.23025-3-peterx@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-08iothread: replace init_done_cond with a semaphorePeter Xu
Only sending an init-done message using lock+cond seems an overkill to me. Replacing it with a simpler semaphore. Meanwhile, init the semaphore unconditionally, then we can destroy it unconditionally too in finalize which seems cleaner. Signed-off-by: Peter Xu <peterx@redhat.com> Message-id: 20190306115532.23025-2-peterx@redhat.com Message-Id: <20190306115532.23025-2-peterx@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-12iothread: fix iothread hang when stop too soonPeter Xu
Lukas reported an hard to reproduce QMP iothread hang on s390 that QEMU might hang at pthread_join() of the QMP monitor iothread before quitting: Thread 1 #0 0x000003ffad10932c in pthread_join #1 0x0000000109e95750 in qemu_thread_join at /home/thuth/devel/qemu/util/qemu-thread-posix.c:570 #2 0x0000000109c95a1c in iothread_stop #3 0x0000000109bb0874 in monitor_cleanup #4 0x0000000109b55042 in main While the iothread is still in the main loop: Thread 4 #0 0x000003ffad0010e4 in ?? #1 0x000003ffad553958 in g_main_context_iterate.isra.19 #2 0x000003ffad553d90 in g_main_loop_run #3 0x0000000109c9585a in iothread_run at /home/thuth/devel/qemu/iothread.c:74 #4 0x0000000109e94752 in qemu_thread_start at /home/thuth/devel/qemu/util/qemu-thread-posix.c:502 #5 0x000003ffad10825a in start_thread #6 0x000003ffad00dcf2 in thread_start IMHO it's because there's a race between the main thread and iothread when stopping the thread in following sequence: main thread iothread =========== ============== aio_poll() iothread_get_g_main_context set iothread->worker_context iothread_stop schedule iothread_stop_bh execute iothread_stop_bh [1] set iothread->running=false (since main_loop==NULL so skip to quit main loop. Note: although main_loop is NULL but worker_context is not!) atomic_read(&iothread->worker_context) [2] create main_loop object g_main_loop_run() [3] pthread_join() [4] We can see that when execute iothread_stop_bh() at [1] it's possible that main_loop is still NULL because it's only created until the first check of the worker_context later at [2]. Then the iothread will hang in the main loop [3] and it'll starve the main thread too [4]. Here the simple solution should be that we check again the "running" variable before check against worker_context. CC: Thomas Huth <thuth@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Lukáš Doktor <ldoktor@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Lukáš Doktor <ldoktor@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Message-id: 20190129051432.22023-1-peterx@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-10-12iothread: fix crash with invalid propertiesMarc-André Lureau
-object iothread,id=foo,? will crash qemu: qemu-system-x86_64:qemu-thread-posix.c:128: qemu_cond_destroy: Assertion `cond->initialized' failed. Use thread_id != -1 to check if iothread_complete() finished successfully and the mutex/cond have been initialized. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180821100716.13803-1-marcandre.lureau@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-04-10iothread: workaround glib bug which hangs qmp-testPeter Xu
Free the AIO context earlier than the GMainContext (if we have) to workaround a glib2 bug that GSource context pointer is not cleared even if the context has already been destroyed (while it should). The patch itself only changed the order to destroy the objects, no functional change at all. Without this workaround, we can encounter qmp-test hang with oob (and possibly any other use case when iothread is used with GMainContexts): #0 0x00007f35ffe45334 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f35ffe405d8 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007f35ffe404a7 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f35fc5b9c9d in g_source_unref_internal (source=0x24f0600, context=0x7f35f0000960, have_lock=0) at gmain.c:1685 #4 0x0000000000aa6672 in aio_context_unref (ctx=0x24f0600) at /root/qemu/util/async.c:497 #5 0x000000000065851c in iothread_instance_finalize (obj=0x24f0380) at /root/qemu/iothread.c:129 #6 0x0000000000962d79 in object_deinit (obj=0x24f0380, type=0x242e960) at /root/qemu/qom/object.c:462 #7 0x0000000000962e0d in object_finalize (data=0x24f0380) at /root/qemu/qom/object.c:476 #8 0x0000000000964146 in object_unref (obj=0x24f0380) at /root/qemu/qom/object.c:924 #9 0x0000000000965880 in object_finalize_child_property (obj=0x24ec640, name=0x24efca0 "mon_iothread", opaque=0x24f0380) at /root/qemu/qom/object.c:1436 #10 0x0000000000962c33 in object_property_del_child (obj=0x24ec640, child=0x24f0380, errp=0x0) at /root/qemu/qom/object.c:436 #11 0x0000000000962d26 in object_unparent (obj=0x24f0380) at /root/qemu/qom/object.c:455 #12 0x0000000000658f00 in iothread_destroy (iothread=0x24f0380) at /root/qemu/iothread.c:365 #13 0x00000000004c67a8 in monitor_cleanup () at /root/qemu/monitor.c:4663 #14 0x0000000000669e27 in main (argc=16, argv=0x7ffc8b1ae2f8, envp=0x7ffc8b1ae380) at /root/qemu/vl.c:4749 The glib2 bug is fixed in commit 26056558b ("gmain: allow g_source_get_context() on destroyed sources", 2012-07-30), so the first good version is glib2 2.33.10. But we still support building with glib as old as 2.28, so we need the workaround. Let's make sure we destroy the GSources first before its owner context until we drop support for glib older than 2.33.10. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180409083956.1780-1-peterx@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-26iothread: fix breakage on windowsPeter Xu
OOB can enable iothread for parsing even on Windows. We need some tunes to enable that on Windows otherwise it'll break Windows users. This patch fixes the breakage on Windows with qemu-system-ppc.exe. Reported-by: Howard Spoelstra <hsp.cat7@gmail.com> Tested-by: Howard Spoelstra <hsp.cat7@gmail.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180322085630.23654-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-08vl: introduce vm_shutdown()Stefan Hajnoczi
Commit 00d09fdbbae5f7864ce754913efc84c12fdf9f1a ("vl: pause vcpus before stopping iothreads") and commit dce8921b2baaf95974af8176406881872067adfa ("iothread: Stop threads before main() quits") tried to work around the fact that emulation was still active during termination by stopping iothreads. They suffer from race conditions: 1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the virtio_scsi_ctx_check() assertion failure because the BDS AioContext has been modified by iothread_stop_all(). 2. Guest vq kick racing with main loop termination leaves a readable ioeventfd that is handled by the next aio_poll() when external clients are enabled again, resulting in unwanted emulation activity. This patch obsoletes those commits by fully disabling emulation activity when vcpus are stopped. Use the new vm_shutdown() function instead of pause_all_vcpus() so that vm change state handlers are invoked too. Virtio devices will now stop their ioeventfds, preventing further emulation activity after vm_stop(). Note that vm_stop(RUN_STATE_SHUTDOWN) cannot be used because it emits a QMP STOP event that may affect existing clients. It is no longer necessary to call replay_disable_events() directly since vm_shutdown() does so already. Drop iothread_stop_all() since it is no longer used. Cc: Fam Zheng <famz@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20180307144205.20619-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-02qapi: Empty out qapi-schema.jsonMarkus Armbruster
The previous commit improved compile time by including less of the generated QAPI headers. This is impossible for stuff defined directly in qapi-schema.json, because that ends up in headers that that pull in everything. Move everything but include directives from qapi-schema.json to new sub-module qapi/misc.json, then include just the "misc" shard where possible. It's possible everywhere, except: * monitor.c needs qmp-command.h to get qmp_init_marshal() * monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need qapi-event.h to get enum QAPIEvent Perhaps we'll get rid of those some other day. Adding a type to qapi/migration.json now recompiles some 120 instead of 2300 out of 5100 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-25-armbru@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-02-09Include qapi/error.h exactly where neededMarkus Armbruster
This cleanup makes the number of objects depending on qapi/error.h drop from 1910 (out of 4743) to 1612 in my "build everything" tree. While there, separate #include from file comment with a blank line, and drop a useless comment on why qemu/osdep.h is included first. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-5-armbru@redhat.com> [Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
2017-12-19iothread: fix iothread_stop() race conditionStefan Hajnoczi
There is a small chance that iothread_stop() hangs as follows: Thread 3 (Thread 0x7f63eba5f700 (LWP 16105)): #0 0x00007f64012c09b6 in ppoll () at /lib64/libc.so.6 #1 0x000055959992eac9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055959992eac9 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322 #3 0x0000559599930711 in aio_poll (ctx=0x55959bdb83c0, blocking=blocking@entry=true) at util/aio-posix.c:629 #4 0x00005595996806fe in iothread_run (opaque=0x55959bd78400) at iothread.c:59 #5 0x00007f640159f609 in start_thread () at /lib64/libpthread.so.0 #6 0x00007f64012cce6f in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7f640b45b280 (LWP 16103)): #0 0x00007f64015a0b6d in pthread_join () at /lib64/libpthread.so.0 #1 0x00005595999332ef in qemu_thread_join (thread=<optimized out>) at util/qemu-thread-posix.c:547 #2 0x00005595996808ae in iothread_stop (iothread=<optimized out>) at iothread.c:91 #3 0x000055959968094d in iothread_stop_iter (object=<optimized out>, opaque=<optimized out>) at iothread.c:102 #4 0x0000559599857d97 in do_object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0, recurse=recurse@entry=false) at qom/object.c:852 #5 0x0000559599859477 in object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0) at qom/object.c:867 #6 0x0000559599680a6e in iothread_stop_all () at iothread.c:341 #7 0x000055959955b1d5 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4913 The relevant code from iothread_run() is: while (!atomic_read(&iothread->stopping)) { aio_poll(iothread->ctx, true); and iothread_stop(): iothread->stopping = true; aio_notify(iothread->ctx); ... qemu_thread_join(&iothread->thread); The following scenario can occur: 1. IOThread: while (!atomic_read(&iothread->stopping)) -> stopping=false 2. Main loop: iothread->stopping = true; aio_notify(iothread->ctx); 3. IOThread: aio_poll(iothread->ctx, true); -> hang The bug is explained by the AioContext->notify_me doc comments: "If this field is 0, everything (file descriptors, bottom halves, timers) will be re-evaluated before the next blocking poll(), thus the event_notifier_set call can be skipped." The problem is that "everything" does not include checking iothread->stopping. This means iothread_run() will block in aio_poll() if aio_notify() was called just before aio_poll(). This patch fixes the hang by replacing aio_notify() with aio_bh_schedule_oneshot(). This makes aio_poll() or g_main_loop_run() to return. Implementing this properly required a new bool running flag. The new flag prevents races that are tricky if we try to use iothread->stopping. Now iothread->stopping is purely for iothread_stop() and iothread->running is purely for the iothread_run() thread. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20171207201320.19284-6-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-12-19iothread: add iothread_by_id() APIStefan Hajnoczi
Encapsulate IOThread QOM object lookup so that callers don't need to know how and where IOThread objects live. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20171206144550.22295-8-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-10-03iothread: delay the context release to finalizePeter Xu
When gcontext is used with iothread, the context will be destroyed during iothread_stop(). That's not good since sometimes we would like to keep the resources until iothread is destroyed, but we may want to stop the thread before that point. Delay the destruction of gcontext to iothread finalize. Then we can do: iothread_stop(thread); some_cleanup_on_resources(); iothread_destroy(thread); We may need this patch if we want to run chardev IOs in iothreads and hopefully clean them up correctly. For more specific information, please see 2b316774f6 ("qemu-char: do not operate on sources from finalize callbacks"). Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-id: 20170928025958.1420-5-peterx@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-10-03iothread: export iothread_stop()Peter Xu
So that internal iothread users can explicitly stop one iothread without destroying it. Since at it, fix iothread_stop() to allow it to be called multiple times. Before this patch we may call iothread_stop() more than once on single iothread, while that may not be correct since qemu_thread_join() is not allowed to run twice. From manual of pthread_join(): Joining with a thread that has previously been joined results in undefined behavior. Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-id: 20170928025958.1420-4-peterx@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-10-03iothread: provide helpers for internal usePeter Xu
IOThread is a general framework that contains IO loop environment and a real thread behind. It's also good to be used internally inside qemu. Provide some helpers for it to create iothreads to be used internally. Put all the internal used iothreads into the internal object container. Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-id: 20170928025958.1420-3-peterx@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-09-29iothread: Make iothread_stop() idempotentEduardo Habkost
Currently, iothread_stop_all() makes all iothread objects unsafe to be destroyed, because qemu_thread_join() ends up being called twice. To fix this, make iothread_stop() idempotent by checking thread->stopped. Fixes the following crash: qemu-system-x86_64 -object iothread,id=iothread0 -monitor stdio -display none QEMU 2.10.50 monitor - type 'help' for more information (qemu) quit qemu: qemu_thread_join: No such process Aborted (core dumped) Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170926130028.12471-1-ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-08qemu-iothread: IOThread supports the GMainContext event loopWang Yong
IOThread uses AioContext event loop and does not run a GMainContext. Therefore,chardev cannot work in IOThread,such as the chardev is used for colo-compare packets reception. This patch makes the IOThread run the GMainContext event loop, chardev and IOThread can work together. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Wang Yong <wang.yong155@zte.com.cn> Signed-off-by: Wang Guang <wang.guang55@zte.com.cn> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-02-21monitor: add poll-* properties into query-iothreads resultPavel Hrdina
IOthreads were recently extended by new properties that can enable/disable and configure aio polling. This will also allow other tools that uses QEMU to probe for existence of those new properties via query-qmp-schema. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Message-Id: <3163c16d6ab4257f7be9ad44fe9cc0ce8c359e5a.1486718555.git.phrdina@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-03iothread: enable AioContext polling by defaultStefan Hajnoczi
IOThread AioContexts are likely to consist only of event sources like virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable from userspace (without system calls). We recently merged the AioContext polling feature but didn't enable it by default yet. I have gone back over the performance data on the mailing list and picked a default polling value that gave good results. Let's enable AioContext polling by default so users don't have another switch they need to set manually. If performance regressions are found we can still disable this for the QEMU 2.9 release. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Karl Rister <krister@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170126170119.27876-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-03iothread: add poll-grow and poll-shrink parametersStefan Hajnoczi
These parameters control the poll time self-tuning algorithm. They are optional and will default to sane values if omitted. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20161201192652.9509-14-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-03aio: self-tune polling timeStefan Hajnoczi
This patch is based on the algorithm for the kvm.ko halt_poll_ns parameter in Linux. The initial polling time is zero. If the event loop is woken up within the maximum polling time it means polling could be effective, so grow polling time. If the event loop is woken up beyond the maximum polling time it means polling is not effective, so shrink polling time. If the event loop makes progress within the current polling time then the sweet spot has been reached. This algorithm adjusts the polling time so it can adapt to variations in workloads. The goal is to reach the sweet spot while also recognizing when polling would hurt more than help. Two new trace events, poll_grow and poll_shrink, are added for observing polling time adjustment. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20161201192652.9509-13-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-03iothread: add polling parametersStefan Hajnoczi
Poll mode can be configured with -object iothread,poll-max-ns=NUM. Polling is disabled with a value of 0 nanoseconds. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20161201192652.9509-7-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-10-28iothread: release AioContext around aio_pollPaolo Bonzini
This is the first step towards having fine-grained critical sections in dataplane threads, which will resolve lock ordering problems between address_space_* functions (which need the BQL when doing MMIO, even after we complete RCU-based dispatch) and the AioContext. Because AioContext does not use contention callbacks anymore, the unit test has to be changed. Previously applied as a0710f7995f914e3044e5899bd8ff6c43c62f916 and then reverted. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1477565348-5458-19-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28iothread: detach all block devices before stopping themPaolo Bonzini
Soon bdrv_drain will not call aio_poll itself on iothreads. If block devices are left hanging off the iothread's AioContext, there will be no one to do I/O for those poor devices. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-Id: <1477565348-5458-13-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28aio: introduce qemu_get_current_aio_contextPaolo Bonzini
This will be used by BDRV_POLL_WHILE (and thus by bdrv_drain) to choose how to wait for I/O completion. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1477565348-5458-12-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2016-09-28iothread: check iothread->ctx before aio_context_unref to avoid assertionLin Ma
if iothread->ctx is set to NULL, aio_context_unref triggers the assertion: g_source_unref: assertion 'source != NULL' failed. The patch fixes it. Signed-off-by: Lin Ma <lma@suse.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20160926052958.10716-1-lma@suse.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13iothread: Stop threads before main() quitsFam Zheng
Right after main_loop ends, we release various things but keep iothread alive. The latter is not prepared to the sudden change of resources. Specifically, after bdrv_close_all(), virtio-scsi dataplane get a surprise at the empty BlockBackend: (gdb) bt at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:543 at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:577 It is because the d->conf.blk->root is set to NULL, then blk_get_aio_context() returns qemu_aio_context, whereas s->ctx is still pointing to the iothread: hw/scsi/virtio-scsi.c:543: if (s->dataplane_started) { assert(blk_get_aio_context(d->conf.blk) == s->ctx); } To fix this, let's stop iothreads before doing bdrv_close_all(). Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1473326931-9699-1-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-04all: Clean up includesPeter Maydell
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1454089805-5470-16-git-send-email-peter.maydell@linaro.org
2015-12-03iothread: include id in thread namePaolo Bonzini
This makes it easier to find the desired thread. Use "IO" plus the id; even with the 14 character limit on the thread name, enough of the id should be readable (e.g. "IO iothreadNNN" with three characters for the number). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 1448372804-5034-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-24rcu: actually register threads that have RCU read-side critical sectionsPaolo Bonzini
Otherwise, grace periods are detected too early! Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19qom: Add helper function for getting user objects rootDaniel P. Berrange
Add object_get_objects_root() function which is a convenience for obtaining the Object * located at /objects in the object composition tree. Convert existing code over to use the new API where appropriate. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-06-12Revert "iothread: release iothread around aio_poll"Stefan Hajnoczi
This reverts commit a0710f7995f914e3044e5899bd8ff6c43c62f916. In qemu-devel email message <556DBF87.2020908@de.ibm.com>, Christian Borntraeger writes: Having many guests all with a kernel/ramdisk (via -kernel) and several null block devices will result in hangs. All hanging guests are in partition detection code waiting for an I/O to return so very early maybe even the first I/O. Reverting that commit "fixes" the hangs. Reverting this commit for the 2.4 release. More time is needed to investigate and correct this patch. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>