slackcoder/qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer

Age	Commit message (Collapse)	Author
2023-09-07	io: follow coroutine AioContext in qio_channel_yield()	Stefan Hajnoczi
	The ongoing QEMU multi-queue block layer effort makes it possible for multiple threads to process I/O in parallel. The nbd block driver is not compatible with the multi-queue block layer yet because QIOChannel cannot be used easily from coroutines running in multiple threads. This series changes the QIOChannel API to make that possible. In the current API, calling qio_channel_attach_aio_context() sets the AioContext where qio_channel_yield() installs an fd handler prior to yielding: qio_channel_attach_aio_context(ioc, my_ctx); ... qio_channel_yield(ioc); // my_ctx is used here ... qio_channel_detach_aio_context(ioc); This API design has limitations: reading and writing must be done in the same AioContext and moving between AioContexts involves a cumbersome sequence of API calls that is not suitable for doing on a per-request basis. There is no fundamental reason why a QIOChannel needs to run within the same AioContext every time qio_channel_yield() is called. QIOChannel only uses the AioContext while inside qio_channel_yield(). The rest of the time, QIOChannel is independent of any AioContext. In the new API, qio_channel_yield() queries the AioContext from the current coroutine using qemu_coroutine_get_aio_context(). There is no need to explicitly attach/detach AioContexts anymore and qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone. One coroutine can read from the QIOChannel while another coroutine writes from a different AioContext. This API change allows the nbd block driver to use QIOChannel from any thread. It's important to keep in mind that the block driver already synchronizes QIOChannel access and ensures that two coroutines never read simultaneously or write simultaneously. This patch updates all users of qio_channel_attach_aio_context() to the new API. Most conversions are simple, but vhost-user-server requires a new qemu_coroutine_yield() call to quiesce the vu_client_trip() coroutine when not attached to any AioContext. While the API is has become simpler, there is one wart: QIOChannel has a special case for the iohandler AioContext (used for handlers that must not run in nested event loops). I didn't find an elegant way preserve that behavior, so I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true\|false) for opting in to the new AioContext model. By default QIOChannel uses the iohandler AioHandler. Code that formerly called qio_channel_attach_aio_context() now calls qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is created. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20230830224802.493686-5-stefanha@redhat.com> [eblake: also fix migration/rdma.c] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-05-30	aio: remove aio_disable_external() API	Stefan Hajnoczi
	All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-06	io: Add support for MSG_PEEK for socket channel	manish.mishra
	MSG_PEEK peeks at the channel, The data is treated as unread and the next read shall still return this data. This support is currently added only for socket class. Extra parameter 'flags' is added to io_readv calls to pass extra read flags like MSG_PEEK. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: manish.mishra <manish.mishra@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-10-12	io/command: implement support for win32	Marc-André Lureau
	The initial implementation was changing the pipe state created by GLib to PIPE_NOWAIT, but it turns out it doesn't work (read/write returns an error). Since reading may return less than the requested amount, it seems to be non-blocking already. However, the IO operation may block until the FD is ready, I can't find good sources of information, to be safe we can just poll for readiness before. Alternatively, we could setup the FDs ourself, and use UNIX sockets on Windows, which can be used in blocking/non-blocking mode. I haven't tried it, as I am not sure it is necessary. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20221006113657.2656108-6-marcandre.lureau@redhat.com>
2022-10-12	io/command: use glib GSpawn, instead of open-coding fork/exec	Marc-André Lureau
	Simplify qio_channel_command_new_spawn() with GSpawn API. This will allow to build for WIN32 in the following patches. As pointed out by Daniel Berrangé: there is a change in semantics here too. The current code only touches stdin/stdout/stderr. Any other FDs which do NOT have O_CLOEXEC set will be inherited. With the new code, all FDs except stdin/out/err will be explicitly closed, because we don't set the flag G_SPAWN_LEAVE_DESCRIPTORS_OPEN. The only place we use QIOChannelCommand today is the migration exec: protocol, and that is only declared to use stdin/stdout. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20221006113657.2656108-5-marcandre.lureau@redhat.com>
2022-05-16	QIOChannel: Add flags on io_writev and introduce io_flush callback	Leonardo Bras
	Add flags to io_writev and introduce io_flush as optional callback to QIOChannelClass, allowing the implementation of zero copy writes by subclasses. How to use them: - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY), - Wait write completion with qio_channel_flush(). Notes: As some zero copy write implementations work asynchronously, it's recommended to keep the write buffer untouched until the return of qio_channel_flush(), to avoid the risk of sending an updated buffer instead of the buffer state during write. As io_flush callback is optional, if a subclass does not implement it, then: - io_flush will return 0 without changing anything. Also, some functions like qio_channel_writev_full_all() were adapted to receive a flag parameter. That allows shared code between zero copy and non-zero copy writev, and also an easier implementation on new flags. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20220513062836.965425-3-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-03	io: replace qemu_set{_non}block()	Marc-André Lureau
	Those calls are non-socket fd, or are POSIX-specific. Use the dedicated GLib API. (qemu_set_nonblock() is for socket-like) (this is a preliminary patch before renaming qemu_set_nonblock()) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-05-03	io: make qio_channel_command_new_pid() static	Marc-André Lureau
	The function isn't used outside of qio_channel_command_new_spawn(), which is !win32-specific. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-05-03	io: replace pipe() with g_unix_open_pipe(CLOEXEC)	Marc-André Lureau
	Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-01-12	aio-posix: split poll check from ready handler	Stefan Hajnoczi
	Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time. For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling. By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring. The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before: 168k IOPS, IOThread syscalls: 9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32 174k IOPS (+3.6%), IOThread syscalls: 9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32 Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring. As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com [Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-27	io: Fix Lesser GPL version number	Chetan Pant
	There is no "version 2" of the "Lesser" General Public License. It is either "GPL version 2.0" or "Lesser GPL version 2.1". This patch replaces all occurrences of "Lesser GPL version 2" with "Lesser GPL version 2.1" in comment section. Signed-off-by: Chetan Pant <chetan4windows@gmail.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20201014134033.14095-1-chetan4windows@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-06-12	Include qemu/module.h where needed, drop it from qemu-common.h	Markus Armbruster
	Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2018-02-15	io/channel-command: Do not kill the child process after closing the pipe	Thomas Huth
	We are currently facing some migration failure on s390x when running certain avocado-vt tests, e.g. when running the test type_specific.io-github-autotest-qemu.migrate.with_reboot.exec.gzip_exec. This test is using 'migrate -d "exec:nc localhost 5200"' for the migration. The problem is detected at the receiving side, where the migration stream apparently ends too early. However, the cause for the problem is at the sending side: After writing the migration stream into the pipe to netcat, the source QEMU calls qio_channel_command_close() which closes the pipe and immediately (!) kills the child process afterwards (via the function qio_channel_command_abort()). So if the sending netcat did not read the final bytes from the pipe yet, or if it did not manage to send out all its buffers yet, it is killed before the whole migration stream is passed to the destination side. QEMU can not know how much time is required by the child process to send over all migration data, so we should not kill it, neither directly nor after a delay. Let's simply wait for the child process to exit gracefully instead (this was also the behaviour of pclose() that was used in "exec:" migration before the QIOChannel rework). Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2017-02-21	io: add methods to set I/O handlers on AioContext	Paolo Bonzini
	This is in preparation for making qio_channel_yield work on AioContexts other than the main one. Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213135235.12274-6-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-22	include/qemu/osdep.h: Don't include qapi/error.h	Markus Armbruster
	Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-10	io: remove checking of EWOULDBLOCK	Daniel P. Berrange
	Since we now canonicalize WSAEWOULDBLOCK into EAGAIN there is no longer any need to explicitly check EWOULDBLOCK for Win32. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-02-04	io: Clean up includes	Peter Maydell
	Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1454089805-5470-14-git-send-email-peter.maydell@linaro.org
2016-01-20	io: some fixes to handling of /dev/null when running commands	Daniel P. Berrange
	The /dev/null file handle was leaked in a couple of places. There is also the possibility that both readfd and writefd point to the same /dev/null file handle, so care must be taken not to close the same file handle twice. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-01-19	io: increment counter when killing off subcommand	Daniel P. Berrange
	When killing the subcommand, it is intended to first send SIGTERM, then SIGKILL and only report an error if it still doesn't die after SIGKILL. The 'step' counter was not being incremented though, so the code never got past the SIGTERM stage. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18	io: add QIOChannelCommand class	Daniel P. Berrange
	Add a QIOChannel subclass that is capable of performing I/O to/from a separate process, via a pair of pipes. The command can be used for unidirectional or bi-directional I/O. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>