aboutsummaryrefslogtreecommitdiff
path: root/hw/xen/xen-bus.c
AgeCommit message (Collapse)Author
2024-02-02hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICsDavid Woodhouse
When instantiating XenBus itself, for each NIC which is configured with either the model unspecified, or set to to "xen" or "xen-net-device", create a corresponding xen-net-device for it. Now we can revert the previous more hackish version which relied on the platform code explicitly registering the NICs on its own XenBus, having returned the BusState* from xen_bus_init() itself. This also fixes the setup for Xen PV guests, which was previously broken in various ways and never actually managed to peer with the netdev. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-11-07hw/i386/pc: support '-nic' for xen-net-deviceDavid Woodhouse
The default NIC creation seems a bit hackish to me. I don't understand why each platform has to call pci_nic_init_nofail() from a point in the code where it actually has a pointer to the PCI bus, and then we have the special cases for things like ne2k_isa. If qmp_device_add() can *find* the appropriate bus and instantiate the device on it, why can't we just do that from generic code for creating the default NICs too? But that isn't a yak I want to shave today. Add a xenbus field to the PCMachineState so that it can make its way from pc_basic_device_init() to pc_nic_init() and be handled as a special case like ne2k_isa is. Now we can launch emulated Xen guests with '-nic user'. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-11-07hw/xen: add support for Xen primary console in emulated modeDavid Woodhouse
The primary console is special because the toolstack maps a page into the guest for its ring, and also allocates the guest-side event channel. The guest's grant table is even primed to export that page using a known grant ref#. Add support for all that in emulated mode, so that we can have a primary console. For reasons unclear, the backends running under real Xen don't just use a mapping of the well-known GNTTAB_RESERVED_CONSOLE grant ref (which would also be in the ring-ref node in XenStore). Instead, the toolstack sets the ring-ref node of the primary console to the GFN of the guest page. The backend is expected to handle that special case and map it with foreignmem operations instead. We don't have an implementation of foreignmem ops for emulated Xen mode, so just make it map GNTTAB_RESERVED_CONSOLE instead. This would probably work for real Xen too, but we can't work out how to make real Xen create a primary console of type "ioemu" to make QEMU drive it, so we can't test that; might as well leave it as it is for now under Xen. Now at last we can boot the Xen PV shim and run PV kernels in QEMU. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-11-07hw/xen: do not repeatedly try to create a failing backend deviceDavid Woodhouse
If xen_backend_device_create() fails to instantiate a device, the XenBus code will just keep trying over and over again each time the bus is re-enumerated, as long as the backend appears online and in XenbusStateInitialising. The only thing which prevents the XenBus code from recreating duplicates of devices which already exist, is the fact that xen_device_realize() sets the backend state to XenbusStateInitWait. If the attempt to create the device doesn't get *that* far, that's when it will keep getting retried. My first thought was to handle errors by setting the backend state to XenbusStateClosed, but that doesn't work for XenConsole which wants to *ignore* any device of type != "ioemu" completely. So, make xen_backend_device_create() *keep* the XenBackendInstance for a failed device, and provide a new xen_backend_exists() function to allow xen_bus_type_enumerate() to check whether one already exists before creating a new one. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-11-07hw/xen: add get_frontend_path() method to XenDeviceClassDavid Woodhouse
The primary Xen console is special. The guest's side is set up for it by the toolstack automatically and not by the standard PV init sequence. Accordingly, its *frontend* doesn't appear in …/device/console/0 either; instead it appears under …/console in the guest's XenStore node. To allow the Xen console driver to override the frontend path for the primary console, add a method to the XenDeviceClass which can be used instead of the standard xen_device_get_frontend_path() Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-06-07xen-block: fix segv on unrealizeAnthony PERARD
Backtrace: qemu_lockcnt_lock (lockcnt=0xb4) at ../util/lockcnt.c:238 aio_set_fd_handler (ctx=0x0, fd=51, is_external=true, io_read=0x0, io_write=0x0, io_poll=0x0, io_poll_ready=0x0, opaque=0x0) at ../util/aio-posix.c:119 xen_device_unbind_event_channel (xendev=0x55c6da5b5000, channel=0x55c6da6c4c80, errp=0x7fff641ac608) at ../hw/xen/xen-bus.c:926 xen_block_dataplane_stop (dataplane=0x55c6da6ddbe0) at ../hw/block/dataplane/xen-block.c:719 xen_block_disconnect (xendev=0x55c6da5b5000, errp=0x0) at ../hw/block/xen-block.c:48 xen_block_unrealize (xendev=0x55c6da5b5000) at ../hw/block/xen-block.c:154 xen_device_unrealize (dev=0x55c6da5b5000) at ../hw/xen/xen-bus.c:956 xen_device_exit (n=0x55c6da5b50d0, data=0x0) at ../hw/xen/xen-bus.c:985 notifier_list_notify (list=0x55c6d91f9820 <exit_notifiers>, data=0x0) at ../util/notify.c:39 qemu_run_exit_notifiers () at ../softmmu/runstate.c:760 Fixes: f6eac904f682 ("xen-block: implement BlockDevOps->drained_begin()") Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230606131605.55596-1-anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2023-05-30aio: remove aio_disable_external() APIStefan Hajnoczi
All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30hw/xen: do not set is_external=true on evtchn fdsStefan Hajnoczi
is_external=true suspends fd handlers between aio_disable_external() and aio_enable_external(). The block layer's drain operation uses this mechanism to prevent new I/O from sneaking in between bdrv_drained_begin() and bdrv_drained_end(). The previous commit converted the xen-block device to use BlockDevOps .drained_begin/end() callbacks. It no longer relies on is_external=true so it is safe to pass is_external=false. This is part of ongoing work to remove the aio_disable_external() API. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-13-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30xen-block: implement BlockDevOps->drained_begin()Stefan Hajnoczi
Detach event channels during drained sections to stop I/O submission from the ring. xen-block is no longer reliant on aio_disable_external() after this patch. This will allow us to remove the aio_disable_external() API once all other code that relies on it is converted. Extend xen_device_set_event_channel_context() to allow ctx=NULL. The event channel still exists but the event loop does not monitor the file descriptor. Event channel processing can resume by calling xen_device_set_event_channel_context() with a non-NULL ctx. Factor out xen_device_set_event_channel_context() calls in hw/block/dataplane/xen-block.c into attach/detach helper functions. Incidentally, these don't require the AioContext lock because aio_set_fd_handler() is thread-safe. It's safer to register BlockDevOps after the dataplane instance has been created. The BlockDevOps .drained_begin/end() callbacks depend on the dataplane instance, so move the blk_set_dev_ops() call after xen_block_dataplane_create(). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-12-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-03-07hw/xen: Avoid crash when backend watch fires too earlyPaul Durrant
The xen-block code ends up calling aio_poll() through blkconf_geometry(), which means we see watch events during the indirect call to xendev_class->realize() in xen_device_realize(). Unfortunately this call is made before populating the initial frontend and backend device nodes in xenstore and hence xen_block_frontend_changed() (which is called from a watch event) fails to read the frontend's 'state' node, and hence believes the device is being torn down. This in-turn sets the backend state to XenbusStateClosed and causes the device to be deleted before it is fully set up, leading to the crash. By simply moving the call to xendev_class->realize() after the initial xenstore nodes are populated, this sorry state of affairs is avoided. Reported-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paul Durrant <pdurrant@amazon.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07hw/xen: Add xenstore operations to allow redirection to internal emulationPaul Durrant
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07hw/xen: Pass grant ref to gnttab unmap operationDavid Woodhouse
The previous commit introduced redirectable gnttab operations fairly much like-for-like, with the exception of the extra arguments to the ->open() call which were always NULL/0 anyway. This *changes* the arguments to the ->unmap() operation to include the original ref# that was mapped. Under real Xen it isn't necessary; all we need to do from QEMU is munmap(), then the kernel will release the grant, and Xen does the tracking/refcounting for the guest. When we have emulated grant tables though, we need to do all that for ourselves. So let's have the back ends keep track of what they mapped and pass it in to the ->unmap() method for us. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07hw/xen: Add gnttab operations to allow redirection to internal emulationDavid Woodhouse
Move the existing code using libxengnttab to xen-operations.c and allow the operations to be redirected so that we can add emulation of grant table mapping for backend drivers. In emulation, mapping more than one grant ref to be virtually contiguous would be fairly difficult. The best way to do it might be to make the ram_block mappings actually backed by a file (shmem or a deleted file, perhaps) so that we can have multiple *shared* mappings of it. But that would be fairly intrusive. Making the backend drivers cope with page *lists* instead of expecting the mapping to be contiguous is also non-trivial, since some structures would actually *cross* page boundaries (e.g. the 32-bit blkif responses which are 12 bytes). So for now, we'll support only single-page mappings in emulation. Add a XEN_GNTTAB_OP_FEATURE_MAP_MULTIPLE flag to indicate that the native Xen implementation *does* support multi-page maps, and a helper function to query it. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07hw/xen: Add evtchn operations to allow redirection to internal emulationDavid Woodhouse
The existing implementation calling into the real libxenevtchn moves to a new file hw/xen/xen-operations.c, and is called via a function table which in a subsequent commit will also be able to invoke the emulated event channel support. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-01-11hw/xen: use G_GNUC_PRINTF/SCANF for various functionsDaniel P. Berrangé
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20221219130205.687815-3-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-12aio-posix: split poll check from ready handlerStefan Hajnoczi
Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time. For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling. By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring. The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before: 168k IOPS, IOThread syscalls: 9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32 174k IOPS (+3.6%), IOThread syscalls: 9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32 Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring. As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com [Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-30qbus: Rename qbus_create() to qbus_new()Peter Maydell
Rename the "allocate and return" qbus creation function to qbus_new(), to bring it into line with our _init vs _new convention. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Corey Minyard <cminyard@mvista.com> Message-id: 20210923121153.23754-6-peter.maydell@linaro.org
2020-10-19xen-bus: reduce scope of backend watchPaul Durrant
Currently a single watch on /local/domain/X/backend is registered by each QEMU process running in service domain X (where X is usually 0). The purpose of this watch is to ensure that QEMU is notified when the Xen toolstack creates a new device backend area. Such a backend area is specific to a single frontend area created for a specific guest domain and, since each QEMU process is also created to service a specfic guest domain, it is unnecessary and inefficient to notify all QEMU processes. Only the QEMU process associated with the same guest domain need receive the notification. This patch re-factors the watch registration code such that notifications are targetted appropriately. Reported-by: Jerome Leseinne <jerome.leseinne@gmail.com> Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20201001081500.1026-1-paul@xen.org> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2020-07-10xen: Use ERRP_GUARD()Vladimir Sementsov-Ogievskiy
If we want to check error after errp-function call, we need to introduce local_err and then propagate it to errp. Instead, use the ERRP_GUARD() macro, benefits are: 1. No need of explicit error_propagate call 2. No need of explicit local_err variable: use errp directly 3. ERRP_GUARD() leaves errp as is if it's not NULL or &error_fatal, this means that we don't break error_abort (we'll abort on error_set, not on error_propagate) If we want to add some info to errp (by error_prepend() or error_append_hint()), we must use the ERRP_GUARD() macro. Otherwise, this info will not be added when errp == &error_fatal (the program will exit prior to the error_append_hint() or error_prepend() call). No such cases are being fixed here. This commit is generated by command sed -n '/^X86 Xen CPUs$/,/^$/{s/^F: //p}' MAINTAINERS | \ xargs git ls-files | grep '\.[hc]$' | \ xargs spatch \ --sp-file scripts/coccinelle/errp-guard.cocci \ --macro-file scripts/cocci-macro-file.h \ --in-place --no-show-diff --max-width 80 Reported-by: Kevin Wolf <kwolf@redhat.com> Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200707165037.1026246-9-armbru@redhat.com> [ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and auto-propagated-errp.cocci to errp-guard.cocci. Commit message tweaked again.]
2020-07-02qdev: Drop qbus_set_bus_hotplug_handler() parameter @errpMarkus Armbruster
All callers pass &error_abort. Drop the parameter. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200630090351.1247703-14-armbru@redhat.com>
2020-06-15sysbus: Convert to sysbus_realize() etc. with CoccinelleMarkus Armbruster
Convert from qdev_realize(), qdev_realize_and_unref() with null @bus argument to sysbus_realize(), sysbus_realize_and_unref(). Coccinelle script: @@ expression dev, errp; @@ - qdev_realize(DEVICE(dev), NULL, errp); + sysbus_realize(SYS_BUS_DEVICE(dev), errp); @@ expression sysbus_dev, dev, errp; @@ + sysbus_dev = SYS_BUS_DEVICE(dev); - qdev_realize_and_unref(dev, NULL, errp); + sysbus_realize_and_unref(sysbus_dev, errp); - sysbus_dev = SYS_BUS_DEVICE(dev); @@ expression sysbus_dev, dev, errp; expression expr; @@ sysbus_dev = SYS_BUS_DEVICE(dev); ... when != dev = expr; - qdev_realize_and_unref(dev, NULL, errp); + sysbus_realize_and_unref(sysbus_dev, errp); @@ expression dev, errp; @@ - qdev_realize_and_unref(DEVICE(dev), NULL, errp); + sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), errp); @@ expression dev, errp; @@ - qdev_realize_and_unref(dev, NULL, errp); + sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), errp); Whitespace changes minimized manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-46-armbru@redhat.com> [Conflicts in hw/misc/empty_slot.c and hw/sparc/leon3.c resolved]
2020-06-15qdev: Convert uses of qdev_create() with CoccinelleMarkus Armbruster
This is the transformation explained in the commit before previous. Takes care of just one pattern that needs conversion. More to come in this series. Coccinelle script: @ depends on !(file in "hw/arm/highbank.c")@ expression bus, type_name, dev, expr; @@ - dev = qdev_create(bus, type_name); + dev = qdev_new(type_name); ... when != dev = expr - qdev_init_nofail(dev); + qdev_realize_and_unref(dev, bus, &error_fatal); @@ expression bus, type_name, dev, expr; identifier DOWN; @@ - dev = DOWN(qdev_create(bus, type_name)); + dev = DOWN(qdev_new(type_name)); ... when != dev = expr - qdev_init_nofail(DEVICE(dev)); + qdev_realize_and_unref(DEVICE(dev), bus, &error_fatal); @@ expression bus, type_name, expr; identifier dev; @@ - DeviceState *dev = qdev_create(bus, type_name); + DeviceState *dev = qdev_new(type_name); ... when != dev = expr - qdev_init_nofail(dev); + qdev_realize_and_unref(dev, bus, &error_fatal); @@ expression bus, type_name, dev, expr, errp; symbol true; @@ - dev = qdev_create(bus, type_name); + dev = qdev_new(type_name); ... when != dev = expr - object_property_set_bool(OBJECT(dev), true, "realized", errp); + qdev_realize_and_unref(dev, bus, errp); @@ expression bus, type_name, expr, errp; identifier dev; symbol true; @@ - DeviceState *dev = qdev_create(bus, type_name); + DeviceState *dev = qdev_new(type_name); ... when != dev = expr - object_property_set_bool(OBJECT(dev), true, "realized", errp); + qdev_realize_and_unref(dev, bus, errp); The first rule exempts hw/arm/highbank.c, because it matches along two control flow paths there, with different @type_name. Covered by the next commit's manual conversions. Missing #include "qapi/error.h" added manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-10-armbru@redhat.com> [Conflicts in hw/misc/empty_slot.c and hw/sparc/leon3.c resolved]
2020-05-15qdev: Unrealize must not failMarkus Armbruster
Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-02-27xen-bus/block: explicitly assign event channels to an AioContextPaul Durrant
It is not safe to close an event channel from the QEMU main thread when that channel's poller is running in IOThread context. This patch adds a new xen_device_set_event_channel_context() function to explicitly assign the channel AioContext, and modifies xen_device_bind_event_channel() to initially assign the channel's poller to the QEMU main thread context. The code in xen-block's dataplane is then modified to assign the channel to IOThread context during xen_block_dataplane_start() and de-assign it during in xen_block_dataplane_stop(), such that the channel is always assigned back to main thread context before it is closed. aio_set_fd_handler() already deals with all the necessary synchronization when moving an fd between AioContext-s so no extra code is needed to manage this. Reported-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20191216143451.19024-1-pdurrant@amazon.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2020-01-24qdev: set properties with device_class_set_props()Marc-André Lureau
The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24xen-bus: only set the xen device frontend state if it is missingMark Syms
Some toolstack implementations will set the frontend xenstore keys to Initialising which will then trigger the in guest PV drivers to begin initialising and some implementations will then set their state to Closing. If this has occurred then device realize must not overwrite the frontend keys as then the handshake will stall. Signed-off-by: Mark Syms <mark.syms@citrix.com> Also avoid creating the frontend area if it already exists. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190918115745.39006-1-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-09-24xen: perform XenDevice clean-up in XenBus watch handlerPaul Durrant
Cleaning up offline XenDevice objects directly in xen_device_backend_changed() is dangerous as xen_device_unrealize() will modify the watch list that is being walked. Even the QLIST_FOREACH_SAFE() used in notifier_list_notify() is insufficient as *two* notifiers (for the frontend and backend watches) are removed, thus potentially rendering the 'next' pointer unsafe. The solution is to use the XenBus backend_watch handler to do the clean-up instead, as it is invoked whilst walking a separate watch list. This patch therefore adds a new 'inactive_devices' list to XenBus, to which offline devices are added by xen_device_backend_changed(). The XenBus backend_watch registration is also changed to not only invoke xen_bus_enumerate() but also a new xen_bus_cleanup() function, which will walk 'inactive_devices' and perform the necessary actions. For safety an extra 'online' check is also added to xen_bus_type_enumerate() to make sure that no attempt is made to create a new XenDevice object for a backend that is offline. NOTE: This patch also includes some cosmetic changes: - substitute the local variable name 'backend_state' in xen_bus_type_enumerate() with 'state', since there is no ambiguity with any other state in that context. - change xen_device_state_is_active() to xen_device_frontend_is_active() (and pass a XenDevice directly) since the state tests contained therein only apply to a frontend. - use 'state' rather then 'xendev->backend_state' in xen_device_backend_changed() to shorten the code. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190913082159.31338-4-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-09-24xen: introduce separate XenWatchList for XenDevice objectsPaul Durrant
This patch uses the XenWatchList abstraction to add a separate watch list for each device. This is more scalable than walking a single notifier list for all watches and is also necessary to implement a bug-fix in a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Message-Id: <20190913082159.31338-3-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-09-24xen / notify: introduce a new XenWatchList abstractionPaul Durrant
Xenstore watch call-backs are already abstracted away from XenBus using the XenWatch data structure but the associated NotifierList manipulation and file handle registration is still open coded in various xen_bus_...() functions. This patch creates a new XenWatchList data structure to allow these interactions to be abstracted away from XenBus as well. This is in preparation for a subsequent patch which will introduce separate watch lists for XenBus and XenDevice objects. NOTE: This patch also introduces a new notifier_list_empty() helper function for the purposes of adding an assertion that a XenWatchList is not freed whilst its associated NotifierList is still occupied. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Message-Id: <20190913082159.31338-2-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-09-24xen-bus: check whether the frontend is active during device reset...Paul Durrant
...not the backend Commit cb323146 "xen-bus: Fix backend state transition on device reset" contained a subtle mistake. The hunk @@ -539,11 +556,11 @@ static void xen_device_backend_changed(void *opaque) /* * If the toolstack (or unplug request callback) has set the backend - * state to Closing, but there is no active frontend (i.e. the - * state is not Connected) then set the backend state to Closed. + * state to Closing, but there is no active frontend then set the + * backend state to Closed. */ if (xendev->backend_state == XenbusStateClosing && - xendev->frontend_state != XenbusStateConnected) { + !xen_device_state_is_active(state)) { xen_device_backend_set_state(xendev, XenbusStateClosed); } mistakenly replaced the check of 'xendev->frontend_state' with a check (now in a helper function) of 'state', which actually equates to 'xendev->backend_state'. This patch fixes the mistake. Fixes: cb3231460747552d70af9d546dc53d8195bcb796 Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190910171753.3775-1-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-08-27xen-bus: Avoid rewriting identical values to xenstoreAnthony PERARD
When QEMU receives a xenstore watch event suggesting that the "state" of the frontend changed, it records this in its own state but it also re-write the value back into xenstore even so there were no change. This triggers an unnecessary xenstore watch event which QEMU will process again (and maybe the frontend as well). Also QEMU could potentially write an already old value. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Message-Id: <20190823101534.465-3-anthony.perard@citrix.com>
2019-08-27xen-bus: Fix backend state transition on device resetAnthony PERARD
When a frontend wants to reset its state and the backend one, it starts with setting "Closing", then waits for the backend (QEMU) to do the same. But when QEMU is setting "Closing" to its state, it triggers an event (xenstore watch) that re-execute xen_device_backend_changed() and set the backend state to "Closed". QEMU should wait for the frontend to set "Closed" before doing the same. Before setting "Closed" to the backend_state, we are also going to check if there is a frontend. If that the case, when the backend state is set to "Closing" the frontend should react and sets its state to "Closing" then "Closed". The backend should wait for that to happen. Fixes: b6af8926fb858c4f1426e5acb2cfc1f0580ec98a Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Message-Id: <20190823101534.465-2-anthony.perard@citrix.com>
2019-08-16Include hw/qdev-properties.h lessMarkus Armbruster
In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16Include hw/hw.h exactly where neededMarkus Armbruster
In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-06-24xen-bus / xen-block: add support for event channel pollingPaul Durrant
This patch introduces a poll callback for event channel fd-s and uses this to invoke the channel callback function. To properly support polling, it is necessary for the event channel callback function to return a boolean saying whether it has done any useful work or not. Thus xen_block_dataplane_event() is modified to directly invoke xen_block_handle_requests() and the latter only returns true if it actually processes any requests. This also means that the call to qemu_bh_schedule() is moved into xen_block_complete_aio(), which is more intuitive since the only reason for doing a deferred poll of the shared ring should be because there were previously insufficient resources to fully complete a previous poll. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190408151617.13025-4-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-24xen-bus: allow AioContext to be specified for each event channelPaul Durrant
This patch adds an AioContext parameter to xen_device_bind_event_channel() and then uses aio_set_fd_handler() to set the callback rather than qemu_set_fd_handler(). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190408151617.13025-3-paul.durrant@citrix.com> [Call aio_set_fd_handler() with is_external=true] Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-24xen-bus: use a separate fd for each event channelPaul Durrant
To better support use of IOThread-s it will be necessary to be able to set the AioContext for each XenEventChannel and hence it is necessary to open a separate handle to libxenevtchan for each channel. This patch stops using NotifierList for event channel callbacks, replacing that construct by a list of complete XenEventChannel structures. Each of these now has a xenevtchn_handle pointer in place of the single pointer previously held in the XenDevice structure. The individual handles are opened/closed in xen_device_bind/unbind_event_channel(), replacing the single open/close in xen_device_realize/unrealize(). NOTE: This patch does not add an AioContext parameter to xen_device_bind_event_channel(). That will be done in a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190408151617.13025-2-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-12Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2019-02-04xen: fix xen-bus state model to allow frontend re-connectionPaul Durrant
There is a flaw in the xen-bus state model. To allow a frontend to re- connect the backend state of an online XenDevice is transitioned from Closed to InitWait, but this is currently done unilaterally which is incorrect. The backend state should remain Closed until the frontend state transitions to Initialising. This patch removes the automatic backend state transition from xen_device_backend_state_changed() and, instead, adds an extra check in xen_device_frontend_state_changed() to determine whether a frontend is trying to re-connect to a previously Closed XenDevice. Only if this is found to be the case is the backend state transitioned from Closed to InitWait. Note that this transition will be common amongst all XenDevice classes and hence xen_device_frontend_state_changed() returns immediately afterwards without calling into the XenDeviceClass frontend_changed() method. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: add a mechanism to automatically create XenDevice-s...Paul Durrant
...that maintains compatibility with existing Xen toolstacks. Xen toolstacks instantiate PV backends by simply writing information into xenstore and expecting a backend implementation to be watching for this. This patch adds a new 'xen-backend' module to allow individual XenDevice implementations to register create and destroy functions. The creator will be called when a tool-stack instantiates a new backend in this way, and the destructor will then be called after the resulting XenDevice object is unrealized. To support this it is also necessary to add new watchers into the XenBus implementation to handle enumeration of new backends and also destruction of XenDevice-s when the toolstack sets the backend 'online' key to 0. NOTE: This patch only adds the framework. A subsequent patch will add a creator function for xen-block devices. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: add implementations of xen-block connect and disconnect functions...Paul Durrant
...and wire in the dataplane. This patch adds the remaining code to make the xen-block XenDevice functional. The parameters that a block frontend expects to find are populated in the backend xenstore area, and the 'ring-ref' and 'event-channel' values specified in the frontend xenstore area are mapped/bound and used to set up the dataplane. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: add event channel interface for XenDevice-sPaul Durrant
The legacy PV backend infrastructure provides functions to bind, unbind and send notifications to event channnels. Similar functionality will be required by XenDevice implementations so this patch adds the necessary support. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Patch squashed with: Patch "xen: add event channel interface for XenDevice-s" makes use of the type xenevtchn_port_or_error_t, but this isn't avaiable before Xen 4.7. Also the function xen_device_bind_event_channel assign the return value of xenevtchn_bind_interdomain to channel->local_port but check the result for error with xendev->local_port. Fix by: - removing local_port from struct XenDevice as it isn't use anywere. - adding a compatibility typedef for xenevtchn_port_or_error_t for Xen 4.6 and earlier. As extra, replace the type of XenEventChannel->local_port by evtchn_port_t. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
2019-01-14xen: add grant table interface for XenDevice-sPaul Durrant
The legacy PV backend infrastructure provides functions to map, unmap and copy pages granted by frontends. Similar functionality will be required by XenDevice implementations so this patch adds the necessary support. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: add xenstore watcher infrastructurePaul Durrant
A Xen PV frontend communicates its state to the PV backend by writing to the 'state' key in the frontend area in xenstore. It is therefore necessary for a XenDevice implementation to be notified whenever the value of this key changes. This patch adds code to do this as follows: - an 'fd handler' is registered on the libxenstore handle which will be triggered whenever a 'watch' event occurs - primitives are added to xen-bus-helper to add or remove watch events - a list of Notifier objects is added to XenBus to provide a mechanism to call the appropriate 'watch handler' when its associated event occurs The xen-block implementation is extended with a 'frontend_changed' method, which calls as-yet stub 'connect' and 'disconnect' functions when the relevant frontend state transitions occur. A subsequent patch will supply a full implementation for these functions. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: create xenstore areas for XenDevice-sPaul Durrant
This patch adds a new source module, xen-bus-helper.c, which builds on basic libxenstore primitives to provide functions to create (setting permissions appropriately) and destroy xenstore areas, and functions to 'printf' and 'scanf' nodes therein. The main xen-bus code then uses these primitives [1] to initialize and destroy the frontend and backend areas for a XenDevice during realize and unrealize respectively. The 'xen-block' implementation is extended with a 'get_name' method that returns the VBD number. This number is required to 'name' the xenstore areas. NOTE: An exit handler is also added to make sure the xenstore areas are cleaned up if QEMU terminates without devices being unrealized. [1] The 'scanf' functions are actually not yet needed, but they will be needed by code delivered in subsequent patches. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: introduce new 'XenBus' and 'XenDevice' object hierarchyPaul Durrant
This patch adds the basic boilerplate for a 'XenBus' object that will act as a parent to 'XenDevice' PV backends. A new 'XenBridge' object is also added to connect XenBus to the system bus. The XenBus object is instantiated by a new xen_bus_init() function called from the same sites as the legacy xen_be_init() function. Subsequent patches will flesh-out the functionality of these objects. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>