aboutsummaryrefslogtreecommitdiff
path: root/hw/s390x/s390-pci-bus.c
AgeCommit message (Collapse)Author
2022-09-26s390x/pci: let intercept devices have separate PCI groupsMatthew Rosato
Let's use the reserved pool of simulated PCI groups to allow intercept devices to have separate groups from interpreted devices as some group values may be different. If we run out of simulated PCI groups, subsequent intercept devices just get the default group. Furthermore, if we encounter any PCI groups from hostdevs that are marked as simulated, let's just assign them to the default group to avoid conflicts between host simulated groups and our own simulated groups. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <20220902172737.170349-7-mjrosato@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-09-26s390x/pci: enable adapter event notification for interpreted devicesMatthew Rosato
Use the associated kvm ioctl operation to enable adapter event notification and forwarding for devices when requested. This feature will be set up with or without firmware assist based upon the 'forwarding_assist' setting. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <20220902172737.170349-6-mjrosato@linux.ibm.com> [thuth: Rename "forwarding_assist" property to "forwarding-assist"] Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-09-26s390x/pci: don't fence interpreted devices without MSI-XMatthew Rosato
Lack of MSI-X support is not an issue for interpreted passthrough devices, so let's let these in. This will allow, for example, ISM devices to be passed through -- but only when interpretation is available and being used. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <20220902172737.170349-5-mjrosato@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-09-26s390x/pci: enable for load/store interpretationMatthew Rosato
If the ZPCI_OP ioctl reports that is is available and usable, then the underlying KVM host will enable load/store intepretation for any guest device without a SHM bit in the guest function handle. For a device that will be using interpretation support, ensure the guest function handle matches the host function handle; this value is re-checked every time the guest issues a SET PCI FN to enable the guest device as it is the only opportunity to reflect function handle changes. By default, unless interpret=off is specified, interpretation support will always be assumed and exploited if the necessary ioctl and features are available on the host kernel. When these are unavailable, we will silently revert to the interception model; this allows existing guest configurations to work unmodified on hosts with and without zPCI interpretation support, allowing QEMU to choose the best support model available. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Acked-by: Thomas Huth <thuth@redhat.com> Message-Id: <20220902172737.170349-4-mjrosato@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-02-21Mark remaining global TypeInfo instances as constBernhard Beschow
More than 1k of TypeInfo instances are already marked as const. Mark the remaining ones, too. This commit was created with: git grep -z -l 'static TypeInfo' -- '*.c' | \ xargs -0 sed -i 's/static TypeInfo/static const TypeInfo/' Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Corey Minyard <cminyard@mvista.com> Message-id: 20220117145805.173070-2-shentey@gmail.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-12-17s390x/pci: add supported DT information to clp responseMatthew Rosato
The DTSM is a mask that specifies which I/O Address Translation designation types are supported. Today QEMU only supports DT=1. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <20211203142706.427279-5-mjrosato@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-11-01pci: Export pci_for_each_device_under_bus*()Peter Xu
They're actually more commonly used than the helper without _under_bus, because most callers do have the pci bus on hand. After exporting we can switch a lot of the call sites to use these two helpers. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20211028043129.38871-3-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-30qbus: Rename qbus_create() to qbus_new()Peter Maydell
Rename the "allocate and return" qbus creation function to qbus_new(), to bring it into line with our _init vs _new convention. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Corey Minyard <cminyard@mvista.com> Message-id: 20210923121153.23754-6-peter.maydell@linaro.org
2021-09-06s390x: Replace PAGE_SIZE, PAGE_SHIFT and PAGE_MASKThomas Huth
The PAGE_SIZE macro is causing trouble on Alpine Linux since it clashes with a macro from a system header there. We already have the TARGET_PAGE_SIZE, TARGET_PAGE_MASK and TARGET_PAGE_BITS macros in QEMU anyway, so let's simply replace the PAGE_SIZE, PAGE_MASK and PAGE_SHIFT macro with their TARGET_* counterparts. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/572 Message-Id: <20210901125800.611183-1-thuth@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-05-02Do not include cpu.h if it's not really necessaryThomas Huth
Stop including cpu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-4-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-12-18qdev: Rename qdev_get_prop_ptr() to object_field_prop_ptr()Eduardo Habkost
The function will be moved to common QOM code, as it is not specific to TYPE_DEVICE anymore. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-31-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-12-18qdev: Move dev->realized check to qdev_property_set()Eduardo Habkost
Every single qdev property setter function manually checks dev->realized. We can just check dev->realized inside qdev_property_set() instead. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-24-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-12-15qdev: Make qdev_get_prop_ptr() get Object* argEduardo Habkost
Make the code more generic and not specific to TYPE_DEVICE. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> #s390 parts Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-10-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-11-18s390x/pci: fix endianness issuesCornelia Huck
The zPCI group and function structures are big endian. However, we do not consistently store them as big endian locally, and are missing some conversions. Let's just store the structures as host endian instead and convert to big endian when actually handling the instructions retrieving the data. Also fix the layout of ClpReqQueryPciGrp: g is actually only 8 bit. This also fixes accesses on little endian hosts, and makes accesses on big endian hosts consistent. Fixes: 28dc86a07299 ("s390x/pci: use a PCI Group structure") Fixes: 9670ee752727 ("s390x/pci: use a PCI Function structure") Fixes: 1e7552ff5c34 ("s390x/pci: get zPCI function info from host") Signed-off-by: Cornelia Huck <cohuck@redhat.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20201118104202.1301363-1-cohuck@redhat.com>
2020-11-18s390x/pci: Unregister listeners before destroying IOMMU address spaceMatthew Rosato
Hot-unplugging a vfio-pci device on s390x causes a QEMU crash: qemu-system-s390x: ../softmmu/memory.c:2772: do_address_space_destroy: Assertion `QTAILQ_EMPTY(&as->listeners)' failed. In s390, the IOMMU address space is freed during device unplug but the associated vfio-pci device may not yet be finalized and therefore may still have a listener registered to the IOMMU address space. Commit a2166410ad74 ("spapr_pci: Unregister listeners before destroying the IOMMU address space") previously resolved this issue for spapr_pci. We are now seeing this in s390x; it would seem the possibility for this issue was already present but based on a bisect commit 2d24a6466154 ("device-core: use RCU for list of children of a bus") has now changed the timing such that it is now readily reproducible. Add logic to ensure listeners are removed before destroying the address space. Reported-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <1605562955-21152-1-git-send-email-mjrosato@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-11-01s390x/pci: get zPCI function info from hostMatthew Rosato
We use the capability chains of the VFIO_DEVICE_GET_INFO ioctl to retrieve the CLP information that the kernel exports. To be compatible with previous kernel versions we fall back on previous predefined values, same as the emulation values, when the ioctl is found to not support capability chains. If individual CLP capabilities are not found, we fall back on default values for only those capabilities missing from the chain. This patch is based on work previously done by Pierre Morel. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [aw: non-Linux build fixes] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01s390x/pci: use a PCI Function structurePierre Morel
We use a ClpRspQueryPci structure to hold the information related to a zPCI Function. This allows us to be ready to support different zPCI functions and to retrieve the zPCI function information from the host. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01s390x/pci: clean up s390 PCI groupsMatthew Rosato
Add a step to remove all stashed PCI groups to avoid stale data between machine resets. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01s390x/pci: use a PCI Group structurePierre Morel
We use a S390PCIGroup structure to hold the information related to a zPCI Function group. This allows us to be ready to support multiple groups and to retrieve the group information from the host. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01s390x/pci: Honor DMA limits set by vfioMatthew Rosato
When an s390 guest is using lazy unmapping, it can result in a very large number of oustanding DMA requests, far beyond the default limit configured for vfio. Let's track DMA usage similar to vfio in the host, and trigger the guest to flush their DMA mappings before vfio runs out. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [aw: non-Linux build fixes] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01s390x/pci: Move header files to include/hw/s390xMatthew Rosato
Seems a more appropriate location for them. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-09-23qemu/atomic.h: rename atomic_ to qatomic_Stefan Hajnoczi
clang's C11 atomic_fetch_*() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int *' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic\(64\)\?_[a-z0-9_]\+' include/qemu/atomic.h | \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>
2020-07-10error: Eliminate error_propagate() manuallyMarkus Armbruster
When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous two commits did that for sufficiently simple cases with Coccinelle. Do it for several more manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-37-armbru@redhat.com>
2020-07-10qom: Use returned bool to check for failure, Coccinelle partMarkus Armbruster
The previous commit enables conversion of foo(..., &err); if (err) { ... } to if (!foo(..., errp)) { ... } for QOM functions that now return true / false on success / error. Coccinelle script: @@ identifier fun = { object_apply_global_props, object_initialize_child_with_props, object_initialize_child_with_propsv, object_property_get, object_property_get_bool, object_property_parse, object_property_set, object_property_set_bool, object_property_set_int, object_property_set_link, object_property_set_qobject, object_property_set_str, object_property_set_uint, object_set_props, object_set_propv, user_creatable_add_dict, user_creatable_complete, user_creatable_del }; expression list args, args2; typedef Error; Error *err; @@ - fun(args, &err, args2); - if (err) + if (!fun(args, &err, args2)) { ... } Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Convert manually. Line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-29-armbru@redhat.com>
2020-07-10qom: Put name parameter before value / visitor parameterMarkus Armbruster
The object_property_set_FOO() setters take property name and value in an unusual order: void object_property_set_FOO(Object *obj, FOO_TYPE value, const char *name, Error **errp) Having to pass value before name feels grating. Swap them. Same for object_property_set(), object_property_get(), and object_property_parse(). Convert callers with this Coccinelle script: @@ identifier fun = { object_property_get, object_property_parse, object_property_set_str, object_property_set_link, object_property_set_bool, object_property_set_int, object_property_set_uint, object_property_set, object_property_set_qobject }; expression obj, v, name, errp; @@ - fun(obj, v, name, errp) + fun(obj, name, v, errp) Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error message "no position information". Convert that one manually. Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Convert manually. Fails to convert hw/rx/rx-gdbsim.c, because Coccinelle gets confused by RXCPU being used both as typedef and function-like macro there. Convert manually. The other files using RXCPU that way don't need conversion. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-27-armbru@redhat.com> [Straightforwad conflict with commit 2336172d9b "audio: set default value for pcspk.iobase property" resolved]
2020-07-10s390x/pci: Fix harmless mistake in zpci's property fid's setterMarkus Armbruster
s390_pci_set_fid() sets zpci->fid_defined to true even when visit_type_uint32() failed. Reproducer: "-device zpci,fid=junk". Harmless in practice, because qdev_device_add() then fails, throwing away @zpci. Fix it anyway. Cc: Matthew Rosato <mjrosato@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200707160613.848843-21-armbru@redhat.com>
2020-07-10qdev: Use returned bool to check for qdev_realize() etc. failureMarkus Armbruster
Convert foo(..., &err); if (err) { ... } to if (!foo(..., &err)) { ... } for qdev_realize(), qdev_realize_and_unref(), qbus_realize() and their wrappers isa_realize_and_unref(), pci_realize_and_unref(), sysbus_realize(), sysbus_realize_and_unref(), usb_realize_and_unref(). Coccinelle script: @@ identifier fun = { isa_realize_and_unref, pci_realize_and_unref, qbus_realize, qdev_realize, qdev_realize_and_unref, sysbus_realize, sysbus_realize_and_unref, usb_realize_and_unref }; expression list args, args2; typedef Error; Error *err; @@ - fun(args, &err, args2); - if (err) + if (!fun(args, &err, args2)) { ... } Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error message "no position information". Nothing to convert there; skipped. Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20200707160613.848843-5-armbru@redhat.com>
2020-07-03s390x/pci: fix set_ind_atomicHalil Pasic
The atomic_cmpxchg() loop is broken because we occasionally end up with old and _old having different values (a legit compiler can generate code that accessed *ind_addr again to pick up a value for _old instead of using the value of old that was already fetched according to the rules of the abstract machine). This means the underlying CS instruction may use a different old (_old) than the one we intended to use if atomic_cmpxchg() performed the xchg part. Let us use volatile to force the rules of the abstract machine for accesses to *ind_addr. Let us also rewrite the loop so, we that the new old is used to compute the new desired value if the xchg part is not performed. Fixes: 8cba80c3a0 ("s390: Add PCI bus support") Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20200616045035.51641-3-pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-07-02qdev: Drop qbus_set_hotplug_handler() parameter @errpMarkus Armbruster
qbus_set_hotplug_handler() is a simple wrapper around object_property_set_link(). object_property_set_link() fails when the property doesn't exist, is not settable, or its .check() method fails. These are all programming errors here, so passing &error_abort to qbus_set_hotplug_handler() is appropriate. Most of its callers do. Exceptions: * pcie_cap_slot_init(), shpc_init(), spapr_phb_realize() pass NULL, i.e. they ignore errors. * spapr_machine_init() passes &error_fatal. * s390_pcihost_realize(), virtio_serial_device_realize(), s390_pcihost_plug() pass the error to their callers. The latter two keep going after the error, which looks wrong. Drop the @errp parameter, and instead pass &error_abort to object_property_set_link(). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200630090351.1247703-15-armbru@redhat.com>
2020-06-15qdev: Convert uses of qdev_create() manuallyMarkus Armbruster
Same transformation as in the previous commit. Manual, because convincing Coccinelle to transform these cases is somewhere between not worthwhile and infeasible (at least for me). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-11-armbru@redhat.com>
2020-06-15qdev: Convert to qdev_unrealize() with CoccinelleMarkus Armbruster
For readability, and consistency with qbus_realize(). Coccinelle script: @ depends on !(file in "hw/core/qdev.c")@ typedef DeviceState; DeviceState *dev; symbol false, error_abort; @@ - object_property_set_bool(OBJECT(dev), false, "realized", &error_abort); + qdev_unrealize(dev); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression dev; symbol false, error_abort; @@ - object_property_set_bool(OBJECT(dev), false, "realized", &error_abort); + qdev_unrealize(DEVICE(dev)); Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-8-armbru@redhat.com>
2020-05-15qdev: Unrealize must not failMarkus Armbruster
Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-02-20Let cpu_[physical]_memory() calls pass a boolean 'is_write' argumentPhilippe Mathieu-Daudé
Use an explicit boolean type. This commit was produced with the included Coccinelle script scripts/coccinelle/exec_rw_const. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-01-24qdev: set properties with device_class_set_props()Marc-André Lureau
The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-30s390: PCI: fix IOMMU region initMatthew Rosato
The fix in dbe9cf606c shrinks the IOMMU memory region to a size that seems reasonable on the surface, however is actually too small as it is based against a 0-mapped address space. This causes breakage with small guests as they can overrun the IOMMU window. Let's go back to the prior method of initializing iommu for now. Fixes: dbe9cf606c ("s390x/pci: Set the iommu region size mpcifc request") Cc: qemu-stable@nongnu.org Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Tested-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reported-by: Stefan Zimmerman <stzi@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <1569507036-15314-1-git-send-email-mjrosato@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-08-16Include hw/qdev-properties.h lessMarkus Armbruster
In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-06-12Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2019-03-06qdev: Let the hotplug_handler_unplug() caller delete the deviceDavid Hildenbrand
When unplugging a device, at one point the device will be destroyed via object_unparent(). This will, one the one hand, unrealize the removed device hierarchy, and on the other hand, destroy/free the device hierarchy. When chaining hotplug handlers, we want to overwrite a bus hotplug handler by the machine hotplug handler, to be able to perform some part of the plug/unplug and to forward the calls to the bus hotplug handler. For now, the bus hotplug handler would trigger an object_unparent(), not allowing us to perform some unplug action on a device after we forwarded the call to the bus hotplug handler. The device would be gone at that point. machine_unplug_handler(dev) /* eventually do unplug stuff */ bus_unplug_handler(dev) /* dev is gone, we can't do more unplug stuff */ So move the object_unparent() to the original caller of the unplug. For now, keep the unrealize() at the original places of the object_unparent(). For implicitly chained hotplug handlers (e.g. pc code calling acpi hotplug handlers), the object_unparent() has to be done by the outermost caller. So when calling hotplug_handler_unplug() from inside an unplug handler, nothing is to be done. hotplug_handler_unplug(dev) -> calls machine_unplug_handler() machine_unplug_handler(dev) { /* eventually do unplug stuff */ bus_unplug_handler(dev) -> calls unrealize(dev) /* we can do more unplug stuff but device already unrealized */ } object_unparent(dev) In the long run, every unplug action should be factored out of the unrealize() function into the unplug handler (especially for PCI). Then we can get rid of the additonal unrealize() calls and object_unparent() will properly unrealize the device hierarchy after the device has been unplugged. hotplug_handler_unplug(dev) -> calls machine_unplug_handler() machine_unplug_handler(dev) { /* eventually do unplug stuff */ bus_unplug_handler(dev) -> only unplugs, does not unrealize /* we can do more unplug stuff */ } object_unparent(dev) -> will unrealize The original approach was suggested by Igor Mammedov for the PCI part, but I extended it to all hotplug handlers. I consider this one step into the right direction. To summarize: - object_unparent() on synchronous unplugs is done by common code -- "Caller of hotplug_handler_unplug" - object_unparent() on asynchronous unplugs ("unplug requests") has to be done manually -- "Caller of hotplug_handler_unplug" Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190228122849.4296-2-david@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-02-17qdev: pass an Object * to qbus_set_hotplug_handler()Michael Roth
Certain devices types, like memory/CPU, are now being handled using a hotplug interface provided by a top-level MachineClass. Hotpluggable host bridges are another such device where it makes sense to use a machine-level hotplug handler. However, unlike those devices, host-bridges have a parent bus (the main system bus), and devices with a parent bus use a different mechanism for registering their hotplug handlers: qbus_set_hotplug_handler(). This interface currently expects a handler to be a subclass of DeviceClass, but this is not the case for MachineClass, which derives directly from ObjectClass. Internally, the interface only requires an ObjectClass, so expose that in qbus_set_hotplug_handler(). Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <154999589921.690774.3640149277362188566.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-05s390x/pci: Unplug remaining requested devices on pcihost resetDavid Hildenbrand
When resetting the guest we should unplug and remove all devices that are still pending. With this patch, the requested device will be unplugged on reboot (S390_RESET_EXTERNAL and S390_RESET_REIPL, which reset the pcihost bridge via qemu_devices_reset()). This approach is similar to what's done for acpi PCI hotplug in acpi_pcihp_reset() -> acpi_pcihp_update() -> acpi_pcihp_update_hotplug_bus() -> acpi_pcihp_eject_slot(). s390_pci_generate_plug_event()'s will still be generated, I guess this is not an issue. The same thing would happen right now when unplugging a device just before starting the guest. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190130155733.32742-7-david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-02-05s390x/pci: Warn when adding PCI devices without the 'zpci' featureDavid Hildenbrand
We decided to always create the PCI host bridge, even if 'zpci' is not enabled (due to migration compatibility). This however right now allows to add zPCI/PCI devices to a VM although the guest will never actually see them, confusing people that are using a simple CPU model that has no 'zpci' enabled - "Why isn't this working" (David Hildenbrand) Let's check for 'zpci' and at least print a warning that this will not work as expected. We could also bail out, however that might break existing QEMU commandlines. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190130155733.32742-4-david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-02-05s390x/pci: Fix hotplugging of PCI bridgesDavid Hildenbrand
When hotplugging a PCI bridge right now to the root port, we resolve pci_get_bus(pdev)->parent_dev, which results in a SEGFAULT. Hotplugging really only works right now when hotplugging to another bridge. Instead, we have to properly check if we are already at the root. Let's cleanup the code while at it a bit and factor out updating the subordinate bus number into a separate function. The check for "old_nr < nr" is right now not strictly necessary, but makes it more obvious what is actually going on. Most probably fixing up the topology is not our responsibility when hotplugging. The guest has to sort this out. But let's keep it for now and only fix current code to not crash. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190130155733.32742-3-david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-02-05s390x/pci: Fix primary bus number for PCI bridgesDavid Hildenbrand
The primary bus number corresponds always to the bus number of the bus the bridge is attached to. Right now, if we have two bridges attached to the same bus (e.g. root bus) this is however not the case. The first bridge will have primary bus 0, the second bridge primary bus 1, which is wrong. Fix the assignment. While at it, drop setting the PCI_SUBORDINATE_BUS temporarily to 0xff. Setting it temporarily to that value (as discussed e.g. in [1]), is only relevant for a running system that probes the buses. The value is effectively unused for us just doing a DFS. Also add a comment why we have to reassign during every reset (which I found to be surprising. Please note that hotplugging of bridges is in general still broken, will be fixed next. [1] http://www.science.unitn.it/~fiorella/guidelinux/tlk/node76.html Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190130155733.32742-2-david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-02-04s390x/pci: mark zpci devices as unmigratableCornelia Huck
We currently don't migrate any state for zpci devices, which are coupled with standard pci devices. This means funny things happen when we e.g. try to migrate with a virtio-pci device but the s390x- specific zpci state is not migrated (vfio-pci is not affected, as it is not migratable anyway.) Until this is fixed, mark zpci devices as unmigratable. Reported-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-02-04s390x/pci: Drop release timer and replace it with a flagDavid Hildenbrand
Let's handle it similar to x86 ACPI PCI code and don't use a timer. Instead, remember if an unplug request is pending and keep it pending for eternity. (a follow up patch will process the request on reboot). We expect that a guest that is up and running, will process the unplug request and trigger the unplug. This is normal operation, no timer needed. If the guest does not react, this usually means something in the guest is going wrong. Simply removing the device after 30 seconds does not really sound like a good idea. It might sometimes be wanted, but I consider this rather an "opt-in" decision as it might harm a guest not prepared for it. If we ever actually want a "forced/surprise removal", we will have to implement something on top of the existing "device_del" framework. E.g. also x86 might want to do a forced/surprise removal of PCI devices under some conditions. "device_del X, forced=true" could be an option and will require changes to the hotplug handler infrastructure. This will then move the responsibility on when to do a forced removal to a higher level. Doing a forced removal right now over-complicates things and doesn't really seem to be required. Let's allow to send multiple requests. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190130155733.32742-6-david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-02-04s390x/pci: Introduce unplug requests and split unplug handlerDavid Hildenbrand
PCI on s390x is really weird and how it was modeled in QEMU might not have been the right choice. Anyhow, right now it is the case that: - Hotplugging a PCI device will silently create a zPCI device (if none is provided) - Hotunplugging a zPCI device will unplug the PCI device (if any) - Hotunplugging a PCI device will unplug also the zPCI device As far as I can see, we can no longer change this behavior. But we should fix it. Both device types are handled via a single hotplug handler call. This is problematic for various reasons: 1. Unplugging via the zPCI device allows to unplug devices that are not hot removable. (check performed in qdev_unplug()) - bad. 2. Hotplug handler chains are not possible for the unplug case. In the future, the machine might want to override hotplug handlers, to process device specific stuff and to then branch off to the actual hotplug handler. We need separate hotplug handler calls for both the PCI and zPCI device to make this work reliably. All other PCI implementations are already prepared to handle this correctly, only s390x is missing. Therefore, introduce the unplug_request handler and properly perform unplug checks by redirecting to the separate unplug_request handlers. When finally unplugging, perform two separate hotplug_handler_unplug() calls, first for the PCI device, followed by the zPCI device. This now nicely splits unplugging paths for both devices. The redirect part is a little hairy, as the user is allowed to trigger unplug either via the PCI or the zPCI device. So redirect always to the PCI unplug request handler first and remember if that check has been performed in the zPCI device. Redirect then to the zPCI device unplug request handler to perform the magic. Remembering that we already checked the PCI device breaks the redirect loop. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190130155733.32742-5-david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-01-18s390x/pci: add common function measurement blockYi Min Zhao
Common function measurement block is used to report zPCI internal counters of successful pcilg/stg/stb and rpcit instructions to a memory location provided by the program. This patch introduces a new ZpciFmb structure and schedules a timer callback to copy the zPCI measures to the FMB in the guest memory at an interval time set to 4s. An error while attemping to update the FMB, would generate an error event to the guest. The pcilg/stg/stb and rpcit interception handlers increase the related counter on a successful call. The guest shall pass a null FMBA (FMB address) in the FIB (Function Information Block) when it issues a Modify PCI Function Control instruction to switch off FMB and stop the corresponding timer. Signed-off-by: Yi Min Zhao <zyimin@linux.ibm.com> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <1546969050-8884-2-git-send-email-pmorel@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-01-18s390x/pci: Ignore the unplug call if we already have a release_timerDavid Hildenbrand
... otherwise two successive calls to qdev_unplug() (e.g. by an impatient user) will effectively overwrite pbdev->release_timer, resulting in a memory leak. We are already processing the unplug. If there is already a release_timer, the unplug will be performed after the timeout. Can be easily triggered by (hmp) device_add virtio-mouse-pci,id=test (hmp) stop (hmp) device_del test (hmp) device_del test Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190114103110.10909-5-david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-01-18s390x/pci: Always delete and free the release_timerDavid Hildenbrand
We should always get rid of it. I don't see a reason to keep the timer alive if the devices are going away. This looks like a memory leak. (hmp) device_add virtio-mouse-pci,id=test (hmp) device_del test -> guest notified, timer pending. -> guest does not react for some reason (e.g. crash) -> s390_pcihost_timer_cb(). Timer not pending anymore. qmp_unplug(). -> Device deleted. Timer expired (not pending) but not freed. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190114103110.10909-4-david@redhat.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-01-18s390x/pci: Move some hotplug checks to the pre_plug handlerDavid Hildenbrand
Let's move most of the checks to the new pre_plug handler. As a PCI bridge is just a PCI device, we can simplify the code. Notes: We cannot yet move the MSIX check or device ID creation + zPCI device creation to the pre_plug handler as both parts are not fixed before actual device realization (and therefore after pre_plug and before plug). Once that part is factored out, we can move these parts to the pre_plug handler, too and therefore remove all possible errors from the plug handler. Reviewed-by: Collin Walling <walling@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190114103110.10909-3-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>