aboutsummaryrefslogtreecommitdiff
path: root/hw/intc/xics_kvm.c
AgeCommit message (Collapse)Author
2022-04-06Remove qemu-common.h include from most unitsMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-02Do not include cpu.h if it's not really necessaryThomas Huth
Stop including cpu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-4-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-12-14spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()Greg Kurz
Never used from the start. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201120174646.619395-6-groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-07-10error: Eliminate error_propagate() with Coccinelle, part 1Markus Armbruster
When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) | - !fun(args, &err, args2) + !fun(args, errp, args2) | - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; | return c2; | return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var | !var | var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; | return c2; | return false; | return var; ) } @depends on rule1 || rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>
2019-12-17spapr/xics: Configure number of servers in KVMGreg Kurz
The XICS-on-XIVE KVM devices now has an attribute to configure the number of interrupt servers. This allows to greatly optimize the usage of the VP space in the XIVE HW, and thus to start a lot more VMs. Only set this attribute if available in order to support older POWER9 KVM and pre-POWER9 XICS KVM devices. The XICS-on-XIVE KVM device now reports the exhaustion of VPs upon the connection of the first VCPU. Check that in order to have a chance to provide a hint to the user. ` Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157478678846.67101.9660531022460517710.stgit@bahia.tlslab.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17spapr: Pass the maximum number of vCPUs to the KVM interrupt controllerGreg Kurz
The XIVE and XICS-on-XIVE KVM devices on POWER9 hosts can greatly reduce their consumption of some scarce HW resources, namely Virtual Presenter identifiers, if they know the maximum number of vCPUs that may run in the VM. Prepare ground for this by passing the value down to xics_kvm_connect() and kvmppc_xive_connect(). This is purely mechanical, no functional change. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157478678301.67101.2717368060417156338.stgit@bahia.tlslab.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24spapr, xics, xive: Match signatures for XICS and XIVE KVM connect routinesDavid Gibson
Both XICS and XIVE have routines to connect and disconnect KVM with similar but not identical signatures. This adjusts them to match exactly, which will be useful for further cleanups later. While we're there, we add an explicit return value to the connect path to streamline error reporting in the callers. We remove error reporting the disconnect path. In the XICS case this wasn't used at all. In the XIVE case the only error case was if the KVM device was set up, but KVM didn't have the capability to do so which is pretty obviously impossible. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-04spapr/irq: Only claim VALID interrupts at the KVM levelCédric Le Goater
A typical pseries VM with 16 vCPUs, one disk, one network adapater uses less than 100 interrupts but the whole IRQ number space of the QEMU machine is allocated at reset time and it is 8K wide. This is wasting a considerable amount of interrupt numbers in the global IRQ space which has 1M interrupts per socket on a POWER9. To optimise the HW resources, only request at the KVM level interrupts which have been claimed by the guest. This will help to increase the maximum number of VMs per system and also help supporting nested guests using the XIVE interrupt mode. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190911133937.2716-3-clg@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156942766014.1274533.10792048853177121231.stgit@bahia.lan> [dwg: Folded in fix up from Greg Kurz] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-16Include hw/hw.h exactly where neededMarkus Armbruster
In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-07-28xics/kvm: Fix fallback to emulated XICSGreg Kurz
Commit 4812f2615288 tried to fix rollback path of xics_kvm_connect() but it isn't enough. If we fail to create the KVM device, the guest fails to boot later on with: [ 0.010817] pci 0000:00:00.0: Adding to iommu group 0 [ 0.010863] irq: unknown-1 didn't like hwirq-0x1200 to VIRQ17 mapping (rc=-22) [ 0.010923] pci 0000:00:01.0: Adding to iommu group 0 [ 0.010968] irq: unknown-1 didn't like hwirq-0x1201 to VIRQ17 mapping (rc=-22) [ 0.011543] EEH: No capable adapters found [ 0.011597] irq: unknown-1 didn't like hwirq-0x1000 to VIRQ17 mapping (rc=-22) [ 0.011651] audit: type=2000 audit(1563977526.000:1): state=initialized audit_enabled=0 res=1 [ 0.011703] ------------[ cut here ]------------ [ 0.011729] event-sources: Unable to allocate interrupt number for /event-sources/epow-events [ 0.011776] WARNING: CPU: 0 PID: 1 at arch/powerpc/platforms/pseries/event_sources.c:34 request_event_sources_irqs+0xbc/0x150 [ 0.011828] Modules linked in: [ 0.011850] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.1.17-300.fc30.ppc64le #1 [ 0.011886] NIP: c0000000000d4fac LR: c0000000000d4fa8 CTR: c0000000018f0000 [ 0.011923] REGS: c00000001e4c38d0 TRAP: 0700 Not tainted (5.1.17-300.fc30.ppc64le) [ 0.011966] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 28000284 XER: 20040000 [ 0.012012] CFAR: c00000000011b42c IRQMASK: 0 [ 0.012012] GPR00: c0000000000d4fa8 c00000001e4c3b60 c0000000015fc400 0000000000000051 [ 0.012012] GPR04: 0000000000000001 0000000000000000 0000000000000081 772d6576656e7473 [ 0.012012] GPR08: 000000001edf0000 c0000000014d4830 c0000000014d4830 6e6576652f20726f [ 0.012012] GPR12: 0000000000000000 c0000000018f0000 c000000000010bf0 0000000000000000 [ 0.012012] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.012012] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.012012] GPR24: 0000000000000000 0000000000000000 c000000000ebbf00 c0000000000d5570 [ 0.012012] GPR28: c000000000ebc008 c00000001fff8248 0000000000000000 0000000000000000 [ 0.012372] NIP [c0000000000d4fac] request_event_sources_irqs+0xbc/0x150 [ 0.012409] LR [c0000000000d4fa8] request_event_sources_irqs+0xb8/0x150 [ 0.012445] Call Trace: [ 0.012462] [c00000001e4c3b60] [c0000000000d4fa8] request_event_sources_irqs+0xb8/0x150 (unreliable) [ 0.012513] [c00000001e4c3bf0] [c000000001042848] __machine_initcall_pseries_init_ras_IRQ+0xc8/0xf8 [ 0.012563] [c00000001e4c3c20] [c000000000010810] do_one_initcall+0x60/0x254 [ 0.012611] [c00000001e4c3cf0] [c000000001024538] kernel_init_freeable+0x35c/0x444 [ 0.012655] [c00000001e4c3db0] [c000000000010c14] kernel_init+0x2c/0x148 [ 0.012693] [c00000001e4c3e20] [c00000000000bdc4] ret_from_kernel_thread+0x5c/0x78 [ 0.012736] Instruction dump: [ 0.012759] 38a00000 7c7f1b78 7f64db78 2c1f0000 2fbf0000 78630020 4180002c 409effa8 [ 0.012805] 7fa4eb78 7f43d378 48046421 60000000 <0fe00000> 3bde0001 2c1e0010 7fde07b4 [ 0.012851] ---[ end trace aa5785707323fad3 ]--- This happens because QEMU fell back on XICS emulation but didn't unregister the RTAS calls from KVM. The emulated RTAS calls are hence never called and the KVM ones return an error to the guest since the KVM device is absent. The sanity checks in xics_kvm_disconnect() are abusive since we're freeing the KVM device. Simply drop them. Fixes: 4812f2615288 "xics/kvm: Add proper rollback to xics_kvm_init()" Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156398744035.546975.7029414194633598474.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-12xics/kvm: Always set the MASKED bit if interrupt is maskedGreg Kurz
The ics_set_kvm_state_one() function is called either to restore the state of an interrupt source during migration or to set the interrupt source to a default state during reset. Since always, ie. 2013, the code only sets the MASKED bit if the 'current priority' and the 'saved priority' are different. This is likely true when restoring an interrupt that had been previously masked with the ibm,int-off RTAS call. However this is always false in the case of reset since both 'current priority' and 'saved priority' are equal to 0xff, and the MASKED bit is never set. The legacy KVM XICS device gets away with that because it ends updating its internal structure the same way, whether the MASKED bit is set or the priority is 0xff. The XICS-on-XIVE device for POWER9 is different. It sticks to the KVM documentation [1] and _really_ relies on the MASKED bit to correctly set. If not, it will configure the interrupt source in the XIVE HW, even though the guest hasn't configured the interrupt yet. This disturbs the complex logic implemented in XICS-on-XIVE and may result in the loss of subsequent queued events. Always set the MASKED bit if interrupt is masked as expected by the KVM XICS-on-XIVE device. This has no impact on the legacy KVM XICS. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/virtual/kvm/devices/xics.txt Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156217454083.559957.7359208229523652842.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02xics/kvm: Add proper rollback to xics_kvm_init()Greg Kurz
Make xics_kvm_disconnect() able to undo the changes of a partial execution of xics_kvm_connect() and use it to perform rollback. Note that kvmppc_define_rtas_kernel_token(0) never fails, no matter the RTAS call has been defined or not. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077922319.433243.609897156640506891.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02xics/kvm: Add error propagation to ic*_set_kvm_state() functionsGreg Kurz
This allows errors happening there to be propagated up to spapr_irq, just like XIVE already does. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077921763.433243.4614327010172954196.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02xics/kvm: Always use local_err in xics_kvm_init()Greg Kurz
Passing both errp and &local_err to functions is a recipe for messing things up. Since we must use &local_err for icp_kvm_realize(), use &local_err everywhere where rollback must happen and have a single call to error_propagate() them all. While here, add errno to the error message. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077921212.433243.11716701611944816815.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02xics/kvm: Skip rollback when KVM XICS is absentGreg Kurz
There is no need to rollback anything at this point, so just return an error. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077920657.433243.13541093940589972734.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02xics/spapr: Rename xics_kvm_init()Greg Kurz
Switch to using the connect/disconnect terminology like we already do for XIVE. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156077920102.433243.6605099291134598170.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02xics/spapr: Detect old KVM XICS on POWER9 hostsGreg Kurz
Older KVMs on POWER9 don't support destroying/recreating a KVM XICS device, which is required by 'dual' interrupt controller mode. This causes QEMU to emit a warning when the guest is rebooted and to fall back on XICS emulation: qemu-system-ppc64: warning: kernel_irqchip allowed but unavailable: Error on KVM_CREATE_DEVICE for XICS: File exists If kernel irqchip is required, QEMU will thus exit when the guest is first rebooted. Failing QEMU this late may be a painful experience for the user. Detect that and exit at machine init instead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156044430517.125694.6207865998817342638.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02xics/spapr: Register RTAS/hypercalls once at machine initGreg Kurz
QEMU may crash when running a spapr machine in 'dual' interrupt controller mode on some older (but not that old, eg. ubuntu 18.04.2) KVMs with partial XIVE support: qemu-system-ppc64: hw/ppc/spapr_rtas.c:411: spapr_rtas_register: Assertion `!name || !rtas_table[token].name' failed. XICS is controlled by the guest thanks to a set of RTAS calls. Depending on whether KVM XICS is used or not, the RTAS calls are handled by KVM or QEMU. In both cases, QEMU needs to expose the RTAS calls to the guest through the "rtas" node of the device tree. The spapr_rtas_register() helper takes care of all of that: it adds the RTAS call token to the "rtas" node and registers a QEMU callback to be invoked when the guest issues the RTAS call. In the KVM XICS case, QEMU registers a dummy callback that just prints an error since it isn't supposed to be invoked, ever. Historically, the XICS controller was setup during machine init and released during final teardown. This changed when the 'dual' interrupt controller mode was added to the spapr machine: in this case we need to tear the XICS down and set it up again during machine reset. The crash happens because we indeed have an incompatibility with older KVMs that forces QEMU to fallback on emulated XICS, which tries to re-registers the same RTAS calls. This could be fixed by adding proper rollback that would unregister RTAS calls on error. But since the emulated RTAS calls in QEMU can now detect when they are mistakenly called while KVM XICS is in use, it seems simpler to register them once and for all at machine init. This fixes the crash and allows to remove some now useless lines of code. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156044429963.125694.13710679451927268758.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/irq: add KVM support to the 'dual' machineCédric Le Goater
The interrupt mode is chosen by the CAS negotiation process and activated after a reset to take into account the required changes in the machine. This brings new constraints on how the associated KVM IRQ device is initialized. Currently, each model takes care of the initialization of the KVM device in their realize method but this is not possible anymore as the initialization needs to be done globaly when the interrupt mode is known, i.e. when machine is reseted. It also means that we need a way to delete a KVM device when another mode is chosen. Also, to support migration, the QEMU objects holding the state to transfer should always be available but not necessarily activated. The overall approach of this proposal is to initialize both interrupt mode at the QEMU level to keep the IRQ number space in sync and to allow switching from one mode to another. For the KVM side of things, the whole initialization of the KVM device, sources and presenters, is grouped in a single routine. The XICS and XIVE sPAPR IRQ reset handlers are modified accordingly to handle the init and the delete sequences of the KVM device. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-15-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr: check for the activation of the KVM IRQ deviceCédric Le Goater
The activation of the KVM IRQ device depends on the interrupt mode chosen at CAS time by the machine and some methods used at reset or by the migration need to be protected. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190513084245.25755-11-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr: introduce routines to delete the KVM IRQ deviceCédric Le Goater
If a new interrupt mode is chosen by CAS, the machine generates a reset to reconfigure. At this point, the connection with the previous KVM device needs to be closed and a new connection needs to opened with the KVM device operating the chosen interrupt mode. New routines are introduced to destroy the XICS and the XIVE KVM devices. They make use of a new KVM device ioctl which destroys the device and also disconnects the IRQ presenters from the vCPUs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-10-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12spapr: Use CamelCase properlyDavid Gibson
The qemu coding standard is to use CamelCase for type and structure names, and the pseries code follows that... sort of. There are quite a lot of places where we bend the rules in order to preserve the capitalization of internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR". That was a bad idea - it frequently leads to names ending up with hard to read clusters of capital letters, and means they don't catch the eye as type identifiers, which is kind of the point of the CamelCase convention in the first place. In short, keeping type identifiers look like CamelCase is more important than preserving standard capitalization of internal "words". So, this patch renames a heap of spapr internal type names to a more standard CamelCase. In addition to case changes, we also make some other identifier renames: VIOsPAPR* -> SpaprVio* The reverse word ordering was only ever used to mitigate the capital cluster, so revert to the natural ordering. VIOsPAPRVTYDevice -> SpaprVioVty VIOsPAPRVLANDevice -> SpaprVioVlan Brevity, since the "Device" didn't add useful information sPAPRDRConnector -> SpaprDrc sPAPRDRConnectorClass -> SpaprDrcClass Brevity, and makes it clearer this is the same thing as a "DRC" mentioned in many other places in the code This is 100% a mechanical search-and-replace patch. It will, however, conflict with essentially any and all outstanding patches touching the spapr code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26xics: Write source state to KVM at claim timeGreg Kurz
The pseries machine only uses LSIs to support legacy PCI devices. Every PHB claims 4 LSIs at realize time. When using in-kernel XICS (or upcoming in-kernel XIVE), QEMU synchronizes the state of all irqs, including these LSIs, later on at machine reset. In order to support PHB hotplug, we need a way to tell KVM about the LSIs that doesn't require a machine reset. An easy way to do that is to always inform KVM when an interrupt is claimed, which really isn't a performance path. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155059668360.1466090.5969630516627776426.stgit@bahia.lab.toulouse-stg.fr.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18xics: Drop the KVM ICS classGreg Kurz
The KVM ICS class isn't used anymore. Drop it. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155023084177.1011724.14693955932559990358.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18xics: Handle KVM interrupt presentation from "simple" ICS codeGreg Kurz
We want to use the "simple" ICS type in both KVM and non-KVM setups. Teach the "simple" ICS how to present interrupts to KVM and adapt sPAPR accordingly. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155023082996.1011724.16237920586343905010.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18xics: Handle KVM ICS reset from the "simple" ICS codeGreg Kurz
The KVM ICS reset handler simply writes the ICS state to KVM. This doesn't need the overkill parent_reset logic we have today. Also we want to use the same ICS type for the KVM and non-KVM case with pseries. Call icp_set_kvm_state() from the "simple" ICS reset function. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155023082407.1011724.1983100830860273401.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18xics: Explicitely call KVM ICS methods from the common codeGreg Kurz
The pre_save(), post_load() and synchronize_state() methods of the ICSStateClass type are really KVM only things. Make that obvious by dropping the indirections and directly calling the KVM functions instead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155023081817.1011724.14078777320394028836.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18xics: Drop the KVM ICP classGreg Kurz
The KVM ICP class isn't used anymore. Drop it. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155023081228.1011724.12474992370439652538.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18xics: Handle KVM ICP realize from the common codeGreg Kurz
The realization of KVM ICP currently follows the parent_realize logic, which is a bit overkill here. Also we want to get rid of the KVM ICP class. Explicitely call icp_kvm_realize() from the base ICP realize function. Note that ICPStateClass::parent_realize is retained because powernv needs it. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155023080049.1011724.15423463482790260696.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18xics: Handle KVM ICP reset from the common codeGreg Kurz
The KVM ICP reset handler simply writes the ICP state to KVM. This doesn't need the overkill parent_reset logic we have today. Call icp_set_kvm_state() from the base ICP reset function instead. Since there are no other users for ICPStateClass::parent_reset, and it isn't currently expected to change, drop it as well. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155023079461.1011724.12644984391500635645.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18xics: Explicitely call KVM ICP methods from the common codeGreg Kurz
The pre_save(), post_load() and synchronize_state() methods of the ICPStateClass type are really KVM only things. Make that obvious by dropping the indirections and directly calling the KVM functions instead. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155023078871.1011724.3083923389814185598.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-01-22ppc: Move spapr-related prototypes from xics.h into a seperate header fileThomas Huth
When compiling with Clang in -std=gnu99 mode, there is a warning/error: CC ppc64-softmmu/hw/intc/xics_spapr.o In file included from /home/thuth/devel/qemu/hw/intc/xics_spapr.c:34: /home/thuth/devel/qemu/include/hw/ppc/xics.h:203:34: error: redefinition of typedef 'sPAPRMachineState' is a C11 feature [-Werror,-Wtypedef-redefinition] typedef struct sPAPRMachineState sPAPRMachineState; ^ /home/thuth/devel/qemu/include/hw/ppc/spapr_irq.h:25:34: note: previous definition is here typedef struct sPAPRMachineState sPAPRMachineState; ^ We have to remove the duplicated typedef here and include "spapr.h" instead. But "spapr.h" should not be included for the pnv machine files. So move the spapr-related prototypes into a new file called "xics_spapr.h" instead. Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2019-01-09spapr: move the qemu_irq array under the machineCédric Le Goater
The qemu_irq array is now allocated at the machine level using a sPAPR IRQ set_irq handler depending on the chosen interrupt mode. The use of this handler is slightly inefficient today but it will become necessary when the 'dual' interrupt mode is introduced. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-01-09ppc: export the XICS and XIVE set_irq handlersCédric Le Goater
To support the 'dual' interrupt mode, XICS and XIVE, we plan to move the qemu_irq array of each interrupt controller under the machine and do the allocation under the sPAPR IRQ init method. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-10-19Use error_fatal to simplify obvious fatal errors (again)Markus Armbruster
Add a slight improvement of the Coccinelle semantic patch from commit 007b06578ab, and use it to clean up. It leaves dead Error * variables behind, cleaned up manually. Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Alexander Graf <agraf@suse.de> Cc: Eric Blake <eblake@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20181017082702.5581-3-armbru@redhat.com>
2018-07-03ppc/xics: rework the ICS classes inheritance treeCédric Le Goater
With the previous changes, we can now let the ICS_KVM class inherit directly from ICS_BASE class and not from the intermediate ICS_SIMPLE. It makes the class hierarchy much cleaner. What is left in the top classes is the low level interface to access the KVM XICS device in ICS_KVM and the XICS emulating handlers in ICS_SIMPLE. This should not break migration compatibility. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03ppx/xics: introduce a parent_reset in ICSStateClassCédric Le Goater
Just like for the realize handlers, this makes possible to move the common ICSState code of the reset handlers in the ics-base class. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03ppc/xics: introduce a parent_realize in ICSStateClassCédric Le Goater
This makes possible to move the common ICSState code of the realize handlers in the ics-base class. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03ppc/xics: introduce ICP DeviceRealize and DeviceReset handlersCédric Le Goater
This changes the ICP realize and reset handlers in DeviceRealize and DeviceReset handlers. parent handlers are now called from the inheriting classes which is a cleaner object pattern. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-16xics_kvm: fix a build breakCédric Le Goater
On CentOS 7.5, gcc-4.8.5-28.el7_5.1.ppc64le fails to build QEMU due to : hw/intc/xics_kvm.c: In function ‘ics_set_kvm_state’: hw/intc/xics_kvm.c:281:13: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized] return ret; Fix the breakage and also remove the extra error reporting as kvm_device_access() already provides a substantial error message. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12xics_kvm: use KVM helpersCédric Le Goater
The KVM helpers hide the low level interface used to communicate to the XICS KVM device and provide a good cleanup to the XICS KVM models. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-14xics/kvm: synchonize state before 'info pic'Greg Kurz
When using the emulated XICS, the 'info pic' monitor command shows: CPU 0 XIRR=ff000000 ((nil)) PP=ff MFRR=ff ICS 1000..13ff 0x10040060340 1000 MSI 05 00 1001 MSI 05 00 1002 MSI 05 00 1003 MSI ff 00 1004 LSI ff 00 1005 LSI ff 00 1006 LSI ff 00 1007 LSI ff 00 1008 MSI 05 00 1009 MSI 05 00 100a MSI 05 00 100b MSI 05 00 100c MSI 05 00 but when using the in-kernel XICS with the very same guest, we get: CPU 0 XIRR=00000000 ((nil)) PP=ff MFRR=ff ICS 1000..13ff 0x10032e00340 1000 MSI ff 00 1001 MSI ff 00 1002 MSI ff 00 1003 MSI ff 00 1004 LSI ff 00 1005 LSI ff 00 1006 LSI ff 00 1007 LSI ff 00 1008 MSI ff 00 1009 MSI ff 00 100a MSI ff 00 100b MSI ff 00 100c MSI ff 00 ie, all irqs are masked and XIRR is null, while we should get the same output as with the emulated XICS. If the guest is then migrated, 'info pic' shows the expected values on both source and destination. The problem is that QEMU doesn't synchronize with KVM before printing the XICS state. Migration happens to fix the output because it enforces synchronization with KVM. To fix the invalid output of 'info pic', this patch introduces a new synchronize_state operation for both ICPStateClass and ICSStateClass. The ICP operation relies on run_on_cpu() in order to kick the vCPU and avoid sleeping on KVM_GET_ONE_REG. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09xics: drop ICPStateClass::cpu_setup() handlerGreg Kurz
The cpu_setup() handler is only implemented by xics_kvm, where it really does a typical "realize" job. Moreover, the realize() handler is called shortly after cpu_setup(), on the same path. This patch converts xics_kvm to implement realize() instead of cpu_setup(). Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09xics: pass appropriate types to realize() handlers.Greg Kurz
It makes more sense to pass an IPCState * to handlers of ICPStateClass instead of a DeviceState *, if only to benefit from compile time type checking. The same goes with ICSStateClass. While here, we also change the declaration of ICPStateClass in xics.h for consistency. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-08xics: add reset() handler to ICPStateClassGreg Kurz
Taking into account that qemu_set_irq() returns immediatly if its first argument is NULL, icp_kvm_reset() largely duplicates icp_reset(). This patch introduces a reset() handler, so that the common logic can be implemented in icp_reset() only. While there we can also drop icp_kvm_realize() and icp_kvm_unrealize(). This causes icp-kvm to be realized in icp_realize(), which sets icp->xics, but it has no impact. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-25xics: add unrealize handlerGreg Kurz
Now that ICPState objects get finalized on CPU unplug, we should unregister reset handlers as well to avoid a QEMU crash at machine reset time. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24xics_kvm: cache already enabled vCPU idsGreg Kurz
Since commit a45863bda90d ("xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled"), we were able to re-hotplug a vCPU that had been hot- unplugged ealier, thanks to a boolean flag in ICPState that we set when enabling KVM_CAP_IRQ_XICS. This could work because the lifecycle of all ICPState objects was the same as the machine. Commit 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") broke this assumption and now we always pass a freshly allocated ICPState object (ie, with the flag unset) to icp_kvm_cpu_setup(). This cause re-hotplug to fail with: Unable to connect CPU8 to kernel XICS: Device or resource busy Let's fix this by caching all the vCPU ids for which KVM_CAP_IRQ_XICS was enabled. This also drops the now useless boolean flag from ICPState. Reported-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org> Tested-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11ppc/xics: preserve P and Q bits for KVM IRQsSam Bobroff
Kernel commit 17d48610ae0f ("KVM: PPC: Book 3S: XICS: Implement ICS P/Q states") added new bits to the state used by KVM IRQs. Currently, QEMU does not preserve these bits, so migrating (or otherwise saving and restoring) the guest state causes the P and Q bits to be cleared. Clearing the P bit has no effect, because the kernel will set it based on other data, but the loss of a set Q bit will cause a lost interrupt. This patch preserves the P and Q bits, correcting the problem. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11ppc/xics: Fix stale irq->status bits after getSam Bobroff
ics_get_kvm_state() "or"s set bits into irq->status but does not mask out clear bits. Correct this by initializing the IRQ status to zero before adding bits to it. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-06ppc/xics: register reset handlers for the ICP and ICS objectsCédric Le Goater
The recent changes on the XICS layer removed the XICSState object to let the sPAPR machine handle the ICP and ICS directly. The reset of these objects was previously handled by XICSState, which was a SysBus device, and to keep the same behavior, the ICP and ICS were assigned to SysbBus. But that broke the 'info qtree' command in the monitor. 'qtree' performs a loop on the children of a bus to print their properties and SysBus devices are expected to be found under SysBus, which is not the case anymore. The fix for this problem is to register reset handlers for the ICP and ICS objects and stop using SysBus for such devices. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>