aboutsummaryrefslogtreecommitdiff
path: root/hw/intc
AgeCommit message (Collapse)Author
2019-10-24spapr, xics, xive: Move irq claim and free from SpaprIrq to ↵David Gibson
SpaprInterruptController These methods, like cpu_intc_create, really belong to the interrupt controller, but need to be called on all possible intcs. Like cpu_intc_create, therefore, make them methods on the intc and always call it for all existing intcs. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24spapr, xics, xive: Move cpu_intc_create from SpaprIrq to ↵David Gibson
SpaprInterruptController This method essentially represents code which belongs to the interrupt controller, but needs to be called on all possible intcs, rather than just the currently active one. The "dual" version therefore calls into the xics and xive versions confusingly. Handle this more directly, by making it instead a method on the intc backend, and always calling it on every backend that exists. While we're there, streamline the error reporting a bit. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24spapr, xics, xive: Introduce SpaprInterruptController QOM interfaceDavid Gibson
The SpaprIrq structure is used to represent ths spapr machine's irq backend. Except that it kind of conflates two concepts: one is the backend proper - a specific interrupt controller that we might or might not be using, the other is the irq configuration which covers the layout of irq space and which interrupt controllers are allowed. This leads to some pretty confusing code paths for the "dual" configuration where its hooks redirect to other SpaprIrq structures depending on the currently active irq controller. To clean this up, we start by introducing a new SpaprInterruptController QOM interface to represent strictly an interrupt controller backend, not counting anything configuration related. We implement this interface in the XICs and XIVE interrupt controllers, and in future we'll move relevant methods from SpaprIrq into it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24ppc/pnv: Improve trigger data definitionCédric Le Goater
The trigger data is used for both triggers of a HW source interrupts, PHB, PSI, and triggers for rerouting interrupts between interrupt controllers. When an interrupt is rerouted, the trigger data follows an "END trigger" format. In that case, the remote IC needs EAS containing an END index to perform a lookup of an END. An END trigger, bit0 of word0 set to '1', is defined as : |0123|4567|0123|4567|0123|4567|0123|4567| W0 E=1 |1P--|BLOC| END IDX | W1 E=1 |M | END DATA | An EAS is defined as : |0123|4567|0123|4567|0123|4567|0123|4567| W0 |V---|BLOC| END IDX | W1 |M | END DATA | The END trigger adds an extra 'PQ' bit, bit1 of word0 set to '1', signaling that the PQ bits have been checked. That bit is unused in the initial EAS definition. When a HW device performs the trigger, the trigger data follows an "EAS trigger" format because the trigger data in that case contains an EAS index which the IC needs to look for. An EAS trigger, bit0 of word0 set to '0', is defined as : |0123|4567|0123|4567|0123|4567|0123|4567| W0 E=0 |0P--|---- ---- ---- ---- ---- ---- ----| W1 E=0 |BLOC| EAS INDEX | There is also a 'PQ' bit, bit1 of word0 to '1', signaling that the PQ bits have been checked. Introduce these new trigger bits and rename the XIVE_SRCNO macros in XIVE_EAS to reflect better the nature of the data. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191007084102.29776-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24xics: Make some device types not user creatableGreg Kurz
Some device types of the XICS model are exposed to the QEMU command line: $ ppc64-softmmu/qemu-system-ppc64 -device help | grep ic[sp] name "icp" name "ics" name "ics-spapr" name "pnv-icp", desc "PowerNV ICP" These are internal devices that shouldn't be instantiable by the user. By the way, they can't be because their respective realize functions expect link properties that can't be set from the command line: qemu-system-ppc64: -device icp: required link 'xics' not found: Property '.xics' not found qemu-system-ppc64: -device ics: required link 'xics' not found: Property '.xics' not found qemu-system-ppc64: -device ics-spapr: required link 'xics' not found: Property '.xics' not found qemu-system-ppc64: -device pnv-icp: required link 'xics' not found: Property '.xics' not found Hide them by setting dc->user_creatable to false in the base class "icp" and "ics" init functions. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157017826724.337875.14822177178282524024.stgit@bahia.lan> Message-Id: <157045578962.865784.8551555523533955113.stgit@bahia.lan> [dwg: Folded reason comment into base patch] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24xive: Make some device types not user creatableGreg Kurz
Some device types of the XIVE model are exposed to the QEMU command line: $ ppc64-softmmu/qemu-system-ppc64 -device help | grep xive name "xive-end-source", desc "XIVE END Source" name "xive-source", desc "XIVE Interrupt Source" name "xive-tctx", desc "XIVE Interrupt Thread Context" These are internal devices that shouldn't be instantiable by the user. By the way, they can't be because their respective realize functions expect link properties that can't be set from the command line: qemu-system-ppc64: -device xive-source: required link 'xive' not found: Property '.xive' not found qemu-system-ppc64: -device xive-end-source: required link 'xive' not found: Property '.xive' not found qemu-system-ppc64: -device xive-tctx: required link 'cpu' not found: Property '.cpu' not found Hide them by setting dc->user_creatable to false in their respective class init functions. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157017473006.331610.2983143972519884544.stgit@bahia.lan> Message-Id: <157045578401.865784.6058183726552779559.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> [dwg: Folded comment update into base patch] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-22hw/intc/apic: reject pic ints if isa_pic == NULLSergio Lopez
In apic_accept_pic_intr(), reject PIC interruptions if a i8259 PIC has not been instantiated (isa_pic == NULL). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-22hw/i386: split PCMachineState deriving X86MachineState from itPaolo Bonzini
Split up PCMachineState and PCMachineClass and derive X86MachineState and X86MachineClass from them. This allows sharing code with non-PC x86 machine types. Signed-off-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-15hw/arm/bcm2835_peripherals: Improve loggingPhilippe Mathieu-Daudé
Various logging improvements as once: - Use 0x prefix for hex numbers - Display value written during write accesses - Move some logs from GUEST_ERROR to UNIMP Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Cleber Rosa <crosa@redhat.com> Message-id: 20190926173428.10713-3-f4bug@amsat.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-15intc/arm_gic: Support IRQ injection for more than 256 vpusEric Auger
Host kernels that expose the KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 capability allow injection of interrupts along with vcpu ids larger than 255. Let's encode the vpcu id on 12 bits according to the upgraded KVM_IRQ_LINE ABI when needed. Given that we have two callsites that need to assemble the value for kvm_set_irq(), a new helper routine, kvm_arm_set_irq is introduced. Without that patch qemu exits with "kvm_set_irq: Invalid argument" message. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reported-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-id: 20191003154640.22451-3-eric.auger@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-04xive: Improve irq claim/free pathDavid Gibson
spapr_xive_irq_claim() returns a bool to indicate if it succeeded. But most of the callers and one callee use int return values and/or an Error * with more information instead. In any case, ints are a more common idiom for success/failure states than bools (one never knows what sense they'll be in). So instead change to an int return value to indicate presence of error + an Error * to describe the details through that call chain. It also didn't actually check if the irq was already claimed, which is one of the primary purposes of the claim path, so do that. spapr_xive_irq_free() also returned a bool... which no callers checked and was always true, so just drop it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04spapr, xics, xive: Better use of assert()s on irq claim/free pathsDavid Gibson
The irq claim and free paths for both XICS and XIVE check for some validity conditions. Some of these represent genuine runtime failures, however others - particularly checking that the basic irq number is in a sane range - could only fail in the case of bugs in the callin code. Therefore use assert()s instead of runtime failures for those. In addition the non backend-specific part of the claim/free paths should only be used for PAPR external irqs, that is in the range SPAPR_XIRQ_BASE to the maximum irq number. Put assert()s for that into the top level dispatchers as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04spapr: Eliminate SpaprIrq:get_nodename methodDavid Gibson
This method is used to determine the name of the irq backend's node in the device tree, so that we can find its phandle (after SLOF may have modified it from the phandle we initially gave it). But, in the two cases the only difference between the node name is the presence of a unit address. Searching for a node name without considering unit address is standard practice for the device tree, and fdt_subnode_offset() will do exactly that, making this method unecessary. While we're there, remove the XICS_NODENAME define. The name "interrupt-controller" is required by PAPR (and IEEE1275), and a bunch of places assume it already. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04xics: Create sPAPR specific ICS subtypeDavid Gibson
We create a subtype of TYPE_ICS specifically for sPAPR. For now all this does is move the setup of the PAPR specific hcalls and RTAS calls to the realize() function for this, rather than requiring the PAPR code to explicitly call xics_spapr_init(). In future it will have some more function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04xics: Merge TYPE_ICS_BASE and TYPE_ICS_SIMPLE classesDavid Gibson
TYPE_ICS_SIMPLE is the only subtype of TYPE_ICS_BASE that's ever instantiated. The existence of different classes is mostly a hang over from when we (misguidedly) had separate subtypes for the KVM and non-KVM version of the device. There could be some call for an abstract base type for ICS variants that use a different representation of their state (PowerNV PHB3 might want this). The current split isn't really in the right place for that though. If we need this in future, we can re-implement it more in line with what we actually need. So, collapse the two classes together into just TYPE_ICS. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04xics: Eliminate reset hookDavid Gibson
Currently TYPE_XICS_BASE and TYPE_XICS_SIMPLE have their own reset methods, using the standard technique for having the subtype call the supertype's methods before doing its own thing. But TYPE_XICS_SIMPLE is the only subtype of TYPE_XICS_BASE ever instantiated, so there's no point having the split here. Merge them together into just an ics_reset() function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04xics: Rename misleading ics_simple_*() functionsDavid Gibson
There are a number of ics_simple_*() functions that aren't actually specific to TYPE_XICS_SIMPLE at all, and are equally valid on TYPE_XICS_BASE. Rename them to ics_*() accordingly. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04xics: Eliminate 'reject', 'resend' and 'eoi' class hooksDavid Gibson
Currently ics_reject(), ics_resend() and ics_eoi() indirect through class methods. But there's only one implementation of each method, the one in TYPE_ICS_SIMPLE. TYPE_ICS_BASE has no implementation, but it's never instantiated, and has no other subtypes. So clean up by eliminating the method and just having ics_reject(), ics_resend() and ics_eoi() contain the logic directly. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04spapr/xive: skip partially initialized vCPUs in presenterCédric Le Goater
When vCPUs are hotplugged, they are added to the QEMU CPU list before being fully realized. This can crash the XIVE presenter because the 'tctx' pointer is not necessarily initialized when looking for a matching target. These vCPUs are not valid targets for the presenter. Skip them. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191001085722.32755-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04spapr/irq: Only claim VALID interrupts at the KVM levelCédric Le Goater
A typical pseries VM with 16 vCPUs, one disk, one network adapater uses less than 100 interrupts but the whole IRQ number space of the QEMU machine is allocated at reset time and it is 8K wide. This is wasting a considerable amount of interrupt numbers in the global IRQ space which has 1M interrupts per socket on a POWER9. To optimise the HW resources, only request at the KVM level interrupts which have been claimed by the guest. This will help to increase the maximum number of VMs per system and also help supporting nested guests using the XIVE interrupt mode. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190911133937.2716-3-clg@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156942766014.1274533.10792048853177121231.stgit@bahia.lan> [dwg: Folded in fix up from Greg Kurz] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-09-23s390x/kvm: Officially require at least kernel 3.15Thomas Huth
Since QEMU v2.10, the KVM acceleration does not work on older kernels anymore since the code accidentally requires the KVM_CAP_DEVICE_CTRL capability now - it should have been optional instead. Instead of fixing the bug, we asked in the ChangeLog of QEMU 2.11 - 3.0 that people should speak up if they still need support of QEMU running with KVM on older kernels, but seems like nobody really complained. Thus let's make this official now and turn it into a proper error message, telling the users to use at least kernel 3.15 now. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190913091443.27565-1-thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-09-03memory: Access MemoryRegion with endiannessTony Nguyen
Preparation for collapsing the two byte swaps adjust_endianness and handle_bswap into the former. Call memory_region_dispatch_{read|write} with endianness encoded into the "MemOp op" operand. This patch does not change any behaviour as memory_region_dispatch_{read|write} is yet to handle the endianness. Once it does handle endianness, callers with byte swaps can collapse them into adjust_endianness. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Message-Id: <8066ab3eb037c0388dfadfe53c5118429dd1de3a.1566466906.git.tony.nguyen@bt.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-09-03hw/intc/armv7m_nic: Access MemoryRegion with MemOpTony Nguyen
The memory_region_dispatch_{read|write} operand "unsigned size" is being converted into a "MemOp op". Convert interfaces by using no-op size_memop. After all interfaces are converted, size_memop will be implemented and the memory_region_dispatch_{read|write} operand "unsigned size" will be converted into a "MemOp op". As size_memop is a no-op, this patch does not change any behaviour. Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <21113bae2f54b45176701e0bf595937031368ae6.1566466906.git.tony.nguyen@bt.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-08-22Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2019-08-21' ↵Peter Maydell
into staging Monitor patches for 2019-08-21 # gpg: Signature made Wed 21 Aug 2019 16:35:07 BST # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-monitor-2019-08-21: monitor/qmp: Update comment for commit 4eaca8de268 qdev: Collect HMP handlers command handlers in qdev-monitor.c qapi: Move query-target from misc.json to machine.json hw/core: Move cpu.c, cpu.h from qom/ to hw/core/ Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-21hw/core: Move cpu.c, cpu.h from qom/ to hw/core/Markus Armbruster
Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190709152053.16670-2-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> [Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h in comments replaced]
2019-08-21spapr/xive: Mask the EAS when allocating an IRQCédric Le Goater
If an IRQ is allocated and not configured, such as a MSI requested by a PCI driver, it can be saved in its default state and possibly later on restored using the same state. If not initially MASKED, KVM will try to find a matching priority/target tuple for the interrupt and fail to restore the VM because 0/0 is not a valid target. When allocating a IRQ number, the EAS should be set to a sane default : VALID and MASKED. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190813164420.9829-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21ppc/xive: Improve 'info pic' supportCédric Le Goater
Provide a better output of the XIVE END structures including the escalation information and extend the PowerNV machine 'info pic' command with a dump of the END EAS table used for escalations. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190718115420.19919-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21ppc/xive: Provide silent escalation supportCédric Le Goater
When the 's' bit is set the escalation is said to be 'silent' or 'silent/gather'. In such configuration, the notification sequence is skipped and only the escalation sequence is performed. This is used to configure all the EQs of a vCPU to escalate on a single EQ which will then target the hypervisor. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190718115420.19919-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21ppc/xive: Provide unconditional escalation supportCédric Le Goater
When the 'u' bit is set the escalation is said to be 'unconditional' which means that the ESe PQ bits are not used. Introduce a xive_router_end_es_notify() routine to share code with the ESn notification. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190718115420.19919-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21ppc/xive: Provide escalation supportCédric Le Goater
If the XIVE presenter can not find the NVT dispatched on any of the HW threads, it can not deliver the interrupt. XIVE offers an escalation mechanism to handle such scenarios and inform the hypervisor that an action should be taken. Escalation is configured by setting the 'e' bit and the EAS in word 4 & 5 to let the HW look for the escalation END on which to trigger a new event. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190718115420.19919-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21ppc/xive: Provide backlog supportCédric Le Goater
If backlog is activated ('b' bit) on the END, the pending priority of a missed event is recorded in the IPB field of the NVT for a later resend. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190718115420.19919-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21ppc/xive: Implement TM_PULL_OS_CTX special commandCédric Le Goater
When a vCPU is not dispatched anymore on a HW thread, the Hypervisor (KVM on Linux) invalidates the OS interrupt context of a vCPU with this special command. It returns the OS CAM line value and resets the VO bit. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190718115420.19919-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-19hw/intc: Only build the xlnx-iomod-intc device for the MicroBlaze PMUPhilippe Mathieu-Daudé
The Xilinx I/O Module Interrupt Controller is only used by the MicroBlaze PMU, not by the AArch64 machine. Move it from the generic ZynqMP object list to the PMU specific. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190427141459.19728-3-philmd@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2019-08-16sysemu: Split sysemu/runstate.h off sysemu/sysemu.hMarkus Armbruster
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related to the system-emulator. Evidence: * It's included widely: in my "build everything" tree, changing sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h, down from 5400 due to the previous two commits). * It pulls in more than a dozen additional headers. Split stuff related to run state management into its own header sysemu/runstate.h. Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400 to 4200. Touching new sysemu/runstate.h recompiles some 500 objects. Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also add qemu/main-loop.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-30-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [Unbreak OS-X build]
2019-08-16Include sysemu/sysemu.h a lot lessMarkus Armbruster
In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 5400 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/qdev-core.h includes sysemu/sysemu.h since recent commit e965ffa70a "qdev: add qdev_add_vm_change_state_handler()". This is a bad idea: hw/qdev-core.h is widely included. Move the declaration of qdev_add_vm_change_state_handler() to sysemu/sysemu.h, and drop the problematic include from hw/qdev-core.h. Touching sysemu/sysemu.h now recompiles some 1800 objects. qemu/uuid.h also drops from 5400 to 1800. A few more headers show smaller improvement: qemu/notify.h drops from 5600 to 5200, qemu/timer.h from 5600 to 4500, and qapi/qapi-types-run-state.h from 5500 to 5000. Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190812052359.30071-28-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2019-08-16Clean up inclusion of sysemu/sysemu.hMarkus Armbruster
In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 5400 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Almost a third of its inclusions are actually superfluous. Delete them. Downgrade two more to qapi/qapi-types-run-state.h, and move one from char/serial.h to char/serial.c. hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and stubs/semihost.c define variables declared in sysemu/sysemu.h without including it. The compiler is cool with that, but include it anyway. This doesn't reduce actual use much, as it's still included into widely included headers. The next commit will tackle that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-27-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2019-08-16Include hw/qdev-properties.h lessMarkus Armbruster
In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16Include qemu/main-loop.h lessMarkus Armbruster
In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include hw/hw.h exactly where neededMarkus Armbruster
In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include migration/vmstate.h lessMarkus Armbruster
In my "build everything" tree, changing migration/vmstate.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get VMStateDescription. The previous commit made that unnecessary. Include migration/vmstate.h only where it's still needed. Touching it now recompiles only some 1600 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-16-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include hw/irq.h a lot lessMarkus Armbruster
In my "build everything" tree, changing hw/irq.h triggers a recompile of some 5400 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get qemu_irq and.or qemu_irq_handler. Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to qemu/typedefs.h, and then include hw/irq.h only where it's still needed. Touching it now recompiles only some 500 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-13-armbru@redhat.com>
2019-08-16Include migration/qemu-file-types.h a lot lessMarkus Armbruster
In my "build everything" tree, changing migration/qemu-file-types.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The culprit is again hw/hw.h, which supposedly includes it for convenience. Include migration/qemu-file-types.h only where it's needed. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include sysemu/reset.h a lot lessMarkus Armbruster
In my "build everything" tree, changing sysemu/reset.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The main culprit is hw/hw.h, which supposedly includes it for convenience. Include sysemu/reset.h only where it's needed. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-9-armbru@redhat.com>
2019-08-13spapr/xive: Fix migration of hot-plugged CPUsCédric Le Goater
The migration sequence of a guest using the XIVE exploitation mode relies on the fact that the states of all devices are restored before the machine is. This is not true for hot-plug devices such as CPUs which state come after the machine. This breaks migration because the thread interrupt context registers are not correctly set. Fix migration of hotplugged CPUs by restoring their context in the 'post_load' handler of the XiveTCTX model. Fixes: 277dd3d7712a ("spapr/xive: add migration support for KVM") Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190813064853.29310-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-28xics/kvm: Fix fallback to emulated XICSGreg Kurz
Commit 4812f2615288 tried to fix rollback path of xics_kvm_connect() but it isn't enough. If we fail to create the KVM device, the guest fails to boot later on with: [ 0.010817] pci 0000:00:00.0: Adding to iommu group 0 [ 0.010863] irq: unknown-1 didn't like hwirq-0x1200 to VIRQ17 mapping (rc=-22) [ 0.010923] pci 0000:00:01.0: Adding to iommu group 0 [ 0.010968] irq: unknown-1 didn't like hwirq-0x1201 to VIRQ17 mapping (rc=-22) [ 0.011543] EEH: No capable adapters found [ 0.011597] irq: unknown-1 didn't like hwirq-0x1000 to VIRQ17 mapping (rc=-22) [ 0.011651] audit: type=2000 audit(1563977526.000:1): state=initialized audit_enabled=0 res=1 [ 0.011703] ------------[ cut here ]------------ [ 0.011729] event-sources: Unable to allocate interrupt number for /event-sources/epow-events [ 0.011776] WARNING: CPU: 0 PID: 1 at arch/powerpc/platforms/pseries/event_sources.c:34 request_event_sources_irqs+0xbc/0x150 [ 0.011828] Modules linked in: [ 0.011850] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.1.17-300.fc30.ppc64le #1 [ 0.011886] NIP: c0000000000d4fac LR: c0000000000d4fa8 CTR: c0000000018f0000 [ 0.011923] REGS: c00000001e4c38d0 TRAP: 0700 Not tainted (5.1.17-300.fc30.ppc64le) [ 0.011966] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 28000284 XER: 20040000 [ 0.012012] CFAR: c00000000011b42c IRQMASK: 0 [ 0.012012] GPR00: c0000000000d4fa8 c00000001e4c3b60 c0000000015fc400 0000000000000051 [ 0.012012] GPR04: 0000000000000001 0000000000000000 0000000000000081 772d6576656e7473 [ 0.012012] GPR08: 000000001edf0000 c0000000014d4830 c0000000014d4830 6e6576652f20726f [ 0.012012] GPR12: 0000000000000000 c0000000018f0000 c000000000010bf0 0000000000000000 [ 0.012012] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.012012] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.012012] GPR24: 0000000000000000 0000000000000000 c000000000ebbf00 c0000000000d5570 [ 0.012012] GPR28: c000000000ebc008 c00000001fff8248 0000000000000000 0000000000000000 [ 0.012372] NIP [c0000000000d4fac] request_event_sources_irqs+0xbc/0x150 [ 0.012409] LR [c0000000000d4fa8] request_event_sources_irqs+0xb8/0x150 [ 0.012445] Call Trace: [ 0.012462] [c00000001e4c3b60] [c0000000000d4fa8] request_event_sources_irqs+0xb8/0x150 (unreliable) [ 0.012513] [c00000001e4c3bf0] [c000000001042848] __machine_initcall_pseries_init_ras_IRQ+0xc8/0xf8 [ 0.012563] [c00000001e4c3c20] [c000000000010810] do_one_initcall+0x60/0x254 [ 0.012611] [c00000001e4c3cf0] [c000000001024538] kernel_init_freeable+0x35c/0x444 [ 0.012655] [c00000001e4c3db0] [c000000000010c14] kernel_init+0x2c/0x148 [ 0.012693] [c00000001e4c3e20] [c00000000000bdc4] ret_from_kernel_thread+0x5c/0x78 [ 0.012736] Instruction dump: [ 0.012759] 38a00000 7c7f1b78 7f64db78 2c1f0000 2fbf0000 78630020 4180002c 409effa8 [ 0.012805] 7fa4eb78 7f43d378 48046421 60000000 <0fe00000> 3bde0001 2c1e0010 7fde07b4 [ 0.012851] ---[ end trace aa5785707323fad3 ]--- This happens because QEMU fell back on XICS emulation but didn't unregister the RTAS calls from KVM. The emulated RTAS calls are hence never called and the KVM ones return an error to the guest since the KVM device is absent. The sanity checks in xics_kvm_disconnect() are abusive since we're freeing the KVM device. Simply drop them. Fixes: 4812f2615288 "xics/kvm: Add proper rollback to xics_kvm_init()" Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156398744035.546975.7029414194633598474.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-25ioapic: kvm: Skip route updates for masked pinsJan Kiszka
Masked entries will not generate interrupt messages, thus do no need to be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of the kind qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100) if the masked entry happens to reference a non-present IRTE. Cc: qemu-stable@nongnu.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Message-Id: <a84b7e03-f9a8-b577-be27-4d93d1caa1c9@siemens.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2019-07-12xics/kvm: Always set the MASKED bit if interrupt is maskedGreg Kurz
The ics_set_kvm_state_one() function is called either to restore the state of an interrupt source during migration or to set the interrupt source to a default state during reset. Since always, ie. 2013, the code only sets the MASKED bit if the 'current priority' and the 'saved priority' are different. This is likely true when restoring an interrupt that had been previously masked with the ibm,int-off RTAS call. However this is always false in the case of reset since both 'current priority' and 'saved priority' are equal to 0xff, and the MASKED bit is never set. The legacy KVM XICS device gets away with that because it ends updating its internal structure the same way, whether the MASKED bit is set or the priority is 0xff. The XICS-on-XIVE device for POWER9 is different. It sticks to the KVM documentation [1] and _really_ relies on the MASKED bit to correctly set. If not, it will configure the interrupt source in the XIVE HW, even though the guest hasn't configured the interrupt yet. This disturbs the complex logic implemented in XICS-on-XIVE and may result in the loss of subsequent queued events. Always set the MASKED bit if interrupt is masked as expected by the KVM XICS-on-XIVE device. This has no impact on the legacy KVM XICS. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/virtual/kvm/devices/xics.txt Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <156217454083.559957.7359208229523652842.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-08Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
Bugfixes. # gpg: Signature made Fri 05 Jul 2019 21:21:52 BST # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: ioapic: use irq number instead of vector in ioapic_eoi_broadcast hw/i386: Fix linker error when ISAPC is disabled Makefile: generate header file with the list of devices enabled target/i386: kvm: Fix when nested state is needed for migration minikconf: do not include variables from MINIKCONF_ARGS in config-all-devices.mak target/i386: fix feature check in hyperv-stub.c ioapic: clear irq_eoi when updating the ioapic redirect table entry intel_iommu: Fix unexpected unmaps during global unmap intel_iommu: Fix incorrect "end" for vtd_address_space_unmap i386/kvm: Fix build with -m32 checkpatch: do not warn for multiline parenthesized returned value pc: fix possible NULL pointer dereference in pc_machine_get_device_memory_region_size() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-05ioapic: use irq number instead of vector in ioapic_eoi_broadcastLi Qiang
When emulating irqchip in qemu, such as following command: x86_64-softmmu/qemu-system-x86_64 -m 1024 -smp 4 -hda /home/test/test.img -machine kernel-irqchip=off --enable-kvm -vnc :0 -device edu -monitor stdio We will get a crash with following asan output: (qemu) /home/test/qemu5/qemu/hw/intc/ioapic.c:266:27: runtime error: index 35 out of bounds for type 'int [24]' ================================================================= ==113504==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61b000003114 at pc 0x5579e3c7a80f bp 0x7fd004bf8c10 sp 0x7fd004bf8c00 WRITE of size 4 at 0x61b000003114 thread T4 #0 0x5579e3c7a80e in ioapic_eoi_broadcast /home/test/qemu5/qemu/hw/intc/ioapic.c:266 #1 0x5579e3c6f480 in apic_eoi /home/test/qemu5/qemu/hw/intc/apic.c:428 #2 0x5579e3c720a7 in apic_mem_write /home/test/qemu5/qemu/hw/intc/apic.c:802 #3 0x5579e3b1e31a in memory_region_write_accessor /home/test/qemu5/qemu/memory.c:503 #4 0x5579e3b1e6a2 in access_with_adjusted_size /home/test/qemu5/qemu/memory.c:569 #5 0x5579e3b28d77 in memory_region_dispatch_write /home/test/qemu5/qemu/memory.c:1497 #6 0x5579e3a1b36b in flatview_write_continue /home/test/qemu5/qemu/exec.c:3323 #7 0x5579e3a1b633 in flatview_write /home/test/qemu5/qemu/exec.c:3362 #8 0x5579e3a1bcb1 in address_space_write /home/test/qemu5/qemu/exec.c:3452 #9 0x5579e3a1bd03 in address_space_rw /home/test/qemu5/qemu/exec.c:3463 #10 0x5579e3b8b979 in kvm_cpu_exec /home/test/qemu5/qemu/accel/kvm/kvm-all.c:2045 #11 0x5579e3ae4499 in qemu_kvm_cpu_thread_fn /home/test/qemu5/qemu/cpus.c:1287 #12 0x5579e4cbdb9f in qemu_thread_start util/qemu-thread-posix.c:502 #13 0x7fd0146376da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) #14 0x7fd01436088e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e This is because in ioapic_eoi_broadcast function, we uses 'vector' to index the 's->irq_eoi'. To fix this, we should uses the irq number. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190622002119.126834-1-liq3ea@163.com>
2019-07-05ioapic: clear irq_eoi when updating the ioapic redirect table entryLi Qiang
irq_eoi is used to count the number of irq injected during eoi broadcast. It should be set to 0 when updating the ioapic's redirect table entry. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190624151635.22494-1-liq3ea@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>