aboutsummaryrefslogtreecommitdiff
path: root/hw/ppc
AgeCommit message (Collapse)Author
2018-01-19ppc: e500: Allow only supported dynamic sysbus devicesEduardo Habkost
platform_bus_create_devtree() already rejects all dynamic sysbus devices except TYPE_ETSEC_COMMON, so register it as the only allowed dynamic sysbus device for the ppce500 machine-type. Cc: Alexander Graf <agraf@suse.de> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: qemu-ppc@nongnu.org Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20171125151610.20547-4-ehabkost@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-19machine: Replace has_dynamic_sysbus with list of allowed devicesEduardo Habkost
The existing has_dynamic_sysbus flag makes the machine accept every user-creatable sysbus device type on the command-line. Replace it with a list of allowed device types, so machines can easily accept some sysbus devices while rejecting others. To keep exactly the same behavior as before, the existing has_dynamic_sysbus=true assignments are replaced with a TYPE_SYS_BUS_DEVICE entry on the allowed list. Other patches will replace the TYPE_SYS_BUS_DEVICE entries with more specific lists of devices. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alexander Graf <agraf@suse.de> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: qemu-arm@nongnu.org Cc: qemu-ppc@nongnu.org Cc: xen-devel@lists.xenproject.org Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20171125151610.20547-2-ehabkost@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17ppc/pnv: change initrd addressCédric Le Goater
When skiboot starts, it first clears the CPU structs for all possible CPUs on a system : for (i = 0; i <= cpu_max_pir; i++) memset(&cpu_stacks[i].cpu, 0, sizeof(struct cpu_thread)); On POWER9, cpu_max_pir is quite big, 0x7fff, and the skiboot cpu_stacks array overlaps with the memory region in which QEMU maps the initramfs file. Move it upwards in memory to keep it safe. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: fix XSCOM core addressing on POWER9Cédric Le Goater
The XSCOM base address of the core chiplet was wrongly calculated. Use the OPAL macros to fix that and do a couple of renames. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: introduce pnv*_is_power9() helpersCédric Le Goater
These are useful when instantiating device models which are shared between the POWER8 and the POWER9 processor families. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: change core mask for POWER9Cédric Le Goater
When addressed by XSCOM, the first core has the 0x20 chiplet ID but the CPU PIR can start at 0x0. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: use POWER9 DD2 processorCédric Le Goater
commit 1ed9c8af501f ("target/ppc: Add POWER9 DD2.0 model information") deprecated the POWER9 model v1.0. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17spapr: Adjust default VSMT value for better migration compatibilityDavid Gibson
fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the "vsmt" parameter for the pseries machine type, which controls the spacing of the vcpu ids of thread 0 for each virtual core. This was done to bring some consistency and stability to how that was done, while still allowing backwards compatibility for migration and otherwise. The default value we used for vsmt was set to the max of the host's advertised default number of threads and the number of vthreads per vcore in the guest. This was done to continue running without extra parameters on older KVM versions which don't allow the VSMT value to be changed. Unfortunately, even that smaller than before leakage of host configuration into guest visible configuration still breaks things. Specifically a guest with 4 (or less) vthread/vcore will get a different vsmt value when running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host. That means the vcpu ids don't line up so you can't migrate between them, though you should be able to. Long term we really want to make vsmt == smp_threads for sufficiently new machine types. However, that means that qemu will then require a sufficiently recent KVM (one which supports changing VSMT) - that's still not widely enough deployed to be really comfortable to do. In the meantime we need some default that will work as often as possible. This patch changes that default to 8 in all circumstances. This does change guest visible behaviour (including for existing machine versions) for many cases - just not the most common/important case. Following is case by case justification for why this is still the least worst option. Note that any of the old behaviours can still be duplicated after this patch, it's just that it requires manual intervention by setting the vsmt property on the command line. KVM HV on POWER8 host: This is the overwhelmingly common case in production setups, and is unchanged by design. POWER8 hosts will advertise a default VSMT mode of 8, and > 8 vthreads/vcore isn't permitted KVM HV on POWER7 host: Will break, but POWER7s allowing KVM were never released to the public. KVM HV on POWER9 host: Not yet released to the public, breaking this now will reduce other breakage later. KVM HV on PowerPC 970: Will theoretically break it, but it was barely supported to begin with and already required various user visible hacks to work. Also so old that I just don't care. TCG: This is the nastiest one; it means migration of TCG guests (without manual vsmt setting) will break. Since TCG is rarely used in production I think this is worth it for the other benefits. It does also remove one more barrier to TCG<->KVM migration which could be interesting for debugging applications. KVM PR: As with TCG, this will break migration of existing configurations, without adding extra manual vsmt options. As with TCG, it is rare in production so I think the benefits outweigh breakages. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Allow some cases where we can't set VSMT mode in the kernelDavid Gibson
At present if we require a vsmt mode that's not equal to the kernel's default, and the kernel doesn't let us change it (e.g. because it's an old kernel without support) then we always fail. But in fact we can cope with the kernel having a different vsmt as long as a) it's >= the actual number of vthreads/vcore (so that guest threads that are supposed to be on the same core act like it) b) it's a submultiple of the requested vsmt mode (so that guest threads spaced by the vsmt value will act like they're on different cores) Allowing this case gives us a bit more freedom to adjust the vsmt behaviour without breaking existing cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17target/ppc: Clarify compat mode max_threads valueDavid Gibson
We recently had some discussions that were sidetracked for a while, because nearly everyone misapprehended the purpose of the 'max_threads' field in the compatiblity modes table. It's all about guest expectations, not host expectations or support (that's handled elsewhere). In an attempt to avoid a repeat of that confusion, rename the field to 'max_vthreads' and add an explanatory comment. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
2018-01-17spapr: Remove unnecessary 'options' field from sPAPRCapabilityInfoDavid Gibson
The options field here is intended to list the available values for the capability. It's not used yet, because the existing capabilities are boolean. We're going to add capabilities that aren't, but in that case the info on the possible values can be folded into the .description field. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17hw/ppc/spapr_caps: Rework spapr_caps to use uint8 internal representationSuraj Jitindar Singh
Currently spapr_caps are tied to boolean values (on or off). This patch reworks the caps so that they can have any uint8 value. This allows more capabilities with various values to be represented in the same way internally. Capabilities are numbered in ascending order. The internal representation of capability values is an array of uint8s in the sPAPRMachineState, indexed by capability number. Capabilities can have their own name, description, options, getter and setter functions, type and allow functions. They also each have their own section in the migration stream. Capabilities are only migrated if they were explictly set on the command line, with the assumption that otherwise the default will match. On migration we ensure that the capability value on the destination is greater than or equal to the capability value from the source. So long at this remains the case then the migration is considered compatible and allowed to continue. This patch implements generic getter and setter functions for boolean capabilities. It also converts the existings cap-htm, cap-vsx and cap-dfp capabilities to this new format. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17spapr: Handle Decimal Floating Point (DFP) as an optional capabilityDavid Gibson
Decimal Floating Point has been available on POWER7 and later (server) cpus. However, it can be disabled on the hypervisor, meaning that it's not available to guests. We currently handle this by conditionally advertising DFP support in the device tree depending on whether the guest CPU model supports it - which can also depend on what's allowed in the host for -cpu host. That can lead to confusion on migration, since host properties are silently affecting guest visible properties. This patch handles it by treating it as an optional capability for the pseries machine type. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Handle VMX/VSX presence as an spapr capability flagDavid Gibson
We currently have some conditionals in the spapr device tree code to decide whether or not to advertise the availability of the VMX (aka Altivec) and VSX vector extensions to the guest, based on whether the guest cpu has those features. This can lead to confusion and subtle failures on migration, since it makes a guest visible change based only on host capabilities. We now have a better mechanism for this, in spapr capabilities flags, which explicitly depend on user options rather than host capabilities. Rework the advertisement of VSX and VMX based on a new VSX capability. We no longer bother with a conditional for VMX support, because every CPU that's ever been supported by the pseries machine type supports VMX. NOTE: Some userspace distributions (e.g. RHEL7.4) already rely on availability of VSX in libc, so using cap-vsx=off may lead to a fatal SIGILL in init. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Validate capabilities on migrationDavid Gibson
Now that the "pseries" machine type implements optional capabilities (well, one so far) there's the possibility of having different capabilities available at either end of a migration. Although arguably a user error, it would be nice to catch this situation and fail as gracefully as we can. This adds code to migrate the capabilities flags. These aren't pulled directly into the destination's configuration since what the user has specified on the destination command line should take precedence. However, they are checked against the destination capabilities. If the source was using a capability which is absent on the destination, we fail the migration, since that could easily cause a guest crash or other bad behaviour. If the source lacked a capability which is present on the destination we warn, but allow the migration to proceed. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Treat Hardware Transactional Memory (HTM) as an optional capabilityDavid Gibson
This adds an spapr capability bit for Hardware Transactional Memory. It is enabled by default for pseries-2.11 and earlier machine types. with POWER8 or later CPUs (as it must be, since earlier qemu versions would implicitly allow it). However it is disabled by default for the latest pseries-2.12 machine type. This means that with the latest machine type, HTM will not be available, regardless of CPU, unless it is explicitly enabled on the command line. That change is made on the basis that: * This way running with -M pseries,accel=tcg will start with whatever cpu and will provide the same guest visible model as with accel=kvm. - More specifically, this means existing make check tests don't have to be modified to use cap-htm=off in order to run with TCG * We hope to add a new "HTM without suspend" feature in the not too distant future which could work on both POWER8 and POWER9 cpus, and could be enabled by default. * Best guesses suggest that future POWER cpus may well only support the HTM-without-suspend model, not the (frankly, horribly overcomplicated) POWER8 style HTM with suspend. * Anecdotal evidence suggests problems with HTM being enabled when it wasn't wanted are more common than being missing when it was. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Capabilities infrastructureDavid Gibson
Because PAPR is a paravirtual environment access to certain CPU (or other) facilities can be blocked by the hypervisor. PAPR provides ways to advertise in the device tree whether or not those features are available to the guest. In some places we automatically determine whether to make a feature available based on whether our host can support it, in most cases this is based on limitations in the available KVM implementation. Although we correctly advertise this to the guest, it means that host factors might make changes to the guest visible environment which is bad: as well as generaly reducing reproducibility, it means that a migration between different host environments can easily go bad. We've mostly gotten away with it because the environments considered mature enough to be well supported (basically, KVM on POWER8) have had consistent feature availability. But, it's still not right and some limitations on POWER9 is going to make it more of an issue in future. This introduces an infrastructure for defining "sPAPR capabilities". These are set by default based on the machine version, masked by the capabilities of the chosen cpu, but can be overriden with machine properties. The intention is at reset time we verify that the requested capabilities can be supported on the host (considering TCG, KVM and/or host cpu limitations). If not we simply fail, rather than silently modifying the advertised featureset to the guest. This does mean that certain configurations that "worked" may now fail, but such configurations were already more subtly broken. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-11Merge remote-tracking branch 'origin/master' into HEADMichael S. Tsirkin
Resolve conflicts around apb. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-10spapr: Correct compatibility mode setting for hotplugged CPUsDavid Gibson
Currently the pseries machine sets the compatibility mode for the guest's cpus in two places: 1) at machine reset and 2) after CAS negotiation. This means that if we set or negotiate a compatiblity mode, then hotplug a cpu, the hotplugged cpu doesn't get the right mode set and will incorrectly have the full native features. To correct this, we set the compatibility mode on a cpu when it is brought online with the 'start-cpu' RTAS call. Given that we no longer need to set the compatibility mode on all CPUs at machine reset, so we change that to only set the mode for the boot cpu. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2018-01-10hw/ppc: Remove the deprecated spapr-pci-vfio-host-bridge deviceThomas Huth
It's a deprecated dummy device since QEMU v2.6.0. That should have been enough time to allow the users to update their scripts in case they still use it, so let's remove this legacy code now. Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-10target/ppc: more use of the PPC_*() macrosCédric Le Goater
Also introduce utilities to manipulate bitmasks (originaly from OPAL) which be will be used in the model of the XIVE interrupt controller. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-10ppc/pnv: change powernv_ prefix to pnv_ for overall naming consistencyCédric Le Goater
The 'pnv' prefix is now used for all and the routines populating the device tree start with 'pnv_dt'. The handler of the PnvXScomInterface is also renamed to 'dt_xscom' which should reflect that it is populating the device tree under the 'xscom@' node of the chip. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-10spapr_pci: use warn_report()Greg Kurz
These two are definitely warnings. Let's use the appropriate API. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-18hw/net/ne2000: extract ne2k-isa code from i386/pc to ne2000-isa.cPhilippe Mathieu-Daudé
- add "hw/net/ne2000-isa.h" - remove the old i386 dependency Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Hervé Poussineau <hpoussin@reactos.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> [PPC] Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18hw/timer/mc146818: rename rtc_init() -> mc146818_rtc_init()Philippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18ppc: remove duplicated includesPhilippe Mathieu-Daudé
applied using ./scripts/clean-includes not needed since 7ebaf795560 Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18hw: use "qemu/osdep.h" as first #include in source filesPhilippe Mathieu-Daudé
applied using ./scripts/clean-includes Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-15spapr: don't initialize PATB entry if max-cpu-compat < power9Laurent Vivier
if KVM is enabled and KVM capabilities MMU radix is available, the partition table entry (patb_entry) for the radix mode is initialized by default in ppc_spapr_reset(). It's a problem if we want to migrate the guest to a POWER8 host while the kernel is not started to set the value to the one expected for a POWER8 CPU. The "-machine max-cpu-compat=power8" should allow to migrate a POWER9 KVM host to a POWER8 KVM host, but because patb_entry is set, the destination QEMU tries to enable radix mode on the POWER8 host. This fails and cancels the migration: Process table config unsupported by the host error while loading state for instance 0x0 of device 'spapr' load of migration failed: Invalid argument This patch doesn't set the PATB entry if the user provides a CPU compatibility mode that doesn't support radix mode. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr: Assume msi_nonbrokenDavid Gibson
We conditionally adjust part of the guest device tree based on the global msi_nonbroken flag. However, the main machine type code initializes msi_nonbroken to true and there's nothing that would set it to false again. So replace the test with an assert(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2017-12-15spapr: Rename machine init functions for clarityDavid Gibson
Machine objects have two init functions - the generic QOM level instance_init which should only do static object initialization, and the Machine specific MachineClass::init which does the actual construction of the machine. In spapr the functions implementing these two have names - ppc_machine_initfn() and ppc_spapr_init() - which don't correspond closely to either of those. To prevent people (read, me) from confusing which is which, rename them spapr_instance_init() and spapr_machine_init() to make it clearer which is which. While we're there rename ppc_spapr_reset() to spapr_machine_reset() to match. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
2017-12-15spapr_events: drop bogus cell from "interrupt-ranges" propertyGreg Kurz
According to LoPAPR 1.1 B.6.12, the "/event-sources" node has an "interrupt- ranges" property, the format of which is described in B.6.9.1.2 as follows: “interrupt-ranges” Standard property name that defines the interrupt number(s) and range(s) handled by this unit. prop-encoded-array: List of (int-number, range) specifications. Int-number is encoded as with encode-int. Range is encoded as with encode-int. The first entry in this list shall contain the int-number associated with the first “reg” property entry. The int-num-ber is the value representing the interrupt source as would appear in the PowerPC External Interrupt Architecture XISR. The range shall be the number of sequential interrupt numbers which this unit can generate. There's no such thing as a cell count at the end of the array, like the one introduced by commit ffbb1705a33d in QEMU 2.8. It doesn't seem it had any impact on existing guests and I couldn't find any related workaround in linux. So, let's just drop the bogus lines. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr: fix LSI interrupt specifiers in the device treeGreg Kurz
LoPAPR 1.1 B.6.9.1.2 describes the "#interrupt-cells" property of the PowerPC External Interrupt Source Controller node as follows: “#interrupt-cells” Standard property name to define the number of cells in an interrupt- specifier within an interrupt domain. prop-encoded-array: An integer, encoded as with encode-int, that denotes the number of cells required to represent an interrupt specifier in its child nodes. The value of this property for the PowerPC External Interrupt option shall be 2. Thus all interrupt specifiers (as used in the standard “interrupts” property) shall consist of two cells, each containing an integer encoded as with encode-int. The first integer represents the interrupt number the second integer is the trigger code: 0 for edge triggered, 1 for level triggered. This patch fixes the interrupt specifiers in the "interrupt-map" property of the PHB node, that were setting the second cell to 8 (confusion with IRQ_TYPE_LEVEL_LOW ?) instead of 1. VIO devices and RTAS event sources use the same format for interrupt specifiers: while here, we introduce a common helper to handle the encoding details. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> -- v3: - reference public LoPAPR instead of internal PAPR+ in changelog - change helper name to spapr_dt_xics_irq() v2: - drop the erroneous changes to the "interrupts" prop in PCI device nodes - introduce a common helper to encode interrupt specifiers Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr: replace numa_get_node() with lookup in pc-dimm listIgor Mammedov
SPAPR is the last user of numa_get_node() and a bunch of supporting code to maintain numa_info[x].addr list. Get LMB node id from pc-dimm list, which allows to remove ~80LOC maintaining dynamic address range lookup list. It also removes pc-dimm dependency on numa_[un]set_mem_node_id() and makes pc-dimms a sole source of information about which node it belongs to and removes duplicate data from global numa_info. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr: introduce a spapr_qirq() helperCédric Le Goater
xics_get_qirq() is only used by the sPAPR machine. Let's move it there and change its name to reflect its scope. It will be useful for XIVE support which will use its own set of qirqs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr: introduce a spapr_irq_set_lsi() helperCédric Le Goater
It will make synchronisation easier with the XIVE interrupt mode when available. The 'irq' parameter refers to the global IRQ number space. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr: move the IRQ allocation routines under the machineCédric Le Goater
Also change the prototype to use a sPAPRMachineState and prefix them with spapr_irq_. It will let us synchronise the IRQ allocation with the XIVE interrupt mode when available. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15ppc/xics: assign of the CPU 'intc' pointer under the coreCédric Le Goater
The 'intc' pointer of the CPU references the interrupt presenter in the XICS interrupt mode. When the XIVE interrupt mode is available and activated, the machine will need to reassign this pointer to reflect the change. Moving this assignment under the realize routine of the CPU will ease the process when the interrupt mode is toggled. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15ppc/xics: introduce an icp_create() helperCédric Le Goater
The sPAPR and the PowerNV core objects create the interrupt presenter object of the CPUs in a very similar way. Let's provide a common routine in which we use the presenter 'type' as a child identifier. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr/rtas: do not reset the MSR in stop-self commandCédric Le Goater
When a CPU is stopped with the 'stop-self' RTAS call, its state 'halted' is switched to 1 and, in this case, the MSR is not taken into account anymore in the cpu_has_work() routine. Only the pending hardware interrupts are checked with their LPCR:PECE* enablement bit. The CPU is now also protected from the decrementer interrupt by the LPCR:PECE* bits which are disabled in the 'stop-self' RTAS call. Reseting the MSR is pointless. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr/rtas: fix reboot of a a SMP TCG guestCédric Le Goater
Just like for hot unplug CPUs, when a guest is rebooted, the secondary CPUs can be awaken by the decrementer and start entering SLOF at the same time the boot CPU is. To be safe, let's disable on the secondaries all the exceptions which can cause an exit while the CPU is in power-saving mode. Based on previous work from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr/rtas: disable the decrementer interrupt when a CPU is unpluggedCédric Le Goater
When a CPU is stopped with the 'stop-self' RTAS call, its state 'halted' is switched to 1 and, in this case, the MSR is not taken into account anymore in the cpu_has_work() routine. Only the pending hardware interrupts are checked with their LPCR:PECE* enablement bit. If the DECR timer fires after 'stop-self' is called and before the CPU 'stop' state is reached, the nearly-dead CPU will have some work to do and the guest will crash. This case happens very frequently with the not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is occasionally fired but after 'stop' state, so no work is to be done and the guest survives. I suspect there is a race between the QEMU mainloop triggering the timers and the TCG CPU thread but I could not quite identify the root cause. To be safe, let's disable in the LPCR all the exceptions which can cause an exit while the CPU is in power-saving mode and reenable them when the CPU is started. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15e500: name openpic and pci host bridgeMichael Davidsaver
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr_cpu_core: instantiate CPUs separatelyGreg Kurz
The current code assumes that only the CPU core object holds a reference on each individual CPU object, and happily frees their allocated memory when the core is unrealized. This is dangerous as some other code can legitimely keep a pointer to a CPU if it calls object_ref(), but it would end up with a dangling pointer. Let's allocate all CPUs with object_new() and let QOM free them when their reference count reaches zero. This greatly simplify the code as we don't have to fiddle with the instance size anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15spapr: Add pseries-2.12 machine typeDavid Gibson
While we're at it fix a couple of small errors in the 2.11 and 2.10 models (they didn't have any real effect, but don't quite match the template). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-05pci: Eliminate redundant PCIDevice::bus pointerDavid Gibson
The bus pointer in PCIDevice is basically redundant with QOM information. It's always initialized to the qdev_get_parent_bus(), the only difference is the type. Therefore this patch eliminates the field, instead creating a pci_get_bus() helper to do the type mangling to derive it conveniently from the QOM Device object underneath. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2017-12-05pci: Rename root bus initialization functions for clarityDavid Gibson
pci_bus_init(), pci_bus_new_inplace(), pci_bus_new() and pci_register_bus() are misleadingly named. They're not used for initializing *any* PCI bus, but only for a root PCI bus. Non-root buses - i.e. ones under a logical PCI to PCI bridge - are instead created with a direct qbus_create_inplace() (see pci_bridge_initfn()). This patch renames the functions to make it clear they're only used for a root bus. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2017-12-04spapr: Include "pre-plugged" DIMMS in ram size calculation at resetDavid Gibson
At guest reset time, we allocate a hash page table (HPT) for the guest based on the guest's RAM size. If dynamic HPT resizing is not available we use the maximum RAM size, if it is we use the current RAM size. But the "current RAM size" calculation is incorrect - we just use the "base" ram_size from the machine structure. This doesn't include any pluggable DIMMs that are already plugged at reset time. This means that if you try to start a 'pseries' machine with a DIMM specified on the command line that's much larger than the "base" RAM size, then the guest will get a woefully inadequate HPT. This can lead to a guest freeze during boot as it runs out of HPT space during initial MMU setup. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org>
2017-11-30pseries: fix TCG migrationLaurent Vivier
Migration of pseries is broken with TCG because QEMU tries to restore KVM MMU state unconditionally. The result is a SIGSEGV in kvm_vm_ioctl(): #0 kvm_vm_ioctl (s=0x0, type=-2146390353) at qemu/accel/kvm/kvm-all.c:2032 #1 0x00000001003e3e2c in kvmppc_configure_v3_mmu (cpu=<optimized out>, radix=<optimized out>, gtse=<optimized out>, proc_tbl=<optimized out>) at qemu/target/ppc/kvm.c:396 #2 0x00000001002f8b88 in spapr_post_load (opaque=0x1019103c0, version_id=<optimized out>) at qemu/hw/ppc/spapr.c:1578 #3 0x000000010059e4cc in vmstate_load_state (f=0x106230000, vmsd=0x1009479e0 <vmstate_spapr>, opaque=0x1019103c0, version_id=<optimized out>) at qemu/migration/vmstate.c:165 #4 0x00000001005987e0 in vmstate_load (f=<optimized out>, se=<optimized out>) at qemu/migration/savevm.c:748 This patch fixes the problem by not calling the KVM function with the TCG mode. Fixes: d39c90f5f3 ("spapr: Fix migration of Radix guests") Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-27target/ppc: Move setting of patb_entry on hash table initSuraj Jitindar Singh
The patb_entry is used to store the location of the process table in guest memory. The msb is also used to indicate the mmu mode of the guest, that is patb_entry & 1 << 63 ? radix_mode : hash_mode. Currently we set this to zero in spapr_setup_hpt_and_vrma() since if this function gets called then we know we're hash. However some code paths, such as setting up the hpt on incoming migration of a hash guest, call spapr_reallocate_hpt() directly bypassing this higher level function. Since we assume radix if the host is capable this results in the msb in patb_entry being left set so in spapr_post_load() we call kvmppc_configure_v3_mmu() and tell the host we're radix which as expected means addresses cannot be translated once we actually run the cpu. To fix this move the zeroing of patb_entry into spapr_reallocate_hpt(). Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-22hw/ppc/spapr: Fix virtio-scsi bootindex handling for LUNs >= 256Thomas Huth
LUNs >= 256 have to be encoded with the so-called "flat space addressing method" for virtio-scsi, where an additional bit has to be set. SLOF already took care of this with the following commit: https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=f72a37713fea47da (see https://bugzilla.redhat.com/show_bug.cgi?id=1431584 for details) But QEMU does not use this encoding yet for device tree paths that have to be handed over to SLOF to deal with the "bootindex" property, so SLOF currently fails to boot from virtio-scsi devices with LUNs >= 256 in the right boot order. Fix it by using the bit to indicate the "flat space addressing method" for LUNs >= 256. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>