aboutsummaryrefslogtreecommitdiff
path: root/include/hw/ppc
AgeCommit message (Collapse)Author
2016-10-28ppc/pnv: add a LPC controllerBenjamin Herrenschmidt
The LPC (Low Pin Count) interface on a POWER8 is made accessible to the system through the ADU (XSCOM interface). This interface is part of set of units connected together via a local OPB (On-Chip Peripheral Bus) which act as a bridge between the ADU and the off chip LPC endpoints, like external flash modules. The most important units of this OPB are : - OPB Master: contains the ADU slave logic, a set of internal registers and the logic to control the OPB. - LPCHC (LPC HOST Controller): which implements a OPB Slave, a set of internal registers and the LPC HOST Controller to control the LPC interface. Four address spaces are provided to the ADU : - LPC Bus Firmware Memory - LPC Bus Memory - LPC Bus I/O (ISA bus) - and the registers for the OPB Master and the LPC Host Controller On POWER8, an intermediate hop is necessary to reach the OPB, through a unit called the ECCB. OPB commands are simply mangled in ECCB write commands. On POWER9, the OPB master address space can be accessed via MMIO. The logic is same but the code will be simpler as the XSCOM and ECCB hops are not necessary anymore. This version of the LPC controller model doesn't yet implement support for the SerIRQ deserializer present in the Naples version of the chip though some preliminary work is there. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: - updated for qemu-2.7 - ported on latest PowerNV patchset - changed the XSCOM interface to fit new model - QOMified the model - moved the ISA hunks in another patch - removed printf logging - added a couple of UNIMP logging - rewrote commit log ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/pnv: add XSCOM handlers to PnvCoreCédric Le Goater
Now that we are using real HW ids for the cores in PowerNV chips, we can route the XSCOM accesses to them. We just need to attach a specific XSCOM memory region to each core in the appropriate window for the core number. To start with, let's install the DTS (Digital Thermal Sensor) handlers which should return 38°C for each core. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/pnv: add XSCOM infrastructureCédric Le Goater
On a real POWER8 system, the Pervasive Interconnect Bus (PIB) serves as a backbone to connect different units of the system. The host firmware connects to the PIB through a bridge unit, the Alter-Display-Unit (ADU), which gives him access to all the chiplets on the PCB network (Pervasive Connect Bus), the PIB acting as the root of this network. XSCOM (serial communication) is the interface to the sideband bus provided by the POWER8 pervasive unit to read and write to chiplets resources. This is needed by the host firmware, OPAL and to a lesser extent, Linux. This is among others how the PCI Host bridges get configured at boot or how the LPC bus is accessed. To represent the ADU of a real system, we introduce a specific AddressSpace to dispatch XSCOM accesses to the targeted chiplets. The translation of an XSCOM address into a PCB register address is slightly different between the P9 and the P8. This is handled before the dispatch using a 8byte alignment for all. To customize the device tree, a QOM InterfaceClass, PnvXScomInterface, is provided with a populate() handler. The chip populates the device tree by simply looping on its children. Therefore, each model needing custom nodes should not forget to declare itself as a child at instantiation time. Based on previous work done by : Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Added cpu parameter to xscom_complete()] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/pnv: add a PnvCore objectCédric Le Goater
This is largy inspired by sPAPRCPUCore with some simplification, no hotplug for instance. A set of PnvCore objects is added to the PnvChip and the device tree is populated looping on these cores. Real HW cpu ids are now generated depending on the chip cpu model, the chip id and a core mask. The id is propagated to the CPU object, using properties, to set the SPR_PIR (Processor Identification Register) Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/pnv: add a PIR handler to PnvChipCédric Le Goater
The Processor Identification Register (PIR) is a register that holds a processor identifier which is used for bus transactions (XSCOM) and for processor differentiation in multiprocessor systems. It also used in the interrupt vector entries (IVE) to identify the thread serving the interrupts. P9 and P8 have some differences in the CPU PIR encoding. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/pnv: add a core mask to PnvChipCédric Le Goater
This will be used to build real HW ids for the cores and enforce some limits on the available cores per chip. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/pnv: add a PnvChip objectCédric Le Goater
This is is an abstraction of a POWER8 chip which is a set of cores plus other 'units', like the pervasive unit, the interrupt controller, the memory controller, the on-chip microcontroller, etc. The whole can be seen as a socket. It depends on a cpu model and its characteristics: max cores and specific inits are defined in a PnvChipClass. We start with an near empty PnvChip with only a few cpu constants which we will grow in the subsequent patches with the controllers required to run the system. The Chip CFAM (Common FRU Access Module) ID gives the model of the chip and its version number. It is generally the first thing firmwares fetch, available at XSCOM PCB address 0xf000f, to start initialization. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/pnv: add skeleton PowerNV platformBenjamin Herrenschmidt
The goal is to emulate a PowerNV system at the level of the skiboot firmware, which loads the OS and provides some runtime services. Power Systems have a lower firmware (HostBoot) that does low level system initialization, like DRAM training. This is beyond the scope of what qemu will address in a PowerNV guest. No devices yet, not even an interrupt controller. Just to get started, some RAM to load the skiboot firmware, the kernel and initrd. The device tree is fully created in the machine reset op. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: - updated for qemu-2.7 - replaced fprintf by error_report - used a common definition of _FDT macro - removed VMStateDescription as migration is not yet supported - added IBM Copyright statements - reworked kernel_filename handling - merged PnvSystem and sPowerNVMachineState - removed PHANDLE_XICP - added ppc_create_page_sizes_prop helper - removed nmi support - removed kvm support - updated powernv machine to version 2.8 - removed chips and cpus, They will be provided in another patches - added a machine reset routine to initialize the device tree (also) - french has a squelette and english a skeleton. - improved commit log. - reworked prototypes parameters - added a check on the ram size (thanks to Michael Ellerman) - fixed chip-id cell - changed MAX_CPUS to 2048 - simplified memory node creation to one node only - removed machine version - rewrote the device tree creation with the fdt "rw" routines - s/sPowerNVMachineState/PnvMachineState/ - etc.] Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28pseries: Remove unused callbacks from sPAPR VIO bus stateDavid Gibson
The original QOMification of the spapr VIO devices in 3954d33 "spapr: convert to QEMU Object Model (v2)" moved some callbacks from the VIOsPAPRBus structure to the VIOsPAPRDeviceClass. Except, that it forgot to actually remove them from the VIOsPAPRBus structure (which still exists, though it doesn't fulfill quite the same function as it did pre-QOM). This patch removes those now unused callback fields. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-10-28ppc/xics: change the icp_ routines API to use an 'ICPState *' argumentCédric Le Goater
The routines : void icp_set_cppr(ICPState *icp, uint8_t cppr); void icp_set_mfrr(ICPState *icp, uint8_t mfrr); void icp_eoi(ICPState *icp, uint32_t xirr); now use one 'ICPState *icp' argument instead of a 'XICSState *' and a server arguments. The backlink on XICSState* is used whenever needed. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/xics: add a XICSState backlink in ICPStateCédric Le Goater
The link will be used to change the API of the icp_* routines which are still using an XICSState as an argument. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28ppc/xics: add a xics_set_nr_servers common routineCédric Le Goater
xics_spapr and xics_kvm nearly define the same 'set_nr_servers' handler. Only the type of the ICP differs. So let's make a common one to remove some duplicated code. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-16spapr_pci: Add a 64-bit MMIO windowDavid Gibson
On real hardware, and under pHyp, the PCI host bridges on Power machines typically advertise two outbound MMIO windows from the guest's physical memory space to PCI memory space: - A 32-bit window which maps onto 2GiB..4GiB in the PCI address space - A 64-bit window which maps onto a large region somewhere high in PCI address space (traditionally this used an identity mapping from guest physical address to PCI address, but that's not always the case) The qemu implementation in spapr-pci-host-bridge, however, only supports a single outbound MMIO window, however. At least some Linux versions expect the two windows however, so we arranged this window to map onto the PCI memory space from 2 GiB..~64 GiB, then advertised it as two contiguous windows, the "32-bit" window from 2G..4G and the "64-bit" window from 4G..~64G. This approach means, however, that the 64G window is not naturally aligned. In turn this limits the size of the largest BAR we can map (which does have to be naturally aligned) to roughly half of the total window. With some large nVidia GPGPU cards which have huge memory BARs, this is starting to be a problem. This patch adds true support for separate 32-bit and 64-bit outbound MMIO windows to the spapr-pci-host-bridge implementation, each of which can be independently configured. The 32-bit window always maps to 2G.. in PCI space, but the PCI address of the 64-bit window can be configured (it defaults to the same as the guest physical address). So as not to break possible existing configurations, as long as a 64-bit window is not specified, a large single window can be specified. This will appear the same way to the guest as the old approach, although it's now implemented by two contiguous memory regions rather than a single one. For now, this only adds the possibility of 64-bit windows. The default configuration still uses the legacy mode. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16spapr_pci: Delegate placement of PCI host bridges to machine typeDavid Gibson
The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB) for a PAPR guest. Unlike on x86, it's routine on Power (both bare metal and PAPR guests) to have numerous independent PHBs, each controlling a separate PCI domain. There are two ways of configuring the spapr-pci-host-bridge device: first it can be done fully manually, specifying the locations and sizes of all the IO windows. This gives the most control, but is very awkward with 6 mandatory parameters. Alternatively just an "index" can be specified which essentially selects from an array of predefined PHB locations. The PHB at index 0 is automatically created as the default PHB. The current set of default locations causes some problems for guests with large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia GPGPU cards via VFIO). Obviously, for migration we can only change the locations on a new machine type, however. This is awkward, because the placement is currently decided within the spapr-pci-host-bridge code, so it breaks abstraction to look inside the machine type version. So, this patch delegates the "default mode" PHB placement from the spapr-pci-host-bridge device back to the machine type via a public method in sPAPRMachineClass. It's still a bit ugly, but it's about the best we can do. For now, this just changes where the calculation is done. It doesn't change the actual location of the host bridges, or any other behaviour. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-14ppc/xics: Split ICS into ics-base and ics classBenjamin Herrenschmidt
The existing implementation remains same and ics-base is introduced. The type name "ics" is retained, and all the related functions renamed as ics_simple_* This will allow different implementations for the source controllers such as the MSI support of PHB3 on Power8 which uses in-memory state tables for example. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ clg: added ICS_BASE_GET_CLASS and related fixes, based on : http://patchwork.ozlabs.org/patch/646010/ ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-14ppc/xics: Make the ICSState a listBenjamin Herrenschmidt
Instead of an array of fixed sized blocks, use a list, as we will need to have sources with variable number of interrupts. SPAPR only uses a single entry. Native will create more. If performance becomes an issue we can add some hashed lookup but for now this will do fine. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [ move the initialization of list to xics_common_initfn, restore xirr_owner after migration and move restoring to icp_post_load] Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> [ clg: removed the icp_post_load() changes from nikunj patchset v3: http://patchwork.ozlabs.org/patch/646008/ ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-06hw/ppc/spapr: Use POWER8 by default for the pseries-2.8 machineThomas Huth
A couple of distributors are compiling their distributions with "-mcpu=power8" for ppc64le these days, so the user sooner or later runs into a crash there when not explicitely specifying the "-cpu POWER8" option to QEMU (which is currently using POWER7 for the "pseries" machine by default). Due to this reason, the linux-user target already switched to POWER8 a while ago (see commit de3f1b98410e0d5b406a0df3a48547b559d18602). Since the softmmu target of course has the same problem, we should switch there to POWER8 for the newer machine types, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23ppc/xics: An ICS with offset 0 is assumed to be uninitializedBenjamin Herrenschmidt
This will make life easier for dealing with dynamically configured ICSes such as PHB3 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23spapr: Introduce sPAPRCPUCoreClassBharata B Rao
Each spapr cpu core type defines an instance_init routine which just populates the CPU class name. This can be done in the class_init commonly for all core types which simplifies the registration. This is inspired by how PowerNV core types are registered. Certain types of spapr cpu cores ('host' and generic type based on host CPU) are initialized in target-ppc/kvm.c. To convert these type registrations to use class_init, we need to expose spapr_cpu_core_class_init() outside of spapr_cpu_core.c. Commit d11b268e1765 added a generic sPAPR CPU core family type to support cases like POWER8 CPU type on POWER8E host CPU. Switching to class_init would fix such scenarios to use the right CPU thread type instead of defaulting to host-powerpc64-cpu. In an unrelated cleanup, fix a typo in .get_hotplug_handler routine. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23tests: add RTAS command in the protocolLaurent Vivier
Add a first test to validate the protocol: - rtas/get-time-of-day compares the time from the guest with the time from the host. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-15Remove unused function declarationsLadi Prosek
Unused function declarations were found using a simple gcc plugin and manually verified by grepping the sources. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-13ppc: do not redefine CPUPPCStatePaolo Bonzini
Just include the file that is supposed to bring it in. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-07hw/ppc: add a ppc_create_page_sizes_prop() helper routineCédric Le Goater
The exact same routine will be used in PowerNV. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-07hw/ppc: use error_report instead of fprintfCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-07hw/ppc: include fdt helper routine in a common fileCédric Le Goater
spapr_pci would also be a good candidate but the macro _FDT is slightly different. It returns and does not exit. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-13ppc: parse cpu features onceGreg Kurz
Considering that features are converted to global properties and global properties are automatically applied to every new instance of created CPU (at object_new() time), there is no point in parsing cpu_model string every time a CPU created. So move parsing outside CPU creation loop and do it only once. Parsing also should be done before any CPU is created so that features would affect the first CPU a well. This patch does that for all PowerPC machine types. It is based on previous work from Bharata: https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07564.html Signed-off-by: Greg Kurz <groug@kaod.org> [clg: only kept the fix for the spapr platform. support for other platform will be added in 2.8 ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-08spapr: Correctly set query_hotpluggable_cpus hook based on machine versionDavid Gibson
Prior to c8721d3 "spapr: Error out when CPU hotplug is attempted on older pseries machines", attempting to use query-hotpluggable-cpus on pseries-2.6 and earlier machine types would SEGV. That change fixed that, but due to some unexpected interactions in init order and a brown-paper-bag worthy failure to test, it accidentally disabled query-hotpluggable-cpus for all pseries machine types, including the current one which should allow it. In fact, query_hotpluggable_cpus needs to be non-NULL when and only when the dr_cpu_enabled flag in sPAPRMachineClass is set, which makes dr_cpu_enabled itself redundant. This patch removes dr_cpu_enabled, instead directly setting query_hotpluggable_cpus from the machine class_init functions, and using that to determine the availability of CPU hotplug when necessary. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-12Clean up decorations and whitespace around header guardsMarkus Armbruster
Cleaned up with scripts/clean-header-guards.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12Clean up ill-advised or unusual header guardsMarkus Armbruster
Cleaned up with scripts/clean-header-guards.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12Clean up header guards that don't match their file nameMarkus Armbruster
Header guard symbols should match their file name to make guard collisions less likely. Offenders found with scripts/clean-header-guards.pl -vn. Cleaned up with scripts/clean-header-guards.pl, followed by some renaming of new guard symbols picked by the script to better ones. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12Use #include "..." for our own headers, <...> for othersMarkus Armbruster
Tracked down with an ugly, brittle and probably buggy Perl script. Also move includes converted to <...> up so they get included before ours where that's obviously okay. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-05spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)Alexey Kardashevskiy
This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA window(s) The "ddw" property is enabled by default on a PHB but for compatibility the pseries-2.6 machine and older disable it. This also creates a single DMA window for the older machines to maintain backward migration. This implements DDW for PHB with emulated and VFIO devices. The host kernel support is required. The advertised IOMMU page sizes are 4K and 64K; 16M pages are supported but not advertised by default, in order to enable them, the user has to specify "pgsz" property for PHB and enable huge pages for RAM. The existing linux guests try creating one additional huge DMA window with 64K or 16MB pages and map the entire guest RAM to. If succeeded, the guest switches to dma_direct_ops and never calls TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM and not waste time on map/unmap later. This adds a "dma64_win_addr" property which is a bus address for the 64bit window and by default set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware uses and this allows having emulated and VFIO devices on the same bus. This adds 4 RTAS handlers: * ibm,query-pe-dma-window * ibm,create-pe-dma-window * ibm,remove-pe-dma-window * ibm,reset-pe-dma-window These are registered from type_init() callback. These RTAS handlers are implemented in a separate file to avoid polluting spapr_iommu.c with PCI. This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs and updates all references to dma_liobn. However this does not add 64bit LIOBN to the migration stream as in fact even 32bit LIOBN is rather pointless there (as it is a PHB property and the management software can/should pass LIOBNs via CLI) but we keep it for the backward migration support. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Replace "icp" with "xics" in most placesBenjamin Herrenschmidt
The "ICP" is a different object than the "XICS". For historical reasons, we have a number of places where we name a variable "icp" while it contains a XICSState pointer. There *is* an ICPState structure too so this makes the code really confusing. This is a mechanical replacement of all those instances to use the name "xics" instead. There should be no functional change. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [spapr_cpu_init has been moved to spapr_cpu_core.c, change there] Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Implement H_IPOLL using an accessorBenjamin Herrenschmidt
None of the other presenter functions directly mucks with the internal state, so don't do it there either. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Move SPAPR specific code to a separate fileBenjamin Herrenschmidt
Leave the core ICP/ICS logic in xics.c and move the top level class wrapper, hypercall and RTAS handlers to xics_spapr.c Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [add cpu.h in xics_spapr.c, move set_nr_irqs and set_nr_servers to xics_spapr.c] Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Rename existing xics to xics_spaprBenjamin Herrenschmidt
The common class doesn't change, the KVM one is sPAPR specific. Rename variables and functions to xics_spapr. Retain the type name as "xics" to preserve migration for existing sPAPR guests. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27ppc/xics: Remove unused xics_set_irq_type()Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> [dwg: Adjusted for context to apply without original series] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: CPU hot unplug supportBharata B Rao
Remove the CPU core device by removing the underlying CPU thread devices. Hot removal of CPU for sPAPR guests is achieved by sending the hot unplug notification to the guest. Release the vCPU object after CPU hot unplug so that vCPU fd can be parked and reused. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: CPU hotplug supportBharata B Rao
Set up device tree entries for the hotplugged CPU core and use the exising RTAS event logging infrastructure to send CPU hotplug notification to the guest. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: convert boot CPUs into CPU core devicesBharata B Rao
Introduce sPAPRMachineClass.dr_cpu_enabled to indicate support for CPU core hotplug. Initialize boot time CPUs as core deivces and prevent topologies that result in partially filled cores. Both of these are done only if CPU core hotplug is supported. Note: An unrelated change in the call to xics_system_init() is done in this patch as it makes sense to use the local variable smt introduced in this patch instead of kvmppc_smt_threads() call here. TODO: We derive sPAPR core type by looking at -cpu <model>. However we don't take care of "compat=" feature yet for boot time as well as hotplug CPUs. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: Move spapr_cpu_init() to spapr_cpu_core.cBharata B Rao
Start consolidating CPU init related routines in spapr_cpu_core.c. As part of this, move spapr_cpu_init() and its dependencies from spapr.c to spapr_cpu_core.c No functionality change in this patch. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> [dwg: Rename TIMEBASE_FREQ to SPAPR_TIMEBASE_FREQ, since it's now in a public(ish) header] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: Abstract CPU core device and type specific core devicesBharata B Rao
Add sPAPR specific abastract CPU core device that is based on generic CPU core device. Use this as base type to create sPAPR CPU specific core devices. TODO: - Add core types for other remaining CPU types - Handle CPU model alias correctly Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr_drc: Prevent detach racing against attach for CPU DRBharata B Rao
If a CPU is hot removed while hotplug of the same is still in progress, the guest crashes. Prevent this by ensuring that detach is done only after attach has completed. The existing code already prevents such race for PCI hotplug. However given that CPU is a logical DR unlike PCI and starts with ISOLATED state, we need a logic that works for CPU too. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> [Don't set awaiting_attach for PCI devices] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17xics,xics_kvm: Handle CPU unplug correctlyBharata B Rao
XICS is setup for each CPU during initialization. Provide a routine to undo the same when CPU is unplugged. While here, move ss->cs management into xics from xics_kvm since there is nothing KVM specific in it. Also ensure xics reset doesn't set irq for CPUs that are already unplugged. This allows reboot of a VM that has undergone CPU hotplug and unplug to work correctly. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14spapr: Ensure all LMBs are represented in ibm,dynamic-memoryBharata B Rao
Memory hotplug can fail for some combinations of RAM and maxmem when DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends on maximum addressable memory returned by guest and this value is currently being calculated wrongly by the guest kernel routine memory_hotplug_max(). While there is an attempt to fix the guest kernel, this patch works around the problem within QEMU itself. memory_hotplug_max() routine in the guest kernel arrives at max addressable memory by multiplying lmb-size with the lmb-count obtained from ibm,dynamic-memory property. There are two assumptions here: - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM where only hot-pluggable LMBs are present in this property. - The memory area comprising of RAM and hotplug region is contiguous: This needn't be true always for PowerKVM as there can be gap between boot time RAM and hotplug region. To work around this guest kernel bug, ensure that ibm,dynamic-memory has information about all the LMBs (RMA, boot-time LMBs, future hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and hotpluggable region). RMA is represented separately by memory@0 node. Hence mark RMA LMBs and also the LMBs for the gap b/n RAM and hotpluggable region as reserved and as having no valid DRC so that these LMBs are not considered by the guest. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14macio: call dma_memory_unmap() at the end of each DMA transferMark Cave-Ayland
This ensures that the underlying memory is marked dirty once the transfer is complete and resolves cache coherency problems under MacOS 9. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Add root memory regionAlexey Kardashevskiy
We are going to have multiple DMA windows at different offsets on a PCI bus. For the sake of migration, we will have as many TCE table objects pre-created as many windows supported. So we need a way to map windows dynamically onto a PCI bus when migration of a table is completed but at this stage a TCE table object does not have access to a PHB to ask it to map a DMA window backed by just migrated TCE table. This adds a "root" memory region (UINT64_MAX long) to the TCE object. This new region is mapped on a PCI bus with enabled overlapping as there will be one root MR per TCE table, each of them mapped at 0. The actual IOMMU memory region is a subregion of the root region and a TCE table enables/disables this subregion and maps it at the specific offset inside the root MR which is 1:1 mapping of a PCI address space. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Migrate full stateAlexey Kardashevskiy
The source guest could have reallocated the default TCE table and migrate bigger/smaller table. This adds reallocation in post_load() if the default table size is different on source and destination. This adds @bus_offset, @page_shift to the migration stream as a subsection so when DDW is added, migration to older machines will still be possible. As @bus_offset and @page_shift are not used yet, this makes no change in behavior. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Introduce "enabled" state for TCE tableAlexey Kardashevskiy
Currently TCE tables are created once at start and their sizes never change. We are going to change that by introducing a Dynamic DMA windows support where DMA configuration may change during the guest execution. This changes spapr_tce_new_table() to create an empty zero-size IOMMU memory region (IOMMU MR). Only LIOBN is assigned by the time of creation. It still will be called once at the owner object (VIO or PHB) creation. This introduces an "enabled" state for TCE table objects, some helper functions are added: - spapr_tce_table_enable() receives TCE table parameters, stores in sPAPRTCETable and allocates a guest view of the TCE table (in the user space or KVM) and sets the correct size on the IOMMU MR; - spapr_tce_table_disable() disposes the table and resets the IOMMU MR size; it is made public as the following DDW code will be using it. This changes the PHB reset handler to do the default DMA initialization instead of spapr_phb_realize(). This does not make differenct now but later with more than just one DMA window, we will have to remove them all and create the default one on a system reset. No visible change in behaviour is expected except the actual table will be reallocated every reset. We might optimize this later. The other way to implement this would be dynamically create/remove the TCE table QOM objects but this would make migration impossible as the migration code expects all QOM objects to exist at the receiver so we have to have TCE table objects created when migration begins. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-19hw: cannot include hw/hw.h from user emulationPaolo Bonzini
All qdev definitions are available from other headers, user-mode emulation does not need hw/hw.h. By considering system emulation only, it is simpler to disentangle hw/hw.h from NEED_CPU_H. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>