aboutsummaryrefslogtreecommitdiff
path: root/include/hw/ppc
AgeCommit message (Collapse)Author
2016-07-12Use #include "..." for our own headers, <...> for othersMarkus Armbruster
Tracked down with an ugly, brittle and probably buggy Perl script. Also move includes converted to <...> up so they get included before ours where that's obviously okay. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-05spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)Alexey Kardashevskiy
This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA window(s) The "ddw" property is enabled by default on a PHB but for compatibility the pseries-2.6 machine and older disable it. This also creates a single DMA window for the older machines to maintain backward migration. This implements DDW for PHB with emulated and VFIO devices. The host kernel support is required. The advertised IOMMU page sizes are 4K and 64K; 16M pages are supported but not advertised by default, in order to enable them, the user has to specify "pgsz" property for PHB and enable huge pages for RAM. The existing linux guests try creating one additional huge DMA window with 64K or 16MB pages and map the entire guest RAM to. If succeeded, the guest switches to dma_direct_ops and never calls TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM and not waste time on map/unmap later. This adds a "dma64_win_addr" property which is a bus address for the 64bit window and by default set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware uses and this allows having emulated and VFIO devices on the same bus. This adds 4 RTAS handlers: * ibm,query-pe-dma-window * ibm,create-pe-dma-window * ibm,remove-pe-dma-window * ibm,reset-pe-dma-window These are registered from type_init() callback. These RTAS handlers are implemented in a separate file to avoid polluting spapr_iommu.c with PCI. This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs and updates all references to dma_liobn. However this does not add 64bit LIOBN to the migration stream as in fact even 32bit LIOBN is rather pointless there (as it is a PHB property and the management software can/should pass LIOBNs via CLI) but we keep it for the backward migration support. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Replace "icp" with "xics" in most placesBenjamin Herrenschmidt
The "ICP" is a different object than the "XICS". For historical reasons, we have a number of places where we name a variable "icp" while it contains a XICSState pointer. There *is* an ICPState structure too so this makes the code really confusing. This is a mechanical replacement of all those instances to use the name "xics" instead. There should be no functional change. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [spapr_cpu_init has been moved to spapr_cpu_core.c, change there] Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Implement H_IPOLL using an accessorBenjamin Herrenschmidt
None of the other presenter functions directly mucks with the internal state, so don't do it there either. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Move SPAPR specific code to a separate fileBenjamin Herrenschmidt
Leave the core ICP/ICS logic in xics.c and move the top level class wrapper, hypercall and RTAS handlers to xics_spapr.c Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [add cpu.h in xics_spapr.c, move set_nr_irqs and set_nr_servers to xics_spapr.c] Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Rename existing xics to xics_spaprBenjamin Herrenschmidt
The common class doesn't change, the KVM one is sPAPR specific. Rename variables and functions to xics_spapr. Retain the type name as "xics" to preserve migration for existing sPAPR guests. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27ppc/xics: Remove unused xics_set_irq_type()Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> [dwg: Adjusted for context to apply without original series] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: CPU hot unplug supportBharata B Rao
Remove the CPU core device by removing the underlying CPU thread devices. Hot removal of CPU for sPAPR guests is achieved by sending the hot unplug notification to the guest. Release the vCPU object after CPU hot unplug so that vCPU fd can be parked and reused. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: CPU hotplug supportBharata B Rao
Set up device tree entries for the hotplugged CPU core and use the exising RTAS event logging infrastructure to send CPU hotplug notification to the guest. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: convert boot CPUs into CPU core devicesBharata B Rao
Introduce sPAPRMachineClass.dr_cpu_enabled to indicate support for CPU core hotplug. Initialize boot time CPUs as core deivces and prevent topologies that result in partially filled cores. Both of these are done only if CPU core hotplug is supported. Note: An unrelated change in the call to xics_system_init() is done in this patch as it makes sense to use the local variable smt introduced in this patch instead of kvmppc_smt_threads() call here. TODO: We derive sPAPR core type by looking at -cpu <model>. However we don't take care of "compat=" feature yet for boot time as well as hotplug CPUs. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: Move spapr_cpu_init() to spapr_cpu_core.cBharata B Rao
Start consolidating CPU init related routines in spapr_cpu_core.c. As part of this, move spapr_cpu_init() and its dependencies from spapr.c to spapr_cpu_core.c No functionality change in this patch. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> [dwg: Rename TIMEBASE_FREQ to SPAPR_TIMEBASE_FREQ, since it's now in a public(ish) header] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: Abstract CPU core device and type specific core devicesBharata B Rao
Add sPAPR specific abastract CPU core device that is based on generic CPU core device. Use this as base type to create sPAPR CPU specific core devices. TODO: - Add core types for other remaining CPU types - Handle CPU model alias correctly Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr_drc: Prevent detach racing against attach for CPU DRBharata B Rao
If a CPU is hot removed while hotplug of the same is still in progress, the guest crashes. Prevent this by ensuring that detach is done only after attach has completed. The existing code already prevents such race for PCI hotplug. However given that CPU is a logical DR unlike PCI and starts with ISOLATED state, we need a logic that works for CPU too. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> [Don't set awaiting_attach for PCI devices] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17xics,xics_kvm: Handle CPU unplug correctlyBharata B Rao
XICS is setup for each CPU during initialization. Provide a routine to undo the same when CPU is unplugged. While here, move ss->cs management into xics from xics_kvm since there is nothing KVM specific in it. Also ensure xics reset doesn't set irq for CPUs that are already unplugged. This allows reboot of a VM that has undergone CPU hotplug and unplug to work correctly. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14spapr: Ensure all LMBs are represented in ibm,dynamic-memoryBharata B Rao
Memory hotplug can fail for some combinations of RAM and maxmem when DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends on maximum addressable memory returned by guest and this value is currently being calculated wrongly by the guest kernel routine memory_hotplug_max(). While there is an attempt to fix the guest kernel, this patch works around the problem within QEMU itself. memory_hotplug_max() routine in the guest kernel arrives at max addressable memory by multiplying lmb-size with the lmb-count obtained from ibm,dynamic-memory property. There are two assumptions here: - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM where only hot-pluggable LMBs are present in this property. - The memory area comprising of RAM and hotplug region is contiguous: This needn't be true always for PowerKVM as there can be gap between boot time RAM and hotplug region. To work around this guest kernel bug, ensure that ibm,dynamic-memory has information about all the LMBs (RMA, boot-time LMBs, future hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and hotpluggable region). RMA is represented separately by memory@0 node. Hence mark RMA LMBs and also the LMBs for the gap b/n RAM and hotpluggable region as reserved and as having no valid DRC so that these LMBs are not considered by the guest. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14macio: call dma_memory_unmap() at the end of each DMA transferMark Cave-Ayland
This ensures that the underlying memory is marked dirty once the transfer is complete and resolves cache coherency problems under MacOS 9. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Add root memory regionAlexey Kardashevskiy
We are going to have multiple DMA windows at different offsets on a PCI bus. For the sake of migration, we will have as many TCE table objects pre-created as many windows supported. So we need a way to map windows dynamically onto a PCI bus when migration of a table is completed but at this stage a TCE table object does not have access to a PHB to ask it to map a DMA window backed by just migrated TCE table. This adds a "root" memory region (UINT64_MAX long) to the TCE object. This new region is mapped on a PCI bus with enabled overlapping as there will be one root MR per TCE table, each of them mapped at 0. The actual IOMMU memory region is a subregion of the root region and a TCE table enables/disables this subregion and maps it at the specific offset inside the root MR which is 1:1 mapping of a PCI address space. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Migrate full stateAlexey Kardashevskiy
The source guest could have reallocated the default TCE table and migrate bigger/smaller table. This adds reallocation in post_load() if the default table size is different on source and destination. This adds @bus_offset, @page_shift to the migration stream as a subsection so when DDW is added, migration to older machines will still be possible. As @bus_offset and @page_shift are not used yet, this makes no change in behavior. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Introduce "enabled" state for TCE tableAlexey Kardashevskiy
Currently TCE tables are created once at start and their sizes never change. We are going to change that by introducing a Dynamic DMA windows support where DMA configuration may change during the guest execution. This changes spapr_tce_new_table() to create an empty zero-size IOMMU memory region (IOMMU MR). Only LIOBN is assigned by the time of creation. It still will be called once at the owner object (VIO or PHB) creation. This introduces an "enabled" state for TCE table objects, some helper functions are added: - spapr_tce_table_enable() receives TCE table parameters, stores in sPAPRTCETable and allocates a guest view of the TCE table (in the user space or KVM) and sets the correct size on the IOMMU MR; - spapr_tce_table_disable() disposes the table and resets the IOMMU MR size; it is made public as the following DDW code will be using it. This changes the PHB reset handler to do the default DMA initialization instead of spapr_phb_realize(). This does not make differenct now but later with more than just one DMA window, we will have to remove them all and create the default one on a system reset. No visible change in behaviour is expected except the actual table will be reallocated every reset. We might optimize this later. The other way to implement this would be dynamically create/remove the TCE table QOM objects but this would make migration impossible as the migration code expects all QOM objects to exist at the receiver so we have to have TCE table objects created when migration begins. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-19hw: cannot include hw/hw.h from user emulationPaolo Bonzini
All qdev definitions are available from other headers, user-mode emulation does not need hw/hw.h. By considering system emulation only, it is simpler to disentangle hw/hw.h from NEED_CPU_H. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19hw: do not use VMSTATE_*TLPaolo Bonzini
Reserve this to CPU state serialization. Luckily, they were only used by sPAPR devices and these are ppc64 only. So there is no change to migration format. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19explicitly include qom/cpu.hPaolo Bonzini
exec/cpu-all.h includes qom/cpu.h. Explicit inclusion will keep things working when cpu.h will not be included indirectly almost everywhere (either directly or through qemu-common.h). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19ppc: use PowerPCCPU instead of CPUPPCStatePaolo Bonzini
This changes a cpu.h dependency for hw/ppc/ppc.h into a cpu-qom.h dependency. For it to compile we also need to clean up a few unused definitions. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-05spapr_drc: enable immediate detach for unsignalled devicesMichael Roth
Currently spapr doesn't support "aborting" hotplug of PCI devices by allowing device_del to immediately remove the device if we haven't signalled the presence of the device to the guest. In the past this wasn't an issue, since we always immediately signalled device attach and simply relied on full guest-aware add->remove path for device removal. However, as of 788d259, we now defer signalling for PCI functions until function 0 is attached, so now we need to deal with these "abort" operations for cases where a user hotplugs a non-0 function, then opts to remove it prior hotplugging function 0. Currently they'd have to reboot before the unplug completed. PCIe multifunction hotplug does not have this requirement however, so from a management implementation perspective it would be good to address this within the same release as 788d259. We accomplish this by simply adding a 'signalled' flag to track whether a device hotplug event has been sent to the guest. If it hasn't, we allow immediate removal under the assumption that the guest will not be using the device. Devices present at boot/reset time are also assumed to be 'signalled'. For CPU/memory/etc, signalling will still happen immediately as part of device_add, so only PCI functions should be affected. Cc: bharata@linux.vnet.ibm.com Cc: david@gibson.dropbear.id.au Cc: sbhat@linux.vnet.ibm.com Cc: qemu-ppc@nongnu.org Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> [dwg: This fixes a regression where an incorrect hot-add of a non-zero function can no longer be backed out until function 0 is added] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-05ppc: Rework POWER7 & POWER8 exception modelCédric Le Goater
From: Benjamin Herrenschmidt <benh@kernel.crashing.org> This patch fixes the current AIL implementation for POWER8. The interrupt vector address can be calculated directly from LPCR when the exception is handled. The excp_prefix update becomes useless and we can cleanup the H_SET_MODE hcall. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: Removed LPES0/1 handling for HV vs. !HV Fixed LPCR_ILE case for POWERPC_EXCP_POWER8 ] Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> [dwg: This was written as a cleanup, but it also fixes a real bug where setting an alternative interrupt location would not be correctly migrated] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-22include/qemu/iov.h: Don't include qemu-common.hMarkus Armbruster
qemu-common.h should only be included by .c files. Its file comment explains why: "No header file should depend on qemu-common.h, as this would easily lead to circular header dependencies." qemu/iov.h includes qemu-common.h for QEMUIOVector stuff. Move all that to qemu/iov.h and drop the ill-advised include. Include qemu/iov.h where the QEMUIOVector stuff is now missing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-28xics: report errors with the QEMU Error APIGreg Kurz
Using the return value to report errors is error prone: - xics_alloc() returns -1 on error but spapr_vio_busdev_realize() errors on 0 - xics_alloc_block() returns the unclear value of ics->offset - 1 on error but both rtas_ibm_change_msi() and spapr_phb_realize() error on 0 This patch adds an errp argument to xics_alloc() and xics_alloc_block() to report errors. The return value of these functions is a valid IRQ number if errp is NULL. It is undefined otherwise. The corresponding error traces get promotted to error messages. Note that the "can't allocate IRQ" error message in spapr_vio_busdev_realize() also moves to xics_alloc(). Similar error message consolidation isn't really applicable to xics_alloc_block() because callers have extra context (device config address, MSI or MSIX). This fixes the issues mentioned above. Based on previous work from Brian W. Hart. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17pseries: Simplify handling of the hash page table fdDavid Gibson
When migrating the 'pseries' machine type with KVM, we use a special fd to access the hash page table stored within KVM. Usually, this fd is opened at the beginning of migration, and kept open until the migration is complete. However, if there is a guest reset during the migration, the fd can become stale and we need to re-open it. At the moment we use an 'htab_fd_stale' flag in sPAPRMachineState to signal this, which is checked in the migration iterators. But that's rather ugly. It's simpler to just close and invalidate the fd on reset, and lazily re-open it in migration if necessary. This patch implements that change. This requires a small addition to the machine state's instance_init, so that htab_fd is initialized to -1 (telling the migration code it needs to open it) instead of 0, which could be a valid fd. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-30spapr: Remove rtas_st_buffer_direct()David Gibson
rtas_st_buffer_direct() is a not particularly useful wrapper around cpu_physical_memory_write(). All the callers are in rtas_ibm_configure_connector, where it's better handled by local helper. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-30spapr: Small fixes to rtas_ibm_get_system_parameter, remove rtas_st_bufferDavid Gibson
rtas_st_buffer() appears in spapr.h as though it were a widely used helper, but in fact it is only used for saving data in a format used by rtas_ibm_get_system_parameter(). This changes it to a local helper more specifically for that function. While we're there fix a couple of small defects in rtas_ibm_get_system_parameter: - For the string value SPLPAR_CHARACTERISTICS, it wasn't including the terminating \0 in the length which it should according to LoPAPR 7.3.16.1 - It now checks that the supplied buffer has at least enough space for the length of the returned data, and returns an error if it does not. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11spapr vio: fix to incomplete QOMifyCao jin
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-11hw/ppc/spapr: Use XHCI as host controller for new spapr machinesThomas Huth
The OHCI has some bugs and performance issues, so for newer machines it's preferable to use XHCI instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-10-23spapr_iommu: Provide a function to switch a TCE table to allowing VFIODavid Gibson
Because of the way non-VFIO guest IOMMU operations are KVM accelerated, not all TCE tables (guest IOMMU contexts) can support VFIO devices. Currently, this is decided at creation time. To support hotplug of VFIO devices, we need to allow a TCE table which previously didn't allow VFIO devices to be switched so that it can. This patch adds an spapr_tce_set_need_vfio() function to do this, by reallocating the table in userspace if necessary. Currently this doesn't allow the KVM acceleration to be re-enabled if all the VFIO devices are removed. That's an optimization for another time. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-10-23spapr_iommu: Rename vfio_accel parameterDavid Gibson
The vfio_accel parameter used when creating a new TCE table (guest IOMMU context) has a confusing name. What it really means is whether we need the TCE table created to be able to support VFIO devices. VFIO is relevant, because when available we use in-kernel acceleration of the TCE table, but that may not work with VFIO devices because updates to the table are handled in kernel, bypass qemu and so don't hit qemu's infrastructure for keeping the VFIO host IOMMU state in sync with the guest IOMMU state. Rename the parameter to "need_vfio" throughout. This is a cosmetic change, with no impact on the logic. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-09-23ppc/spapr: Implement H_RANDOM hypercall in QEMUThomas Huth
The PAPR interface defines a hypercall to pass high-quality hardware generated random numbers to guests. Recent kernels can already provide this hypercall to the guest if the right hardware random number generator is available. But in case the user wants to use another source like EGD, or QEMU is running with an older kernel, we should also have this call in QEMU, so that guests that do not support virtio-rng yet can get good random numbers, too. This patch now adds a new pseudo-device to QEMU that either directly provides this hypercall to the guest or is able to enable the in-kernel hypercall if available. The in-kernel hypercall can be enabled with the use-kvm property, e.g.: qemu-system-ppc64 -device spapr-rng,use-kvm=true For handling the hypercall in QEMU instead, a "RngBackend" is required since the hypercall should provide "good" random data instead of pseudo-random (like from a "simple" library function like rand() or g_random_int()). Since there are multiple RngBackends available, the user must select an appropriate back-end via the "rng" property of the device, e.g.: qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \ -device spapr-rng,rng=gid0 ... See http://wiki.qemu-project.org/Features-Done/VirtIORNG for other example of specifying RngBackends. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23spapr: Support hotplug by specifying DRC countBharata B Rao
Support hotplug identifier type RTAS_LOG_V6_HP_ID_DRC_COUNT that allows hotplugging of DRCs by specifying the DRC count. While we are here, rename spapr_hotplug_req_add_event() to spapr_hotplug_req_add_by_index() spapr_hotplug_req_remove_event() to spapr_hotplug_req_remove_by_index() so that they match with spapr_hotplug_req_add_by_count(). Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23spapr: Support ibm,dynamic-reconfiguration-memoryBharata B Rao
Parse ibm,architecture.vec table obtained from the guest and enable memory node configuration via ibm,dynamic-reconfiguration-memory if guest supports it. This is in preparation to support memory hotplug for sPAPR guests. This changes the way memory node configuration is done. Currently all memory nodes are built upfront. But after this patch, only memory@0 node for RMA is built upfront. Guest kernel boots with just that and rest of the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory) are built when guest does ibm,client-architecture-support call. Note: This patch needs a SLOF enhancement which is already part of SLOF binary in QEMU. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23spapr: Add LMB DR connectorsDavid Gibson
Enable memory hotplug for pseries 2.4 and add LMB DR connectors. With memory hotplug, enforce RAM size, NUMA node memory size and maxmem to be a multiple of SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the granularity in which LMBs are represented and hot-added. LMB DR connectors will be used by the memory hotplug code. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> [spapr_drc_reset implementation] [since this missed the 2.4 cutoff, changing to only enable for 2.5] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23spapr_drc: use RTAS return codes for methods called by RTASMichael Roth
Certain methods in sPAPRDRConnector objects are only ever called by RTAS and in many cases are responsible for the logic that determines the RTAS return codes. Rather than having a level of indirection requiring RTAS code to re-interpret return values from such methods to determine the appropriate return code, just pass them through directly. This requires changing method return types to uint32_t to match the type of values currently passed to RTAS helpers. In the case of read accesses like drc->entity_sense() where we weren't previously reporting any errors, just the read value, we modify the function to return RTAS return code, and pass the read value back via reference. Suggested-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Suggested-by: David Gibson <david@gibson.dropbear.id.au> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23spapr: Initialize hotplug memory address spaceBharata B Rao
Initialize a hotplug memory region under which all the hotplugged memory is accommodated. Also enable memory hotplug by setting CONFIG_MEM_HOTPLUG. Modelled on i386 memory hotplug. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23spapr_drc: don't allow 'empty' DRCs to be unisolated or allocatedMichael Roth
Logical resources start with allocation-state:UNUSABLE / isolation-state:ISOLATED. During hotplug, guests will transition them to allocation-state:USABLE, and then to isolation-state:UNISOLATED. For cases where we cannot transition to allocation-state:USABLE, in this case due to no device/resource being association with the logical DRC, we should return an error -3. For physical DRCs, we default to allocation-state:USABLE and stay there, so in this case we should report an error -3 when the guest attempts to make the isolation-state:ISOLATED transition for a DRC with no device associated. These are as documented in PAPR 2.7, 13.5.3.4. We also ensure allocation-state:USABLE when the guest attempts transition to isolation-state:UNISOLATED to deal with misbehaving guests attempting to bring online an unallocated logical resource. This is as documented in PAPR 2.7, 13.7. Currently we implement no such error logic. Fix this by handling these error cases as PAPR defines. Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23sPAPR: Introduce rtas_ldq()Gavin Shan
This introduces rtas_ldq() to load 64-bits parameter from continuous two 4-bytes memory chunk of RTAS parameter buffer, to simplify the code. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23spapr_rtas: Prevent QEMU crash during hotplug without a prior device_addBharata B Rao
If drmgr is used in the guest to hotplug a device before a device_add has been issued via the QEMU monitor, QEMU segfaults in configure_connector call. This occurs due to accessing of NULL FDT which otherwise would have been created and associated with the DRC during device_add command. Check for NULL FDT and return failure from configure_connector call. As per PAPR+, an error value of -9003 seems appropriate for this failure. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23ppc/spapr: Use qemu_log_mask() for hcall_dprintf()Thomas Huth
To see the output of the hcall_dprintf statements, you currently have to enable the DEBUG_SPAPR_HCALLS macro in include/hw/ppc/spapr.h. This is ugly because a) not every user who wants to debug guest problems can or wants to recompile QEMU to be able to see such issues, and b) since this macro is disabled by default, the code in the hcall_dprintf() brackets tends to bitrot until somebody temporarily enables that macro again. Since the hcall_dprintf statements except one indicate guest problems, let's always use qemu_log_mask(LOG_GUEST_ERROR, ...) for this macro instead. One spot indicated an unimplemented host feature, so this is changed into qemu_log_mask(LOG_UNIMP, ...) instead. Now it's possible to see all those messages by simply adding the CLI parameter "-d guest_errors,unimp", without the need to re-compile the binary. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-07-07xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabledBharata B Rao
When supporting CPU hot removal by parking the vCPU fd and reusing it during hotplug again, there can be cases where we try to reenable KVM_CAP_IRQ_XICS CAP for the vCPU for which it was already enabled. Introduce a boolean member in ICPState to track this and don't reenable the CAP if it was already enabled earlier. Re-enabling this CAP should ideally work, but currently it results in kernel trying to create and associate ICP with this vCPU and that fails since there is already an ICP associated with it. Hence this patch is needed to work around this problem in the kernel. This change allows CPU hot removal to work for sPAPR. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Support ibm, lrdr-capacity device tree propertyBharata B Rao
Add support for ibm,lrdr-capacity since this is needed by the guest kernel to know about the possible hot-pluggable CPUs and Memory. With this, pseries kernels will start reporting correct maxcpus in /sys/devices/system/cpu/possible. Also define the minimum hotpluggable memory size as 256MB. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [agraf: Fix compile error on 32bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Add sPAPRMachineClassDavid Gibson
Currently although we have an sPAPRMachineState descended from MachineState we don't have an sPAPRMAchineClass descended from MachineClass. So far it hasn't been needed, but several upcoming features are going to want it, so this patch creates a stub implementation. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Remove obsolete entry_point field from sPAPRMachineStateDavid Gibson
The sPAPRMachineState structure includes an entry_point field containing the initial PC value for starting the machine, even though this always has the value 0x100. I think this is a hangover from very early versions which bypassed the firmware when using -kernel. In any case it has no function now, so remove it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Remove obsolete ram_limit field from sPAPRMachineStateDavid Gibson
The ram_limit field was imported from sPAPREnvironment where it predates the machine's ram size being available generically from machine->ram_size. Worse, the existing code was inconsistent about where it got the ram size from. Sometimes it used spapr->ram_limit, sometimes the global 'ram_size' and sometimes a local 'ram_size' masking the global. This cleans up the code to consistently use machine->ram_size, eliminating spapr->ram_limit in the process. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Merge sPAPREnvironment into sPAPRMachineStateDavid Gibson
The code for -machine pseries maintains a global sPAPREnvironment structure which keeps track of general state information about the guest platform. This predates the existence of the MachineState structure, but performs basically the same function. Now that we have the generic MachineState, fold sPAPREnvironment into sPAPRMachineState, the pseries specific subclass of MachineState. This is mostly a matter of search and replace, although a few places which relied on the global spapr variable are changed to find the structure via qdev_get_machine(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>