aboutsummaryrefslogtreecommitdiff
path: root/hw/ppc
AgeCommit message (Collapse)Author
2016-07-12Use #include "..." for our own headers, <...> for othersMarkus Armbruster
Tracked down with an ugly, brittle and probably buggy Perl script. Also move includes converted to <...> up so they get included before ours where that's obviously okay. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-06qapi: Add parameter to visit_end_*Eric Blake
Rather than making the dealloc visitor track of stack of pointers remembered during visit_start_* in order to free them during visit_end_*, it's a lot easier to just make all callers pass the same pointer to visit_end_*. The generated code has access to the same pointer, while all other users are doing virtual walks and can pass NULL. The dealloc visitor is then greatly simplified. All three visit_end_*() functions intentionally take a void**, even though the visit_start_*() functions differ between void**, GenericList**, and GenericAlternate**. This is done for several reasons: when doing a virtual walk, passing NULL doesn't care what the type is, but when doing a generated walk, we already have to cast the caller's specific FOO* to call visit_start, while using void** lets us use visit_end without a cast. Also, an upcoming patch will add a clone visitor that wants to use the same implementation for all three visit_end callbacks, which is made easier if all three share the same signature. For visitors with already track per-object state (the QMP visitors via a stack, and the string visitors which do not allow nesting), add an assertion that the caller is indeed passing the same pointer to paired calls. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-4-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-07-05Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pc, pci, virtio: new features, cleanups, fixes iommus can not be added with -device. cleanups and fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 05 Jul 2016 11:18:32 BST # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (30 commits) vmw_pvscsi: remove unnecessary internal msi state flag e1000e: remove unnecessary internal msi state flag vmxnet3: remove unnecessary internal msi state flag mptsas: remove unnecessary internal msi state flag megasas: remove unnecessary megasas_use_msi() pci: Convert msi_init() to Error and fix callers to check it pci bridge dev: change msi property type megasas: change msi/msix property type mptsas: change msi property type intel-hda: change msi property type usb xhci: change msi/msix property type change pvscsi_init_msi() type to void tests: add APIC.cphp and DSDT.cphp blobs tests: acpi: add CPU hotplug testcase log: Permit -dfilter 0..0xffffffffffffffff range: Replace internal representation of Range range: Eliminate direct Range member access log: Clean up misuse of Range for -dfilter pci_register_bar: cleanup Revert "virtio-net: unbreak self announcement and guest offloads after migration" ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-05ppc/hash64: Add proper real mode translation supportBenjamin Herrenschmidt
This adds proper support for translating real mode addresses based on the combination of HV and LPCR bits. This handles HRMOR offset for hypervisor real mode, and both RMA and VRMA modes for guest real mode. PAPR mode adjusts the offsets appropriately to match the RMA used in TCG, but we need to limit to the max supported by the implementation (16G). This includes some fixes by Cédric Le Goater <clg@kaod.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [dwg: Adjusted for differences in my version of the prereq patches] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05ppc: simplify ppc_hash64_hpte_page_shift_noslb()Cédric Le Goater
The segment page shift parameter is never used. Let's remove it. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)Alexey Kardashevskiy
This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA window(s) The "ddw" property is enabled by default on a PHB but for compatibility the pseries-2.6 machine and older disable it. This also creates a single DMA window for the older machines to maintain backward migration. This implements DDW for PHB with emulated and VFIO devices. The host kernel support is required. The advertised IOMMU page sizes are 4K and 64K; 16M pages are supported but not advertised by default, in order to enable them, the user has to specify "pgsz" property for PHB and enable huge pages for RAM. The existing linux guests try creating one additional huge DMA window with 64K or 16MB pages and map the entire guest RAM to. If succeeded, the guest switches to dma_direct_ops and never calls TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM and not waste time on map/unmap later. This adds a "dma64_win_addr" property which is a bus address for the 64bit window and by default set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware uses and this allows having emulated and VFIO devices on the same bus. This adds 4 RTAS handlers: * ibm,query-pe-dma-window * ibm,create-pe-dma-window * ibm,remove-pe-dma-window * ibm,reset-pe-dma-window These are registered from type_init() callback. These RTAS handlers are implemented in a separate file to avoid polluting spapr_iommu.c with PCI. This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs and updates all references to dma_liobn. However this does not add 64bit LIOBN to the migration stream as in fact even 32bit LIOBN is rather pointless there (as it is a PHB property and the management software can/should pass LIOBNs via CLI) but we keep it for the backward migration support. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05spapr_iommu: Realloc guest visible TCE table when starting/stopping listeningAlexey Kardashevskiy
The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU - a guest view of the table and a hardware TCE table. If there is no VFIO presense in the address space, then just the guest view is used, if this is the case, it is allocated in the KVM. However since there is no support yet for VFIO in KVM TCE hypercalls, when we start using VFIO, we need to move the guest view from KVM to the userspace; and we need to do this for every IOMMU on a bus with VFIO devices. This implements the callbacks for the sPAPR IOMMU - notify_started() reallocated the guest view to the user space, notify_stopped() does the opposite. This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug path as the new callbacks do this better - they notify IOMMU at the exact moment when the configuration is changed, and this also includes the case of PCI hot unplug. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05spapr: Ensure thread0 of CPU core is always realized firstBharata B Rao
During CPU core realization, we create all the thread objects and parent them to the core object in a loop. However, the realization of thread objects is done separately by walking the threads of a core using object_child_foreach(). With this, there is no guarantee on the order in which the child thread objects get realized. Since CPU device tree properties are currently derived from the CPU thread object, we assume thread0 of the core to be the representative thread of the core when creating device tree properties for the core. If thread0 is not the first thread that gets realized, then we would end up having an incorrect dt_id for the core and this causes hotplug failures from the guest. Fix this by realizing each thread object by walking the core's thread object list thereby ensuring that thread0 and other threads are always realized in the correct order. Future TODO: CPU DT nodes are per-core properties and we should ideally base the creation of CPU DT nodes on core objects rather than the thread objects. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-04hw/ppc: realize the PCI root bus as part of mac99 initMarcel Apfelbaum
Mac99's PCI root bus is not part of a host bridge, realize it manually. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-01spapr: drop duplicate variable in spapr_core_release()Greg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01spapr: do proper error propagation in spapr_cpu_core_realize_child()Greg Kurz
This patch changes spapr_cpu_core_realize_child() to have a local error pointer and use error_propagate() as it is supposed to be done. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01spapr: drop reference on child object during core realizationGreg Kurz
When a core is being realized, we create a child object for each thread of the core. The child is first initialized with object_initialize() which sets its ref count to 1, and then added to the core with object_property_add_child() which bumps the ref count to 2. When the core gets released, object_unparent() decreases the ref count to 1, and we g_free() the object: we hence loose the reference on an unfinalized object. This is likely to cause random crashes. Let's drop the extra reference as soon as we don't need it, after the thread is added to the core. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01spapr: Restore support for 970MP and POWER8NVL CPU coresBharata B Rao
Introduction of core based CPU hotplug for PowerPC sPAPR didn't add support for 970MP and POWER8NVL based core types. Add support for the same. While we are here, add support for explicit specification of POWER5+_v2.1 core type. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Replace "icp" with "xics" in most placesBenjamin Herrenschmidt
The "ICP" is a different object than the "XICS". For historical reasons, we have a number of places where we name a variable "icp" while it contains a XICSState pointer. There *is* an ICPState structure too so this makes the code really confusing. This is a mechanical replacement of all those instances to use the name "xics" instead. There should be no functional change. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [spapr_cpu_init has been moved to spapr_cpu_core.c, change there] Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc/xics: Rename existing xics to xics_spaprBenjamin Herrenschmidt
The common class doesn't change, the KVM one is sPAPR specific. Rename variables and functions to xics_spapr. Retain the type name as "xics" to preserve migration for existing sPAPR guests. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlbAaron Larson
Eliminate redundant and incorrect booke206_page_size_to_tlb function from ppce500_spin.c in preference to previously existing but newly exported definition from e500.c Defect analysis: The booke206_page_size_to_tlb function in e500.c was updated in commit 2bd9543 "ppc: booke206: use MAV=2.0 TSIZE definition, fix 4G pages" to reflect a change in the definition of MAS1_TSIZE_SHIFT from 8 (corresponding to a min TLB page size of 4kb) to a value of 7 (TLB page size 2k). The booke206_page_size_to_tlb() function defined in ppce500_spin.c was never updated to reflect the change in MAS1_TSIZE_SHIFT. In http://lists.nongnu.org/archive/html/qemu-ppc/2016-06/msg00533.html, Scott Wood suggested this "root cause" explanation: SW> The patch that changed MAS1_TSIZE_SHIFT from 8 to 7 was around the SW> same time as the patch that added this code, which is probably why SW> adjusting it got missed. Commit 2bd9543cd3 did update the SW> equivalent code in ppce500_mpc8544ds.c, which now resides in SW> hw/ppc/e500.c and has been changed to not assume a power-of-2 SW> size. The ppce500_spin version should be eliminated. Signed-off-by: Aaron Larson <alarson@ddci.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01spapr: Restore support for older PowerPC CPU coresBharata B Rao
Introduction of core based CPU hotplug for PowerPC sPAPR didn't add support for 970 and POWER5+ based core types. Add support for the same. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01spapr: fix write-past-end-of-array error in cpu core device init codeGreg Kurz
This fixes a potential QEMU crash introduced by commit 3b542549661. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01hw/ppc/spapr: Add some missing hcall function set stringsThomas Huth
Add "hcall-sprg0" (for H_SET_SPRG0), "hcall-copy" (for H_PAGE_INIT) and "hcall-debug" (for H_LOGICAL_CI_LOAD/STORE) to the property "ibm,hypertas-functions" to indicate that we support these hypercalls. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01ppc: Initial HDEC supportBenjamin Herrenschmidt
The current behaviour isn't completely right, as for the DEC, we don't properly re-arm when wrapping around, but I will fix this in a separate patch. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: fixed checkpatch.pl errors ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27qapi: keep names in 'CpuInstanceProperties' in sync with struct CPUCorePeter Krempa
struct CPUCore uses 'id' suffix in the property name. As docs for query-hotpluggable-cpus state that the cpu core properties should be passed back to device_add by management in case new members are added and thus the names for the fields should be kept in sync. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> [dwg: Removed a duplicated word in comment] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27target-ppc: ppce500_spin.c uses SPR_PIR, should use SPR_BOOKE_PIRAaron Larson
ppce500_spin.c uses SPR_PIR to initialize the spin table, however on Book E processors the correct SPR is SPR_BOOKE_PIR. Signed-off-by: Aaron Larson <alarson@ddci.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22memory: Add reporting of supported page sizesAlexey Kardashevskiy
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate uses when translating, however this information is not available outside the translate context for various checks. This adds a get_min_page_size callback to MemoryRegionIOMMUOps and a wrapper for it so IOMMU users (such as VFIO) can know the minimum actual page size supported by an IOMMU. As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE as fallback. This removes vfio_container_granularity() and uses new helper in memory_region_iommu_replay() when replaying IOMMU mappings on added IOMMU memory region. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> [dwg: Removed an unnecessary calculation] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22powerpc/mm: Update the WIMG check during H_ENTERAneesh Kumar K.V
Support for 0 value for memeory coherence is optional and with ppc64 we can always enable memory coherence. Linux kernel did that during the development of 4.7 kernel. But that resulted in failure in Qemu in H_ENTER hcall due to below check. The mentioned change was reverted in the kernel and kernel right now enable memory coherence only if cache inhibited is not set. Nevertheless update qemu WIMG flag check to cover the case where we enable memory coherence along with cache inhibited flag. In order to handle older and newer kernel version consider both Cache inhibitted and (cache inhibitted | memory conference) as valid values for wimg flags. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-20Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' ↵Peter Maydell
into staging # gpg: Signature made Mon 20 Jun 2016 21:29:27 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/tracing-pull-request: (42 commits) trace: split out trace events for linux-user/ directory trace: split out trace events for qom/ directory trace: split out trace events for target-ppc/ directory trace: split out trace events for target-s390x/ directory trace: split out trace events for target-sparc/ directory trace: split out trace events for net/ directory trace: split out trace events for audio/ directory trace: split out trace events for ui/ directory trace: split out trace events for hw/alpha/ directory trace: split out trace events for hw/arm/ directory trace: split out trace events for hw/acpi/ directory trace: split out trace events for hw/vfio/ directory trace: split out trace events for hw/s390x/ directory trace: split out trace events for hw/pci/ directory trace: split out trace events for hw/ppc/ directory trace: split out trace events for hw/9pfs/ directory trace: split out trace events for hw/i386/ directory trace: split out trace events for hw/isa/ directory trace: split out trace events for hw/sd/ directory trace: split out trace events for hw/sparc/ directory ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-20trace: split out trace events for hw/ppc/ directoryDaniel P. Berrange
Move all trace-events for files in the hw/ppc/ directory to their own file. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-id: 1466066426-16657-27-git-send-email-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20coccinelle: Remove unnecessary variables for function return valueEduardo Habkost
Use Coccinelle script to replace 'ret = E; return ret' with 'return E'. The script will do the substitution only when the function return type and variable type are the same. Manual fixups: * audio/audio.c: coding style of "read (...)" and "write (...)" * block/qcow2-cluster.c: wrap line to make it shorter * block/qcow2-refcount.c: change indentation of wrapped line * target-tricore/op_helper.c: fix coding style of "remainder|quotient" * target-mips/dsp_helper.c: reverted changes because I don't want to argue about checkpatch.pl * ui/qemu-pixman.c: fix line indentation * block/rbd.c: restore blank line between declarations and statements Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1465855078-19435-4-git-send-email-ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Unused Coccinelle rule name dropped along with a redundant comment; whitespace touched up in block/qcow2-cluster.c; stale commit message paragraph deleted] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-17spapr: implement query-hotpluggable-cpus callbackIgor Mammedov
It returns a list of present/possible to hotplug CPU objects with a list of properties to use with device_add. in spapr case returned list would looks like: -> { "execute": "query-hotpluggable-cpus" } <- {"return": [ { "props": { "core": 8 }, "type": "POWER8-spapr-cpu-core", "vcpus-count": 2 }, { "props": { "core": 0 }, "type": "POWER8-spapr-cpu-core", "vcpus-count": 2, "qom-path": "/machine/unattached/device[0]"} ]}' TODO: add 'node' property for core <-> numa node mapping Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: CPU hot unplug supportBharata B Rao
Remove the CPU core device by removing the underlying CPU thread devices. Hot removal of CPU for sPAPR guests is achieved by sending the hot unplug notification to the guest. Release the vCPU object after CPU hot unplug so that vCPU fd can be parked and reused. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: CPU hotplug supportBharata B Rao
Set up device tree entries for the hotplugged CPU core and use the exising RTAS event logging infrastructure to send CPU hotplug notification to the guest. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: convert boot CPUs into CPU core devicesBharata B Rao
Introduce sPAPRMachineClass.dr_cpu_enabled to indicate support for CPU core hotplug. Initialize boot time CPUs as core deivces and prevent topologies that result in partially filled cores. Both of these are done only if CPU core hotplug is supported. Note: An unrelated change in the call to xics_system_init() is done in this patch as it makes sense to use the local variable smt introduced in this patch instead of kvmppc_smt_threads() call here. TODO: We derive sPAPR core type by looking at -cpu <model>. However we don't take care of "compat=" feature yet for boot time as well as hotplug CPUs. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: Move spapr_cpu_init() to spapr_cpu_core.cBharata B Rao
Start consolidating CPU init related routines in spapr_cpu_core.c. As part of this, move spapr_cpu_init() and its dependencies from spapr.c to spapr_cpu_core.c No functionality change in this patch. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> [dwg: Rename TIMEBASE_FREQ to SPAPR_TIMEBASE_FREQ, since it's now in a public(ish) header] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr: Abstract CPU core device and type specific core devicesBharata B Rao
Add sPAPR specific abastract CPU core device that is based on generic CPU core device. Use this as base type to create sPAPR CPU specific core devices. TODO: - Add core types for other remaining CPU types - Handle CPU model alias correctly Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17spapr_drc: Prevent detach racing against attach for CPU DRBharata B Rao
If a CPU is hot removed while hotplug of the same is still in progress, the guest crashes. Prevent this by ensuring that detach is done only after attach has completed. The existing code already prevents such race for PCI hotplug. However given that CPU is a logical DR unlike PCI and starts with ISOLATED state, we need a logic that works for CPU too. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> [Don't set awaiting_attach for PCI devices] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17hw/ppc/spapr: Silence deprecation message in qtest modeThomas Huth
When running "make check", there is currently always an error message saying "spapr-pci-vfio-host-bridge is deprecated". This happens because the QOM tests are instantiating all possible devices, and the error message is currently located in the instance_init() function of the device. Since it is legal for the tests to instantiate a device without using it, the error message should be silenced when we're running in test mode. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14spapr: Ensure all LMBs are represented in ibm,dynamic-memoryBharata B Rao
Memory hotplug can fail for some combinations of RAM and maxmem when DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends on maximum addressable memory returned by guest and this value is currently being calculated wrongly by the guest kernel routine memory_hotplug_max(). While there is an attempt to fix the guest kernel, this patch works around the problem within QEMU itself. memory_hotplug_max() routine in the guest kernel arrives at max addressable memory by multiplying lmb-size with the lmb-count obtained from ibm,dynamic-memory property. There are two assumptions here: - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM where only hot-pluggable LMBs are present in this property. - The memory area comprising of RAM and hotplug region is contiguous: This needn't be true always for PowerKVM as there can be gap between boot time RAM and hotplug region. To work around this guest kernel bug, ensure that ibm,dynamic-memory has information about all the LMBs (RMA, boot-time LMBs, future hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and hotpluggable region). RMA is represented separately by memory@0 node. Hence mark RMA LMBs and also the LMBs for the gap b/n RAM and hotpluggable region as reserved and as having no valid DRC so that these LMBs are not considered by the guest. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14ppc: Add PowerISA 2.07 compatibility modeThomas Huth
Make sure that guests can use the PowerISA 2.07 CPU sPAPR compatibility mode when they request it and the target CPU supports it. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14ppc: Split pcr_mask settings into supported bits and the register maskThomas Huth
The current pcr_mask values are ambiguous: Should these be the mask that defines valid bits in the PCR register? Or should these rather indicate which compatibility levels are possible? Anyway, POWER6 and POWER7 should certainly not use the same values here. So let's introduce an additional variable "pcr_supported" here which is used to indicate the valid compatibility levels, and use pcr_mask to signal the valid bits in the PCR register. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14ppc/spapr: Refactor h_client_architecture_support() CPU parsing codeThomas Huth
The h_client_architecture_support() function has become quite big and nested already. So factor out the code that takes care of the sPAPR compatibility PVRs (which will be modified by the following patches). Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-13vl: Eliminate usb_enabled()Eduardo Habkost
This wrapper for machine_usb(current_machine) is not necessary, replace all usages of usb_enabled() with machine_usb(). Cc: Peter Maydell <peter.maydell@linaro.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alexander Graf <agraf@suse.de> Cc: qemu-arm@nongnu.org Cc: qemu-ppc@nongnu.org Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-id: 1465419025-21519-3-git-send-email-ehabkost@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-07ppc: Remove a potential overflow in muldiv64()Laurent Vivier
The coccinelle script: scripts/coccinelle/overflow_muldiv64.cocci gives us a list of potential overflows in muldiv64() (the two first parameters are 64bit values). This patch fixes one, as the fix seems obvious: replace muldiv64(a, b, c) by muldiv64(b, a, c) as "a" and "b" are 64bit values but a <= NANOSECONDS_PER_SECOND. (10^9 -> 30bit value). Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07spapr_pci: Drop cannot_instantiate_with_device_add_yet=falseMarkus Armbruster
It's become redundant since it was added in commit 09aa9a5 "spapr-pci: enable adding PHB via -device". Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr: Introduce pseries-2.7 machine typeBharata B Rao
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr: Increase hotpluggable memory slots to 256Bharata B Rao
KVM now supports 512 memslots on PowerPC (earlier it was 32). Allow half of it (256) to be used as hotpluggable memory slots. Instead of hard coding the max value, use the KVM supplied value if KVM is enabled. Otherwise resort to the default value of 32. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_pci: Add and export DMA resetting helperAlexey Kardashevskiy
This will be later used by the "ibm,reset-pe-dma-window" RTAS handler which resets the DMA configuration to the defaults. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_pci: Reset DMA config on PHB resetAlexey Kardashevskiy
LoPAPR dictates that during system reset all DMA windows must be removed and the default DMA32 window must be created so does the patch. At the moment there is just one window supported so no change in behaviour is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Add root memory regionAlexey Kardashevskiy
We are going to have multiple DMA windows at different offsets on a PCI bus. For the sake of migration, we will have as many TCE table objects pre-created as many windows supported. So we need a way to map windows dynamically onto a PCI bus when migration of a table is completed but at this stage a TCE table object does not have access to a PHB to ask it to map a DMA window backed by just migrated TCE table. This adds a "root" memory region (UINT64_MAX long) to the TCE object. This new region is mapped on a PCI bus with enabled overlapping as there will be one root MR per TCE table, each of them mapped at 0. The actual IOMMU memory region is a subregion of the root region and a TCE table enables/disables this subregion and maps it at the specific offset inside the root MR which is 1:1 mapping of a PCI address space. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Migrate full stateAlexey Kardashevskiy
The source guest could have reallocated the default TCE table and migrate bigger/smaller table. This adds reallocation in post_load() if the default table size is different on source and destination. This adds @bus_offset, @page_shift to the migration stream as a subsection so when DDW is added, migration to older machines will still be possible. As @bus_offset and @page_shift are not used yet, this makes no change in behavior. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Introduce "enabled" state for TCE tableAlexey Kardashevskiy
Currently TCE tables are created once at start and their sizes never change. We are going to change that by introducing a Dynamic DMA windows support where DMA configuration may change during the guest execution. This changes spapr_tce_new_table() to create an empty zero-size IOMMU memory region (IOMMU MR). Only LIOBN is assigned by the time of creation. It still will be called once at the owner object (VIO or PHB) creation. This introduces an "enabled" state for TCE table objects, some helper functions are added: - spapr_tce_table_enable() receives TCE table parameters, stores in sPAPRTCETable and allocates a guest view of the TCE table (in the user space or KVM) and sets the correct size on the IOMMU MR; - spapr_tce_table_disable() disposes the table and resets the IOMMU MR size; it is made public as the following DDW code will be using it. This changes the PHB reset handler to do the default DMA initialization instead of spapr_phb_realize(). This does not make differenct now but later with more than just one DMA window, we will have to remove them all and create the default one on a system reset. No visible change in behaviour is expected except the actual table will be reallocated every reset. We might optimize this later. The other way to implement this would be dynamically create/remove the TCE table QOM objects but this would make migration impossible as the migration code expects all QOM objects to exist at the receiver so we have to have TCE table objects created when migration begins. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30ppc: Do some batching of TCG tlb flushesBenjamin Herrenschmidt
On ppc64 especially, we flush the tlb on any slbie or tlbie instruction. However, those instructions often come in bursts of 3 or more (context switch will favor a series of slbie's for example to an slbia if the SLB has less than a certain number of entries in it, and tlbie's can happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries at a time. Doing a tlb_flush() each time is a waste of time. We end up doing a memset of the whole TLB, reloading it for the next instruction, memset'ing again, etc... Those instructions don't have to take effect immediately. For slbie, they can wait for the next context synchronizing event. For tlbie, the next tlbsync. This implements batching by keeping a flag that indicates that we have a TLB in need of flushing. We check it on interrupts, rfi's, isync's and tlbsync and flush the TLB if needed. This reduces the number of tlb_flush() on a boot to a ubuntu installer first dialog screen from roughly 360K down to 36K. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: added a 'CPUPPCState *' variable in h_remove() and h_bulk_remove() ] Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: removed spurious whitespace change, use 0/1 not true/false consistently, since tlb_need_flush has int type] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>