aboutsummaryrefslogtreecommitdiff
path: root/hw/ppc/spapr_iommu.c
AgeCommit message (Collapse)Author
2018-02-06spapr/iommu: Enable in-kernel TCE acceleration via VFIO KVM deviceAlexey Kardashevskiy
In order to enable TCE operations support in KVM, we have to inform the KVM about VFIO groups being attached to specific LIOBNs; the necessary bits are implemented already by IOMMU MR and VFIO. This defines get_attr() for the SPAPR TCE IOMMU MR which makes VFIO call the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl and establish LIOBN-to-IOMMU link. This changes spapr_tce_set_need_vfio() to avoid TCE table reallocation if the kernel supports the TCE acceleration. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> [aw - remove unnecessary sys/ioctl.h include] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-09-27migration: pre_save return intDr. David Alan Gilbert
Modify the pre_save method on VMStateDescription to return an int rather than void so that it potentially can fail. Changed zillions of devices to make them return 0; the only case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already had an error_report/return case. Note: If you add an error exit in your pre_save you must emit an error_report to say why. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170925112917.21340-2-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-09-08spapr_iommu: Realloc guest visible TCE table when hot(un)plugging vfio-pciAlexey Kardashevskiy
This replaces g_malloc() with spapr_tce_alloc_table() as this is the standard way of allocating tables and this allows moving the table back to KVM when unplugging a VFIO PCI device and VFIO TCE acceleration support is not present in the KVM. Although spapr_tce_alloc_table() is expected to fail with EBUSY if called when previous fd is not closed yet, in practice we will not see it because cap_spapr_vfio is false at the moment. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08spapr_iommu: unregister vmstate at unrealize timeGreg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08spapr_iommu: pass object ownership to parent/ownerMichael Roth
TCE table objects attach themselves to an owner as a child property. unref afterward to allow them to be finalized when their owner is finalized. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08spapr_iommu: convert TCE table object to realize()Greg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-09-08spapr_iommu: use g_strdup_printf() instead of snprintf()Greg Kurz
Passing a stack allocated buffer of arbitrary length to snprintf() without checking the return value can cause the resultant strings to be silently truncated. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-08-22hw/ppc/spapr_iommu: Fix crash when removing the "spapr-tce-table" deviceThomas Huth
QEMU currently aborts unexpectedly when the user tries to add and remove a "spapr-tce-table" device: $ qemu-system-ppc64 -nographic -S -nodefaults -monitor stdio QEMU 2.9.92 monitor - type 'help' for more information (qemu) device_add spapr-tce-table,id=x (qemu) device_del x ** ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl) Aborted (core dumped) The device should not be accessable for the users at all, it's just used internally, so mark it with user_creatable = false. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-14Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* gdbstub fixes (Alex) * IOMMU MemoryRegion subclass (Alexey) * Chardev hotswap (Anton) * NBD_OPT_GO support (Eric) * Misc bugfixes * DEFINE_PROP_LINK (minus the ARM patches - Fam) * MAINTAINERS updates (Philippe) # gpg: Signature made Fri 14 Jul 2017 11:06:27 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (55 commits) spapr_rng: Convert to DEFINE_PROP_LINK cpu: Convert to DEFINE_PROP_LINK mips_cmgcr: Convert to DEFINE_PROP_LINK ivshmem: Convert to DEFINE_PROP_LINK dimm: Convert to DEFINE_PROP_LINK virtio-crypto: Convert to DEFINE_PROP_LINK virtio-rng: Convert to DEFINE_PROP_LINK virtio-scsi: Convert to DEFINE_PROP_LINK virtio-blk: Convert to DEFINE_PROP_LINK qdev: Add const qualifier to PropertyInfo definitions qmp: Use ObjectProperty.type if present qdev: Introduce DEFINE_PROP_LINK qdev: Introduce PropertyInfo.create qom: enforce readonly nature of link's check callback translate-all: remove redundant !tcg_enabled check in dump_exec_info vl: fix breakage of -tb-size nbd: Implement NBD_INFO_BLOCK_SIZE on client nbd: Implement NBD_INFO_BLOCK_SIZE on server nbd: Implement NBD_OPT_GO on client nbd: Implement NBD_OPT_GO on server ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-14memory/iommu: introduce IOMMUMemoryRegionClassAlexey Kardashevskiy
This finishes QOM'fication of IOMMUMemoryRegion by introducing a IOMMUMemoryRegionClass. This also provides a fastpath analog for IOMMU_MEMORY_REGION_GET_CLASS(). This makes IOMMUMemoryRegion an abstract class. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170711035620.4232-3-aik@ozlabs.ru> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14memory/iommu: QOM'fy IOMMU MemoryRegionAlexey Kardashevskiy
This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion as a parent. This moves IOMMU-related fields from MR to IOMMU MR. However to avoid dymanic QOM casting in fast path (address_space_translate, etc), this adds an @is_iommu boolean flag to MR and provides new helper to do simple cast to IOMMU MR - memory_region_get_iommu. The flag is set in the instance init callback. This defines memory_region_is_iommu as memory_region_get_iommu()!=NULL. This switches MemoryRegion to IOMMUMemoryRegion in most places except the ones where MemoryRegion may be an alias. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20170711035620.4232-2-aik@ozlabs.ru> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-13Convert error_report() to warn_report()Alistair Francis
Convert all uses of error_report("warning:"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using these two commands: find ./* -type f -exec sed -i \ 's|error_report(".*warning[,:] |warn_report("|Ig' {} + Indentation fixed up manually afterwards. The test-qdev-global-props test case was manually updated to ensure that this patch passes make check (as the test cases are case sensitive). Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Suggested-by: Thomas Huth <thuth@redhat.com> Cc: Jeff Cody <jcody@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Lieven <pl@kamp.de> Cc: Josh Durgin <jdurgin@redhat.com> Cc: "Richard W.M. Jones" <rjones@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Greg Kurz <groug@kaod.org> Cc: Rob Herring <robh@kernel.org> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Peter Chubb <peter.chubb@nicta.com.au> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Alexander Graf <agraf@suse.de> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Greg Kurz <groug@kaod.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed by: Peter Chubb <peter.chubb@data61.csiro.au> Acked-by: Max Reitz <mreitz@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Message-Id: <e1cfa2cd47087c248dd24caca9c33d9af0c499b0.1499866456.git.alistair.francis@xilinx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-28vmstate: error hint for failed equal checksHalil Pasic
In some cases a failing VMSTATE_*_EQUAL does not mean we detected a bug, but it's actually the best we can do. Especially in these cases a verbose error message is required. Let's introduce infrastructure for specifying a error hint to be used if equal check fails. Let's do this by adding a parameter to the _EQUAL macros called _err_hint. Also change all current users to pass NULL as last parameter so nothing changes for them. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Message-Id: <20170623144823.42936-1-pasic@linux.vnet.ibm.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-25memory: tune last param of iommu_ops.translate()Peter Xu
This patch converts the old "is_write" bool into IOMMUAccessFlags. The difference is that "is_write" can only express either read/write, but sometimes what we really want is "none" here (neither read nor write). Replay is an good example - during replay, we should not check any RW permission bits since thats not an actual IO at all. CC: Paolo Bonzini <pbonzini@redhat.com> CC: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-04-26target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64Alexey Kardashevskiy
KVM_CAP_SPAPR_TCE capability allows creating TCE tables in KVM which allows having in-kernel acceleration for H_PUT_TCE_xxx hypercalls. However it only supports 32bit DMA windows at zero bus offset. There is a new KVM_CAP_SPAPR_TCE_64 capability which supports 64bit window size, variable page size and bus offset. This makes use of the new capability. The kernel headers are already updated as the kernel support went in to v4.6. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-27memory: introduce IOMMUOps.notify_flag_changedPeter Xu
The new interface can be used to replace the old notify_started() and notify_stopped(). Meanwhile it provides explicit flags so that IOMMUs can know what kind of notifications it is requested for. Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1474606948-14391-3-git-send-email-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-07hw/ppc: use error_report instead of fprintfCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-08spapr: Fix undefined behaviour in spapr_tce_reset()David Gibson
When a TCE table (sPAPR IOMMU context) is in disabled state (which is true by default for the 64-bit window), it has tcet->nb_table == 0 and tcet->table == NULL. However, on system reset, spapr_tce_reset() executes, which unconditionally calls memset(tcet->table, 0, table_size); We get away with this in practice, because it's a zero length memset(), but memset() on a NULL pointer is undefined behaviour, so we should not call it in this case. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05spapr_iommu: Realloc guest visible TCE table when starting/stopping listeningAlexey Kardashevskiy
The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU - a guest view of the table and a hardware TCE table. If there is no VFIO presense in the address space, then just the guest view is used, if this is the case, it is allocated in the KVM. However since there is no support yet for VFIO in KVM TCE hypercalls, when we start using VFIO, we need to move the guest view from KVM to the userspace; and we need to do this for every IOMMU on a bus with VFIO devices. This implements the callbacks for the sPAPR IOMMU - notify_started() reallocated the guest view to the user space, notify_stopped() does the opposite. This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug path as the new callbacks do this better - they notify IOMMU at the exact moment when the configuration is changed, and this also includes the case of PCI hot unplug. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22memory: Add reporting of supported page sizesAlexey Kardashevskiy
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate uses when translating, however this information is not available outside the translate context for various checks. This adds a get_min_page_size callback to MemoryRegionIOMMUOps and a wrapper for it so IOMMU users (such as VFIO) can know the minimum actual page size supported by an IOMMU. As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE as fallback. This removes vfio_container_granularity() and uses new helper in memory_region_iommu_replay() when replaying IOMMU mappings on added IOMMU memory region. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> [dwg: Removed an unnecessary calculation] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Add root memory regionAlexey Kardashevskiy
We are going to have multiple DMA windows at different offsets on a PCI bus. For the sake of migration, we will have as many TCE table objects pre-created as many windows supported. So we need a way to map windows dynamically onto a PCI bus when migration of a table is completed but at this stage a TCE table object does not have access to a PHB to ask it to map a DMA window backed by just migrated TCE table. This adds a "root" memory region (UINT64_MAX long) to the TCE object. This new region is mapped on a PCI bus with enabled overlapping as there will be one root MR per TCE table, each of them mapped at 0. The actual IOMMU memory region is a subregion of the root region and a TCE table enables/disables this subregion and maps it at the specific offset inside the root MR which is 1:1 mapping of a PCI address space. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Migrate full stateAlexey Kardashevskiy
The source guest could have reallocated the default TCE table and migrate bigger/smaller table. This adds reallocation in post_load() if the default table size is different on source and destination. This adds @bus_offset, @page_shift to the migration stream as a subsection so when DDW is added, migration to older machines will still be possible. As @bus_offset and @page_shift are not used yet, this makes no change in behavior. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07spapr_iommu: Introduce "enabled" state for TCE tableAlexey Kardashevskiy
Currently TCE tables are created once at start and their sizes never change. We are going to change that by introducing a Dynamic DMA windows support where DMA configuration may change during the guest execution. This changes spapr_tce_new_table() to create an empty zero-size IOMMU memory region (IOMMU MR). Only LIOBN is assigned by the time of creation. It still will be called once at the owner object (VIO or PHB) creation. This introduces an "enabled" state for TCE table objects, some helper functions are added: - spapr_tce_table_enable() receives TCE table parameters, stores in sPAPRTCETable and allocates a guest view of the TCE table (in the user space or KVM) and sets the correct size on the IOMMU MR; - spapr_tce_table_disable() disposes the table and resets the IOMMU MR size; it is made public as the following DDW code will be using it. This changes the PHB reset handler to do the default DMA initialization instead of spapr_phb_realize(). This does not make differenct now but later with more than just one DMA window, we will have to remove them all and create the default one on a system reset. No visible change in behaviour is expected except the actual table will be reallocated every reset. We might optimize this later. The other way to implement this would be dynamically create/remove the TCE table QOM objects but this would make migration impossible as the migration code expects all QOM objects to exist at the receiver so we have to have TCE table objects created when migration begins. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27spapr_iommu: Move table allocation to helpersAlexey Kardashevskiy
At the moment presence of vfio-pci devices on a bus affect the way the guest view table is allocated. If there is no vfio-pci on a PHB and the host kernel supports KVM acceleration of H_PUT_TCE, a table is allocated in KVM. However, if there is vfio-pci and we do yet not KVM acceleration for these, the table has to be allocated by the userspace. At the moment the table is allocated once at boot time but next patches will reallocate it. This moves kvmppc_create_spapr_tce/g_malloc0 and their counterparts to helpers. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-26memory: Fix IOMMU replay base addressAlexey Kardashevskiy
Since a788f227 "memory: Allow replay of IOMMU mapping notifications" when new VFIO listener is added, all existing IOMMU mappings are replayed. However there is a problem that the base address of an IOMMU memory region (IOMMU MR) is ignored which is not a problem for the existing user (which is pseries) with its default 32bit DMA window starting at 0 but it is if there is another DMA window. This stores the IOMMU's offset_within_address_space and adjusts the IOVA before calling vfio_dma_map/vfio_dma_unmap. As the IOMMU notifier expects IOVA offset rather than the absolute address, this also adjusts IOVA in sPAPR H_PUT_TCE handler before calling notifier(s). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-05-19hw: explicitly include qemu/log.hPaolo Bonzini
Move the inclusion out of hw/hw.h, most files do not need it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-29ppc: Clean up includesPeter Maydell
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1453832250-766-6-git-send-email-peter.maydell@linaro.org
2015-10-23spapr_iommu: Provide a function to switch a TCE table to allowing VFIODavid Gibson
Because of the way non-VFIO guest IOMMU operations are KVM accelerated, not all TCE tables (guest IOMMU contexts) can support VFIO devices. Currently, this is decided at creation time. To support hotplug of VFIO devices, we need to allow a TCE table which previously didn't allow VFIO devices to be switched so that it can. This patch adds an spapr_tce_set_need_vfio() function to do this, by reallocating the table in userspace if necessary. Currently this doesn't allow the KVM acceleration to be re-enabled if all the VFIO devices are removed. That's an optimization for another time. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-10-23spapr_iommu: Rename vfio_accel parameterDavid Gibson
The vfio_accel parameter used when creating a new TCE table (guest IOMMU context) has a confusing name. What it really means is whether we need the TCE table created to be able to support VFIO devices. VFIO is relevant, because when available we use in-kernel acceleration of the TCE table, but that may not work with VFIO devices because updates to the table are handled in kernel, bypass qemu and so don't hit qemu's infrastructure for keeping the VFIO host IOMMU state in sync with the guest IOMMU state. Rename the parameter to "need_vfio" throughout. This is a cosmetic change, with no impact on the logic. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-07-07spapr_iommu: translate sPAPRTCEAccess to IOMMUAccessFlagsGreg Kurz
The fact that these enums have matching values is pure coincidence. We actually need to translate from the PAPR definition to the QEMU one. This patch doesn't fix any bug, it is only code cleanup. Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr_iommu: drop erroneous check in h_put_tce_indirect()Greg Kurz
The tce_list variable is not a TCE but the address to a TCE: we shouldn't clear permission bits as we do now. And this is dead code anyway since we check tce_list is 4K aligned a few lines above. This patch doesn't fix any bug, it is only code cleanup. Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Merge sPAPREnvironment into sPAPRMachineStateDavid Gibson
The code for -machine pseries maintains a global sPAPREnvironment structure which keeps track of general state information about the guest platform. This predates the existence of the MachineState structure, but performs basically the same function. Now that we have the generic MachineState, fold sPAPREnvironment into sPAPRMachineState, the pseries specific subclass of MachineState. This is mostly a matter of search and replace, although a few places which relied on the global spapr variable are changed to find the structure via qdev_get_machine(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03hw/ppc/spapr_iommu: Fix the check for invalid upper bits in liobnThomas Huth
The check "liobn & 0xFFFFFFFF00000000ULL" in spapr_tce_find_by_liobn() is completely useless since liobn is only declared as an uint32_t parameter. Fix this by using target_ulong instead (this is what most of the callers of this function are using, too). Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03spapr_iommu: Give unique QOM name to TCE tableAlexey Kardashevskiy
Useful for debugging. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03spapr_iommu: Make spapr_tce_find_by_liobn() publicAlexey Kardashevskiy
At the moment spapr_tce_find_by_liobn() is used by H_PUT_TCE/... handlers to find an IOMMU by LIOBN. We are going to implement Dynamic DMA windows (DDW), new code will go to a new file and we will use spapr_tce_find_by_liobn() there too so let's make it public. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03spapr_iommu: Add separate trace points for PCI DMA operationsAlexey Kardashevskiy
This is to reduce VIO noise while debugging PCI DMA. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03spapr_iommu: Make H_PUT_TCE_INDIRECT endian-safeAlexey Kardashevskiy
PAPR is defined as big endian so TCEs need an adjustment so does this patch. This changes code to have ldq_be_phys() in one place. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03spapr_iommu: Disable in-kernel IOMMU tables for >4GB windowsAlexey Kardashevskiy
The existing KVM_CREATE_SPAPR_TCE ioctl only support 4G windows max as the window size parameter to the kernel ioctl() is 32-bit so there's no way of expressing a TCE window > 4GB. We are going to add huge DMA windows support so this will create small window and unexpectedly fail later. This disables KVM_CREATE_SPAPR_TCE for windows bigger that 4GB. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-03-09spapr_vio/spapr_iommu: Move VIO bypass where it belongsAlexey Kardashevskiy
Instead of tweaking a TCE table device by adding there a bypass flag, let's add an alias to RAM and IOMMU memory region, and enable/disable those according to the selected bypass mode. This way IOMMU memory region can have size of the actual window rather than ram_size which is essential for upcoming DDW support. This moves bypass logic to VIO layer and keeps @bypass flag in TCE table for migration compatibility only. This replaces spapr_tce_set_bypass() calls with explicit assignment to avoid confusion as the function could do something more that just syncing the @bypass flag. This adds a pointer to VIO device into the sPAPRTCETable struct to provide the sPAPRTCETable device a way to update bypass mode for the VIO device. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-16exec: RCUify AddressSpaceDispatchPaolo Bonzini
Note that even after this patch, most callers of address_space_* functions must still be under the big QEMU lock, otherwise the memory region returned by address_space_translate can disappear as soon as address_space_translate returns. This will be fixed in the next part of this series. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-07PPC: Fix crash on spapr_tce_table_finalize()David Gibson
spapr_tce_table_finalize() can SEGV if the object was not previously realized. In particular this can be triggered by running qemu-system-ppc -device spapr-tce-table,? The basic problem is that we have mismatched initialization versus finalization: spapr_tce_table_finalize() is attempting to undo things that are done in spapr_tce_table_realize(), not an instance_init function. Therefore, replace spapr_tce_table_finalize() with spapr_tce_table_unrealize(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Cc: qemu-stable@nongnu.org Signed-off-by: Alexander Graf <agraf@suse.de>
2014-08-28iommu: add is_write as a parameter to the translate function of ↵Le Tan
MemoryRegionIOMMUOps Add a bool variable is_write as a parameter to the translate function of MemoryRegionIOMMUOps to indicate the operation of the access. It can be used for correct fault reporting from within the callback. Change the interface of related functions. Signed-off-by: Le Tan <tamlokveer@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-07-15sPAPR/IOMMU: Fix TCE entry permissionGavin Shan
The permission of TCE entry should exclude physical base address. Otherwise, unmapping TCE entry can be interpreted to mapping TCE entry wrongly for VFIO devices. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27spapr_iommu: Make in-kernel TCE table optionalAlexey Kardashevskiy
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating TCE tables in the host kernel memory and handle H_PUT_TCE requests targeted to specific LIOBN (logical bus number) right in the host without switching to QEMU. At the moment this is used for emulated devices only and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE handler finds a LIOBN and corresponding table, it will put a TCE to the table and complete hypercall execution. The user space will not be notified. Upcoming VFIO support is going to use the same sPAPRTCETable device class so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE tables for VFIO are going to be allocated in the host as well. However VFIO operates with real IOMMU tables and simple copying of a TCE to the real hardware TCE table will not work as guest physical to host physical address translation is requited. So until the host kernel gets VFIO support for H_PUT_TCE, we better not to register VFIO's TCE in the host. This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not in upstream yet and being discussed so now it is always false which means that in-kernel VFIO acceleration is not supported. This adds a bool @vfio_accel flag to the sPAPRTCETable device telling that sPAPRTCETable should not try allocating TCE table in the host kernel for VFIO. The flag is false now as at the moment there is no VFIO. This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic is the same. Since there is only emulated PCI and VIO now, the flag is set to false. Upcoming VFIO support will set it to true. This is a preparation patch so no change in behaviour is expected Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr_iommu: Introduce bus_offset in sPAPRTCETableAlexey Kardashevskiy
This adds @bus_offset into sPAPRTCETable to tell where TCE table starts from. It is set to 0 for emulated devices. Dynamic DMA windows will use other offset. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr_iommu: Introduce page_shift in sPAPRTCETableAlexey Kardashevskiy
At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it where it is possible. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr_iommu: Get rid of window_size in sPAPRTCETableAlexey Kardashevskiy
This removes window_size as it is basically a copy of nb_table shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are going to support windows as big as the entire RAM and this number will be bigger that 32 capacity, we will have to do something about @window_size anyway and removal seems to be the right way to go. This removes dma_window_start/dma_window_size from sPAPRPHBState as they are no longer used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr_iommu: Convert old qdev_init_nofail() to object_property_set_boolAlexey Kardashevskiy
qdev_init_nofail() was replaced by object_property_set_bool("realized") all over the QEMU so do we. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr_pci: spapr_iommu: Make DMA window a subregionAlexey Kardashevskiy
Currently the default DMA window is represented by a single MemoryRegion. However there can be more than just one window so we need a "root" memory region to be separated from the actual DMA window(s). This introduces a "root" IOMMU memory region and adds a subregion for the default DMA 32bit window. Following patches will add other subregion(s). This initializes a default DMA window subregion size to the guest RAM size as this window can be switched into "bypass" mode which implements direct DMA mapping. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr_iommu: Enable multiple TCE requestsAlexey Kardashevskiy
Currently only single TCE entry per request is supported (H_PUT_TCE). However PAPR+ specification allows multiple entry requests such as H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host kernel via ioctls, support of these calls can accelerate IOMMU operations. This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT. This advertises "multi-tce" capability to the guest if the host kernel supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>