aboutsummaryrefslogtreecommitdiff
path: root/hw/vfio/common.c
AgeCommit message (Collapse)Author
2015-10-05vfio: Allow hotplug of containers onto existing guest IOMMU mappingsDavid Gibson
At present the memory listener used by vfio to keep host IOMMU mappings in sync with the guest memory image assumes that if a guest IOMMU appears, then it has no existing mappings. This may not be true if a VFIO device is hotplugged onto a guest bus which didn't previously include a VFIO device, and which has existing guest IOMMU mappings. Therefore, use the memory_region_register_iommu_notifier_replay() function in order to fix this case, replaying existing guest IOMMU mappings, bringing the host IOMMU into sync with the guest IOMMU. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05vfio: Record host IOMMU's available IO page sizesDavid Gibson
Depending on the host IOMMU type we determine and record the available page sizes for IOMMU translation. We'll need this for other validation in future patches. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05vfio: Check guest IOVA ranges against host IOMMU capabilitiesDavid Gibson
The current vfio core code assumes that the host IOMMU is capable of mapping any IOVA the guest wants to use to where we need. However, real IOMMUs generally only support translating a certain range of IOVAs (the "DMA window") not a full 64-bit address space. The common x86 IOMMUs support a wide enough range that guests are very unlikely to go beyond it in practice, however the IOMMU used on IBM Power machines - in the default configuration - supports only a much more limited IOVA range, usually 0..2GiB. If the guest attempts to set up an IOVA range that the host IOMMU can't map, qemu won't report an error until it actually attempts to map a bad IOVA. If guest RAM is being mapped directly into the IOMMU (i.e. no guest visible IOMMU) then this will show up very quickly. If there is a guest visible IOMMU, however, the problem might not show up until much later when the guest actually attempt to DMA with an IOVA the host can't handle. This patch adds a test so that we will detect earlier if the guest is attempting to use IOVA ranges that the host IOMMU won't be able to deal with. For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is incorrect, but no worse than what we have already. We can't do better for now because the Type1 kernel interface doesn't tell us what IOVA range the IOMMU actually supports. For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported IOVA range and validate guest IOVA ranges against it, and this patch does so. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05vfio: Generalize vfio_listener_region_add failure pathDavid Gibson
If a DMA mapping operation fails in vfio_listener_region_add() it checks to see if we've already completed initial setup of the container. If so it reports an error so the setup code can fail gracefully, otherwise throws a hw_error(). There are other potential failure cases in vfio_listener_region_add() which could benefit from the same logic, so move it to its own fail: block. Later patches can use this to extend other failure cases to fail as gracefully as possible under the circumstances. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05vfio: Remove unneeded union from VFIOContainerDavid Gibson
Currently the VFIOContainer iommu_data field contains a union with different information for different host iommu types. However: * It only actually contains information for the x86-like "Type1" iommu * Because we have a common listener the Type1 fields are actually used on all IOMMU types, including the SPAPR TCE type as well In fact we now have a general structure for the listener which is unlikely to ever need per-iommu-type information, so this patch removes the union. In a similar way we can unify the setup of the vfio memory listener in vfio_connect_container() that is currently split across a switch on iommu type, but is effectively the same in both cases. The iommu_data.release pointer was only needed as a cleanup function which would handle potentially different data in the union. With the union gone, it too can be removed. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23vfio: Change polarity of our no-mmap optionAlex Williamson
The default should be to allow mmap and new drivers shouldn't need to expose an option or set it to other than the allocation default in their initfn. Take advantage of the experimental flag to change this option to the correct polarity. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-11maint: remove / fix many doubled wordsDaniel P. Berrange
Many source files have doubled words (eg "the the", "to to", and so on). Most of these can simply be removed, but a couple were actual mis-spellings (eg "to to" instead of "to do"). There was even one triple word score "to to to" :-) Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-07-06vfio: Unregister IOMMU notifiers when container is destroyedAlexey Kardashevskiy
On systems with guest visible IOMMU, adding a new memory region onto PCI bus calls vfio_listener_region_add() for every DMA window. This installs a notifier for IOMMU memory regions. The notifier is supposed to be removed vfio_listener_region_del(), however in the case of mixed PHB (emulated + VFIO devices) when last VFIO device is unplugged and container gets destroyed, all existing DMA windows stay alive altogether with the notifiers which are on the linked list which head was in the destroyed container. This unregisters IOMMU memory region notifier when a container is destroyed. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-04-30exec: move rcu_read_lock/unlock to address_space_translate callersPaolo Bonzini
Once address_space_translate will be called outside the BQL, the returned MemoryRegion might disappear as soon as the RCU read-side critical section ends. Avoid this by moving the critical section to the callers. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1426684909-95030-3-git-send-email-pbonzini@redhat.com>
2015-03-10vfio: Remove superfluous '\n' around error_report()Gonglei
Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-03-09sPAPR: Implement sPAPRPHBClass EEH callbacksGavin Shan
The patch implements sPAPRPHBClass EEH callbacks so that the EEH RTAS requests can be routed to VFIO for further handling. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-03-02vfio: allow to disable MMAP per device with -x-mmap=off optionSamuel Pitoiset
Disabling MMAP support uses the slower read/write accesses but allows to trace all MMIO accesses, which is not good for performance, but very useful for reverse engineering PCI drivers. This option allows to disable MMAP per device without a compile-time change. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-02vfio: Make type1 listener symbols staticAlexey Kardashevskiy
They are not used from anywhere but common.c which is where these are defined so make them static. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-02vfio: Add ioctl number to error reportAlexey Kardashevskiy
This makes the error report more informative. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10vfio: Use vfio type1 v2 IOMMU interfaceAlex Williamson
The difference between v1 and v2 is fairly subtle, simply more deterministic behavior for unmaps. The v1 interface allows the user to attempt to unmap sub-regions of previous mappings, returning success with zero size if unable to comply. This was a reflection of the underlying IOMMU API. The v2 interface requires that the user may only unmap fully contained mappings, ie. an unmap cannot intersect or bisect a previous mapping, but may cover multiple mappings. QEMU never made use of the sub-region v1 support anyway, so we can support either v1 or v2. We'll favor v2 since it's newer. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10vfio: free dynamically-allocated data in instance_finalizePaolo Bonzini
In order to enable out-of-BQL address space lookup, destruction of devices needs to be split in two phases. Unrealize is the first phase; once it complete no new accesses will be started, but there may still be pending memory accesses can still be completed. The second part is freeing the device, which only happens once all memory accesses are complete. At this point the reference count has dropped to zero, an RCU grace period must have completed (because the RCU-protected FlatViews hold a reference to the device via memory_region_ref). This is when instance_finalize is called. Freeing data belongs in an instance_finalize callback, because the dynamically allocated memory can still be used after unrealize by the pending memory accesses. This starts the process by creating an instance_finalize callback and freeing most of the dynamically-allocated data in instance_finalize. Because instance_finalize is also called on error paths or also when the device is actually not realized, the common code needs some changes to be ready for this. The error path in vfio_initfn can be simplified too. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10vfio: cleanup vfio_get_device error path, remove vfio_populate_device callbackPaolo Bonzini
Now that vfio_put_base_device is called unconditionally at instance_finalize time, it can be called twice if vfio_populate_device fails. This works but it is slightly harder to follow. Change vfio_get_device to not touch the vbasedev struct until it will definitely succeed, moving the vfio_populate_device call back to vfio-pci. This way, vfio_put_base_device will only be called once. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-04vfio: fix wrong initialize vfio_group_listChen Fan
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-12-22vfio: Cleanup error_report()sAlex Williamson
With the conversion to tracepoints, a couple previous DPRINTKs are now quite a bit more visible and are really just informational. Remove these and add a bit more description to another. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-12-22hw/vfio: create common moduleEric Auger
A new common module is created. It implements all functions that have no device specificity (PCI, Platform). This patch only consists in move (no functional changes) Signed-off-by: Kim Phillips <kim.phillips@linaro.org> Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>