aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-10-05memory: Allow replay of IOMMU mapping notificationsDavid Gibson
When we have guest visible IOMMUs, we allow notifiers to be registered which will be informed of all changes to IOMMU mappings. This is used by vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings. However, unlike with a memory region listener, an iommu notifier won't be told about any mappings which already exist in the (guest) IOMMU at the time it is registered. This can cause problems if hotplugging a VFIO device onto a guest bus which had existing guest IOMMU mappings, but didn't previously have an VFIO devices (and hence no host IOMMU mappings). This adds a memory_region_iommu_replay() function to handle this case. It replays any existing mappings in an IOMMU memory region to a specified notifier. Because the IOMMU memory region doesn't internally remember the granularity of the guest IOMMU it has a small hack where the caller must specify a granularity at which to replay mappings. If there are finer mappings in the guest IOMMU these will be reported in the iotlb structures passed to the notifier which it must handle (probably causing it to flag an error). This isn't new - the VFIO iommu notifier must already handle notifications about guest IOMMU mappings too short for it to represent in the host IOMMU. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05vfio: Record host IOMMU's available IO page sizesDavid Gibson
Depending on the host IOMMU type we determine and record the available page sizes for IOMMU translation. We'll need this for other validation in future patches. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05vfio: Check guest IOVA ranges against host IOMMU capabilitiesDavid Gibson
The current vfio core code assumes that the host IOMMU is capable of mapping any IOVA the guest wants to use to where we need. However, real IOMMUs generally only support translating a certain range of IOVAs (the "DMA window") not a full 64-bit address space. The common x86 IOMMUs support a wide enough range that guests are very unlikely to go beyond it in practice, however the IOMMU used on IBM Power machines - in the default configuration - supports only a much more limited IOVA range, usually 0..2GiB. If the guest attempts to set up an IOVA range that the host IOMMU can't map, qemu won't report an error until it actually attempts to map a bad IOVA. If guest RAM is being mapped directly into the IOMMU (i.e. no guest visible IOMMU) then this will show up very quickly. If there is a guest visible IOMMU, however, the problem might not show up until much later when the guest actually attempt to DMA with an IOVA the host can't handle. This patch adds a test so that we will detect earlier if the guest is attempting to use IOVA ranges that the host IOMMU won't be able to deal with. For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is incorrect, but no worse than what we have already. We can't do better for now because the Type1 kernel interface doesn't tell us what IOVA range the IOMMU actually supports. For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported IOVA range and validate guest IOVA ranges against it, and this patch does so. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05vfio: Generalize vfio_listener_region_add failure pathDavid Gibson
If a DMA mapping operation fails in vfio_listener_region_add() it checks to see if we've already completed initial setup of the container. If so it reports an error so the setup code can fail gracefully, otherwise throws a hw_error(). There are other potential failure cases in vfio_listener_region_add() which could benefit from the same logic, so move it to its own fail: block. Later patches can use this to extend other failure cases to fail as gracefully as possible under the circumstances. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05vfio: Remove unneeded union from VFIOContainerDavid Gibson
Currently the VFIOContainer iommu_data field contains a union with different information for different host iommu types. However: * It only actually contains information for the x86-like "Type1" iommu * Because we have a common listener the Type1 fields are actually used on all IOMMU types, including the SPAPR TCE type as well In fact we now have a general structure for the listener which is unlikely to ever need per-iommu-type information, so this patch removes the union. In a similar way we can unify the setup of the vfio memory listener in vfio_connect_container() that is currently split across a switch on iommu type, but is effectively the same in both cases. The iommu_data.release pointer was only needed as a cleanup function which would handle potentially different data in the union. With the union gone, it too can be removed. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05hw/vfio/platform: do not set resamplefd for edge-sensitive IRQSEric Auger
In irqfd mode, current code attempts to set a resamplefd whatever the type of the IRQ. For an edge-sensitive IRQ this attempt fails and as a consequence, the whole irqfd setup fails and we fall back to the slow mode. This patch bypasses the resamplefd setting for non level-sentive IRQs. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05hw/vfio/platform: change interrupt/unmask fields into pointerEric Auger
unmask EventNotifier might not be initialized in case of edge sensitive irq. Using EventNotifier pointers make life simpler to handle the edge-sensitive irqfd setup. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05hw/vfio/platform: irqfd setup sequence updateEric Auger
With current implementation, eventfd VFIO signaling is first set up and then irqfd is setup, if supported and allowed. This start sequence causes several issues with IRQ forwarding setup which, if supported, is transparently attempted on irqfd setup: IRQ forwarding setup is likely to fail if the IRQ is detected as under injection into the guest (active at irqchip level or VFIO masked). This currently always happens because the current sequence explicitly VFIO-masks the IRQ before setting irqfd. Even if that masking were removed, we couldn't prevent the case where the IRQ is under injection into the guest. So the simpler solution is to remove this 2-step startup and directly attempt irqfd setup. This is what this patch does. Also in case the eventfd setup fails, there is no reason to go farther: let's abort. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-02Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio,pc features, fixes New features: guest RAM buffer overrun mitigation RAM physical address gaps for memory hotplug (except refactoring which got some review comments) Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Fri 02 Oct 2015 15:04:56 BST using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" * remotes/mst/tags/for_upstream: vhost-user-test: fix predictable filename on tmpfs vhost-user-test: use tmpfs by default pc: memhp: force gaps between DIMM's GPA memhp: extend address auto assignment to support gaps vhost-user: unit test for new messages vhost-user-test: do not reinvent glib-compat.h virtio: Notice when the system doesn't support MSIx at all pc: Add a comment explaining why pc_compat_2_4() doesn't exist exec: allocate PROT_NONE pages on top of RAM oslib: allocate PROT_NONE pages on top of RAM oslib: rework anonimous RAM allocation virtio-net: correctly drop truncated packets virtio: introduce virtqueue_discard() virtio: introduce virtqueue_unmap_sg() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-02Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20151002' ↵Peter Maydell
into staging First set of Linux-user que patches for 2.5 # gpg: Signature made Fri 02 Oct 2015 13:38:00 BST using RSA key ID DE3C9BC0 # gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>" # gpg: aka "Riku Voipio <riku.voipio@linaro.org>" * remotes/riku/tags/pull-linux-user-20151002: linux-user: assert that target_mprotect cannot fail linux-user/signal.c: Use setup_rt_frame() instead of setup_frame() for target openrisc linux-user/syscall.c: Add EAGAIN to host_to_target_errno_table for linux-user: add name_to_handle_at/open_by_handle_at linux-user: Return target error number in do_fork() linux-user: fix cmsg conversion in case of multiple headers linux-user: remove MAX_ARG_PAGES limit linux-user: remove unused image_info members linux-user: Treat --foo options the same as -foo linux-user: use EXIT_SUCCESS and EXIT_FAILURE linux-user: Add proper error messages for bad options linux-user: Add -help linux-user: Exit 0 when -h is used Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-02vhost-user-test: fix predictable filename on tmpfsMichael S. Tsirkin
vhost-user-test uses getpid to create a unique filename. This name is predictable, and a security problem. Instead, use a tmp directory created by mkdtemp, which is a suggested best practice. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2015-10-02vhost-user-test: use tmpfs by defaultMichael S. Tsirkin
Most people don't run make check by default, so they skip vhost-user unit tests. Solve this by using tmpfs instead, unless hugetlbfs is specified (using an environment variable). Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2015-10-02pc: memhp: force gaps between DIMM's GPAIgor Mammedov
mapping DIMMs non contiguously allows to workaround virtio bug reported earlier: http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html in this case guest kernel doesn't allocate buffers that can cross DIMM boundary keeping each buffer local to a DIMM. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-02memhp: extend address auto assignment to support gapsIgor Mammedov
setting gap to TRUE will make sparse DIMM address auto allocation, leaving gaps between a new DIMM address and preceeding existing DIMM. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-02vhost-user: unit test for new messagesMichael S. Tsirkin
Data is empty for now, but do make sure master sets the new feature bit flag. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-02vhost-user-test: do not reinvent glib-compat.hPaolo Bonzini
glib-compat.h has the gunk to support both old-style and new-style gthread functions. Use it instead of reinventing it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Tested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2015-10-02Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches # gpg: Signature made Fri 02 Oct 2015 12:49:13 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: block/raw-posix: Open file descriptor O_RDWR to work around glibc posix_fallocate emulation issue. block: disable I/O limits at the beginning of bdrv_close() iotests: Fix test 128 for password-less sudo tests: Fix test 049 fallout from improved HMP error messages raw-win32: Fix write request error handling Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-02block/raw-posix: Open file descriptor O_RDWR to work around glibc ↵Richard W.M. Jones
posix_fallocate emulation issue. https://bugzilla.redhat.com/show_bug.cgi?id=1265196 The following command fails on an NFS mountpoint: $ qemu-img create -f qcow2 -o preallocation=falloc disk.img 262144 Formatting 'disk.img', fmt=qcow2 size=262144 encryption=off cluster_size=65536 preallocation='falloc' lazy_refcounts=off qemu-img: disk.img: Could not preallocate data for the new file: Bad file descriptor The reason turns out to be because NFS doesn't support the posix_fallocate call. glibc emulates it instead. However glibc's emulation involves using the pread(2) syscall. The pread syscall fails with EBADF if the file descriptor is opened without the read open-flag (ie. open (..., O_WRONLY)). I contacted glibc upstream about this, and their response is here: https://bugzilla.redhat.com/show_bug.cgi?id=1265196#c9 There are two possible fixes: Use Linux fallocate directly, or (this fix) work around the problem in qemu by opening the file with O_RDWR instead of O_WRONLY. Signed-off-by: Richard W.M. Jones <rjones@redhat.com> BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1265196 Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-02block: disable I/O limits at the beginning of bdrv_close()Alberto Garcia
Disabling I/O limits from a BDS also drains all pending throttled requests, so it should be done at the beginning of bdrv_close() with the rest of the bdrv_drain() calls before the BlockDriver is closed. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-02iotests: Fix test 128 for password-less sudoMax Reitz
As of 934659c460d46c948cf348822fda1d38556ed9a4, $QEMU_IO is generally no longer a program name, and therefore "sudo -n $QEMU_IO" will no longer work. Fix this by copying the qemu-io invocation function from common.config, making it use $sudo for invoking $QEMU_IO_PROG, and then use that function instead of $QEMU_IO. Reported-by: Fam Zheng <famz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-02tests: Fix test 049 fallout from improved HMP error messagesEric Blake
Commit 50b7b000 improved HMP error messages, but forgot to update qemu-iotests to match. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-02raw-win32: Fix write request error handlingKevin Wolf
aio_worker() wrote the return code to the wrong variable. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Guangmu Zhu <guangmuzhu@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-10-02Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into stagingPeter Maydell
# gpg: Signature made Thu 01 Oct 2015 20:02:33 BST using RSA key ID C0DE3057 # gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>" # gpg: aka "Jeffrey Cody <jeff@codyprime.org>" # gpg: aka "Jeffrey Cody <codyprime@gmail.com>" * remotes/cody/tags/block-pull-request: block: mirror - fix full sync mode when target does not support zero init Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-01target-microblaze: Set the PC in reset instead of realizeAlistair Francis
Set the Microblaze CPU PC in the reset instead of setting it in the realize. This is required as the PC is zeroed in the reset function and causes problems in some situations. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2015-10-01disas/cris: Fix typo in commentStefan Weil
Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2015-10-01block: mirror - fix full sync mode when target does not support zero initJeff Cody
During mirror, if the target device does not support zero init, a mirror may result in a corrupted image for sync="full" mode. This is due to how the initial dirty bitmap is set up prior to copying data - we did not mark sectors as dirty that are unallocated. This means those unallocated sectors are skipped over on the target, and for a device without zero init, invalid data may reside in those holes. If both of the following conditions are true, then we will explicitly mark all sectors as dirty: 1.) sync = "full" 2.) bdrv_has_zero_init(target) == false If the target does support zero init, but a target image is passed in with data already present (i.e. an "existing" image), it is assumed the data present in the existing image is valid data for those sectors. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 91ed4bc5bda7e2b09eb508b07c83f4071fe0b3c9.1443705220.git.jcody@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-10-01virtio: Notice when the system doesn't support MSIx at allRichard Henderson
And do not issue an error_report in that case. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01pc: Add a comment explaining why pc_compat_2_4() doesn't existEduardo Habkost
pc_compat_2_4() doesn't exist, and we shouldn't create one. Add a comment explaining why the function doesn't exist and why pc_compat_*() functions are deprecated. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01exec: allocate PROT_NONE pages on top of RAMMichael S. Tsirkin
This inserts a read and write protected page between RAM and QEMU memory, for file-backend RAM. This makes it harder to exploit QEMU bugs resulting from buffer overflows in devices using variants of cpu_physical_memory_map, dma_memory_map etc. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01oslib: allocate PROT_NONE pages on top of RAMMichael S. Tsirkin
This inserts a read and write protected page between RAM and QEMU memory. This makes it harder to exploit QEMU bugs resulting from buffer overflows in devices using variants of cpu_physical_memory_map, dma_memory_map etc. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01oslib: rework anonimous RAM allocationMichael S. Tsirkin
At the moment we first allocate RAM, sometimes more than necessary for alignment reasons. We then free the extra RAM. Rework this to avoid the temporary allocation: reserve the range by mapping it with PROT_NONE, then use just the necessary range with MAP_FIXED. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01virtio-net: correctly drop truncated packetsJason Wang
When packet is truncated during receiving, we drop the packets but neither discard the descriptor nor add and signal used descriptor. This will lead several issues: - sg mappings are leaked - rx will be stalled if a lots of packets were truncated In order to be consistent with vhost, fix by discarding the descriptor in this case. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01virtio: introduce virtqueue_discard()Jason Wang
This patch introduces virtqueue_discard() to discard a descriptor and unmap the sgs. This will be used by the patch that will discard descriptor when packet is truncated. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01virtio: introduce virtqueue_unmap_sg()Jason Wang
Factor out sg unmapping logic. This will be reused by the patch that can discard descriptor. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Andrew James <andrew.james@hpe.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20150930' ↵Peter Maydell
into staging migration/next for 20150930 # gpg: Signature made Wed 30 Sep 2015 09:24:02 BST using RSA key ID 5872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" # gpg: aka "Juan Quintela <quintela@trasno.org>" * remotes/juanquintela/tags/migration/20150930: migration: Disambiguate MAX_THROTTLE qmp/hmp: Add throttle ratio to query-migrate and info migrate migration: Dynamic cpu throttling for auto-converge migration: Parameters for auto-converge cpu throttling cpu: Provide vcpu throttling interface migration: yet more possible state transitions Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-01linux-user: assert that target_mprotect cannot failPaolo Bonzini
All error conditions that target_mprotect checks are also checked by target_mmap. EACCESS cannot happen because we are just removing PROT_WRITE. ENOMEM should not happen because we are modifying a whole VMA (and we have bigger problems anyway if it happens). Fixes a Coverity false positive, where Coverity complains about target_mprotect's return value being passed to tb_invalidate_phys_range. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2015-10-01linux-user/signal.c: Use setup_rt_frame() instead of setup_frame() for ↵Chen Gang
target openrisc qemu has already considered about some targets may have no traditional signals. And openrisc's setup_frame() is dummy, but it can be supported by setup_rt_frame(). Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2015-09-30migration: Disambiguate MAX_THROTTLEJason J. Herne
Migration has a define for MAX_THROTTLE. Update comment to clarify that this is used for throttling transfer speed. Hopefully this will prevent it from being confused with a guest cpu throttling entity. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30qmp/hmp: Add throttle ratio to query-migrate and info migrateJason J. Herne
Report throttle percentage in info migrate and query-migrate responses when cpu throttling is active. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30migration: Dynamic cpu throttling for auto-convergeJason J. Herne
Remove traditional auto-converge static 30ms throttling code and replace it with a dynamic throttling algorithm. Additionally, be more aggressive when deciding when to start throttling. Previously we waited until four unproductive memory passes. Now we begin throttling after only two unproductive memory passes. Four seemed quite arbitrary and only waiting for two passes allows us to complete the migration faster. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30migration: Parameters for auto-converge cpu throttlingJason J. Herne
Add migration parameters to allow the user to adjust the parameters that control cpu throttling when auto-converge is in effect. The added parameters are as follows: x-cpu-throttle-initial : Initial percantage of time guest cpus are throttled when migration auto-converge is activated. x-cpu-throttle-increment: throttle percantage increase each time auto-converge detects that migration is not making progress. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30cpu: Provide vcpu throttling interfaceJason J. Herne
Provide a method to throttle guest cpu execution. CPUState is augmented with timeout controls and throttle start/stop functions. To throttle the guest cpu the caller simply has to call the throttle set function and provide a percentage of throttle time. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30migration: yet more possible state transitionsJuan Quintela
On destination, we move from INMIGRATE to FINISH_MIGRATE. Add that to the list of allowed states. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-29Merge remote-tracking branch 'remotes/amit-migration/tags/for-juan-201509' ↵Peter Maydell
into staging Migration queue # gpg: Signature made Tue 29 Sep 2015 07:13:55 BST using RSA key ID 854083B6 # gpg: Good signature from "Amit Shah <amit@amitshah.net>" # gpg: aka "Amit Shah <amit@kernel.org>" # gpg: aka "Amit Shah <amitshah@gmx.net>" * remotes/amit-migration/tags/for-juan-201509: ram_find_and_save_block: Split out the finding Move dirty page search state into separate structure migration: Use g_new() & friends where that makes obvious sense migration: qemu-file more size_t'ifying migration: size_t'ify some of qemu-file Init page sizes in qtest Split out end of migration code from migration_thread migration/ram.c: Use RAMBlock rather than MemoryRegion vmstate: Remove redefinition of VMSTATE_UINT32_ARRAY Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-29ram_find_and_save_block: Split out the findingDr. David Alan Gilbert
Split out the finding of the dirty page and all the wrap detection into a separate function since it was getting a bit hairy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1443018431-11170-3-git-send-email-dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> [Fix comment -- Amit] Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29Move dirty page search state into separate structureDr. David Alan Gilbert
Pull the search state for one iteration of the dirty page search into a structure. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1443018431-11170-2-git-send-email-dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29migration: Use g_new() & friends where that makes obvious senseMarkus Armbruster
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Same Coccinelle semantic patch as in commit b45c03f. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1442231491-23352-1-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29migration: qemu-file more size_t'ifyingDr. David Alan Gilbert
This time convert the external functions: qemu_get_buffer, qemu_peek_buffer qemu_put_buffer and qemu_put_buffer_async Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1439463094-5394-6-git-send-email-dgilbert@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29migration: size_t'ify some of qemu-fileDr. David Alan Gilbert
This is a start on using size_t more in qemu-file and friends; it fixes up QEMUFilePutBufferFunc and QEMUFileGetBufferFunc to take size_t lengths and return ssize_t return values (like read(2)) and fixes up all the different implementations of them. Note that I've not yet followed this deeply into bdrv_ implementations. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1439463094-5394-5-git-send-email-dgilbert@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29Init page sizes in qtestDr. David Alan Gilbert
One of my patches used a loop that was based on host page size; it dies in qtest since qtest hadn't bothered init'ing it. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1439463094-5394-4-git-send-email-dgilbert@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>