aboutsummaryrefslogtreecommitdiff
path: root/migration
AgeCommit message (Collapse)Author
2019-08-16Include sysemu/sysemu.h a lot lessMarkus Armbruster
In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 5400 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/qdev-core.h includes sysemu/sysemu.h since recent commit e965ffa70a "qdev: add qdev_add_vm_change_state_handler()". This is a bad idea: hw/qdev-core.h is widely included. Move the declaration of qdev_add_vm_change_state_handler() to sysemu/sysemu.h, and drop the problematic include from hw/qdev-core.h. Touching sysemu/sysemu.h now recompiles some 1800 objects. qemu/uuid.h also drops from 5400 to 1800. A few more headers show smaller improvement: qemu/notify.h drops from 5600 to 5200, qemu/timer.h from 5600 to 4500, and qapi/qapi-types-run-state.h from 5500 to 5000. Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190812052359.30071-28-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2019-08-16Include hw/qdev-properties.h lessMarkus Armbruster
In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16Include qemu/main-loop.h lessMarkus Armbruster
In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include exec/memory.h slightly lessMarkus Armbruster
Drop unnecessary inclusions from headers. Downgrade a few more to exec/hwaddr.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-17-armbru@redhat.com>
2019-08-16Include migration/vmstate.h lessMarkus Armbruster
In my "build everything" tree, changing migration/vmstate.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get VMStateDescription. The previous commit made that unnecessary. Include migration/vmstate.h only where it's still needed. Touching it now recompiles only some 1600 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-16-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Clean up inclusion of exec/cpu-common.hMarkus Armbruster
migration/qemu-file.h neglects to include it even though it needs ram_addr_t. Fix that. Drop a few superfluous inclusions elsewhere. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-14-armbru@redhat.com>
2019-07-24migration: fix migrate_cancel multifd migration leads destination hung foreverIvan Ren
When migrate_cancel a multifd migration, if run sequence like this: [source] [destination] multifd_send_sync_main[finish] multifd_recv_thread wait &p->sem_sync shutdown to_dst_file detect error from_src_file send RAM_SAVE_FLAG_EOS[fail] [no chance to run multifd_recv_sync_main] multifd_load_cleanup join multifd receive thread forever will lead destination qemu hung at following stack: pthread_join qemu_thread_join multifd_load_cleanup process_incoming_migration_co coroutine_trampoline Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <1561468699-9819-4-git-send-email-ivanren@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-24migration: Make explicit that we are quitting multifdJuan Quintela
We add a bool to indicate that. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-24migration: fix migrate_cancel leads live_migration thread hung foreverIvan Ren
When we 'migrate_cancel' a multifd migration, live_migration thread may hung forever at some points, because of multifd_send_thread has already exit for socket error: 1. multifd_send_pages may hung at qemu_sem_wait(&multifd_send_state-> channels_ready) 2. multifd_send_sync_main my hung at qemu_sem_wait(&multifd_send_state-> sem_sync) Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1561468699-9819-3-git-send-email-ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> --- Remove spurious not needed bits
2019-07-24migration: fix migrate_cancel leads live_migration thread endless loopIvan Ren
When we 'migrate_cancel' a multifd migration, live_migration thread may go into endless loop in multifd_send_pages functions. Reproduce steps: (qemu) migrate_set_capability multifd on (qemu) migrate -d url (qemu) [wait a while] (qemu) migrate_cancel Then may get live_migration 100% cpu usage in following stack: pthread_mutex_lock qemu_mutex_lock_impl multifd_send_pages multifd_queue_page ram_save_multifd_page ram_save_target_page ram_save_host_page ram_find_and_save_block ram_find_and_save_block ram_save_iterate qemu_savevm_state_iterate migration_iteration_run migration_thread qemu_thread_start start_thread clone Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1561468699-9819-2-git-send-email-ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration: always initial RAMBlock.bmap to 1 for new migrationIvan Ren
Reproduce the problem: migrate migrate_cancel migrate Error happen for memory migration The reason as follows: 1. qemu start, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] all set to 1 by a series of cpu_physical_memory_set_dirty_range 2. migration start:ram_init_bitmaps - memory_global_dirty_log_start: begin log diry - memory_global_dirty_log_sync: sync dirty bitmap to ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] - migration_bitmap_sync_range: sync ram_list. dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero 3. migration data... 4. migrate_cancel, will stop log dirty 5. migration start:ram_init_bitmaps - memory_global_dirty_log_start: begin log diry - memory_global_dirty_log_sync: sync dirty bitmap to ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] - migration_bitmap_sync_range: sync ram_list. dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero Here RAMBlock.bmap only have new logged dirty pages, don't contain the whole guest pages. Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1563115879-2715-1-git-send-email-ivanren@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration/postcopy: remove redundant cpu_synchronize_all_post_initWei Yang
cpu_synchronize_all_post_init() is called twice in loadvm_postcopy_handle_run_bh(), so remove one redundant call. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190715080751.24304-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration/postcopy: fix document of postcopy_send_discard_bm_ram()Wei Yang
Commit 6b6712efccd3 ('ram: Split dirty bitmap by RAMBlock') changes the parameter of postcopy_send_discard_bm_ram(), while left the document part untouched. This patch correct the document and fix two typo by hand. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190715020549.15018-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration: allow private destination ram with x-ignore-sharedPeng Tao
By removing the share ram check, qemu is able to migrate to private destination ram when x-ignore-shared capability is on. Then we can create multiple destination VMs based on the same source VM. This changes the x-ignore-shared migration capability to work similar to Lai's original bypass-shared-memory work(https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html) which enables kata containers (https://katacontainers.io) to implement the VM templating feature. An example usage in kata containers(https://katacontainers.io): 1. Start the source VM: qemu-system-x86 -m 2G \ -object memory-backend-file,id=mem0,size=2G,share=on,mem-path=/tmpfs/template-memory \ -numa node,memdev=mem0 2. Stop the template VM, set migration x-ignore-shared capability, migrate "exec:cat>/tmpfs/state", quit it 3. Start target VM: qemu-system-x86 -m 2G \ -object memory-backend-file,id=mem0,size=2G,share=off,mem-path=/tmpfs/template-memory \ -numa node,memdev=mem0 \ -incoming defer 4. connect to target VM qmp, set migration x-ignore-shared capability, migrate_incoming "exec:cat /tmpfs/state" 5. create more target VMs repeating 3 and 4 Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Yury Kotov <yury-kotov@yandex-team.ru> Cc: Jiangshan Lai <laijs@hyper.sh> Cc: Xu Wang <xu@hyper.sh> Signed-off-by: Peng Tao <tao.peng@linux.alibaba.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1560494113-1141-1-git-send-email-tao.peng@linux.alibaba.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration: Split log_clear() into smaller chunksPeter Xu
Currently we are doing log_clear() right after log_sync() which mostly keeps the old behavior when log_clear() was still part of log_sync(). This patch tries to further optimize the migration log_clear() code path to split huge log_clear()s into smaller chunks. We do this by spliting the whole guest memory region into memory chunks, whose size is decided by MigrationState.clear_bitmap_shift (an example will be given below). With that, we don't do the dirty bitmap clear operation on the remote node (e.g., KVM) when we fetch the dirty bitmap, instead we explicitly clear the dirty bitmap for the memory chunk for each of the first time we send a page in that chunk. Here comes an example. Assuming the guest has 64G memory, then before this patch the KVM ioctl KVM_CLEAR_DIRTY_LOG will be a single one covering 64G memory. If after the patch, let's assume when the clear bitmap shift is 18, then the memory chunk size on x86_64 will be 1UL<<18 * 4K = 1GB. Then instead of sending a big 64G ioctl, we'll send 64 small ioctls, each of the ioctl will cover 1G of the guest memory. For each of the 64 small ioctls, we'll only send if any of the page in that small chunk was going to be sent right away. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190603065056.25211-12-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration: No need to take rcu during sync_dirty_bitmapPeter Xu
cpu_physical_memory_sync_dirty_bitmap() has one RAMBlock* as parameter, which means that it must be with RCU read lock held already. Taking it again inside seems redundant. Removing it. Instead comment on the functions about the RCU read lock. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-2-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration/ram.c: reset complete_round when we gets a queued pageWei Yang
In case we gets a queued page, the order of block is interrupted. We may not rely on the complete_round flag to say we have already searched the whole blocks on the list. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190605010828.6969-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration/multifd: sync packet_num after all thread are doneWei Yang
Notification from recv thread is not ordered, which means we may be notified by one MultiFDRecvParams but adjust packet_num for another. Move the adjustment after we are sure each recv thread are sync-ed. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190604023540.26532-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration/xbzrle: update cache and current_data in one placeWei Yang
When we are not in the last_stage, we need to update the cache if page is not the same. Currently this procedure is scattered in two places and mixed with encoding status check. This patch extract this general step out to make the code a little bit easy to read. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190610004159.20966-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration/multifd: call multifd_send_sync_main when sending RAM_SAVE_FLAG_EOSWei Yang
On receiving RAM_SAVE_FLAG_EOS, multifd_recv_sync_main() is called to synchronize receive threads. Current synchronization mechanism is to wait for each channel's sem_sync semaphore. This semaphore is triggered by a packet with MULTIFD_FLAG_SYNC flag. While in current implementation, we don't do multifd_send_sync_main() to send such packet when blk_mig_bulk_active() is true. This will leads to the receive threads won't notify multifd_recv_sync_main() by sem_sync. And multifd_recv_sync_main() will always wait there. [Note]: normal migration test works, while didn't test the blk_mig_bulk_active() case. Since not sure how to produce this situation. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190612014337.11255-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15migration: fix multifd_recv event typoJuan Quintela
It uses num in multifd_send(). Make it coherent. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-05general: Replace global smp variables with smp machine propertiesLike Xu
Basically, the context could get the MachineState reference via call chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode. A local variable of the same name would be introduced in the declaration phase out of less effort OR replace it on the spot if it's only used once in the context. No semantic changes. Signed-off-by: Like Xu <like.xu@linux.intel.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190518205428.90532-4-like.xu@linux.intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-04migration: move port_attr inside CONFIG_LINUXAlex Bennée
Otherwise the FreeBSD compiler complains about an unused variable. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-02migration/colo.c: Add missed filter notify for Xen COLO.Zhang Chen
We need to notify net filter to do checkpoint for Xen COLO, like KVM side. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-06-12Include qemu-common.h exactly where neededMarkus Armbruster
No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
2019-06-12Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2019-06-06Merge remote-tracking branch ↵Peter Maydell
'remotes/vivier2/tags/trivial-branch-pull-request' into staging Trivial fixes 06/06/2019 # gpg: Signature made Thu 06 Jun 2019 12:05:50 BST # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/trivial-branch-pull-request: hw/watchdog/wdt_i6300esb: Use DEVICE() macro to access DeviceState.qdev hw/scsi: Use the QOM BUS() macro to access BusState.qbus hw/sd: Use the QOM BUS() macro to access BusState.qbus hw/audio/ac97: Use the QOM DEVICE() macro to access DeviceState.qdev hw/vfio/pci: Use the QOM DEVICE() macro to access DeviceState.qdev hw/usb-storage: Use the QOM DEVICE() macro to access DeviceState.qdev hw/isa: Use the QOM DEVICE() macro to access DeviceState.qdev hw/s390x/event-facility: Use the QOM BUS() macro to access BusState.qbus hw/pci-bridge: Use the QOM BUS() macro to access BusState.qbus hw/scsi/vmw_pvscsi: Use qbus_reset_all() directly docs/devel/build-system: Update an example test: Fix make target check-report.tap util: Adjust qemu_guest_getrandom_nofail for Coverity vhost: fix incorrect print type migration: fix a typo hw/rdma: Delete unused headers inclusion Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-06migration: fix a typoLi Qiang
'postocpy' should be 'postcopy'. CC: qemu-trivial@nongnu.org Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190525062832.18009-1-liq3ea@163.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-06-05migratioin/ram: leave RAMBlock->bmap blank on allocatingWei Yang
During migration, we would sync bitmap from ram_list.dirty_memory to RAMBlock.bmap in cpu_physical_memory_sync_dirty_bitmap(). Since we set RAMBlock.bmap and ram_list.dirty_memory both to all 1, this means at the first round this sync is meaningless and is a duplicated work. Leaving RAMBlock->bmap blank on allocating would have a side effect on migration_dirty_pages, since it is calculated from the result of cpu_physical_memory_sync_dirty_bitmap(). To keep it right, we need to set migration_dirty_pages to 0 in ram_state_init(). Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-06-05migration: Fix fd protocol for incoming deferYury Kotov
Currently, incoming migration through fd supports only command-line case: E.g. fork(); fd = open(); exec("qemu ... -incoming fd:%d", fd); It's possible to use add-fd commands to pass fd for migration, but it's invalid case. add-fd works with fdset but not with particular fds. To work with getfd in incoming defer it's enough to use monitor_fd_param instead of strtol. monitor_fd_param supports both cases: * fd:123 * fd:fd_name (added by getfd). And also the use of monitor_fd_param improves error messages. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-06-05migration/ram.c: multifd_send_state->count is not really usedWei Yang
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-06-05migration/ram.c: MultiFDSendParams.sem_sync is not really usedWei Yang
Besides init and destroy, MultiFDSendParams.sem_sync is not really used. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-06-04block: Add BlockBackend.ctxKevin Wolf
This adds a new parameter to blk_new() which requires its callers to declare from which AioContext this BlockBackend is going to be used (or the locks of which AioContext need to be taken anyway). The given context is only stored and kept up to date when changing AioContexts. Actually applying the stored AioContext to the root node is saved for another commit. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-05-28migration/dirty-bitmaps: change bitmap enumeration methodJohn Snow
Shift from looking at every root BDS to *every* BDS. This will migrate bitmaps that are attached to blockdev created nodes instead of just ones attached to emulated storage devices. Note that this will not migrate anonymous or internal-use bitmaps, as those are defined as having no name. This will also fix the Coverity issues Peter Maydell has been asking about for the past several releases, as well as fixing a real bug. Reported-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Coverity 😅 Reported-by: aihua liang <aliang@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20190514201926.10407-1-jsnow@redhat.com Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652490 Fixes: Coverity CID 1390625 CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>
2019-05-22migration: Fix typo in migrate_add_blocker() error messageGreg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <155800428514.543845.17558475870097990036.stgit@bahia.lan> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-05-14migration/ram.c: fix typos in commentsWei Yang
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190510233729.15554-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration: Fix use-after-free during process exitYury Kotov
It fixes heap-use-after-free which was found by clang's ASAN. Control flow of this use-after-free: main_thread: * Got SIGTERM and completes main loop * Calls migration_shutdown - migrate_fd_cancel (so, migration_thread begins to complete) - object_unref(OBJECT(current_migration)); migration_thread: * migration_iteration_finish -> schedule cleanup bh * object_unref(OBJECT(s)); (Now, current_migration is freed) * exits main_thread: * Calls vm_shutdown -> drain bdrvs -> main loop -> cleanup_bh -> use after free If you want to reproduce, these couple of sleeps will help: vl.c:4613: migration_shutdown(); + sleep(2); migration.c:3269: + sleep(1); trace_migration_thread_after_loop(); migration_iteration_finish(s); Original output: qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>) ================================================================= ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 #5 0x5555573bddf6 in virtio_blk_data_plane_stop hw/block/dataplane/virtio-blk.c:282:5 #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 #10 0x555557c36713 in vm_state_notify vl.c:1605:9 #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 #13 0x555557c4283e in main vl.c:4617:5 #14 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) 0x61900001d210 is located 144 bytes inside of 952-byte region [0x61900001d180,0x61900001d538) freed by thread T6 (live_migration) here: #0 0x555556f76782 in __interceptor_free /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 #2 0x555558d57651 in object_unref qom/object.c:1068:9 #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) previously allocated by thread T0 (qemu-vm-0) here: #0 0x555556f76b03 in __interceptor_malloc /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) #2 0x555558d58031 in object_new qom/object.c:640:12 #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 #4 0x555557c41398 in main vl.c:4320:5 #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) Thread T6 (live_migration) created by T0 (qemu-vm-0) here: #0 0x555556f5f0dd in pthread_create /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 #15 0x7ffff6ede196 in g_main_context_dispatch (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 in migrate_fd_cleanup Shadow bytes around the buggy address: 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==31958==ABORTING Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190408113343.2370-1-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixed up comment formatting
2019-05-14migration/savevm: wrap into qemu_loadvm_state_header()Wei Yang
On source side, we have qemu_savevm_state_header() to send related data, while on the receiving side those steps are scattered in qemu_loadvm_state(). This patch wrap those related steps into qemu_loadvm_state_header() to make it friendly to read. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-5-richardw.yang@linux.intel.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration/savevm: load_header before load_setupWei Yang
In migration_thread() and qemu_savevm_state(), we savevm_state in following sequence: qemu_savevm_state_header(f); qemu_savevm_state_setup(f); Then it would be more proper to loadvm_state in the save sequence. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration/savevm: remove duplicate check of migration_is_blockedWei Yang
Current call flow of save_snapshot is: save_snapshot migration_is_blocked qemu_savevm_state migration_is_blocked Since qemu_savevm_state is only called in save_snapshot, this means migration_is_blocked has been already checked. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration: update comments of migration bitmapYi Wang
Since the ram bitmap and the unsent bitmap are split by RAMBlock in commit 6b6712e, it's better to update the comments about them. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Message-Id: <1555311089-18610-1-git-send-email-wang.yi59@zte.com.cn> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration/ram.c: start of migration_bitmap_sync_range is always 0Wei Yang
We can eliminate to pass 0. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190430034412.12935-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration/colo.c: Remove redundant input parameterZhang Chen
The colo_do_failover no need the input parameter. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20190426090730.2691-2-chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration: savevm: fix error code with migration blockersCole Robinson
The only caller that checks the error code is looking for != 0, so returning false is incorrect. Fixes: 5aaac467938 "migration: savevm: consult migration blockers" Signed-off-by: Cole Robinson <crobinso@redhat.com> Message-Id: <b991a4d0e6c4253bc08b2794c6084be55fc72e1d.1554851834.git.crobinso@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14vmstate: check subsection_found is enoughWei Yang
subsection_found is true implies vmdesc is not NULL. This patch remove the additional check on vmdesc and rename subsection_found to vmdesc_has_subsections to make it more self-explain. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190403011016.12549-1-richardw.yang@linux.intel.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration: remove not used field xfer_limitWei Yang
MigrationState->xfer_limit is only set to 0 in migrate_init(). Remove this unnecessary field. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190326055726.10539-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration: not necessary to check ops againWei Yang
During each iteration, se->ops is checked before each loop. So it is not necessary to check it again and simplify the following check a little. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190327013130.26259-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-04-09migration/ram.c: Fix use-after-free in multifd_recv_unfill_packet()Peter Maydell
Coverity points out (CID 1400442) that in this code: if (packet->pages_alloc > p->pages->allocated) { multifd_pages_clear(p->pages); multifd_pages_init(packet->pages_alloc); } we free p->pages in multifd_pages_clear() but continue to use it in the following code. We also leak memory, because multifd_pages_init() returns the pointer to a new MultiFDPages_t struct but we are ignoring its return value. Fix both of these bugs by adding the missing assignment of the newly created struct to p->pages. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-id: 20190409151830.6024-1-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-04-05migration/ram.c: Fix codes conflict about bitmap_mutexZhang Chen
I found upstream codes conflict with COLO and lead to crash, and I located to this patch: commit 386a907b37a9321bc5d699bc37104d6ffba1b34d Author: Wei Wang <wei.w.wang@intel.com> Date: Tue Dec 11 16:24:49 2018 +0800 migration: use bitmap_mutex in migration_bitmap_clear_dirty My colleague Wei's patch add bitmap_mutex in migration_bitmap_clear_dirty, but COLO didn't initialize the bitmap_mutex. So we always get an error when COLO start up. like that: qemu-system-x86_64: util/qemu-thread-posix.c:64: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed. This patch add the bitmap_mutex initialize and destroy in COLO lifecycle. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20190329222951.28945-1-chen.zhang@intel.com> Reviewed-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-04-02migration: Support adding migration blockers earlierMarkus Armbruster
migrate_add_blocker() asserts we have a current_migration object, in migrate_get_current(). We do only after migration_object_init(). This contributes to the following dependency cycle: * configure_blockdev() must run before machine_set_property() so machine properties can refer to block backends * machine_set_property() before configure_accelerator() so machine properties like kvm-irqchip get applied * configure_accelerator() before migration_object_init() so that Xen's accelerator compat properties get applied. * migration_object_init() before configure_blockdev() so configure_blockdev() can add migration blockers The cycle was closed when recent commit cda4aa9a5a0 "Create block backends before setting machine properties" added the first dependency, and satisfied it by violating the last one. Broke block backends that add migration blockers, as demonstrated by qemu-iotests 055. To fix it, break the last dependency: make migrate_add_blocker() usable before migration_object_init(). The previous commit already removed the use of migrate_get_current() from migrate_add_blocker() itself. Didn't quite do the trick, as there's another one hiding in migration_is_idle(). The use there isn't actually necessary: when no migration object has been created yet, migration is surely idle. Make migration_is_idle() return true then. Fixes: cda4aa9a5a08777cf13e164c0543bd4888b8adce Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190401090827.20793-4-armbru@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>