aboutsummaryrefslogtreecommitdiff
path: root/migration/savevm.c
AgeCommit message (Collapse)Author
2019-08-14migration/savevm: flush file for iterable_only caseWei Yang
It would be proper to flush file even for iterable_only case. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190709140924.13291-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14migration: Add error_desc for file channel errorsYury Kotov
Currently, there is no information about error if outgoing migration was failed because of file channel errors. Example (QMP session): -> { "execute": "migrate", "arguments": { "uri": "exec:head -c 1" }} <- { "return": {} } ... -> { "execute": "query-migrate" } <- { "return": { "status": "failed" }} // There is not error's description And even in the QEMU's output there is nothing. This patch 1) Adds errp for the most of QEMUFileOps 2) Adds qemu_file_get_error_obj/qemu_file_set_error_obj 3) And finally using of qemu_file_get_error_obj in migration.c And now, the status for the mentioned fail will be: -> { "execute": "query-migrate" } <- { "return": { "status": "failed", "error-desc": "Unable to write to command: Broken pipe" }} Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190422103420.15686-1-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-07-15migration/postcopy: remove redundant cpu_synchronize_all_post_initWei Yang
cpu_synchronize_all_post_init() is called twice in loadvm_postcopy_handle_run_bh(), so remove one redundant call. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190715080751.24304-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-05-14migration/savevm: wrap into qemu_loadvm_state_header()Wei Yang
On source side, we have qemu_savevm_state_header() to send related data, while on the receiving side those steps are scattered in qemu_loadvm_state(). This patch wrap those related steps into qemu_loadvm_state_header() to make it friendly to read. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-5-richardw.yang@linux.intel.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration/savevm: load_header before load_setupWei Yang
In migration_thread() and qemu_savevm_state(), we savevm_state in following sequence: qemu_savevm_state_header(f); qemu_savevm_state_setup(f); Then it would be more proper to loadvm_state in the save sequence. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration/savevm: remove duplicate check of migration_is_blockedWei Yang
Current call flow of save_snapshot is: save_snapshot migration_is_blocked qemu_savevm_state migration_is_blocked Since qemu_savevm_state is only called in save_snapshot, this means migration_is_blocked has been already checked. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration: savevm: fix error code with migration blockersCole Robinson
The only caller that checks the error code is looking for != 0, so returning false is incorrect. Fixes: 5aaac467938 "migration: savevm: consult migration blockers" Signed-off-by: Cole Robinson <crobinso@redhat.com> Message-Id: <b991a4d0e6c4253bc08b2794c6084be55fc72e1d.1554851834.git.crobinso@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14migration: not necessary to check ops againWei Yang
During each iteration, se->ops is checked before each loop. So it is not necessary to check it again and simplify the following check a little. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190327013130.26259-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-04-02Revert "migration: move only_migratable to MigrationState"Markus Armbruster
This reverts commit 3df663e575f1876d7f3bc684f80e72fca0703d39. This reverts commit b605c47b57b58e61a901a50a0762dccf43d94783. Command line option --only-migratable is for disallowing any configuration that can block migration. Initially, --only-migratable set global variable @only_migratable. Commit 3df663e575 "migration: move only_migratable to MigrationState" replaced it by MigrationState member @only_migratable. That was a mistake. First, it doesn't make sense on the design level. MigrationState captures the state of an individual migration, but --only-migratable isn't a property of an individual migration, it's a restriction on QEMU configuration. With fault tolerance, we could have several migrations at once. --only-migratable would certainly protect all of them. Storing it in MigrationState feels inappropriate. Second, it contributes to a dependency cycle that manifests itself as a bug now. Putting @only_migratable into MigrationState means its available only after migration_object_init(). We can't set it before migration_object_init(), so we delay setting it with a global property (this is fixup commit b605c47b57 "migration: fix handling for --only-migratable"). We can't get it before migration_object_init(), so anything that uses it can only run afterwards. Since migrate_add_blocker() needs to obey --only-migratable, any code adding migration blockers can run only afterwards. This contributes to the following dependency cycle: * configure_blockdev() must run before machine_set_property() so machine properties can refer to block backends * machine_set_property() before configure_accelerator() so machine properties like kvm-irqchip get applied * configure_accelerator() before migration_object_init() so that Xen's accelerator compat properties get applied. * migration_object_init() before configure_blockdev() so configure_blockdev() can add migration blockers The cycle was closed when recent commit cda4aa9a5a0 "Create block backends before setting machine properties" added the first dependency, and satisfied it by violating the last one. Broke block backends that add migration blockers. Moving @only_migratable into MigrationState was a mistake. Revert it. This doesn't quite break the "migration_object_init() before configure_blockdev() dependency, since migrate_add_blocker() still has another dependency on migration_object_init(). To be addressed the next commit. Note that the reverted commit made -only-migratable sugar for -global migration.only-migratable=on below the hood. Documentation has only ever mentioned -only-migratable. This commit removes the arcane & undocumented alternative to -only-migratable again. Nobody should be using it. Conflicts: include/migration/misc.h migration/migration.c migration/migration.h vl.c Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190401090827.20793-3-armbru@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2019-03-06migration/ram.c: add a notifier chain for precopyWei Wang
This patch adds a notifier chain for the memory precopy. This enables various precopy optimizations to be invoked at specific places. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-6-git-send-email-wei.w.wang@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-03-06migration: Add capabilities validationYury Kotov
Currently we don't check which capabilities set in the source QEMU. We just expect that the target QEMU has the same enabled capabilities. Add explicit validation for capabilities to make sure that the target VM has them too. This is enabled for only new capabilities to keep compatibily. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-6-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Manual merge
2019-03-05migration: Switch to using announce timerDr. David Alan Gilbert
Switch the announcements to using the new announce timer. Move the code that does it to announce.c rather than savevm because it really has nothing to do with the actual migration. Migration starts the announce from bh's and so they're all in the main thread/bql, and so there's never any racing with the timers themselves. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-01-23vmstate: constify SaveVMHandlersMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181114133139.27346-1-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-01-23migration: add more error handling for postcopy_ram_enable_notifyFei Li
Call postcopy_ram_incoming_cleanup() to do the cleanup when postcopy_ram_enable_notify fails. Besides, report the error message when qemu_ram_foreach_migratable_block() fails. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Fei Li <fli@suse.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190113140849.38339-5-lifei1214@126.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-11-27vmstate: constify VMStateFieldMarc-André Lureau
Because they are supposed to remain const. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181114132931.22624-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-27migration: savevm: consult migration blockersPaolo Bonzini
There is really no difference between live migration and savevm, except that savevm does not require bdrv_invalidate_cache to be implemented by all disks. However, it is unlikely that savevm is used with anything except qcow2 disks, so the penalty is small and worth the improvement in catching bad usage of savevm. Only one place was taking care of savevm when adding a migration blocker, and it can be removed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-31migration: avoid segmentfault when take a snapshot of a VM which being migratedJia Lina
During an active background migration, snapshot will trigger a segmentfault. As snapshot clears the "current_migration" struct and updates "to_dst_file" before it finds out that there is a migration task, Migration accesses the null pointer in "current_migration" struct and qemu crashes eventually. Signed-off-by: Jia Lina <jialina01@baidu.com> Signed-off-by: Chai Wen <chaiwen@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20181026083620.10172-1-jialina01@baidu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-10-23Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-10-22' ↵Peter Maydell
into staging Error reporting patches for 2018-10-22 # gpg: Signature made Mon 22 Oct 2018 13:20:23 BST # gpg: using RSA key 3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-error-2018-10-22: (40 commits) error: Drop bogus "use error_setg() instead" admonitions vpc: Fail open on bad header checksum block: Clean up bdrv_img_create()'s error reporting vl: Simplify call of parse_name() vl: Fix exit status for -drive format=help blockdev: Convert drive_new() to Error vl: Assert drive_new() does not fail in default_drive() fsdev: Clean up error reporting in qemu_fsdev_add() spice: Clean up error reporting in add_channel() tpm: Clean up error reporting in tpm_init_tpmdev() numa: Clean up error reporting in parse_numa() vnc: Clean up error reporting in vnc_init_func() ui: Convert vnc_display_init(), init_keyboard_layout() to Error ui/keymaps: Fix handling of erroneous include files vl: Clean up error reporting in device_init_func() vl: Clean up error reporting in parse_fw_cfg() vl: Clean up error reporting in mon_init_func() vl: Clean up error reporting in machine_set_property() vl: Clean up error reporting in chardev_init_func() qom: Clean up error reporting in user_creatable_add_opts_foreach() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-19migration: Fix !replay_can_snapshot() error handlingMarkus Armbruster
Calling error_report() in a function that takes an Error ** argument is suspicious. save_snapshot() and load_snapshot() do that, and then fail without setting an error. Wrong. The HMP commands survive this unscathed, since hmp_handle_error() does nothing when no error has been set. Callers main() (on behalf of -loadvm) and replay_vmstate_init() crash, but I'm not sure the error is possible there. Screwed up when commit 377b21ccea1 (v2.12.0) added incorrect error handling right next to correct examples. Fix by calling error_setg() instead of error_report(). Fixes: 377b21ccea1755a8b0dae822c29567c58dda6939 Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181017082702.5581-13-armbru@redhat.com>
2018-10-19savevm: split the process of different stages for loadvm/savevmZhang Chen
There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: Load dirty pages into SVM's RAM cache firstlyZhang Chen
We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: Remove colo_state migration structZhang Chen
We need to know if migration is going into COLO state for incoming side before start normal migration. Instead by using the VMStateDescription to send colo_state from source side to destination side, we use MIG_CMD_ENABLE_COLO to indicate whether COLO is enabled or not. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-09-26migration: cleanup in error paths in loadvmDr. David Alan Gilbert
There's a couple of error paths in qemu_loadvm_state which happen early on but after we've initialised the load state; that needs to be cleaned up otherwise we can hit asserts if the state gets reinitialised later. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20180914170430.54271-3-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-09-26migration/postcopy: Clear have_listen_threadDr. David Alan Gilbert
Clear have_listen_thread when we exit the thread. The fallout from this was that various things thought there was an ongoing postcopy after the postcopy had finished. The case that failed was postcopy->savevm->loadvm. This corresponds to RH bug https://bugzilla.redhat.com/show_bug.cgi?id=1608765 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20180914170430.54271-2-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-09-26Add a hint message to loadvm and exits on failureJose Ricardo Ziviani
This patch adds a small hint for the failure case of the load snapshot process. It may be useful for users to remember that the VM configuration has changed between the save and load processes. (qemu) loadvm vm-20180903083641 Unknown savevm section or instance 'cpu_common' 4. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices Error -22 while loading VM state (qemu) device_add host-spapr-cpu-core,core-id=4 (qemu) loadvm vm-20180903083641 (qemu) c (qemu) info status VM status: running It also exits Qemu if the snapshot cannot be loaded before reaching the main loop (-loadvm in the command line). $ qemu-system-ppc64 ... -loadvm vm-20180903083641 qemu-system-ppc64: Unknown savevm section or instance 'cpu_common' 4. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices qemu-system-ppc64: Error -22 while loading VM state $ Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com> Message-Id: <20180903162613.15877-1-joserz@linux.ibm.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-08-22migration: implement bi-directional RDMA QIOChannelLidong Chen
This patch implements bi-directional RDMA QIOChannel. Because different threads may access RDMAQIOChannel currently, this patch use RCU to protect it. Signed-off-by: Lidong Chen <lidongchen@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-07-10migration: reorder MIG_CMD_POSTCOPY_RESUMEPeter Xu
It was accidently added before MIG_CMD_PACKAGED so it might break command compatibility when we run postcopy migration between old/new QEMUs. Fix that up quickly before the QEMU 3.0 release. Reported-by: Lukáš Doktor <ldoktor@redhat.com> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180710094424.30754-1-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-07-10migration: loosen recovery check when load vmPeter Xu
We were checking against -EIO, assuming that it will cover all IO failures. But actually it is not. One example is that in qemu_loadvm_section_start_full() we can have tons of places that will return -EINVAL even if the error is caused by IO failures on the network. Let's loosen the recovery check logic here to cover all the error cases happened by removing the explicit check against -EIO. After all we won't lose anything here if any other failure happened. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180710091902.28780-3-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-07-10migration: delay postcopy paused statePeter Xu
Before this patch we firstly setup the postcopy-paused state then we clean up the QEMUFile handles. That can be racy if there is a very fast "migrate-recover" command running in parallel. Fix that up. Reported-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180627132246.5576-2-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-06-04Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20180604' ↵Peter Maydell
into staging migration/next for 20180604 # gpg: Signature made Mon 04 Jun 2018 05:14:24 BST # gpg: using RSA key F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" # gpg: aka "Juan Quintela <quintela@trasno.org>" # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * remotes/juanquintela/tags/migration/20180604: migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect migration: remove unnecessary variables len in QIOChannelRDMA migration: Don't activate block devices if using -S migration: discard non-migratable RAMBlocks migration: introduce decompress-error-check Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-04migration: discard non-migratable RAMBlocksCédric Le Goater
On the POWER9 processor, the XIVE interrupt controller can control interrupt sources using MMIO to trigger events, to EOI or to turn off the sources. Priority management and interrupt acknowledgment is also controlled by MMIO in the presenter sub-engine. These MMIO regions are exposed to guests in QEMU with a set of 'ram device' memory mappings, similarly to VFIO, and the VMAs are populated dynamically with the appropriate pages using a fault handler. But, these regions are an issue for migration. We need to discard the associated RAMBlocks from the RAM state on the source VM and let the destination VM rebuild the memory mappings on the new host in the post_load() operation just before resuming the system. To achieve this goal, the following introduces a new RAMBlock flag RAM_MIGRATABLE which is updated in the vmstate_register_ram() and vmstate_unregister_ram() routines. This flag is then used by the migration to identify RAMBlocks to discard on the source. Some checks are also performed on the destination to make sure nothing invalid was sent. This change impacts the boston, malta and jazz mips boards for which migration compatibility is broken. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-06-01migration: drop an unused includeMichael S. Tsirkin
In the vmstate.h file, we just need a struct name. Use a forward declaration instead of an include, then adjust the one affected .c file to include the file that is no longer implicit from the header. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-05-15qmp/migration: new command migrate-recoverPeter Xu
The first allow-oob=true command. It's used on destination side when the postcopy migration is paused and ready for a recovery. After execution, a new migration channel will be established for postcopy to continue. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-21-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> --- s/2.12/2.13/
2018-05-15migration: introduce SaveVMHandlers.resume_preparePeter Xu
This is hook function to be called when a postcopy migration wants to resume from a failure. For each module, it should provide its own recovery logic before we switch to the postcopy-active state. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-16-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-05-15migration: new message MIG_RP_MSG_RESUME_ACKPeter Xu
Creating new message to reply for MIG_CMD_POSTCOPY_RESUME. One uint32_t is used as payload to let the source know whether destination is ready to continue the migration. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-15-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-05-15migration: new cmd MIG_CMD_POSTCOPY_RESUMEPeter Xu
Introducing this new command to be sent when the source VM is ready to resume the paused migration. What the destination does here is basically release the fault thread to continue service page faults. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-14-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-05-15migration: new message MIG_RP_MSG_RECV_BITMAPPeter Xu
Introducing new return path message MIG_RP_MSG_RECV_BITMAP to send received bitmap of ramblock back to source. This is the reply message of MIG_CMD_RECV_BITMAP, it contains not only the header (including the ramblock name), and it was appended with the whole ramblock received bitmap on the destination side. When the source receives such a reply message (MIG_RP_MSG_RECV_BITMAP), it parses it, convert it to the dirty bitmap by inverting the bits. One thing to mention is that, when we send the recv bitmap, we are doing these things in extra: - converting the bitmap to little endian, to support when hosts are using different endianess on src/dst. - do proper alignment for 8 bytes, to support when hosts are using different word size (32/64 bits) on src/dst. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-13-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-05-15migration: new cmd MIG_CMD_RECV_BITMAPPeter Xu
Add a new vm command MIG_CMD_RECV_BITMAP to request received bitmap for one ramblock. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-12-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-05-15migration: allow fault thread to pausePeter Xu
Allows the fault thread to stop handling page faults temporarily. When network failure happened (and if we expect a recovery afterwards), we should not allow the fault thread to continue sending things to source, instead, it should halt for a while until the connection is rebuilt. When the dest main thread noticed the failure, it kicks the fault thread to switch to pause state. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-7-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-05-15migration: allow dst vm pause on postcopyPeter Xu
When there is IO error on the incoming channel (e.g., network down), instead of bailing out immediately, we allow the dst vm to switch to the new POSTCOPY_PAUSE state. Currently it is still simple - it waits the new semaphore, until someone poke it for another attempt. One note is that here on ram loading thread we cannot detect the POSTCOPY_ACTIVE state, but we need to detect the more specific POSTCOPY_INCOMING_RUNNING state, to make sure we have already loaded all the device states. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180502104740.12123-5-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-03-20Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio,vhost,pci,pc: features, cleanups SRAT tables for DIMM devices new virtio net flags for speed/duplex post-copy migration support in vhost cleanups in pci Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (51 commits) postcopy shared docs libvhost-user: Claim support for postcopy postcopy: Allow shared memory vhost: Huge page align and merge vhost+postcopy: Wire up POSTCOPY_END notify vhost-user: Add VHOST_USER_POSTCOPY_END message libvhost-user: mprotect & madvises for postcopy vhost+postcopy: Call wakeups vhost+postcopy: Add vhost waker postcopy: postcopy_notify_shared_wake postcopy: helper for waking shared vhost+postcopy: Resolve client address postcopy-ram: add a stub for postcopy_request_shared_page vhost+postcopy: Helper to send requests to source for shared pages vhost+postcopy: Stash RAMBlock and offset vhost+postcopy: Send address back to qemu libvhost-user+postcopy: Register new regions with the ufd migration/ram: ramblock_recv_bitmap_test_byte_offset postcopy+vhost-user: Split set_mem_table for postcopy vhost+postcopy: Transmit 'listen' to slave ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # scripts/update-linux-headers.sh
2018-03-20vhost+postcopy: Transmit 'listen' to slaveDr. David Alan Gilbert
Notify the vhost-user slave on reception of the 'postcopy-listen' event from the source. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' messageDr. David Alan Gilbert
Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE message on an incoming advise. Later patches will fill in the behaviour/contents of the message. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-16Merge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Tue 13 Mar 2018 21:11:43 GMT # gpg: using RSA key 7DEF8106AAFC390E # gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" # Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB # Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E * remotes/jnsnow/tags/bitmaps-pull-request: iotests: add dirty bitmap postcopy test iotests: add dirty bitmap migration test migration: add postcopy migration of dirty bitmaps migration: allow qmp command migrate-start-postcopy for any postcopy migration: add is_active_iterate handler migration/qemu-file: add qemu_put_counted_string() migration: include migrate_dirty_bitmaps in migrate_postcopy qapi: add dirty-bitmaps migration capability migration: introduce postcopy-only pending dirty-bitmap: add locked state block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-13migration: add postcopy migration of dirty bitmapsVladimir Sementsov-Ogievskiy
Postcopy migration of dirty bitmaps. Only named dirty bitmaps are migrated. If destination qemu is already containing a dirty bitmap with the same name as a migrated bitmap (for the same node), then, if their granularities are the same the migration will be done, otherwise the error will be generated. If destination qemu doesn't contain such bitmap it will be created. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20180313180320.339796-12-vsementsov@virtuozzo.com [Changed '+' to '*' as per list discussion. --js] Signed-off-by: John Snow <jsnow@redhat.com>
2018-03-13migration: add is_active_iterate handlerVladimir Sementsov-Ogievskiy
Only-postcopy savevm states (dirty-bitmap) don't need live iteration, so to disable them and stop transporting empty sections there is a new savevm handler. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180313180320.339796-10-vsementsov@virtuozzo.com
2018-03-13migration: introduce postcopy-only pendingVladimir Sementsov-Ogievskiy
There would be savevm states (dirty-bitmap) which can migrate only in postcopy stage. The corresponding pending is introduced here. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20180313180320.339796-6-vsementsov@virtuozzo.com
2018-03-12replay: fix save/load vm for non-empty queuePavel Dovgalyuk
This patch does not allows saving/loading vmstate when replay events queue is not empty. There is no reliable way to save events queue, because it describes internal coroutine state. Therefore saving and loading operations should be deferred to another record/replay step. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20180227095214.1060.32939.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-02qapi: Empty out qapi-schema.jsonMarkus Armbruster
The previous commit improved compile time by including less of the generated QAPI headers. This is impossible for stuff defined directly in qapi-schema.json, because that ends up in headers that that pull in everything. Move everything but include directives from qapi-schema.json to new sub-module qapi/misc.json, then include just the "misc" shard where possible. It's possible everywhere, except: * monitor.c needs qmp-command.h to get qmp_init_marshal() * monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need qapi-event.h to get enum QAPIEvent Perhaps we'll get rid of those some other day. Adding a type to qapi/migration.json now recompiles some 120 instead of 2300 out of 5100 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-25-armbru@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-02-14migration: pass MigrationState to migrate_init()Peter Xu
Let the callers take the object, then pass it to migrate_init(). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180208103132.28452-12-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>