aboutsummaryrefslogtreecommitdiff
path: root/migration
AgeCommit message (Collapse)Author
2023-04-24migration: Create migrate_max_bandwidth() functionJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Move migrate_postcopy() to options.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Create migrate_cpu_throttle_tailslow() functionJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Create migrate_cpu_throttle_increment() functionJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Create migrate_cpu_throttle_initial() to option.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Move migrate_announce_params() to option.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> --- Fix extra whitespace (fabiano)
2023-04-24migration: Create migrate_max_cpu_throttle()Juan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Create migrate_checkpoint_delay()Juan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Create migrate_throttle_trigger_threshold()Juan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Move migrate_use_block_incremental() to option.cJuan Quintela
To be consistent with every other parameter, rename to migrate_block_incremental(). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Use migrate_max_postcopy_bandwidth()Juan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move parameters functions to option.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_cap_set() to options.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move qmp_migrate_set_capabilities() to options.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move qmp_query_migrate_capabilities() to options.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_caps_check() to options.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Create migrate_rdma_pin_all() functionJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> --- Fixed missing space after comma (fabiano)
2023-04-24migration: Move migrate_use_return() to options.cJuan Quintela
Once that we are there, we rename the function to migrate_return_path() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_use_block() to options.cJuan Quintela
Once that we are there, we rename the function to migrate_block() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_use_xbzrle() to options.cJuan Quintela
Once that we are there, we rename the function to migrate_xbzrle() to be consistent with all other capabilities. We change the type to return bool also for consistency. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_use_zero_copy_send() to options.cJuan Quintela
Once that we are there, we rename the function to migrate_zero_copy_send() to be consistent with all other capabilities. We can remove the CONFIG_LINUX guard. We already check that we can't setup this capability in migrate_caps_check(). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_use_multifd() to options.cJuan Quintela
Once that we are there, we rename the function to migrate_multifd() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_use_events() to options.cJuan Quintela
Once that we are there, we rename the function to migrate_events() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_use_compression() to options.cJuan Quintela
Once that we are there, we rename the function to migrate_compress() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Move migrate_colo_enabled() to options.cJuan Quintela
Once that we are there, we rename the function to migrate_colo() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: Create options.cJuan Quintela
We move there all capabilities helpers from migration.c. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- Following David advise: - looked through the history, capabilities are newer than 2012, so we can remove that bit of the header. - This part is posterior to Anthony. Original Author is Orit. Once there, I put myself. Peter Xu also did quite a bit of work here. Anyone else wants/needs to be there? I didn't search too hard because nobody asked before to be added. What do you think?
2023-04-24migration: Create migrate_cap_set()Juan Quintela
And remove the convoluted use of qmp_migrate_set_capabilities() to enable disable MIGRATION_CAPABILITY_BLOCK. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24spice: move client_migrate_info command to ui/Juan Quintela
It has nothing to do with migration, except for the "migrate" in the name of the command. Move it with the rest of the ui commands. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-04-24migration: move migration_global_dump() to migration-hmp-cmds.cJuan Quintela
It is only used there, so we can make it static. Once there, remove spice.h that it is not used. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> --- fix David Edmonson ui/qemu-spice.h unintended removal
2023-04-24migration: Minor control flow simplificationEric Blake
No need to declare a temporary variable. Suggested-by: Juan Quintela <quintela@redhat.com> Fixes: 1df36e8c6289 ("migration: Handle block device inactivation failures better") Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-24migration: Pass migrate_caps_check() the old and new capsJuan Quintela
We used to pass the old capabilities array and the new capabilities as a list. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: rename enabled_capabilities to capabilitiesJuan Quintela
It is clear from the context what that means, and such a long name with the extra long names of the capabilities make very difficilut to stay inside the 80 columns limit. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration/postcopy: Detect file system on dest hostPeter Xu
Postcopy requires the memory support userfaultfd to work. Right now we check it but it's a bit too late (when switching to postcopy migration). Do that early right at enabling of postcopy. Note that this is still only a best effort because ramblocks can be dynamically created. We can add check in hostmem creations and fail if postcopy enabled, but maybe that's too aggressive. Still, we have chance to fail the most obvious where we know there's an existing unsupported ramblock. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-24migration: Handle block device inactivation failures betterEric Blake
Consider what happens when performing a migration between two host machines connected to an NFS server serving multiple block devices to the guest, when the NFS server becomes unavailable. The migration attempts to inactivate all block devices on the source (a necessary step before the destination can take over); but if the NFS server is non-responsive, the attempt to inactivate can itself fail. When that happens, the destination fails to get the migrated guest (good, because the source wasn't able to flush everything properly): (qemu) qemu-kvm: load of migration failed: Input/output error at which point, our only hope for the guest is for the source to take back control. With the current code base, the host outputs a message, but then appears to resume: (qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1) (src qemu)info status VM status: running but a second migration attempt now asserts: (src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. Whether the guest is recoverable on the source after the first failure is debatable, but what we do not want is to have qemu itself fail due to an assertion. It looks like the problem is as follows: In migration.c:migration_completion(), the source sets 'inactivate' to true (since COLO is not enabled), then tries savevm.c:qemu_savevm_state_complete_precopy() with a request to inactivate block devices. In turn, this calls block.c:bdrv_inactivate_all(), which fails when flushing runs up against the non-responsive NFS server. With savevm failing, we are now left in a state where some, but not all, of the block devices have been inactivated; but migration_completion() then jumps to 'fail' rather than 'fail_invalidate' and skips an attempt to reclaim those those disks by calling bdrv_activate_all(). Even if we do attempt to reclaim disks, we aren't taking note of failure there, either. Thus, we have reached a state where the migration engine has forgotten all state about whether a block device is inactive, because we did not set s->block_inactive in enough places; so migration allows the source to reach vm_start() and resume execution, violating the block layer invariant that the guest CPUs should not be restarted while a device is inactive. Note that the code in migration.c:migrate_fd_cancel() will also try to reactivate all block devices if s->block_inactive was set, but because we failed to set that flag after the first failure, the source assumes it has reclaimed all devices, even though it still has remaining inactivated devices and does not try again. Normally, qmp_cont() will also try to reactivate all disks (or correctly fail if the disks are not reclaimable because NFS is not yet back up), but the auto-resumption of the source after a migration failure does not go through qmp_cont(). And because we have left the block layer in an inconsistent state with devices still inactivated, the later migration attempt is hitting the assertion failure. Since it is important to not resume the source with inactive disks, this patch marks s->block_inactive before attempting inactivation, rather than after succeeding, in order to prevent any vm_start() until it has successfully reactivated all devices. See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982 Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-24migration: Rename normal to normal_pagesJuan Quintela
Rest of counters that refer to pages has a _pages suffix. And historically, this showed the number of full pages transferred. The name "normal" refered to the fact that they were sent without any optimization (compression, xbzrle, zero_page, ...). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2023-04-24migration: Rename duplicate to zero_pagesJuan Quintela
Rest of counters that refer to pages has a _pages suffix. And historically, this showed the number of pages composed of the same character, here comes the name "duplicated". But since years ago, it refers to the number of zero_pages. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2023-04-24migration: Make postcopy_requests atomicJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2023-04-24migration: Make dirty_sync_count atomicJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2023-04-24migration: Make downtime_bytes atomicJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2023-04-24migration: Make precopy_bytes atomicJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2023-04-24migration: Make dirty_sync_missed_zero_copy atomicJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2023-04-24migration: Make multifd_bytes atomicJuan Quintela
In the spirit of: commit 394d323bc3451e4d07f13341cb8817fac8dfbadd Author: Peter Xu <peterx@redhat.com> Date: Tue Oct 11 17:55:51 2022 -0400 migration: Use atomic ops properly for page accountings Reviewed-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-24migration: Update atomic stats out of the mutexJuan Quintela
Reviewed-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-24migration: Merge ram_counters and ram_atomic_countersJuan Quintela
Using MgrationStats as type for ram_counters mean that we didn't have to re-declare each value in another struct. The need of atomic counters have make us to create MigrationAtomicStats for this atomic counters. Create RAMStats type which is a merge of MigrationStats and MigrationAtomicStats removing unused members. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> --- Fix typos found by David Edmondson
2023-04-24migration: remove extra whitespace character for code style李皆俊
Fix code style. Signed-off-by: 李皆俊 <a_lijiejun@163.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-20postcopy-ram: do not use qatomic_mb_readPaolo Bonzini
It does not even pair with a qatomic_mb_set(), so it is clearer to use load-acquire in this case; they are synonyms. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-04-20migration: mark mixed functions that can suspendPaolo Bonzini
There should be no paths from a coroutine_fn to aio_poll, however in practice coroutine_mixed_fn will call aio_poll in the !qemu_in_coroutine() path. By marking mixed functions, we can track accurately the call paths that execute entirely in coroutine context, and find more missing coroutine_fn markers. This results in more accurate checks that coroutine code does not end up blocking. If the marking were extended transitively to all functions that call these ones, static analysis could be done much more efficiently. However, this is a start and makes it possible to use vrc's path-based searches to find potential bugs where coroutine_fns call blocking functions. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-04-12migration: fix ram_state_pending_exact()Juan Quintela
I removed that bit on commit: commit c8df4a7aeffcb46020f610526eea621fa5b0cd47 Author: Juan Quintela <quintela@redhat.com> Date: Mon Oct 3 02:00:03 2022 +0200 migration: Split save_live_pending() into state_pending_* Fixes: c8df4a7aeffcb46020f610526eea621fa5b0cd47 Suggested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-12migration/ram.c: Fix migration with compress enabledLukas Straub
Since ec6f3ab9, migration with compress enabled was broken, because the compress threads use a dummy QEMUFile which just acts as a buffer and that commit accidentally changed it to use the outgoing migration channel instead. Fix this by using the dummy file again in the compress threads. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-12migration: Recover behavior of preempt channel creation for pre-7.2Peter Xu
In 8.0 devel window we reworked preempt channel creation, so that there'll be no race condition when the migration channel and preempt channel got established in the wrong order in commit 5655aab079. However no one noticed that the change will also be not compatible with older qemus, majorly 7.1/7.2 versions where preempt mode started to be supported. Leverage the same pre-7.2 flag introduced in the previous patch to recover the behavior hopefully before 8.0 releases, so we don't break migration when we migrate from 8.0 to older qemu binaries. Fixes: 5655aab079 ("migration: Postpone postcopy preempt channel to be after main") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>