aboutsummaryrefslogtreecommitdiff
path: root/migration/migration.c
AgeCommit message (Collapse)Author
2024-04-01migration/postcopy: Ensure postcopy_start() sets errp if it failsAvihai Horon
There are several places where postcopy_start() fails without setting errp. This can cause a null pointer de-reference, as in case of error, the caller of postcopy_start() copies/prints the error set in errp. Fix it by setting errp in all of postcopy_start() error paths. Cc: qemu-stable <qemu-stable@nongnu.org> Fixes: 908927db28ea ("migration: Update error description whenever migration fails") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240328140252.16756-3-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com> (cherry picked from commit d0ad271a7613459bd0a3397c8071a4ad06f3f7eb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-12migration: Fix logic of channels and transport compatibility checkAvihai Horon
The commit in the fixes line mistakenly modified the channels and transport compatibility check logic so it now checks multi-channel support only for socket transport type. Thus, running multifd migration using a transport other than socket that is incompatible with multi-channels (such as "exec") would lead to a segmentation fault instead of an error message. For example: (qemu) migrate_set_capability multifd on (qemu) migrate -d "exec:cat > /tmp/vm_state" Segmentation fault (core dumped) Fix it by checking multi-channel compatibility for all transport types. Cc: qemu-stable <qemu-stable@nongnu.org> Fixes: d95533e1cdcc ("migration: modify migration_channels_and_uri_compatible() for new QAPI syntax") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240125162528.7552-2-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com> (cherry picked from commit 3205bebd4fc6dd501fb8b10c93ddce9da18e09db) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-29migration: Fix use-after-free of migration state objectFabiano Rosas
We're currently allowing the process_incoming_migration_bh bottom-half to run without holding a reference to the 'current_migration' object, which leads to a segmentation fault if the BH is still live after migration_shutdown() has dropped the last reference to current_migration. In my system the bug manifests as migrate_multifd() returning true when it shouldn't and multifd_load_shutdown() calling multifd_recv_terminate_threads() which crashes due to an uninitialized multifd_recv_state. Fix the issue by holding a reference to the object when scheduling the BH and dropping it before returning from the BH. The same is already done for the cleanup_bh at migrate_fd_cleanup_schedule(). Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1969 Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240119233922.32588-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com> (cherry picked from commit 27eb8499edb2bc952c29ddae0bdac9fc959bf7b1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-12-04Merge tag 'migration-20231201-pull-request' of ↵Stefan Hajnoczi
https://github.com/xzpeter/qemu into staging Migration patches for rc3: - One more memleak regression fix from Het # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZWoLbRIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wahYwD+OsD7CaZYjkl9KSooRfblEenD6SdfhAdC # oZc07f2UxocA/0s1keDBZUUcZOiGYPDFV5his4Jw4F+RRD1YIpVWZg4J # =T0/r # -----END PGP SIGNATURE----- # gpg: Signature made Fri 01 Dec 2023 11:35:57 EST # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [full] # gpg: aka "Peter Xu <peterx@redhat.com>" [full] # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-20231201-pull-request' of https://github.com/xzpeter/qemu: migration: Plug memory leak with migration URIs Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-12-01migration: Plug memory leak with migration URIsHet Gala
migrate_uri_parse() allocates memory to 'channel' if the user opts for old syntax - uri, which is leaked because there is no code for freeing 'channel'. So, free channel to avoid memory leak in case where 'channels' is empty and uri parsing is required. Fixes: 5994024f ("migration: Implement MigrateChannelList to qmp migration flow") Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Markus Armbruster <armbru@redhat.com> Tested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/20231129204301.131228-1-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>
2023-11-30migration: free 'saddr' since be no longer usedZongmin Zhou
Since socket_parse() will allocate memory for 'saddr',and its value will pass to 'addr' that allocated by migrate_uri_parse(), then 'saddr' will no longer used,need to free. But due to 'saddr->u' is shallow copying the contents of the union, the members of this union containing allocated strings,and will be used after that. So just free 'saddr' itself without doing a deep free on the contents of the SocketAddress. Fixes: 72a8192e225c ("migration: convert migration 'uri' into 'MigrateAddress'") Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231120031428.908295-1-zhouzongmin@kylinos.cn>
2023-11-02migration: Implement MigrateChannelList to hmp migration flow.Het Gala
Integrate MigrateChannelList with all transport backends (socket, exec and rdma) for both src and dest migration endpoints for hmp migration. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-ID: <20231023182053.8711-14-farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-11-02migration: Implement MigrateChannelList to qmp migration flow.Het Gala
Integrate MigrateChannelList with all transport backends (socket, exec and rdma) for both src and dest migration endpoints for qmp migration. For current series, limit the size of MigrateChannelList to single element (single interface) as runtime check. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-13-farosas@suse.de>
2023-11-02migration: modify migration_channels_and_uri_compatible() for new QAPI syntaxHet Gala
migration_channels_and_uri_compatible() check for transport mechanism suitable for multifd migration gets executed when the caller calls old uri syntax. It needs it to be run when using the modern MigrateChannel QAPI syntax too. After URI -> 'MigrateChannel' : migration_channels_and_uri_compatible() -> migration_channels_and_transport_compatible() passes object as argument and check for valid transport mechanism. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-12-farosas@suse.de>
2023-11-02migration: New migrate and migrate-incoming argument 'channels'Het Gala
MigrateChannelList allows to connect accross multiple interfaces. Add MigrateChannelList struct as argument to migration QAPIs. We plan to include multiple channels in future, to connnect multiple interfaces. Hence, we choose 'MigrateChannelList' as the new argument over 'MigrateChannel' to make migration QAPIs future proof. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-10-farosas@suse.de>
2023-11-02migration: Convert the file backend to the new QAPI syntaxFabiano Rosas
Convert the file: URI to accept a FileMigrationArgs to be compatible with the new migration QAPI. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-9-farosas@suse.de>
2023-11-02migration: convert exec backend to accept MigrateAddress.Het Gala
Exec transport backend for 'migrate'/'migrate-incoming' QAPIs accept new wire protocol of MigrateAddress struct. It is achived by parsing 'uri' string and storing migration parameters required for exec connection into strList struct. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-8-farosas@suse.de>
2023-11-02migration: convert rdma backend to accept MigrateAddressHet Gala
RDMA based transport backend for 'migrate'/'migrate-incoming' QAPIs accept new wire protocol of MigrateAddress struct. It is achived by parsing 'uri' string and storing migration parameters required for RDMA connection into well defined InetSocketAddress struct. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-7-farosas@suse.de>
2023-11-02migration: convert socket backend to accept MigrateAddressHet Gala
Socket transport backend for 'migrate'/'migrate-incoming' QAPIs accept new wire protocol of MigrateAddress struct. It is achived by parsing 'uri' string and storing migration parameters required for socket connection into well defined SocketAddress struct. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-6-farosas@suse.de>
2023-11-02migration: convert migration 'uri' into 'MigrateAddress'Het Gala
This patch parses 'migrate' and 'migrate-incoming' QAPI's 'uri' string containing migration connection related information and stores them inside well defined 'MigrateAddress' struct. Fabiano fixed for "file" transport. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-4-farosas@suse.de> Message-ID: <20231023182053.8711-5-farosas@suse.de>
2023-11-02migration: Change ram_dirty_bitmap_reload() retval to boolPeter Xu
Now we have a Error** passed into the return path thread stack, which is even clearer than an int retval. Change ram_dirty_bitmap_reload() and the callers to use a bool instead to replace errnos. Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231017202633.296756-5-peterx@redhat.com>
2023-11-02migration: Allow network to fail even during recoveryPeter Xu
Normally the postcopy recover phase should only exist for a super short period, that's the duration when QEMU is trying to recover from an interrupted postcopy migration, during which handshake will be carried out for continuing the procedure with state changes from PAUSED -> RECOVER -> POSTCOPY_ACTIVE again. Here RECOVER phase should be super small, that happens right after the admin specified a new but working network link for QEMU to reconnect to dest QEMU. However there can still be case where the channel is broken in this small RECOVER window. If it happens, with current code there's no way the src QEMU can got kicked out of RECOVER stage. No way either to retry the recover in another channel when established. This patch allows the RECOVER phase to fail itself too - we're mostly ready, just some small things missing, e.g. properly kick the main migration thread out when sleeping on rp_sem when we found that we're at RECOVER stage. When this happens, it fails the RECOVER itself, and rollback to PAUSED stage. Then the user can retry another round of recovery. To make it even stronger, teach QMP command migrate-pause to explicitly kick src/dst QEMU out when needed, so even if for some reason the migration thread didn't got kicked out already by a failing rethrn-path thread, the admin can also kick it out. This will be an super, super corner case, but still try to cover that. One can try to test this with two proxy channels for migration: (a) socat unix-listen:/tmp/src.sock,reuseaddr,fork tcp:localhost:10000 (b) socat tcp-listen:10000,reuseaddr,fork unix:/tmp/dst.sock So the migration channel will be: (a) (b) src -> /tmp/src.sock -> tcp:10000 -> /tmp/dst.sock -> dst Then to make QEMU hang at RECOVER stage, one can do below: (1) stop the postcopy using QMP command postcopy-pause (2) kill the 2nd proxy (b) (3) try to recover the postcopy using /tmp/src.sock on src (4) src QEMU will go into RECOVER stage but won't be able to continue from there, because the channel is actually broken at (b) Before this patch, step (4) will make src QEMU stuck in RECOVER stage, without a way to kick the QEMU out or continue the postcopy again. After this patch, (4) will quickly fail qemu and bounce back to PAUSED stage. Admin can also kick QEMU from (4) into PAUSED when needed using migrate-pause when needed. After bouncing back to PAUSED stage, one can recover again. Reported-by: Xiaohui Li <xiaohli@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2111332 Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231017202633.296756-3-peterx@redhat.com>
2023-11-02migration: Refactor error handling in source return pathPeter Xu
rp_state.error was a boolean used to show error happened in return path thread. That's not only duplicating error reporting (migrate_set_error), but also not good enough in that we only do error_report() and set it to true, we never can keep a history of the exact error and show it in query-migrate. To make this better, a few things done: - Use error_setg() rather than error_report() across the whole lifecycle of return path thread, keeping the error in an Error*. - With above, no need to have mark_source_rp_bad(), remove it, alongside with rp_state.error itself. - Use migrate_set_error() to apply that captured error to the global migration object when error occured in this thread. - Do the same when detected qemufile error in source return path We need to re-export qemu_file_get_error_obj() to do the last one. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231017202633.296756-2-peterx@redhat.com>
2023-11-01migration: per-mode blockersSteve Sistare
Extend the blocker interface so that a blocker can be registered for one or more migration modes. The existing interfaces register a blocker for all modes, and the new interfaces take a varargs list of modes. Internally, maintain a separate blocker list per mode. The same Error object may be added to multiple lists. When a block is deleted, it is removed from every list, and the Error is freed. No functional change until a new mode is added. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1698263069-406971-3-git-send-email-steven.sistare@oracle.com>
2023-11-01migration: Add tracepoints for downtime checkpointsPeter Xu
This patch is inspired by Joao Martin's patch here: https://lore.kernel.org/r/20230926161841.98464-1-joao.m.martins@oracle.com Add tracepoints for major downtime checkpoints on both src and dst. They share the same tracepoint with a string showing its stage. Besides the checkpoints in the previous patch, this patch also added destination checkpoints. On src, we have these checkpoints added: - src-downtime-start: right before vm stops on src - src-vm-stopped: after vm is fully stopped - src-iterable-saved: after all iterables saved (END sections) - src-non-iterable-saved: after all non-iterable saved (FULL sections) - src-downtime-stop: migration fully completed On dst, we have these checkpoints added: - dst-precopy-loadvm-completes: after loadvm all done for precopy - dst-precopy-bh-*: record BH steps to resume VM for precopy - dst-postcopy-bh-*: record BH steps to resume VM for postcopy On dst side, we don't have a good way to trace total time consumed by iterable or non-iterable for now. We can mark it by 1st time receiving a FULL / END section, but rather than that let's just rely on the other tracepoints added for vmstates to back up the information. With this patch, one can enable "vmstate_downtime*" tracepoints and it'll enable all tracepoints for downtime measurements necessary. Drop loadvm_postcopy_handle_run_bh() tracepoint alongside, because they service the same purpose, which was only for postcopy. We then have unified prefix for all downtime relevant tracepoints. Co-developed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-6-peterx@redhat.com>
2023-11-01migration: migration_stop_vm() helperPeter Xu
Provide a helper for non-COLO use case of migration to stop a VM. This prepares for adding some downtime relevant tracepoints to migration, where they may or may not apply to COLO. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-5-peterx@redhat.com>
2023-11-01migration: Add migration_downtime_start|end() helpersPeter Xu
Unify the three users on recording downtimes with the same pair of helpers. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-3-peterx@redhat.com>
2023-11-01migration: Set downtime_start even for postcopyPeter Xu
Postcopy calculates its downtime separately. It always sets MigrationState.downtime properly, but not MigrationState.downtime_start. Make postcopy do the same as other modes on properly recording the timestamp when the VM is going to be stopped. Drop the temporary variable in postcopy_start() along the way. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-2-peterx@redhat.com>
2023-10-31qemu-file: Make qemu_fflush() return errorsJuan Quintela
This let us simplify code of this shape. qemu_fflush(f); int ret = qemu_file_get_error(f); if (ret) { return ret; } into: int ret = qemu_fflush(f); if (ret) { return ret; } I updated all callers where there is any error check. qemu_fclose() don't need to check for f->last_error because qemu_fflush() returns it at the beggining of the function. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-13-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-10-31migration: Use migration_transferred_bytes()Juan Quintela
There are only two differnces with the old value: - the amount of QEMUFile that hasn't yet been flushed. It can be discussed what is more exact, the new or the old one. - the amount of transferred bytes that we forgot to account for (the newer is better, i.e. exact). Notice that this two values are used to: a - present to the user b - calculate the rate_limit So a few KB here and there is not going to make a difference. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-11-quintela@redhat.com>
2023-10-31migration: migration_rate_limit_reset() don't need the QEMUFileJuan Quintela
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-9-quintela@redhat.com>
2023-10-31migration: migration_transferred_bytes() don't need the QEMUFileJuan Quintela
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-8-quintela@redhat.com>
2023-10-31migration: migrate 'blk' command option is deprecated.Juan Quintela
Use blocked-mirror with NBD instead. Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231018115513.2163-4-quintela@redhat.com>
2023-10-31migration: migrate 'inc' command option is deprecated.Juan Quintela
Use blockdev-mirror with NBD instead. Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231018115513.2163-3-quintela@redhat.com>
2023-10-20migration: simplify notifiersSteve Sistare
Pass the callback function to add_migration_state_change_notifier so that migration can initialize the notifier on add and clear it on delete, which simplifies the call sites. Shorten the function names so the extra arg can be added more legibly. Hide the global notifier list in a new function migration_call_notifiers, and make it externally visible so future live update code can call it. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1686148954-250144-1-git-send-email-steven.sistare@oracle.com>
2023-10-20migration: simplify blockersSteve Sistare
Modify migrate_add_blocker and migrate_del_blocker to take an Error ** reason. This allows migration to own the Error object, so that if an error occurs in migrate_add_blocker, migration code can free the Error and clear the client handle, simplifying client code. It also simplifies the migrate_del_blocker call site. In addition, this is a pre-requisite for a proposed future patch that would add a mode argument to migration requests to support live update, and maintain a list of blockers for each mode. A blocker may apply to a single mode or to multiple modes, and passing Error** will allow one Error object to be registered for multiple modes. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Tested-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1697634216-84215-1-git-send-email-steven.sistare@oracle.com>
2023-10-17migration: Create populate_compress()Juan Quintela
So we don't have to access compression_counters from outside ram-compress.c. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-7-quintela@redhat.com>
2023-10-17migration: Move compression_counters cleanup ram-compress.cJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-6-quintela@redhat.com>
2023-10-17migration: RDMA is not compatible with anything elseJuan Quintela
So give an error instead of just ignoring the other methods. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-4-quintela@redhat.com>
2023-10-17migration: Create migrate_rdma()Juan Quintela
Helper to say if we are doing a migration over rdma. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-2-quintela@redhat.com>
2023-10-17migration: hold the BQL during setupFiona Ebner
This is intended to be a semantic revert of commit 9b09503752 ("migration: run setup callbacks out of big lock"). There have been so many changes since that commit (e.g. a new setup callback dirty_bitmap_save_setup() that also needs to be adapted now), it's easier to do the revert manually. For snapshots, the bdrv_writev_vmstate() function is used during setup (in QIOChannelBlock backing the QEMUFile), but not holding the BQL while calling it could lead to an assertion failure. To understand how, first note the following: 1. Generated coroutine wrappers for block layer functions spawn the coroutine and use AIO_WAIT_WHILE()/aio_poll() to wait for it. 2. If the host OS switches threads at an inconvenient time, it can happen that a bottom half scheduled for the main thread's AioContext is executed as part of a vCPU thread's aio_poll(). An example leading to the assertion failure is as follows: main thread: 1. A snapshot-save QMP command gets issued. 2. snapshot_save_job_bh() is scheduled. vCPU thread: 3. aio_poll() for the main thread's AioContext is called (e.g. when the guest writes to a pflash device, as part of blk_pwrite which is a generated coroutine wrapper). 4. snapshot_save_job_bh() is executed as part of aio_poll(). 3. qemu_savevm_state() is called. 4. qemu_mutex_unlock_iothread() is called. Now qemu_get_current_aio_context() returns 0x0. 5. bdrv_writev_vmstate() is executed during the usual savevm setup via qemu_fflush(). But this function is a generated coroutine wrapper, so it uses AIO_WAIT_WHILE. There, the assertion assert(qemu_get_current_aio_context() == qemu_get_aio_context()); will fail. To fix it, ensure that the BQL is held during setup. While it would only be needed for snapshots, adapting migration too avoids additional logic for conditional locking/unlocking in the setup callbacks. Writing the header could (in theory) also trigger qemu_fflush() and thus bdrv_writev_vmstate(), so the locked section also covers the qemu_savevm_state_header() call, even for migration for consistency. The section around multifd_send_sync_main() needs to be unlocked to avoid a deadlock. In particular, the multifd_save_setup() function calls socket_send_channel_create() using multifd_new_send_channel_async() as a callback and then waits for the callback to signal via the channels_ready semaphore. The connection happens via qio_task_run_in_thread(), but the callback is only executed via qio_task_thread_result() which is scheduled for the main event loop. Without unlocking the section, the main thread would never get to process the task result and the callback meaning there would be no signal via the channels_ready semaphore. The comment in ram_init_bitmaps() was introduced by 4987783400 ("migration: fix incorrect memory_global_dirty_log_start outside BQL") and is removed, because it referred to the qemu_mutex_lock_iothread() call. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231013105839.415989-1-f.ebner@proxmox.com>
2023-10-17migration: Add the configuration vmstate to the json writerNikolay Borisov
Make the migration json writer part of MigrationState struct, allowing the 'configuration' object be serialized to json. This will facilitate the parsing of the 'configuration' object in the next patch that fixes analyze-migration.py for arm. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231009184326.15777-2-farosas@suse.de>
2023-10-17migration: Allow user to specify available switchover bandwidthPeter Xu
Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-17migration: refactor migration_completionWei Wang
Current migration_completion function is a bit long. Refactor the long implementation into different subfunctions: - migration_completion_precopy: completion code related to precopy - migration_completion_postcopy: completion code related to postcopy Rename await_return_path_close_on_source to close_return_path_on_source: It is renamed to match with open_return_path_on_source. This improves readability and is easier for future updates (e.g. add new subfunctions when completion code related to new features are needed). No functional changes intended. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230804093053.5037-1-wei.w.wang@intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-10-11migration: Add migration_rp_wait|kick()Peter Xu
It's just a simple wrapper for rp_sem on either wait() or kick(), make it even clearer on how it is used. Prepared to be used even for other things. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Message-ID: <20231004220240.167175-8-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-10-11migration: Introduce migrate_has_error()Peter Xu
Introduce a helper to detect whether MigrationState.error is set for whatever reason. This is preparation work for any thread (e.g. source return path thread) to setup errors in an unified way to MigrationState, rather than relying on its own way to set errors (mark_source_rp_bad()). Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-3-peterx@redhat.com>
2023-10-11migration: Display error in query-migrate irrelevant of statusPeter Xu
Display it as long as being set, irrelevant of FAILED status. E.g., it may also be applicable to PAUSED stage of postcopy, to provide hint on what has gone wrong. The error_mutex seems to be overlooked when referencing the error, add it to be very safe. This will change QAPI behavior by showing up error message outside !FAILED status, but it's intended and doesn't expect to break anyone. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404 Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-2-peterx@redhat.com>
2023-10-11migration: Set migration status early in incoming sideFabiano Rosas
We are sending a migration event of MIGRATION_STATUS_SETUP at qemu_start_incoming_migration but never actually setting the state. This creates a window between qmp_migrate_incoming and process_incoming_migration_co where the migration status is still MIGRATION_STATUS_NONE. Calling query-migrate during this time will return an empty response even though the incoming migration command has already been issued. Commit 7cf1fe6d68 ("migration: Add migration events on target side") has added support to the 'events' capability to the incoming part of migration, but chose to send the SETUP event without setting the state. I'm assuming this was a mistake. This introduces a change in behavior, any QMP client waiting for the SETUP event will hang, unless it has previously enabled the 'events' capability. Having the capability enabled is sufficient to continue to receive the event. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230712190742.22294-5-farosas@suse.de>
2023-10-04migration: file URISteve Sistare
Extend the migration URI to support file:<filename>. This can be used for any migration scenario that does not require a reverse path. It can be used as an alternative to 'exec:cat > file' in minimized containers that do not contain /bin/sh, and it is easier to use than the fd:<fdname> URI. It can be used in HMP commands, and as a qemu command-line parameter. For best performance, guest ram should be shared and x-ignore-shared should be true, so guest pages are not written to the file, in which case the guest may remain running. If ram is not so configured, then the user is advised to stop the guest first. Otherwise, a busy guest may re-dirty the same page, causing it to be appended to the file multiple times, and the file may grow unboundedly. That issue is being addressed in the "fixed-ram" patch series. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Tested-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1694182931-61390-2-git-send-email-steven.sistare@oracle.com>
2023-09-29migration: Refactor repeated call of yank_unregister_instanceTejus GK
In the function qmp_migrate(), yank_unregister_instance() gets called twice which isn't required. Hence, refactoring it so that it gets called during the local_error cleanup. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Message-ID: <20230621130940.178659-3-tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-09-27migration: Move return path cleanup to main migration threadFabiano Rosas
Now that the return path thread is allowed to finish during a paused migration, we can move the cleanup of the QEMUFiles to the main migration thread. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-9-farosas@suse.de>
2023-09-27migration: Replace the return path retry logicFabiano Rosas
Replace the return path retry logic with finishing and restarting the thread. This fixes a race when resuming the migration that leads to a segfault. Currently when doing postcopy we consider that an IO error on the return path file could be due to a network intermittency. We then keep the thread alive but have it do cleanup of the 'from_dst_file' and wait on the 'postcopy_pause_rp' semaphore. When the user issues a migrate resume, a new return path is opened and the thread is allowed to continue. There's a race condition in the above mechanism. It is possible for the new return path file to be setup *before* the cleanup code in the return path thread has had a chance to run, leading to the *new* file being closed and the pointer set to NULL. When the thread is released after the resume, it tries to dereference 'from_dst_file' and crashes: Thread 7 "return path" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffd1dbf700 (LWP 9611)] 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 154 return f->last_error; (gdb) bt #0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 #1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206 #2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876 #3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541 #4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477 #5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Here's the race (important bit is open_return_path happening before migration_release_dst_files): migration | qmp | return path --------------------------+-----------------------------+--------------------------------- qmp_migrate_pause() shutdown(ms->to_dst_file) f->last_error = -EIO migrate_detect_error() postcopy_pause() set_state(PAUSED) wait(postcopy_pause_sem) qmp_migrate(resume) migrate_fd_connect() resume = state == PAUSED open_return_path <-- TOO SOON! set_state(RECOVER) post(postcopy_pause_sem) (incoming closes to_src_file) res = qemu_file_get_error(rp) migration_release_dst_files() ms->rp_state.from_dst_file = NULL post(postcopy_pause_rp_sem) postcopy_pause_return_path_thread() wait(postcopy_pause_rp_sem) rp = ms->rp_state.from_dst_file goto retry qemu_file_get_error(rp) SIGSEGV ------------------------------------------------------------------------------------------- We can keep the retry logic without having the thread alive and waiting. The only piece of data used by it is the 'from_dst_file' and it is only allowed to proceed after a migrate resume is issued and the semaphore released at migrate_fd_connect(). Move the retry logic to outside the thread by waiting for the thread to finish before pausing the migration. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-8-farosas@suse.de>
2023-09-27migration: Consolidate return path closing codeFabiano Rosas
We'll start calling the await_return_path_close_on_source() function from other parts of the code, so move all of the related checks and tracepoints into it. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-7-farosas@suse.de>
2023-09-27migration: Remove redundant cleanup of postcopy_qemufile_srcFabiano Rosas
This file is owned by the return path thread which is already doing cleanup. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-6-farosas@suse.de>
2023-09-27migration: Fix possible race when shutting down to_dst_fileFabiano Rosas
It's not safe to call qemu_file_shutdown() on the to_dst_file without first checking for the file's presence under the lock. The cleanup of this file happens at postcopy_pause() and migrate_fd_cleanup() which are not necessarily running in the same thread as migrate_fd_cancel(). Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-5-farosas@suse.de>