Age | Commit message (Collapse) | Author |
|
Error reporting patches for 2024-04-24
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmYouloSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTzLwP+wQjCWJHpTB+uQ3+U5Tb77BUJxuEjDMj
# txNIJBXHOo7erxTSCieLuQICm8e30z62QAK4nVStyMDcyGh1KfwdSDAxBFnuLpA2
# 7X5bXbvCrm4vXVASRTV1zKCYDlIXFfrMWLvN5KgM90RsodLcy0szlXg+qYyoIM3Z
# 8zp0Ug0fQPFHiOAQJi9ZTOsCYJBhZc2sbzgQEmf/g6q9bJaZHzPEHvVT4AQhTAtn
# 7BIJY+vGDZNZwbP/0obWy2lai3kbGak8OXpwq/bewdrxeRmvqmM7sk+V/P2tXQD+
# kZe0/HWuDoO5J8L3KHiJnBJ0KCk8fbo4I0T6v9vf55Sj8K0r7O9sykgXXWv8q0lO
# GrQa0YcyWAckI41stYQpwEpIlRanuZv/p8OZFJIqsTAfaw7RlbIBYA9xZCUnTton
# FbHO/t2BLfo8eO9/xRD4r1u6vMbVozImPETuUMPyLHzlrdw2thxddKQNInHYYZ2U
# SvvaByceEP2UywOnOflZhVL2dIhhnrBztiW2Vqod1fQHpfBAcJn909PZIlPZyMkr
# gUnABI/rtC/lW3pBee6HmfzJ6Fah0e0XCpCY20qFe27Bi/z3xKi5NWYuyAUG5csp
# CuTsc4pXfPVj5Z+Mk4pyY8PK5k4jSa7vAVLCLTNzXJLZlJTb6yuf0HsJ7768nHDc
# hSEIjLwQWYtw
# =r8Rv
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Apr 2024 12:52:58 AM PDT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-error-2024-04-24' of https://repo.or.cz/qemu/armbru:
qapi: Inline and remove QERR_PROPERTY_VALUE_BAD definition
qapi: Inline and remove QERR_MIGRATION_ACTIVE definition
qapi: Correct error message for 'vcpu_dirty_limit' parameter
qapi: Inline and remove QERR_INVALID_PARAMETER_TYPE definition
qapi: Inline QERR_INVALID_PARAMETER_TYPE definition (constant value)
qapi: Inline and remove QERR_INVALID_PARAMETER definition
qapi: Inline and remove QERR_DEVICE_NO_HOTPLUG definition
qapi: Inline and remove QERR_DEVICE_HAS_NO_MEDIUM definition
qapi: Inline and remove QERR_BUS_NO_HOTPLUG definition
error: Drop superfluous #include "qapi/qmp/qerror.h"
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Address the comment added in commit 4629ed1e98
("qerror: Finally unused, clean up"), from 2015:
/*
* These macros will go away, please don't use
* in new code, and do not add new ones!
*/
Mechanical transformation using sed, manually
removing the definition in include/qapi/qmp/qerror.h.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240312141343.3168265-10-armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
[Straightforward conflict with commit aeaafb1e59f (migration: export
migration_is_running) resolved]
|
|
This prepares ground for the changes coming next which add an Error**
argument to the .save_setup() handler. Callers of qemu_savevm_state_setup()
now handle the error and fail earlier setting the migration state from
MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
In qemu_savevm_state(), move the cleanup to preserve the error
reported by .save_setup() handlers.
Since the previous behavior was to ignore errors at this step of
migration, this change should be examined closely to check that
cleanups are still correctly done.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-7-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
There are several places where postcopy_start() fails without setting
errp. This can cause a null pointer de-reference, as in case of error,
the caller of postcopy_start() copies/prints the error set in errp.
Fix it by setting errp in all of postcopy_start() error paths.
Cc: qemu-stable <qemu-stable@nongnu.org>
Fixes: 908927db28ea ("migration: Update error description whenever migration fails")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240328140252.16756-3-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
After commit 9425ef3f990a ("migration: Use migrate_has_error() in
close_return_path_on_source()"), close_return_path_on_source() assumes
that migration error is set if an error occurs during migration.
This may not be true if migration errors in migration_completion(). For
example, if qemu_savevm_state_complete_precopy() errors, migration error
will not be set.
This in turn, will cause a migration hang bug, similar to the bug that
was fixed by commit 22b04245f0d5 ("migration: Join the return path
thread before releasing to_dst_file"), as shutdown() will not be issued
for the return-path channel.
Fix it by ensuring migration error is set in case of error in
migration_completion().
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Fixes: 9425ef3f990a ("migration: Use migrate_has_error() in close_return_path_on_source()")
Acked-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240328140252.16756-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
With current code base I can observe extremely high sync count during
precopy, as long as one enables postcopy-ram=on before switchover to
postcopy.
To provide some context of when QEMU decides to do a full sync: it checks
must_precopy (which implies "data must be sent during precopy phase"), and
as long as it is lower than the threshold size we calculated (out of
bandwidth and expected downtime) QEMU will kick off the slow/exact sync.
However, when postcopy is enabled (even if still during precopy phase), RAM
only reports all pages as can_postcopy, and report must_precopy==0. Then
"must_precopy <= threshold_size" mostly always triggers and enforces a slow
sync for every call to migration_iteration_run() when postcopy is enabled
even if not used. That is insane.
It turns out it was a regress bug introduced in the previous refactoring in
8.0 as reported by Nina [1]:
(a) c8df4a7aef ("migration: Split save_live_pending() into state_pending_*")
Then a workaround patch is applied at the end of release (8.0-rc4) to fix it:
(b) 28ef5339c3 ("migration: fix ram_state_pending_exact()")
However that "workaround" was overlooked when during the cleanup in this
9.0 release in this commit..
(c) b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")
Then the issue was re-exposed as reported by Nina [1].
The problem with (b) is that it only fixed the case for RAM, rather than
all the rest of iterators. Here a slow sync should only be required if all
dirty data (precopy+postcopy) is less than the threshold_size that QEMU
calculated. It is even debatable whether a sync is needed when switched to
postcopy. Currently ram_state_pending_exact() will be mostly noop if
switched to postcopy, and that logic seems to apply too for all the rest of
iterators, as sync dirty bitmap during a postcopy doesn't make much sense.
However let's leave such change for later, as we're in rc phase.
So rather than reusing commit (b), this patch provides the complete fix for
all iterators. When at it, cleanup a little bit on the lines around.
[1] https://gitlab.com/qemu-project/qemu/-/issues/1565
Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Fixes: b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240320214453.584374-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
This reverts commit decdc76772c453ff1444612e910caa0d45cd8eac in full
and also the relevant migration-tests from
7a09f092834641b7a793d50a3a261073bbb404a6.
After the addition of the new QAPI-based migration address API in 8.2
we've been converting an "fd:" URI into a SocketAddress, missing the
fact that the "fd:" syntax could also be used for a plain file instead
of a socket. This is a problem because the SocketAddress is part of
the API, so we're effectively asking users to create a "socket"
channel to pass in a plain file.
The easiest way to fix this situation is to deprecate the usage of
both SocketAddress and "fd:" when used with a plain file for
migration. Since this has been possible since 8.2, we can wait until
9.1 to deprecate it.
For 9.0, however, we should avoid adding further support to migration
to a plain file using the old "fd:" syntax or the new SocketAddress
API, and instead require the usage of either the old-style "file:" URI
or the FileMigrationArgs::filename field of the new API with the
"/dev/fdset/NN" syntax, both of which are already supported.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240319210941.1907-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
When doing migration using the fd: URI, QEMU will fetch the file
descriptor passed in via the monitor at
fd_start_outgoing|incoming_migration(), which means the checks at
migration_channels_and_transport_compatible() happen too soon and we
don't know at that point whether the FD refers to a plain file or a
socket.
For this reason, we've been allowing a migration channel of type
SOCKET_ADDRESS_TYPE_FD to pass the initial verifications in scenarios
where the socket migration is not supported, such as with fd + multifd.
The commit decdc76772 ("migration/multifd: Add mapped-ram support to
fd: URI") was supposed to add a second check prior to starting
migration to make sure a socket fd is not passed instead of a file fd,
but failed to do so.
Add the missing verification and update the comment explaining this
situation which is currently incorrect.
Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240315032040.7974-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-11-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Define and export migration_file_set_error to eliminate a dependency
on MigrationState.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Define and export migration_is_device to eliminate a dependency
on MigrationState.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Define and export migration_thread_is_self to eliminate a dependency
on MigrationState.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Delete the MigrationState parameter from migration_is_running and move
it to the public API in misc.h.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Delete the MigrationState parameter from migration_is_active so it
can be exported and used without including migration.h.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-4-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Delete the MigrationState parameter from migration_is_setup_or_active
and move it to the public API in misc.h.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/1710179338-294359-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
If we receive a file descriptor that points to a regular file, there's
nothing stopping us from doing multifd migration with mapped-ram to
that file.
Enable the fd: URI to work with multifd + mapped-ram.
Note that the fds passed into multifd are duplicated because we want
to avoid cross-thread effects when doing cleanup (i.e. close(fd)). The
original fd doesn't need to be duplicated because monitor_get_fd()
transfers ownership to the caller.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240229153017.2221-23-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
The new mapped-ram stream format uses a file transport and puts ram
pages in the migration file at their respective offsets and can be
done in parallel by using the pwritev system call which takes iovecs
and an offset.
Add support to enabling the new format along with multifd to make use
of the threading and page handling already in place.
This requires multifd to stop sending headers and leaving the stream
format to the mapped-ram code. When it comes time to write the data, we
need to call a version of qio_channel_write that can take an offset.
Usage on HMP is:
(qemu) stop
(qemu) migrate_set_capability multifd on
(qemu) migrate_set_capability mapped-ram on
(qemu) migrate_set_parameter max-bandwidth 0
(qemu) migrate_set_parameter multifd-channels 8
(qemu) migrate file:migfile
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-21-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
On the receiving side we don't need to differentiate between main
channel and threads, so whichever channel is defined first gets to be
the main one. And since there are no packets, use the atomic channel
count to index into the params array.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-19-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
The mapped-ram migration format needs a channel that supports seeking
to be able to write each page to an arbitrary offset in the migration
stream.
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-9-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Add a new migration capability 'mapped-ram'.
The core of the feature is to ensure that RAM pages are mapped
directly to offsets in the resulting migration file instead of being
streamed at arbitrary points.
The reasons why we'd want such behavior are:
- The resulting file will have a bounded size, since pages which are
dirtied multiple times will always go to a fixed location in the
file, rather than constantly being added to a sequential
stream. This eliminates cases where a VM with, say, 1G of RAM can
result in a migration file that's 10s of GBs, provided that the
workload constantly redirties memory.
- It paves the way to implement O_DIRECT-enabled save/restore of the
migration stream as the pages are ensured to be written at aligned
offsets.
- It allows the usage of multifd so we can write RAM pages to the
migration file in parallel.
For now, enabling the capability has no effect. The next couple of
patches implement the core functionality.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-8-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
close_return_path_on_source() retrieves the migration error from the
the QEMUFile '->to_dst_file' to know if a shutdown is required. This
shutdown is required to exit the return-path thread.
Avoid relying on '->to_dst_file' and use migrate_has_error() instead.
(using to_dst_file is a heuristic to infer whether
rp_state.from_dst_file might be stuck on a recvmsg(). Using a generic
method for detecting errors is more reliable. We also want to reduce
dependency on QEMUFile::last_error)
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
[added some words about the motivation for this patch]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226203122.22894-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
The return path thread might hang at a blocking system call. Before
joining the thread we might need to issue a shutdown() on the socket
file descriptor to release it. To determine whether the shutdown() is
necessary we look at the QEMUFile error.
Make sure we only clean up the QEMUFile after the return path has been
waited for.
This fixes a hang when qemu_savevm_state_setup() produced an error
that was detected by migration_detect_error(). That skips
migration_completion() so close_return_path_on_source() would get
stuck waiting for the RP thread to terminate.
Reported-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226203122.22894-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
The QMP command query_migrate might see incorrect throughput numbers
if it runs after we've set the migration completion status but before
migration_calculate_complete() has updated s->total_time and s->mbps.
The migration status would show COMPLETED, but the throughput value
would be the one from the last iteration and not the one from the
whole migration. This will usually be a larger value due to the time
period being smaller (one iteration).
Move migration_calculate_complete() earlier so that the status
MIGRATION_STATUS_COMPLETED is only emitted after the final counters
update. Keep everything under the BQL so the QMP thread sees the
updates as atomic.
Rename migration_calculate_complete to migration_completion_end to
reflect its new purpose of also updating s->state.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240226143335.14282-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Fail the migration request if options are set that are incompatible
with cpr.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-15-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
When migration for cpr is initiated, stop the vm and set state
RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
possibility of ram and device state being out of sync, and guarantees
that a guest in the suspended state remains suspended, because qmp_cont
rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-11-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Check the status returned by migration notifiers for event type
MIG_EVENT_PRECOPY_SETUP, and report errors. None of the notifiers
return an error status at this time.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-10-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Move common code for the error path in migrate_fd_connect to a shared
fail label. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Keep a separate list of migration notifiers for each migration mode.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Define MigrationNotifyFunc to improve type safety and simplify migration
notifiers.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
postcopy_after_devices and migration_in_postcopy_after_devices are no
longer used, so delete them.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-6-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Passing MigrationState to notifiers is unsound because they could access
unstable migration state internals or even modify the state. Instead, pass
the minimal info needed in a new MigrationEvent struct, which could be
extended in the future if needed.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Change all migration notifiers to type NotifierWithReturn, so notifiers
can return an error status in a future patch. For now, pass NULL for the
notifier error parameter, and do not check the return value.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-4-git-send-email-steven.sistare@oracle.com
[peterx: dropped unexpected update to roms/seabios-hppa]
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
We currently have an unfavorable situation around multifd channels
creation and the migration thread execution.
We create the multifd channels with qio_channel_socket_connect_async
-> qio_task_run_in_thread, but only connect them at the
multifd_new_send_channel_async callback, called from
qio_task_complete, which is registered as a glib event.
So at multifd_send_setup() we create the channels, but they will only
be actually usable after the whole multifd_send_setup() calling stack
returns back to the main loop. Which means that the migration thread
is already up and running without any possibility for the multifd
channels to be ready on time.
We currently rely on the channels-ready semaphore blocking
multifd_send_sync_main() until channels start to come up and release
it. However there have been bugs recently found when a channel's
creation fails and multifd_send_cleanup() is allowed to run while
other channels are still being created.
Let's start to organize this situation by moving the
multifd_send_setup() call into the migration thread. That way we
unblock the main-loop to dispatch the completion callbacks and
actually have a chance of getting the multifd channels ready for when
the migration thread needs them.
The next patches will deal with the synchronization aspects.
Note that this takes multifd_send_setup() out of the BQL.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-5-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Hide the error handling inside multifd_send_setup to make it cleaner
for the next patch to move the function around.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-4-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
The commit in the fixes line mistakenly modified the channels and
transport compatibility check logic so it now checks multi-channel
support only for socket transport type.
Thus, running multifd migration using a transport other than socket that
is incompatible with multi-channels (such as "exec") would lead to a
segmentation fault instead of an error message.
For example:
(qemu) migrate_set_capability multifd on
(qemu) migrate -d "exec:cat > /tmp/vm_state"
Segmentation fault (core dumped)
Fix it by checking multi-channel compatibility for all transport types.
Cc: qemu-stable <qemu-stable@nongnu.org>
Fixes: d95533e1cdcc ("migration: modify migration_channels_and_uri_compatible() for new QAPI syntax")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240125162528.7552-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Most of the multifd code uses send/recv to represent the two sides, but
some rare cases use save/load.
Since send/recv is the majority, replacing the save/load use cases to use
send/recv globally. Now we reach a consensus on the naming.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240202102857.110210-22-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
A memory page poisoned from the hypervisor level is no longer readable.
The migration of a VM will crash Qemu when it tries to read the
memory address space and stumbles on the poisoned page with a similar
stack trace:
Program terminated with signal SIGBUS, Bus error.
#0 _mm256_loadu_si256
#1 buffer_zero_avx2
#2 select_accel_fn
#3 buffer_is_zero
#4 save_zero_page
#5 ram_save_target_page_legacy
#6 ram_save_host_page
#7 ram_find_and_save_block
#8 ram_save_iterate
#9 qemu_savevm_state_iterate
#10 migration_iteration_run
#11 migration_thread
#12 qemu_thread_start
To avoid this VM crash during the migration, prevent the migration
when a known hardware poison exists on the VM.
Signed-off-by: William Roche <william.roche@oracle.com>
Link: https://lore.kernel.org/r/20240130190640.139364-2-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Now that the migration state reference counting is correct, further
wrap the bottom half dispatch process to avoid future issues.
Move BH creation and scheduling together and wrap the dispatch with an
intermediary function that will ensure we always keep the ref/unref
balanced.
Also move the responsibility of deleting the BH into the wrapper and
remove the now unnecessary pointers.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-6-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Wrap qemu_bh_schedule() to ensure we always hold a reference to the
current_migration object.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-5-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
We need to hold a reference to the current_migration object around
async calls to avoid it been freed while still in use.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
We're currently allowing the process_incoming_migration_bh bottom-half
to run without holding a reference to the 'current_migration' object,
which leads to a segmentation fault if the BH is still live after
migration_shutdown() has dropped the last reference to
current_migration.
In my system the bug manifests as migrate_multifd() returning true
when it shouldn't and multifd_load_shutdown() calling
multifd_recv_terminate_threads() which crashes due to an uninitialized
multifd_recv_state.
Fix the issue by holding a reference to the object when scheduling the
BH and dropping it before returning from the BH. The same is already
done for the cleanup_bh at migrate_fd_cleanup_schedule().
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1969
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240119233922.32588-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
We're not currently reporting the errors set with migrate_set_error()
when incoming migration fails.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240104142144.9680-5-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
The inital conditional statements in qmp migration functions is harder
to understand than necessary. It is better to get all errors out of
the way in the beginning itself to have better readability and error
handling.
Signed-off-by: Het Gala <het.gala@nutanix.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231205080039.197615-1-het.gala@nutanix.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL)
in their names. Update the code comments to use "BQL" instead of
"iothread lock".
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20240102153529.486531-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
The Big QEMU Lock (BQL) has many names and they are confusing. The
actual QemuMutex variable is called qemu_global_mutex but it's commonly
referred to as the BQL in discussions and some code comments. The
locking APIs, however, are called qemu_mutex_lock_iothread() and
qemu_mutex_unlock_iothread().
The "iothread" name is historic and comes from when the main thread was
split into into KVM vcpu threads and the "iothread" (now called the main
loop thread). I have contributed to the confusion myself by introducing
a separate --object iothread, a separate concept unrelated to the BQL.
The "iothread" name is no longer appropriate for the BQL. Rename the
locking APIs to:
- void bql_lock(void)
- void bql_unlock(void)
- bool bql_locked(void)
There are more APIs with "iothread" in their names. Subsequent patches
will rename them. There are also comments and documentation that will be
updated in later patches.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20240102153529.486531-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
According to Error API, usage of ERRP_GUARD() or a local Error instead
of errp is needed if errp is passed to void functions, where it is later
dereferenced to see if an error occurred.
There are several places in migration.c that use local Error although it
is not needed. Change these places to use errp directly.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-11-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
If multifd_load_setup() fails in migration_ioc_process_incoming(),
error_setg() is called with errp. This will lead to an assert because in
that case errp already contains an error.
Fix it by removing the redundant error_setg().
Fixes: 6720c2b32725 ("migration: check magic value for deciding the mapping of channels")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-9-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Errp parameter in migration_fd_process_incoming() is unused.
Remove it.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-5-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Commit 6720c2b32725 ("migration: check magic value for deciding the
mapping of channels") extracted the only code that could fail in
migration_incoming_setup().
Now migration_incoming_setup() can't fail, so refactor it to return void
and remove errp parameter.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20231231093016.14204-4-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
MigrationState->hostname is set to NULL in migrate_init(). This is
redundant because it is already freed and set to NULL in
migrade_fd_cleanup(). Remove it.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-3-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|