aboutsummaryrefslogtreecommitdiff
path: root/docs/devel
diff options
context:
space:
mode:
authorRichard Henderson <richard.henderson@linaro.org>2023-06-30 08:11:08 +0200
committerRichard Henderson <richard.henderson@linaro.org>2023-06-30 08:11:08 +0200
commit408015a97dbe48a9dde8c0d2526c9312691952e7 (patch)
treeb41b1a9349392da698def212ff60e19f7043f851 /docs/devel
parentf7884164cbe3743c3bd2acc9daf877497fdb5fa3 (diff)
parent0cc889c8826cefa5b80110d31a62273b56aa1832 (diff)
Merge tag 'pull-vfio-20230630' of https://github.com/legoater/qemu into staging
vfio queue: * migration: New switchover ack to reduce downtime * VFIO migration pre-copy support * Removal of the VFIO migration experimental flag * Alternate offset for GPUDirect Cliques * Misc fixes # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmSeVHYACgkQUaNDx8/7 # 7KHeZw/+LRe9QQpx8hU//vKBvLet2QvI3WUaXGHiHbblbRT6HhiHjWHB2/8j6jji # QhAGJ6w9yoKODyY0kGpVFEnkmXOKyqwWssBheV219ntZs09pFGxZr/ldUhT22aBN # kH8mHU9BZ3J+zF/kKphpcIC1sPxVu/DlrtnJu5vDGuRAOu8+3kFV217JC1yGs1Vh # n+KOho8a8oP9qxtzfvQ9iZ4dpBOOKpE9vscS12wJAlen93AGB6esR7VaLxDjExRP # yL1pguQ8ZZ1gEXXbXO62djKo3IViobtD08KmCXTzQ6TVquLleJzqgjp+A0THnYAe # J9Rlja7LpsO9MYSxmRE9WcQccC+sAGn/t/ufB0tL8zR43FvfhbF5H0PzBBY0H7YA # JlzN+fgrKEEHJwMhXANNvSddhWCwvrkjNxo/80u3ySYMQR1Hav/tsXYBlk16e5nS # fmtrFGTwhsVdy1Q6ZqEOyTni1eiYt5stEQMZFODdUNj6b9FugSZ0BK+2WN/M0CzU # 6mKmJQgZAG/nBoRJm/XCO5OKQ6wm/4tm6F4HSH5EJ6mDT+DqETAk4GRUWTbYa2/G # yAAOlhTMu8Xc/NhMeJ7Z99dyq0SM8pi/XpVEIv7p9yBak8ix60iCWZtDE8vlDv3M # UfMVMTAvTS30kbS6FDN2Yyl6l8/ETdcwVIN4l02ipGzpMCtn9EQ= # =dKUj # -----END PGP SIGNATURE----- # gpg: Signature made Fri 30 Jun 2023 06:05:10 AM CEST # gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1 # gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1 * tag 'pull-vfio-20230630' of https://github.com/legoater/qemu: vfio/pci: Free leaked timer in vfio_realize error path vfio/pci: Fix a segfault in vfio_realize MAINTAINERS: Promote Cédric to VFIO co-maintainer vfio/migration: Make VFIO migration non-experimental vfio/migration: Reset bytes_transferred properly vfio/pci: Call vfio_prepare_kvm_msi_virq_batch() in MSI retry path hw/vfio/pci-quirks: Support alternate offset for GPUDirect Cliques vfio: Implement a common device info helper vfio/migration: Add support for switchover ack capability vfio/migration: Add VFIO migration pre-copy support vfio/migration: Store VFIO migration flags in VFIOMigration vfio/migration: Refactor vfio_save_block() to return saved data size tests: Add migration switchover ack capability test migration: Enable switchover ack capability migration: Implement switchover ack logic migration: Add switchover ack capability Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Diffstat (limited to 'docs/devel')
-rw-r--r--docs/devel/vfio-migration.rst45
1 files changed, 35 insertions, 10 deletions
diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
index 1b68ccf115..b433cb5bb2 100644
--- a/docs/devel/vfio-migration.rst
+++ b/docs/devel/vfio-migration.rst
@@ -7,12 +7,21 @@ the guest is running on source host and restoring this saved state on the
destination host. This document details how saving and restoring of VFIO
devices is done in QEMU.
-Migration of VFIO devices currently consists of a single stop-and-copy phase.
-During the stop-and-copy phase the guest is stopped and the entire VFIO device
-data is transferred to the destination.
-
-The pre-copy phase of migration is currently not supported for VFIO devices.
-Support for VFIO pre-copy will be added later on.
+Migration of VFIO devices consists of two phases: the optional pre-copy phase,
+and the stop-and-copy phase. The pre-copy phase is iterative and allows to
+accommodate VFIO devices that have a large amount of data that needs to be
+transferred. The iterative pre-copy phase of migration allows for the guest to
+continue whilst the VFIO device state is transferred to the destination, this
+helps to reduce the total downtime of the VM. VFIO devices opt-in to pre-copy
+support by reporting the VFIO_MIGRATION_PRE_COPY flag in the
+VFIO_DEVICE_FEATURE_MIGRATION ioctl.
+
+When pre-copy is supported, it's possible to further reduce downtime by
+enabling "switchover-ack" migration capability.
+VFIO migration uAPI defines "initial bytes" as part of its pre-copy data stream
+and recommends that the initial bytes are sent and loaded in the destination
+before stopping the source VM. Enabling this migration capability will
+guarantee that and thus, can potentially reduce downtime even further.
Note that currently VFIO migration is supported only for a single device. This
is due to VFIO migration's lack of P2P support. However, P2P support is planned
@@ -29,10 +38,23 @@ VFIO implements the device hooks for the iterative approach as follows:
* A ``load_setup`` function that sets the VFIO device on the destination in
_RESUMING state.
+* A ``state_pending_estimate`` function that reports an estimate of the
+ remaining pre-copy data that the vendor driver has yet to save for the VFIO
+ device.
+
* A ``state_pending_exact`` function that reads pending_bytes from the vendor
driver, which indicates the amount of data that the vendor driver has yet to
save for the VFIO device.
+* An ``is_active_iterate`` function that indicates ``save_live_iterate`` is
+ active only when the VFIO device is in pre-copy states.
+
+* A ``save_live_iterate`` function that reads the VFIO device's data from the
+ vendor driver during iterative pre-copy phase.
+
+* A ``switchover_ack_needed`` function that checks if the VFIO device uses
+ "switchover-ack" migration capability when this capability is enabled.
+
* A ``save_state`` function to save the device config space if it is present.
* A ``save_live_complete_precopy`` function that sets the VFIO device in
@@ -111,8 +133,10 @@ Flow of state changes during Live migration
===========================================
Below is the flow of state change during live migration.
-The values in the brackets represent the VM state, the migration state, and
+The values in the parentheses represent the VM state, the migration state, and
the VFIO device state, respectively.
+The text in the square brackets represents the flow if the VFIO device supports
+pre-copy.
Live migration save path
------------------------
@@ -124,11 +148,12 @@ Live migration save path
|
migrate_init spawns migration_thread
Migration thread then calls each device's .save_setup()
- (RUNNING, _SETUP, _RUNNING)
+ (RUNNING, _SETUP, _RUNNING [_PRE_COPY])
|
- (RUNNING, _ACTIVE, _RUNNING)
- If device is active, get pending_bytes by .state_pending_exact()
+ (RUNNING, _ACTIVE, _RUNNING [_PRE_COPY])
+ If device is active, get pending_bytes by .state_pending_{estimate,exact}()
If total pending_bytes >= threshold_size, call .save_live_iterate()
+ [Data of VFIO device for pre-copy phase is copied]
Iterate till total pending bytes converge and are less than threshold
|
On migration completion, vCPU stops and calls .save_live_complete_precopy for