aboutsummaryrefslogtreecommitdiff
path: root/qapi/migration.json
AgeCommit message (Collapse)Author
2024-03-26qapi: Correct documentation indentation and whitespaceMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322140910.328840-12-armbru@redhat.com> [Add a previous patch's stray hunk]
2024-03-26qapi: Refill doc comments to conform to current conventionsMarkus Armbruster
For legibility, wrap text paragraphs so every line is at most 70 characters long. To check the generated documentation does not change, I compared the generated HTML before and after this commit with "wdiff -3". Finds no differences. Comparing with diff is not useful, as the refilled paragraphs are visible there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322140910.328840-11-armbru@redhat.com>
2024-03-26qapi: Start sentences with a capital letter, end them with a periodMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322140910.328840-9-armbru@redhat.com>
2024-03-26qapi: Fix abbreviation punctuation in doc commentsMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322140910.328840-8-armbru@redhat.com>
2024-03-26qapi: Fix bogus documentation of query-migrationthreadsMarkus Armbruster
The doc comment documents an argument that doesn't exist. Would fail compilation if it was marked up correctly. Delete. The Returns: section fails to refer to the data type, leaving the user to guess. Fix that. The command name violates QAPI naming rules: it should be query-migration-threads. Too late to fix. Reported-by: John Snow <jsnow@redhat.com> Fixes: 671326201dac (migration: Introduce interface query-migrationthreads) Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322135117.195489-4-armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>
2024-03-26qapi: Resync MigrationParameter and MigrateSetParametersMarkus Armbruster
Enum MigrationParameter mirrors the members of struct MigrateSetParameters. Differences to MigrateSetParameters's member documentation are pointless. Clean them up: * @compress-level, @compress-threads, @decompress-threads, and x-checkpoint-delay are more thoroughly documented for MigrationParameter, so use that version for both. * @max-cpu-throttle is almost the same. Use MigrationParameter's version for both. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322135117.195489-3-armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com>
2024-03-26qapi: Improve migration TLS documentationMarkus Armbruster
MigrateSetParameters is about setting parameters, and MigrationParameters is about querying them. Their documentation of @tls-creds and @tls-hostname has residual damage from a failed attempt at de-duplicating them (see commit de63ab61241 "migrate: Share common MigrationParameters struct" and commit 1bda8b3c695 "migration: Unshare MigrationParameters struct for now"). MigrateSetParameters documentation issues: * It claims plain text mode "was reported by omitting tls-creds" before 2.9. MigrateSetParameters is not used for reporting, so this is misleading. Delete. * It similarly claims hostname defaulting to migration URI "was reported by omitting tls-hostname" before 2.9. Delete as well. Rephrase the remaining @tls-hostname contents for clarity. Enum MigrationParameter mirrors the members of struct MigrateSetParameters. Differences to MigrateSetParameters's member documentation are pointless. Copy the new text to MigrationParameter. MigrationParameters documentation issues: * @tls-creds runs the two last sentences together without punctuation. Fix that. * Much of the contents on @tls-hostname only applies to setting parameters, resulting in confusion. Replace by a suitable abridged version of the new MigrateSetParameters text, and a note on @tls-hostname omission in 2.8. Additional damage is due to flawed doc fix commit 66fcb9d651d (qapi/migration: Add missing tls-authz documentation): since it copied the missing MigrateSetParameters text from MigrationParameters instead of MigrationParameter, the part on recreating @tls-authz on the fly is missing. Copy that, too. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322135117.195489-2-armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [Some typos corrected]
2024-03-11migration/multifd: Enable multifd zero page checking by default.Hao Xiang
1. Set default "zero-page-detection" option to "multifd". Now zero page checking can be done in the multifd threads and this becomes the default configuration. 2. Handle migration QEMU9.0 -> QEMU8.2 compatibility. We provide backward compatibility where zero page checking is done from the migration main thread. Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240311180015.3359271-7-hao.xiang@linux.dev Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration/multifd: Implement zero page transmission on the multifd thread.Hao Xiang
1. Add zero_pages field in MultiFDPacket_t. 2. Implements the zero page detection and handling on the multifd threads for non-compression, zlib and zstd compression backends. 3. Added a new value 'multifd' in ZeroPageDetection enumeration. 4. Adds zero page counters and updates multifd send/receive tracing format to track the newly added counters. Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240311180015.3359271-5-hao.xiang@linux.dev Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration/multifd: Add new migration option zero-page-detection.Hao Xiang
This new parameter controls where the zero page checking is running. 1. If this parameter is set to 'legacy', zero page checking is done in the migration main thread. 2. If this parameter is set to 'none', zero page checking is disabled. Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/20240311180015.3359271-4-hao.xiang@linux.dev Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-05Merge tag 'pull-qapi-2024-03-04' of https://repo.or.cz/qemu/armbru into stagingPeter Maydell
QAPI patches patches for 2024-03-04 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmXlaSISHGFybWJydUBy # ZWRoYXQuY29tAAoJEDhwtADrkYZTdZ8P/iMgqLoAFkCCjwfkUc/rqZUezK52Ynr7 # LYwOPI/xcYD7EnVogdRgFgjWFNoivQLP5yKsU/eRTk29pwdDzTscFm/0ztTQX/Gb # ypWV+GBcu5J8mKbp1KF5w68aDD8Bat4WRfEgDQ1DV7v6CoMiUzTiF3CGXkYzqK5Y # kYNq97vdEkBFvFdOl/7scs/XXN2jG27egDhMp68RTxnPHlXZiAO9/2Bul3uVe3x0 # fzQ2ViYv0qLnjE/PwENDqqE3Thv3Sxp5iEeQQ6GWi07EVh07UtHpOM3RYyrTU0Sb # VrTApSrg0oxlkOuR0CBd9Fi+timtbokBL0DWyUpXNTfIEZfLtA9H+8riUg3EOcDp # r7a4SI/27VdPxX6Kc6zA3bi+/j1o7CLTW2LGEwuZs52nmixoo1HTWPIFdyh13g/V # QjNbun0fViHb0FVLiyDlXF/7Y+EWUWIyqwwGqbvve1DyUHQmo3CUQAKGOpkeKSBe # 4eGciVDgpBoKhtw9Kv6LCDj2cwZKC8DxBMibf7GHkOnAsX2mnyuHcey7HvYNCoF+ # yYz7oIEXdlL2eWqg7CfBZK7lniCDln50RI4Ll1v+J4r1v1kRZGMLesTYXCdNc4ku # yb4kpU4t22/RODffLE7K+fc3Onwze3fcfxlZMN66F+wFtk4KdPR2aQBE66bB8J99 # vuSKlTbT4cGL # =s9AR # -----END PGP SIGNATURE----- # gpg: Signature made Mon 04 Mar 2024 06:24:34 GMT # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * tag 'pull-qapi-2024-03-04' of https://repo.or.cz/qemu/armbru: migration: simplify exec migration functions qapi: New strv_from_str_list() qapi: New QAPI_LIST_LENGTH() docs/devel/writing-monitor-commands: Minor improvements docs/devel/writing-monitor-commands: Repair a decade of rot qapi: Reject "Returns" section when command doesn't return anything qga/qapi-schema: Fix guest-set-memory-blocks documentation qga/qapi-schema: Tweak documentation of fsfreeze commands qga/qapi-schema: Clean up "Returns" sections qga/qapi-schema: Delete useless "Returns" sections qga/qapi-schema: Move error documentation to new "Errors" sections qapi/yank: Tweak @yank's error description for consistency qapi: Clean up "Returns" sections qapi: Delete useless "Returns" sections qapi: Move error documentation to new "Errors" sections qapi: New documentation section tag "Errors" qapi: Slightly clearer error message for invalid "Returns" section qapi: Memorize since & returns sections Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-04qapi: Delete useless "Returns" sectionsMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240227113921.236097-6-armbru@redhat.com>
2024-03-01migration/ram: Introduce 'mapped-ram' migration capabilityFabiano Rosas
Add a new migration capability 'mapped-ram'. The core of the feature is to ensure that RAM pages are mapped directly to offsets in the resulting migration file instead of being streamed at arbitrary points. The reasons why we'd want such behavior are: - The resulting file will have a bounded size, since pages which are dirtied multiple times will always go to a fixed location in the file, rather than constantly being added to a sequential stream. This eliminates cases where a VM with, say, 1G of RAM can result in a migration file that's 10s of GBs, provided that the workload constantly redirties memory. - It paves the way to implement O_DIRECT-enabled save/restore of the migration stream as the pages are ensured to be written at aligned offsets. - It allows the usage of multifd so we can write RAM pages to the migration file in parallel. For now, enabling the capability has no effect. The next couple of patches implement the core functionality. Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-8-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01migration: massage cpr-reboot documentationSteve Sistare
Re-wrap the cpr-reboot documentation to 70 columns, use '@' for cpr-reboot references, capitalize COLO and VFIO, and tweak the wording. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1709218462-3640-1-git-send-email-steven.sistare@oracle.com [peterx: s/qemu/QEMU per Markus's suggestion] Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: options incompatible with cprSteve Sistare
Fail the migration request if options are set that are incompatible with cpr. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-15-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: update cpr-reboot descriptionSteve Sistare
Clarify qapi for cpr-reboot migration mode, and add vfio support. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-14-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-26qapi: Require descriptions and tagged sections to be indentedMarkus Armbruster
By convention, we indent the second and subsequent lines of descriptions and tagged sections, except for examples. Turn this into a hard rule, and apply it to examples, too. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240216145841.2099240-11-armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [Straightforward conflicts in qapi/migration.json resolved]
2024-02-26qapi: Misc cleanups to migrate QAPIsHet Gala
Signed-off-by: Het Gala <het.gala@nutanix.com> Message-ID: <20240216195659.189091-1-het.gala@nutanix.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-12qapi/migration: Add missing tls-authz documentationPeter Xu
As reported in Markus's recent enforcement series on qapi doc [1], we accidentally miss one entry for tls-authz. Add it. [1] https://lore.kernel.org/r/20240205074709.3613229-1-armbru@redhat.com Cc: Daniel P. Berrangé <berrange@redhat.com> Cc: Fabiano Rosas <farosas@suse.de> Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-ID: <20240207032836.268183-1-peterx@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Update of qapi/pragma.json squashed in, commit message adjusted]
2024-02-12qapi: Add missing union tag documentationMarkus Armbruster
Low-hanging fruit, and except for StatsFilter, the only members of these unions lacking documentation. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240205074709.3613229-16-armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-12qapi: Indent tagged doc comment sections properlyMarkus Armbruster
docs/devel/qapi-code-gen demands that the "second and subsequent lines of sections other than "Example"/"Examples" should be indented". Commit a937b6aa739q (qapi: Reformat doc comments to conform to current conventions) missed a few instances, and messed up a few others. Clean that up. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240205074709.3613229-5-armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-01-30qapi/migration.json: Fix the member name for MigrationCapabilityHan Han
s/@compression/@compress/ Fixes: 864128df46 Signed-off-by: Han Han <hhan@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-29Make 'uri' optional for migrate QAPIHet Gala
'uri' argument should be optional, as 'uri' and 'channels' arguments are mutally exclusive in nature. Fixes: 074dbce5fcce (migration: New migrate and migrate-incoming argument 'channels') Signed-off-by: Het Gala <het.gala@nutanix.com> Link: https://lore.kernel.org/r/20240123064219.40514-1-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-01-26qapi: Fix malformed "Since:" section tags (again)Markus Armbruster
"Since X.Y" is not recognized as a tagged section, and therefore not formatted as such in generated documentation. Fix by adding the required colon. Previously fixed in commit 433a4fdc420 (qapi: Fix malformed "Since:" section tags) Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240120095327.666239-8-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com>
2023-11-15qapi/migration.json: spelling: transferingMichael Tokarev
Fixes: 074dbce5fcce "migration: New migrate and migrate-incoming argument 'channels'" Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-11-02migration: New migrate and migrate-incoming argument 'channels'Het Gala
MigrateChannelList allows to connect accross multiple interfaces. Add MigrateChannelList struct as argument to migration QAPIs. We plan to include multiple channels in future, to connnect multiple interfaces. Hence, we choose 'MigrateChannelList' as the new argument over 'MigrateChannel' to make migration QAPIs future proof. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-10-farosas@suse.de>
2023-11-02migration: New QAPI type 'MigrateAddress'Het Gala
This patch introduces well defined MigrateAddress struct and its related child objects. The existing argument of 'migrate' and 'migrate-incoming' QAPI - 'uri' is of type string. The current implementation follows double encoding scheme for fetching migration parameters like 'uri' and this is not an ideal design. Motive for intoducing struct level design is to prevent double encoding of QAPI arguments, as Qemu should be able to directly use the QAPI arguments without any level of encoding. Note: this commit only adds the type, and actual uses comes in later commits. Fabiano fixed for "file" transport. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-2-farosas@suse.de> Message-Id: <20231023182053.8711-3-farosas@suse.de>
2023-11-01cpr: reboot modeSteve Sistare
Add the cpr-reboot migration mode. Usage: $ qemu-system-$arch -monitor stdio ... QEMU 8.1.50 monitor - type 'help' for more information (qemu) migrate_set_capability x-ignore-shared on (qemu) migrate_set_parameter mode cpr-reboot (qemu) migrate -d file:vm.state (qemu) info status VM status: paused (postmigrate) (qemu) quit $ qemu-system-$arch -monitor stdio -incoming defer ... QEMU 8.1.50 monitor - type 'help' for more information (qemu) migrate_set_capability x-ignore-shared on (qemu) migrate_set_parameter mode cpr-reboot (qemu) migrate_incoming file:vm.state (qemu) info status VM status: running In this mode, the migrate command saves state to a file, allowing one to quit qemu, reboot to an updated kernel, and restart an updated version of qemu. The caller must specify a migration URI that writes to and reads from a file. Unlike normal mode, the use of certain local storage options does not block the migration, but the caller must not modify guest block devices between the quit and restart. To avoid saving guest RAM to the file, the memory backend must be shared, and the @x-ignore-shared migration capability must be set. Guest RAM must be non-volatile across reboot, such as by backing it with a dax device, but this is not enforced. The restarted qemu arguments must match those used to initially start qemu, plus the -incoming option. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1698263069-406971-6-git-send-email-steven.sistare@oracle.com>
2023-11-01migration: mode parameterSteve Sistare
Create a mode migration parameter that can be used to select alternate migration algorithms. The default mode is normal, representing the current migration algorithm, and does not need to be explicitly set. No functional change until a new mode is added, except that the mode is shown by the 'info migrate' command. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1698263069-406971-2-git-send-email-steven.sistare@oracle.com>
2023-10-31migration: Deprecate old compression methodJuan Quintela
Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231018115513.2163-6-quintela@redhat.com>
2023-10-31migration: Deprecate block migrationJuan Quintela
It is obsolete. It is better to use driver-mirror with NBD instead. CC: Kevin Wolf <kwolf@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Hanna Czenczek <hreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231018115513.2163-5-quintela@redhat.com>
2023-10-31migration: migrate 'blk' command option is deprecated.Juan Quintela
Use blocked-mirror with NBD instead. Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231018115513.2163-4-quintela@redhat.com>
2023-10-31migration: migrate 'inc' command option is deprecated.Juan Quintela
Use blockdev-mirror with NBD instead. Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231018115513.2163-3-quintela@redhat.com>
2023-10-17migration: Improve json and formattingJuan Quintela
Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231013104736.31722-2-quintela@redhat.com>
2023-10-17migration: Allow user to specify available switchover bandwidthPeter Xu
Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-11migration: Display error in query-migrate irrelevant of statusPeter Xu
Display it as long as being set, irrelevant of FAILED status. E.g., it may also be applicable to PAUSED stage of postcopy, to provide hint on what has gone wrong. The error_mutex seems to be overlooked when referencing the error, add it to be very safe. This will change QAPI behavior by showing up error message outside !FAILED status, but it's intended and doesn't expect to break anyone. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404 Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-2-peterx@redhat.com>
2023-10-10migration/dirtyrate: use QEMU_CLOCK_HOST to report start-timeAndrei Gudkov
Currently query-dirty-rate uses QEMU_CLOCK_REALTIME as the source for start-time field. This translates to clock_gettime(CLOCK_MONOTONIC), i.e. number of seconds since host boot. This is not very useful. The only reasonable use case of start-time I can imagine is to check whether previously completed measurements are too old or not. But this makes sense only if start-time is reported as host wall-clock time. This patch replaces source of start-time from QEMU_CLOCK_REALTIME to QEMU_CLOCK_HOST. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <399861531e3b24a1ecea2ba453fb2c3d129fb03a.1693905328.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-10-10migration/calc-dirty-rate: millisecond-granularity periodAndrei Gudkov
This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-08-02qapi: Craft the dirty-limit capability commentHyman Huang(黄勇)
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Message-ID: <169073570563.19893.2928364761104733482-2@git.sr.ht> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-08-02qapi: Reformat the dirty-limit migration doc commentsHyman Huang(黄勇)
Reformat the dirty-limit migration doc comments to conform to current conventions as commit a937b6aa739 (qapi: Reformat doc comments to conform to current conventions). Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Message-ID: <169073570563.19893.2928364761104733482-1@git.sr.ht> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Whitespace tidied up] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-07-26Merge tag 'pull-qapi-2023-07-26-v2' of https://repo.or.cz/qemu/armbru into ↵Richard Henderson
staging QAPI patches patches for 2023-07-26 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmTBFvUSHGFybWJydUBy # ZWRoYXQuY29tAAoJEDhwtADrkYZTML4QAKhHciLnEudtZ6SFSqpOgt80IJnw8a+r # z1AowVYtgPhlZ8TtQJFXpBtAZtKu8xb/QdFxomm4bdNQnWX6CXCoheF5ZJ9V3Rrz # A3pA1wt5KTnRif6R9/Rs1dYXEr4cWagg1UNT3g2eOV3fvdDHvJMPOsqK/jWeXuC1 # T94yFMv1bZSLyiLgB7QQNYDZhIWQ06RGU6tZdWaZQReA8N8maXiZN5NnUISK32Rq # L2X0FtgzyJQ+dLHtbXOw6kIwZdOLNauOM78skZoiZUyFVaH2aDUIg3mnfRw36hN6 # feXGtw68PkTQGexKmonPDljIacfMDApmNBelLwsvB9MTrwVV+hKZPy1ZEwPIFDJ9 # yid63pp2CtQ1TZ3dSjZ1cGbRR+g2NI5X4g1DlcFPAxydMkv9/m5NwQx8OYqVIzqg # VXeS0++O2BM5+ORjlJxMx3RsyH2O1I8DCfwmifzYSo+3Xg/4nCV3f38czbavjCfJ # 4T3ooZx0+PRtjlOlfZTkgxV14TMV+XzQr3bsN4wbPdnjnueSE1tyoVGy8MwQ5aXi # 2oAsjrR8g7iqU6f+6PyRNn5F6D0ge+AYQ7bYS51i3Hyih/y2QUJECpL3XAgOxREb # /68SEtr4m/GJvmQNdwwwu6e1JFo8LknwMfkfzQAOCK1npAJGsWPmJ6iY7KtWgS8F # oDwqng/WOhvV # =mNMX # -----END PGP SIGNATURE----- # gpg: Signature made Wed 26 Jul 2023 05:52:05 AM PDT # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * tag 'pull-qapi-2023-07-26-v2' of https://repo.or.cz/qemu/armbru: qapi: Reformat recent doc comments to conform to current conventions qapi/trace: Tidy up trace-event-get-state, -set-state documentation qapi/qdev: Tidy up device_add documentation qapi/block: Tidy up block-latency-histogram-set documentation qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-26qapi: Reformat recent doc comments to conform to current conventionsMarkus Armbruster
Since commit a937b6aa739 (qapi: Reformat doc comments to conform to current conventions), a number of comments not conforming to the current formatting conventions were added. No problem, just sweep the entire documentation once more. To check the generated documentation does not change, I compared the generated HTML before and after this commit with "wdiff -3". Finds no differences. Comparing with diff is not useful, as the reflown paragraphs are visible there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20230720071610.1096458-7-armbru@redhat.com>
2023-07-26migration: skipped field is really obsolete.Juan Quintela
Has return zero for more than 10 years. Specifically we introduced the field in 1.5.0 commit f1c72795af573b24a7da5eb52375c9aba8a37972 Author: Peter Lieven <pl@kamp.de> Date: Tue Mar 26 10:58:37 2013 +0100 migration: do not sent zero pages in bulk stage during bulk stage of ram migration if a page is a zero page do not send it at all. the memory at the destination reads as zero anyway. even if there is an madvise with QEMU_MADV_DONTNEED at the target upon receipt of a zero page I have observed that the target starts swapping if the memory is overcommitted. it seems that the pages are dropped asynchronously. this patch also updates QMP to return the number of skipped pages in MigrationStats. but removed its usage in 1.5.3 commit 9ef051e5536b6368a1076046ec6c4ec4ac12b5c6 Author: Peter Lieven <pl@kamp.de> Date: Mon Jun 10 12:14:19 2013 +0200 Revert "migration: do not sent zero pages in bulk stage" Not sending zero pages breaks migration if a page is zero at the source but not at the destination. This can e.g. happen if different BIOS versions are used at source and destination. It has also been reported that migration on pseries is completely broken with this patch. This effectively reverts commit f1c72795af573b24a7da5eb52375c9aba8a37972. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20230612193344.3796-2-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Extend query-migrate to provide dirty page limit infoHyman Huang(黄勇)
Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Introduce dirty-limit capabilityHyman Huang(黄勇)
Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26qapi/migration: Introduce vcpu-dirty-limit parametersHyman Huang(黄勇)
Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26qapi/migration: Introduce x-vcpu-dirty-limit-period parameterHyman Huang(黄勇)
Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirtyrate calculation period configurable. Currently with the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes, test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-10migration.json: Don't use space before colonJuan Quintela
So all the file is consistent. Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230612191604.2219-1-quintela@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-07-10qapi: better docs for calc-dirty-rate and friendsAndrei Gudkov
Rewrote calc-dirty-rate documentation. Briefly described different modes of dirty page rate measurement. Added some examples. Fixed obvious grammar errors. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Message-Id: <fe7d32a621ebd69ef6974beb2499c0b5dccb9e19.1684854849.git.gudkov.andrei@huawei.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> [Prose tweaked and spacing corrected, as per review] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-06-30migration: Add switchover ack capabilityAvihai Horon
Migration downtime estimation is calculated based on bandwidth and remaining migration data. This assumes that loading of migration data in the destination takes a negligible amount of time and that downtime depends only on network speed. While this may be true for RAM, it's not necessarily true for other migrated devices. For example, loading the data of a VFIO device in the destination might require from the device to allocate resources, prepare internal data structures and so on. These operations can take a significant amount of time which can increase migration downtime. This patch adds a new capability "switchover ack" that prevents the source from stopping the VM and completing the migration until an ACK is received from the destination that it's OK to do so. This can be used by migrated devices in various ways to reduce downtime. For example, a device can send initial precopy metadata to pre-allocate resources in the destination and use this capability to make sure that the pre-allocation is completed before the source VM is stopped, so it will have full effect. This new capability relies on the return path capability to communicate from the destination back to the source. The actual implementation of the capability will be added in the following patches. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>