aboutsummaryrefslogtreecommitdiff
path: root/migration
AgeCommit message (Collapse)Author
2020-08-28migration: tls: fix memory leak in migration_tls_get_credsZhenyu Ye
Currently migration_tls_get_creds() adds the reference of creds but there was no place to unref it. So the OBJECT(creds) will never be freed and result in memory leak. The leak stack: Direct leak of 104 byte(s) in 1 object(s) allocated from: #0 0xffffa88bd20b in __interceptor_malloc (/usr/lib64/libasan.so.4+0xd320b) #1 0xffffa7f0cb1b in g_malloc (/usr/lib64/libglib-2.0.so.0+0x58b1b) #2 0x14b58cb in object_new_with_type qom/object.c:634 #3 0x14b597b in object_new qom/object.c:645 #4 0x14c0e4f in user_creatable_add_type qom/object_interfaces.c:59 #5 0x141c78b in qmp_object_add qom/qom-qmp-cmds.c:312 #6 0x140e513 in qmp_marshal_object_add qapi/qapi-commands-qom.c:279 #7 0x176ba97 in do_qmp_dispatch qapi/qmp-dispatch.c:165 #8 0x176bee7 in qmp_dispatch qapi/qmp-dispatch.c:208 #9 0x136e337 in monitor_qmp_dispatch monitor/qmp.c:150 #10 0x136eae3 in monitor_qmp_bh_dispatcher monitor/qmp.c:239 #11 0x1852e93 in aio_bh_call util/async.c:89 #12 0x18531b7 in aio_bh_poll util/async.c:117 #13 0x18616bf in aio_dispatch util/aio-posix.c:459 #14 0x1853f37 in aio_ctx_dispatch util/async.c:268 #15 0xffffa7f06a7b in g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0+0x52a7b) Since we're fine to use the borrowed reference when using the creds, so just remove the object_ref() in migration_tls_get_creds(). Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> Message-Id: <20200722033228.71-1-yezhenyu2@huawei.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-08-28migration: improve error reporting of block driver state nameDaniel P. Berrangé
With blockdev, a BlockDriverState may not have a device name, so using a node name is required as an alternative. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20200827111606.1408275-2-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-08-28migration: add vsock as data channel supportLongpeng(Mike)
The vsock channel is more widely use in some new features, for example, the Nitro/Enclave. It can also be used as the migration channel. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Message-Id: <20200806074030.174-3-longpeng2@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-08-28migration: unify the framework of socket-type channelLongpeng(Mike)
Currently, the only difference of tcp channel and unix channel in migration/socket.c is the way to build SocketAddress, but socket_parse() can handle these two types, so use it to instead of tcp_build_address() and unix_build_address(). The socket-type channel can be further unified based on the up, this would be helpful for us to add other socket-type channels. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Message-Id: <20200806074030.174-2-longpeng2@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-08-27migration: Rename class type checking macrosEduardo Habkost
Rename the macros to make them consistent with the MIGRATION_OBJ macro name. This will make future conversion to OBJECT_DECLARE* easier. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Tested-By: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20200825192110.3528606-51-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-08-21migration: Add block-bitmap-mapping parameterMax Reitz
This migration parameter allows mapping block node names and bitmap names to aliases for the purpose of block dirty bitmap migration. This way, management tools can use different node and bitmap names on the source and destination and pass the mapping of how bitmaps are to be transferred to qemu (on the source, the destination, or even both with arbitrary aliases in the migration stream). While touching this code, fix a bug where bitmap names longer than 255 bytes would fail an assertion in qemu_put_counted_string(). Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200820150725.68687-2-mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-08-21meson: convert migration directory to MesonMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21trace: switch position of headers to what Meson requiresPaolo Bonzini
Meson doesn't enjoy the same flexibility we have with Make in choosing the include path. In particular the tracing headers are using $(build_root)/$(<D). In order to keep the include directives unchanged, the simplest solution is to generate headers with patterns like "trace/trace-audio.h" and place forwarding headers in the source tree such that for example "audio/trace.h" includes "trace/trace-audio.h". This patch is too ugly to be applied to the Makefiles now. It's only a way to separate the changes to the tracing header files from the Meson rewrite of the tracing logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-27migration: Fix typos in bitmap migration commentsEric Blake
Noticed while reviewing the file for newer patches. Fixes: b35ebdf076 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727203206.134996-1-eblake@redhat.com>
2020-07-27migration/savevm: don't worry if bitmap migration postcopy failedVladimir Sementsov-Ogievskiy
First, if only bitmaps postcopy is enabled (and not ram postcopy) postcopy_pause_incoming crashes on an assertion assert(mis->to_src_file). And anyway, bitmaps postcopy is not prepared to be somehow recovered. The original idea instead is that if bitmaps postcopy failed, we just lose some bitmaps, which is not critical. So, on failure we just need to remove unfinished bitmaps and guest should continue execution on destination. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-18-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: cancel migration on shutdownVladimir Sementsov-Ogievskiy
If target is turned off prior to postcopy finished, target crashes because busy bitmaps are found at shutdown. Canceling incoming migration helps, as it removes all unfinished (and therefore busy) bitmaps. Similarly on source we crash in bdrv_close_all which asserts that all bdrv states are removed, because bdrv states involved into dirty bitmap migration are referenced by it. So, we need to cancel outgoing migration as well. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-17-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: relax error handling in incoming partVladimir Sementsov-Ogievskiy
Bitmaps data is not critical, and we should not fail the migration (or use postcopy recovering) because of dirty-bitmaps migration failure. Instead we should just lose unfinished bitmaps. Still we have to report io stream violation errors, as they affect the whole migration stream. While touching this, tighten code that was previously blindly calling malloc on a size read from the migration stream, as a corrupted stream (perhaps from a malicious user) should not be able to convince us to allocate an inordinate amount of memory. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727194236.19551-16-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: typo fixes, enhance commit message] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: keep bitmap state for all bitmapsVladimir Sementsov-Ogievskiy
Keep bitmap state for disabled bitmaps too. Keep the state until the end of the process. It's needed for the following commit to implement bitmap postcopy canceling. To clean-up the new list the following logic is used: We need two events to consider bitmap migration finished: 1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received 2. dirty_bitmap_mig_before_vm_start should be called These two events may come in any order, so we understand which one is last, and on the last of them we remove bitmap migration state from the list. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-15-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: simplify dirty_bitmap_load_completeVladimir Sementsov-Ogievskiy
bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in postcopy, bitmap successor must be enabled, and reclaim operation will enable the bitmap. So, actually we need just call _reclaim_ in both if branches, and making differences only to add an assertion seems not really good. The logic becomes simple: on load complete we do reclaim and that's all. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-14-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: rename finish_lock to just lockVladimir Sementsov-Ogievskiy
finish_lock is bad name, as lock used not only on process end. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-13-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: refactor state global variablesVladimir Sementsov-Ogievskiy
Move all state variables into one global struct. Reduce global variable usage, utilizing opaque pointer where possible. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <20200727194236.19551-12-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_initVladimir Sementsov-Ogievskiy
No reasons to keep two public init functions. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20200727194236.19551-11-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanupVladimir Sementsov-Ogievskiy
Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to stress that it is on save part. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-10-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: rename state structure typesVladimir Sementsov-Ogievskiy
Rename types to be symmetrical for load/save part and shorter. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-9-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-27migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_startVladimir Sementsov-Ogievskiy
Using the _locked version of bdrv_enable_dirty_bitmap to bypass locking is wrong as we do not already own the mutex. Moreover, the adjacent call to bdrv_dirty_bitmap_enable_successor grabs the mutex. Fixes: 58f72b965e9e1q Cc: qemu-stable@nongnu.org # v3.0 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727194236.19551-8-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-17migration/block-dirty-bitmap: fix add_bitmaps_to_listVladimir Sementsov-Ogievskiy
We shouldn't fail when finding an unnamed bitmap in a unnamed node or node with auto-generated node name, as bitmap migration ignores such bitmaps in the first place. Fixes: 82640edb88faa Fixes: 4ff5cc121b089 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200626130658.76498-1-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: commit message grammar tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-07-13migration/migration.c: Remove superfluous breaksLiao Pingfang
Remove superfluous breaks, as there is a "return" before them. Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-07-13migration/savevm: respect qemu_fclose() error code in save_snapshot()Denis V. Lunev
qemu_fclose() could return error, f.e. if bdrv_co_flush() will return the error. This validation will become more important once we will start waiting of asynchronous IO operations, started from bdrv_write_vmstate(), which are coming soon. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Juan Quintela <quintela@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <fam@euphon.net> CC: Juan Quintela <quintela@redhat.com> CC: Denis Plotnikov <dplotnikov@virtuozzo.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-07-13migration: fix memory leak in qmp_migrate_set_parametersZheng Chuan
"tmp.tls_hostname" and "tmp.tls_creds" allocated by migrate_params_test_apply() is forgot to free at the end of qmp_migrate_set_parameters(). Fix that. The leak stack: Direct leak of 2 byte(s) in 2 object(s) allocated from: #0 0xffffb597c20b in __interceptor_malloc (/usr/lib64/libasan.so.4+0xd320b) #1 0xffffb52dcb1b in g_malloc (/usr/lib64/libglib-2.0.so.0+0x58b1b) #2 0xffffb52f8143 in g_strdup (/usr/lib64/libglib-2.0.so.0+0x74143) #3 0xaaaac52447fb in migrate_params_test_apply (/usr/src/debug/qemu-4.1.0/migration/migration.c:1377) #4 0xaaaac52fdca7 in qmp_migrate_set_parameters (/usr/src/debug/qemu-4.1.0/qapi/qapi-commands-migration.c:192) #5 0xaaaac551d543 in qmp_dispatch (/usr/src/debug/qemu-4.1.0/qapi/qmp-dispatch.c:165) #6 0xaaaac52a0a8f in qmp_dispatch (/usr/src/debug/qemu-4.1.0/monitor/qmp.c:125) #7 0xaaaac52a1c7f in monitor_qmp_dispatch (/usr/src/debug/qemu-4.1.0/monitor/qmp.c:214) #8 0xaaaac55cb0cf in aio_bh_call (/usr/src/debug/qemu-4.1.0/util/async.c:117) #9 0xaaaac55d4543 in aio_bh_poll (/usr/src/debug/qemu-4.1.0/util/aio-posix.c:459) #10 0xaaaac55cae0f in aio_dispatch (/usr/src/debug/qemu-4.1.0/util/async.c:268) #11 0xffffb52d6a7b in g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0+0x52a7b) #12 0xaaaac55d1e3b(/usr/bin/qemu-kvm-4.1.0+0x1622e3b) #13 0xaaaac4e314bb(/usr/bin/qemu-kvm-4.1.0+0xe824bb) #14 0xaaaac47f45ef(/usr/bin/qemu-kvm-4.1.0+0x8455ef) #15 0xffffb4bfef3f in __libc_start_main (/usr/lib64/libc.so.6+0x23f3f) #16 0xaaaac47ffacb(/usr/bin/qemu-kvm-4.1.0+0x850acb) Direct leak of 2 byte(s) in 2 object(s) allocated from: #0 0xffffb597c20b in __interceptor_malloc (/usr/lib64/libasan.so.4+0xd320b) #1 0xffffb52dcb1b in g_malloc (/usr/lib64/libglib-2.0.so.0+0x58b1b) #2 0xffffb52f8143 in g_strdup (/usr/lib64/libglib-2.0.so.0+0x74143) #3 0xaaaac5244893 in migrate_params_test_apply (/usr/src/debug/qemu-4.1.0/migration/migration.c:1382) #4 0xaaaac52fdca7 in qmp_migrate_set_parameters (/usr/src/debug/qemu-4.1.0/qapi/qapi-commands-migration.c:192) #5 0xaaaac551d543 in qmp_dispatch (/usr/src/debug/qemu-4.1.0/qapi/qmp-dispatch.c) #6 0xaaaac52a0a8f in qmp_dispatch (/usr/src/debug/qemu-4.1.0/monitor/qmp.c:125) #7 0xaaaac52a1c7f in monitor_qmp_dispatch (/usr/src/debug/qemu-4.1.0/monitor/qmp.c:214) #8 0xaaaac55cb0cf in aio_bh_call (/usr/src/debug/qemu-4.1.0/util/async.c:117) #9 0xaaaac55d4543 in aio_bh_poll (/usr/src/debug/qemu-4.1.0/util/aio-posix.c:459) #10 0xaaaac55cae0f in in aio_dispatch (/usr/src/debug/qemu-4.1.0/util/async.c:268) #11 0xffffb52d6a7b in g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0+0x52a7b) #12 0xaaaac55d1e3b(/usr/bin/qemu-kvm-4.1.0+0x1622e3b) #13 0xaaaac4e314bb(/usr/bin/qemu-kvm-4.1.0+0xe824bb) #14 0xaaaac47f45ef (/usr/bin/qemu-kvm-4.1.0+0x8455ef) #15 0xffffb4bfef3f in __libc_start_main (/usr/lib64/libc.so.6+0x23f3f) #16 0xaaaac47ffacb(/usr/bin/qemu-kvm-4.1.0+0x850acb) Signed-off-by: Chuan Zheng <zhengchuan@huawei.com> Reviewed-by: KeQian Zhu <zhukeqian1@huawei.com> Reviewed-by: HaiLiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-07-10cpu-throttle: new module, extracted from cpus.cClaudio Fontana
move the vcpu throttling functionality into its own module. This functionality is not specific to any accelerator, and it is used currently by migration to slow down guests to try to have migrations converge, and by the cocoa MacOS UI to throttle speed. cpu-throttle contains the controls to adjust and inspect throttle settings, start (set) and stop vcpu throttling, and the throttling function itself that is run periodically on vcpus to make them take a nap. Execution of the throttling function on all vcpus is triggered by a timer, registered at module initialization. No functionality change. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20200629093504.3228-3-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-07Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio,acpi: features, fixes, cleanups. vdpa support virtio-mem support a handy script for disassembling acpi tables misc fixes and cleanups Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 07 Jul 2020 13:00:35 BST # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (41 commits) vhost-vdpa: introduce vhost-vdpa net client vhost-vdpa: introduce vhost-vdpa backend vhost_net: introduce set_config & get_config vhost: implement vhost_force_iommu method vhost: introduce new VhostOps vhost_force_iommu vhost: implement vhost_vq_get_addr method vhost: introduce new VhostOps vhost_vq_get_addr vhost: implement vhost_dev_start method vhost: introduce new VhostOps vhost_dev_start vhost: check the existence of vhost_set_iotlb_callback virtio-pci: implement queue_enabled method virtio-bus: introduce queue_enabled method vhost_net: use the function qemu_get_peer net: introduce qemu_get_peer MAINTAINERS: add VT-d entry docs: vhost-user: add Virtio status protocol feature tests/acpi: remove stale allowed tables numa: Auto-enable NUMA when any memory devices are possible virtio-mem: Exclude unplugged memory during migration virtio-mem: Add trace events ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/arm/virt.c # hw/virtio/trace-events
2020-07-03migration: Count new_dirty instead of real_dirtyKeqian Zhu
real_dirty_pages becomes equal to total ram size after dirty log sync in ram_init_bitmaps, the reason is that the bitmap of ramblock is initialized to be all set, so old path counts them as "real dirty" at beginning. This causes wrong dirty rate and false positive throttling. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Message-Id: <20200622032037.31112-1-zhukeqian1@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-07-03migration: postcopy take proper error returnDr. David Alan Gilbert
This function returns a boolean success and we're returning -1; lets just use the 'out' error path. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixes: 58b7c17e226 ("Disable mlock around incoming postcopy") Buglink: https://bugs.launchpad.net/qemu/+bug/1885720 Message-Id: <20200701093557.130096-1-dgilbert@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-07-02migration/colo: Use ram_block_discard_disable()David Hildenbrand
COLO will copy all memory in a RAM block, disable discarding of RAM. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Lukas Straub <lukasstraub2@web.de> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Hailiang Zhang <zhang.zhanghailiang@huawei.com> Cc: Juan Quintela <quintela@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20200626072248.78761-10-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-02migration/rdma: Use ram_block_discard_disable()David Hildenbrand
RDMA will pin all guest memory (as documented in docs/rdma.txt). We want to disable RAM block discards - however, to keep it simple use ram_block_discard_is_required() instead of inhibiting. Note: It is not sufficient to limit disabling to pin_all. Even when only conditionally pinning 1 MB chunks, as soon as one page within such a chunk was discarded and one page not, the discarded pages will be pinned as well. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Juan Quintela <quintela@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20200626072248.78761-9-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-02virtio-balloon: Rip out qemu_balloon_inhibit()David Hildenbrand
The only remaining special case is postcopy. It cannot handle concurrent discards yet, which would result in requesting already sent pages from the source. Special-case it in virtio-balloon instead. Introduce migration_in_incoming_postcopy(), to find out if incoming postcopy is active. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Juan Quintela <quintela@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20200626072248.78761-7-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-02migration/rdma: Plug memory leaks in qemu_rdma_registration_stop()Markus Armbruster
qemu_rdma_registration_stop() uses the ERROR() macro to create, report to stderr, and store an Error object. The stored Error object is never used, and its memory is leaked. Even where ERROR() doesn't leak, it is ill-advised. The whole point of passing an Error to the caller is letting the caller handle the error. Error handling may report to stderr, to somewhere else, or not at all. Also reporting in the callee mixes up concerns that should be kept separate. Since I don't know what reporting to stderr is supposed to accomplish, I'm not touching it. Commit 2a1bc8bde7 "migration/rdma: rdma_accept_incoming_migration fix error handling" plugged the same leak in rdma_accept_incoming_migration(). Plug the memory leak the same way: keep the report part, delete the store part. The report part uses fprintf(). If it's truly an error, it should use error_report() instead. But I don't know, so I leave it alone, just like commit 2a1bc8bde7 did. Fixes: 2da776db4846eadcb808598a5d3484d149773c05 Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Juan Quintela <quintela@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200630090351.1247703-27-armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-26osdep: Make MIN/MAX evaluate arguments only onceEric Blake
I'm not aware of any immediate bugs in qemu where a second runtime evaluation of the arguments to MIN() or MAX() causes a problem, but proactively preventing such abuse is easier than falling prey to an unintended case down the road. At any rate, here's the conversation that sparked the current patch: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg05718.html Update the MIN/MAX macros to only evaluate their argument once at runtime; this uses typeof(1 ? (a) : (b)) to ensure that we are promoting the temporaries to the same type as the final comparison (we have to trigger type promotion, as typeof(bitfield) won't compile; and we can't use typeof((a) + (b)) or even typeof((a) + 0), as some of our uses of MAX are on void* pointers where such addition is undefined). However, we are unable to work around gcc refusing to compile ({}) in a constant context (such as the array length of a static variable), even when only used in the dead branch of a __builtin_choose_expr(), so we have to provide a second macro pair MIN_CONST and MAX_CONST for use when both arguments are known to be compile-time constants and where the result must also be usable as a constant; this second form evaluates arguments multiple times but that doesn't matter for constants. By using a void expression as the expansion if a non-constant is presented to this second form, we can enlist the compiler to ensure the double evaluation is not attempted on non-constants. Alas, as both macros now rely on compiler intrinsics, they are no longer usable in preprocessor #if conditions; those will just have to be open-coded or the logic rewritten into #define or runtime 'if' conditions (but where the compiler dead-code-elimination will probably still apply). I tested that both gcc 10.1.1 and clang 10.0.0 produce errors for all forms of macro mis-use. As the errors can sometimes be cryptic, I'm demonstrating the gcc output: Use of MIN when MIN_CONST is needed: In file included from /home/eblake/qemu/qemu-img.c:25: /home/eblake/qemu/include/qemu/osdep.h:249:5: error: braced-group within expression allowed only inside a function 249 | ({ \ | ^ /home/eblake/qemu/qemu-img.c:92:12: note: in expansion of macro ‘MIN’ 92 | char array[MIN(1, 2)] = ""; | ^~~ Use of MIN_CONST when MIN is needed: /home/eblake/qemu/qemu-img.c: In function ‘is_allocated_sectors’: /home/eblake/qemu/qemu-img.c:1225:15: error: void value not ignored as it ought to be 1225 | i = MIN_CONST(i, n); | ^ Use of MIN in the preprocessor: In file included from /home/eblake/qemu/accel/tcg/translate-all.c:20: /home/eblake/qemu/accel/tcg/translate-all.c: In function ‘page_check_range’: /home/eblake/qemu/include/qemu/osdep.h:249:6: error: token "{" is not valid in preprocessor expressions 249 | ({ \ | ^ Fix the resulting callsites that used #if or computed a compile-time constant min or max to use the new macros. cpu-defs.h is interesting, as CPU_TLB_DYN_MAX_BITS is sometimes used as a constant and sometimes dynamic. It may be worth improving glib's MIN/MAX definitions to be saner, but that is a task for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200625162602.700741-1-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-18Merge remote-tracking branch ↵Peter Maydell
'remotes/dgilbert/tags/pull-migration-20200617a' into staging Migration (and HMP and virtiofs) pull 2020-06-17 Migration: HMP/migration and test changes from Mao Zhongyi multifd fix from Laurent Vivier HMP qom-set partial reversion/change from David Hildenbrand now you need -j to pass json format, but it's regained the old 100M type format. Memory leak fix from Pan Nengyuan Virtiofs fchmod seccomp fix from Max Reitz Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> # gpg: Signature made Wed 17 Jun 2020 19:34:58 BST # gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7 # gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full] # Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7 * remotes/dgilbert/tags/pull-migration-20200617a: migration: fix multifd_send_pages() next channel docs/xbzrle: update 'cache miss rate' and 'encoding rate' to docs monitor/hmp-cmds: improvements for the 'info migrate' monitor/hmp-cmds: add 'goto end' to reduce duplicate code. monitor/hmp-cmds: delete redundant Error check before invoke hmp_handle_error() monitor/hmp-cmds: don't silently output when running 'migrate_set_downtime' fails monitor/hmp-cmds: add units for migrate_parameters tests/migration: fix unreachable path in stress test tests/migration: mem leak fix hmp: Make json format optional for qom-set qom-hmp-cmds: fix a memleak in hmp_qom_get virtiofsd: Whitelist fchmod Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-18migration: fix xbzrle encoding rate calculationWei Wang
It's reported an error of implicit conversion from "unsigned long" to "double" when compiling with Clang 10. Simply make the encoding rate 0 when the encoded_size is 0. Fixes: e460a4b1a4 Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reported-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200617201309.1640952-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-17migration: fix multifd_send_pages() next channelLaurent Vivier
multifd_send_pages() loops around the available channels, the next channel to use between two calls to multifd_send_pages() is stored inside a local static variable, next_channel. It works well, except if the number of channels decreases between two calls to multifd_send_pages(). In this case, the loop can try to access the data of a channel that doesn't exist anymore. The problem can be triggered if we start a migration with a given number of channels and then we cancel the migration to restart it with a lower number. This ends generally with an error like: qemu-system-ppc64: .../util/qemu-thread-posix.c:77: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed. This patch fixes the error by capping next_channel with the current number of channels before using it. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20200617113154.593233-1-lvivier@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-15qdev: qdev_create(), qdev_try_create() are now unused, dropMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-31-armbru@redhat.com>
2020-06-10accel: Move Xen accelerator code under accel/xen/Philippe Mathieu-Daudé
This code is not related to hardware emulation. Move it under accel/ with the other hypervisors. Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200508100222.7112-1-philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-01migration/migration.c: Fix hang in ram_save_host_pageLukas Straub
migration_rate_limit will erroneously ratelimit a shutdown socket, which causes the migration thread to hang in ram_save_host_page if the socket is shutdown. Fix this by explicitly testing if the socket has errors or was shutdown in migration_rate_limit. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <e79085bbe2d46dfa007dd41820194d5e2d4fcd80.1590007004.git.lukasstraub2@web.de> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Move colo_notify_compares_event to the right placeLukas Straub
If the secondary has to failover during checkpointing, it still is in the old state (i.e. different state than primary). Thus we can't expose the primary state until after the checkpoint is sent. This fixes sporadic connection reset of client connections during failover. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <d4555dd5146a54518c4d9d4efd996b7c745c6687.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Relaunch failover even if there was an errorLukas Straub
If vmstate_loading is true, secondary_vm_do_failover will set failover status to FAILOVER_STATUS_RELAUNCH and return success without initiating failover. However, if there is an error during the vmstate_loading section, failover isn't relaunched. Instead we then wait for failover on colo_incoming_sem. Fix this by relaunching failover even if there was an error. Also, to make this work properly, set vmstate_loading to false when returning during the vmstate_loading section. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <f60b0a8e2fadaaec792e04819dfc46951842d6ba.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Flush ram cache only after receiving device stateLukas Straub
If we suceed in receiving ram state, but fail receiving the device state, there will be a mismatch between the two. Fix this by flushing the ram cache only after the vmstate has been received. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <3289d007d494cb0e2f05b1cf4ae6a78d300fede3.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Use cpu_synchronize_all_states()Lukas Straub
cpu_synchronize_all_pre_loadvm() marks all vcpus as dirty, so the registers are loaded from CPUState before we continue running the vm. However if we failover during checkpoint, CPUState is not initialized and the registers are loaded with garbage. This causes guest hangs and crashes. Fix this by using cpu_synchronize_all_states(), which initializes CPUState from the current cpu registers additionally to marking the vcpus as dirty. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <9675031ce557b73ebd10e7bd20ebbf57f30b177c.1589193382.git.lukasstraub2@web.de> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Use event instead of semaphoreLukas Straub
If multiple packets miscompare in a short timeframe, the semaphore value will be increased multiple times. This causes multiple checkpoints even if one would be sufficient. Fix this by using a event instead of a semaphore for triggering checkpoints. Now, checkpoint requests will be ignored until the checkpoint event is sent to colo-compare (which releases the miscompared packets). Benchmark results (iperf3): Client-to-server tcp: without patch: ~66 Mbit/s with patch: ~61 Mbit/s Server-to-client tcp: without patch: ~702 Kbit/s with patch: ~16 Mbit/s Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <fd601ba1beb524aada54ba66e87ebfc12cf4574b.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/rdma: cleanup rdma context before g_free to avoid memleaksPan Nengyuan
When error happen in initializing 'rdma_return_path', we should cleanup rdma context before g_free(rdma) to avoid some memleaks. This patch fix that. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200508100755.7875-3-pannengyuan@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/rdma: fix potential nullptr access in rdma_start_incoming_migrationPan Nengyuan
'rdma' is NULL when taking the first error branch in rdma_start_incoming_migration. And it will cause a null pointer access in label 'err'. Fix that. Fixes: 59c59c67ee6b0327ae932deb303caa47919aeb1e Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200508100755.7875-2-pannengyuan@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Note this is CID 1428762
2020-05-28migration: forbid bitmap migration by generated node-nameVladimir Sementsov-Ogievskiy
It actually never worked with libvirt, as auto-generated names are different on source and destination. It's unsafe and useless to migrate by auto-generated node-names, so let's forbid it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200521220648.3255-7-vsementsov@virtuozzo.com> [eblake: shorter subject line] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-05-28migration: add_bitmaps_to_list: check disk name onceVladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200521220648.3255-6-vsementsov@virtuozzo.com> [eblake: shorter subject line] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-05-28migration: fix bitmaps pre-blockdev migration with mirror jobVladimir Sementsov-Ogievskiy
Important thing for bitmap migration is to select destination block node to obtain the migrated bitmap. Prepatch, on source we use bdrv_get_device_or_node_name() to identify the node, and on target we do bdrv_lookup_bs. bdrv_get_device_or_node_name() returns blk name only for direct children of blk. So, bitmaps of direct children of blks are migrated by blk name and others - by node name. Old libvirt is unprepared to bitmap migration by node-name, node-names are mostly auto-generated. So actually only migration by blk name works for it. Newer libvirt will use new interface (which will be added soon) to specify node-mapping for bitmaps migration explicitly. Still, let's improve the current behavior a bit. Now, consider classic libvirt migrations assisted by mirror block job: mirror block job inserts filter, so our source is not a direct child of blk, and bitmaps are migrated by node-names. And this just doesn't work with auto-generated node names. Let's fix it by using blk-name even if some implicit filters are inserted. Note2: we, of course, can't skip filters and use blk name to migrate bitmaps in filtered node by blk name for this blk if these filters have named bitmaps which should be migrated. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652424 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200521220648.3255-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: comment typo fix, shorter subject line] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-05-28migration: refactor init_dirty_bitmap_migrationVladimir Sementsov-Ogievskiy
Split out handling one bs, it is needed for the following commit, which will handle BlockBackends separately. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200521220648.3255-2-vsementsov@virtuozzo.com> [eblake: shorter subject line] Signed-off-by: Eric Blake <eblake@redhat.com>