aboutsummaryrefslogtreecommitdiff
path: root/migration/colo.c
AgeCommit message (Collapse)Author
2023-10-31qemu-file: Make qemu_fflush() return errorsJuan Quintela
This let us simplify code of this shape. qemu_fflush(f); int ret = qemu_file_get_error(f); if (ret) { return ret; } into: int ret = qemu_fflush(f); if (ret) { return ret; } I updated all callers where there is any error check. qemu_fclose() don't need to check for f->last_error because qemu_fflush() returns it at the beggining of the function. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-13-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-18migration: process_incoming_migration_co(): move colo part to coloVladimir Sementsov-Ogievskiy
Let's make better public interface for COLO: instead of colo_process_incoming_thread and not trivial logic around creating the thread let's make simple colo_incoming_co(), hiding implementation from generic code. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515130640.46035-4-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-18migration: split migration_incoming_coVladimir Sementsov-Ogievskiy
Originally, migration_incoming_co was introduced by 25d0c16f625feb3b6 "migration: Switch to COLO process after finishing loadvm" to be able to enter from COLO code to one specific yield point, added by 25d0c16f625feb3b6. Later in 923709896b1b0 "migration: poll the cm event for destination qemu" we reused this variable to wake the migration incoming coroutine from RDMA code. That was doubtful idea. Entering coroutines is a very fragile thing: you should be absolutely sure which yield point you are going to enter. I don't know how much is it safe to enter during qemu_loadvm_state() which I think what RDMA want to do. But for sure RDMA shouldn't enter the special COLO-related yield-point. As well, COLO code doesn't want to enter during qemu_loadvm_state(), it want to enter it's own specific yield-point. As well, when in 8e48ac95865ac97d "COLO: Add block replication into colo process" we added bdrv_invalidate_cache_all() call (now it's called activate_all()) it became possible to enter the migration incoming coroutine during that call which is wrong too. So, let't make these things separate and disjoint: loadvm_co for RDMA, non-NULL during qemu_loadvm_state(), and colo_incoming_co for COLO, non-NULL only around specific yield. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515130640.46035-3-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-10build: move COLO under CONFIG_REPLICATIONVladimir Sementsov-Ogievskiy
We don't allow to use x-colo capability when replication is not configured. So, no reason to build COLO when replication is disabled, it's unusable in this case. Note also that the check in migrate_caps_check() is not the only restriction: some functions in migration/colo.c will just abort if called with not defined CONFIG_REPLICATION, for example: migration_iteration_finish() case MIGRATION_STATUS_COLO: migrate_start_colo_process() colo_process_checkpoint() abort() It could probably make sense to have possibility to enable COLO without REPLICATION, but this requires deeper audit of colo & replication code, which may be done later if needed. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Acked-by: Dr. David Alan Gilbert <dave@treblig.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230428194928.1426370-4-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-10colo: make colo_checkpoint_notify static and provide simpler APIVladimir Sementsov-Ogievskiy
colo_checkpoint_notify() is mostly used in colo.c. Outside we use it once when x-checkpoint-delay migration parameter is set. So, let's simplify the external API to only that function - notify COLO that parameter was set. This make external API more robust and hides implementation details from external callers. Also this helps us to make COLO module optional in further patch (i.e. we are going to add possibility not build the COLO module). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20230428194928.1426370-3-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-04-24migration: Create migrate_checkpoint_delay()Juan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-04-24migration: Create options.cJuan Quintela
We move there all capabilities helpers from migration.c. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- Following David advise: - looked through the history, capabilities are newer than 2012, so we can remove that bit of the header. - This part is posterior to Anthony. Original Author is Orit. Once there, I put myself. Peter Xu also did quite a bit of work here. Anyone else wants/needs to be there? I didn't search too hard because nobody asked before to be added. What do you think?
2023-02-23error: Drop superfluous #include "qapi/qmp/qerror.h"Markus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20230207075115.1525-2-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
2022-12-14qapi migration: Elide redundant has_FOO in generated CMarkus Armbruster
The has_FOO for pointer-valued FOO are redundant, except for arrays. They are also a nuisance to work with. Recent commit "qapi: Start to elide redundant has_FOO in generated C" provided the means to elide them step by step. This is the step for qapi/migration.json. Said commit explains the transformation in more detail. The invariant violations mentioned there do not occur here. Cc: Juan Quintela <quintela@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20221104160712.3005652-17-armbru@redhat.com>
2022-06-23migration: remove the QEMUFileOps abstractionDaniel P. Berrangé
Now that all QEMUFile callbacks are removed, the entire concept can be deleted. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-12-15COLO: Move some trace code behind qemu_mutex_unlock_iothread()Rao, Lei
There is no need to put some trace code in the critical section. So, moving it behind qemu_mutex_unlock_iothread() can reduce the lock time. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15migration/colo: Optimize COLO primary node start code pathZhang Chen
Optimize COLO primary start path from: MIGRATION_STATUS_XXX --> MIGRATION_STATUS_ACTIVE --> MIGRATION_STATUS_COLO --> MIGRATION_STATUS_COMPLETED To: MIGRATION_STATUS_XXX --> MIGRATION_STATUS_COLO --> MIGRATION_STATUS_COMPLETED No need to start primary COLO through "MIGRATION_STATUS_ACTIVE". Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15Fixed a QEMU hang when guest poweroff in COLO modeRao, Lei
When the PVM guest poweroff, the COLO thread may wait a semaphore in colo_process_checkpoint().So, we should wake up the COLO thread before migration shutdown. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15migration/colo: More accurate update checkpoint timeZhang Chen
Previous operation(like vm_start and replication_start_all) will consume extra time before update the timer, so reduce time in this patch. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-09Reset the auto-converge counter at every checkpoint.Rao, Lei
if we don't reset the auto-converge counter, it will continue to run with COLO running, and eventually the system will hang due to the CPU throttle reaching DEFAULT_MIGRATE_MAX_CPU_THROTTLE. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-03Changed the last-mode to none of first start COLORao, Lei
When we first stated the COLO, the last-mode is as follows: { "execute": "query-colo-status" } {"return": {"last-mode": "primary", "mode": "primary", "reason": "none"}} The last-mode is unreasonable. After the patch, will be changed to the following: { "execute": "query-colo-status" } {"return": {"last-mode": "none", "mode": "primary", "reason": "none"}} Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-03Removed the qemu_fclose() in colo_process_incoming_threadRao, Lei
After the live migration, the related fd will be cleanup in migration_incoming_state_destroy(). So, the qemu_close() in colo_process_incoming_thread is not necessary. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-03colo: fixed 'Segmentation fault' when the simplex mode PVM poweroffRao, Lei
The GDB statck is as follows: Program terminated with signal SIGSEGV, Segmentation fault. 0 object_class_dynamic_cast (class=0x55c8f5d2bf50, typename=0x55c8f2f7379e "qio-channel") at qom/object.c:832 if (type->class->interfaces && [Current thread is 1 (Thread 0x7f756e97eb00 (LWP 1811577))] (gdb) bt 0 object_class_dynamic_cast (class=0x55c8f5d2bf50, typename=0x55c8f2f7379e "qio-channel") at qom/object.c:832 1 0x000055c8f2c3dd14 in object_dynamic_cast (obj=0x55c8f543ac00, typename=0x55c8f2f7379e "qio-channel") at qom/object.c:763 2 0x000055c8f2c3ddce in object_dynamic_cast_assert (obj=0x55c8f543ac00, typename=0x55c8f2f7379e "qio-channel", file=0x55c8f2f73780 "migration/qemu-file-channel.c", line=117, func=0x55c8f2f73800 <__func__.18724> "channel_shutdown") at qom/object.c:786 3 0x000055c8f2bbc6ac in channel_shutdown (opaque=0x55c8f543ac00, rd=true, wr=true, errp=0x0) at migration/qemu-file-channel.c:117 4 0x000055c8f2bba56e in qemu_file_shutdown (f=0x7f7558070f50) at migration/qemu-file.c:67 5 0x000055c8f2ba5373 in migrate_fd_cancel (s=0x55c8f4ccf3f0) at migration/migration.c:1699 6 0x000055c8f2ba1992 in migration_shutdown () at migration/migration.c:187 7 0x000055c8f29a5b77 in main (argc=69, argv=0x7fff3e9e8c08, envp=0x7fff3e9e8e38) at vl.c:4512 The root cause is that we still want to shutdown the from_dst_file in migrate_fd_cancel() after qemu_close in colo_process_checkpoint(). So, we should set the s->rp_state.from_dst_file = NULL after qemu_close(). Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-03Some minor optimizations for COLORao, Lei
Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-06-11Remove migrate_set_block_enabled in checkpointRao, Lei
We can detect disk migration in migrate_prepare, if disk migration is enabled in COLO mode, we can directly report an error.and there is no need to disable block migration at every checkpoint. Signed-off-by: Lei Rao <lei.rao@intel.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-05-26replication: move include out of root directoryPaolo Bonzini
The replication.h file is included from migration/colo.c and tests/unit/test-replication.c, so it should be in include/. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-01-08Remove superfluous timer_del() callsPeter Maydell
This commit is the result of running the timer-del-timer-free.cocci script on the whole source tree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201215154107.3255-4-peter.maydell@linaro.org
2020-10-09error: Remove NULL checks on error_propagate() calls (again)Markus Armbruster
Patch created mechanically by rerunning: $ spatch --sp-file scripts/coccinelle/error_propagate_null.cocci \ --macro-file scripts/cocci-macro-file.h \ --use-gitgrep . Cc: Jens Freimann <jfreimann@redhat.com> Cc: Hailiang Zhang <zhang.zhanghailiang@huawei.com> Cc: Juan Quintela <quintela@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200722084048.1726105-4-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2020-09-17migration/: fix some comment spelling errorszhaolichang
I found that there are many spelling errors in the comments of qemu, so I used the spellcheck tool to check the spelling errors and finally found some spelling errors in the migration folder. Signed-off-by: zhaolichang <zhaolichang@huawei.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20200917075029.313-3-zhaolichang@huawei.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-06-01migration/colo.c: Move colo_notify_compares_event to the right placeLukas Straub
If the secondary has to failover during checkpointing, it still is in the old state (i.e. different state than primary). Thus we can't expose the primary state until after the checkpoint is sent. This fixes sporadic connection reset of client connections during failover. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <d4555dd5146a54518c4d9d4efd996b7c745c6687.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Relaunch failover even if there was an errorLukas Straub
If vmstate_loading is true, secondary_vm_do_failover will set failover status to FAILOVER_STATUS_RELAUNCH and return success without initiating failover. However, if there is an error during the vmstate_loading section, failover isn't relaunched. Instead we then wait for failover on colo_incoming_sem. Fix this by relaunching failover even if there was an error. Also, to make this work properly, set vmstate_loading to false when returning during the vmstate_loading section. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <f60b0a8e2fadaaec792e04819dfc46951842d6ba.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Flush ram cache only after receiving device stateLukas Straub
If we suceed in receiving ram state, but fail receiving the device state, there will be a mismatch between the two. Fix this by flushing the ram cache only after the vmstate has been received. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <3289d007d494cb0e2f05b1cf4ae6a78d300fede3.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Use cpu_synchronize_all_states()Lukas Straub
cpu_synchronize_all_pre_loadvm() marks all vcpus as dirty, so the registers are loaded from CPUState before we continue running the vm. However if we failover during checkpoint, CPUState is not initialized and the registers are loaded with garbage. This causes guest hangs and crashes. Fix this by using cpu_synchronize_all_states(), which initializes CPUState from the current cpu registers additionally to marking the vcpus as dirty. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <9675031ce557b73ebd10e7bd20ebbf57f30b177c.1589193382.git.lukasstraub2@web.de> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-01migration/colo.c: Use event instead of semaphoreLukas Straub
If multiple packets miscompare in a short timeframe, the semaphore value will be increased multiple times. This causes multiple checkpoints even if one would be sufficient. Fix this by using a event instead of a semaphore for triggering checkpoints. Now, checkpoint requests will be ignored until the checkpoint event is sent to colo-compare (which releases the miscompared packets). Benchmark results (iperf3): Client-to-server tcp: without patch: ~66 Mbit/s with patch: ~61 Mbit/s Server-to-client tcp: without patch: ~702 Kbit/s with patch: ~16 Mbit/s Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <fd601ba1beb524aada54ba66e87ebfc12cf4574b.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-07migration/colo: Add missing error-propagation codePhilippe Mathieu-Daudé
Running the coccinelle script produced: $ spatch \ --macro-file scripts/cocci-macro-file.h --include-headers \ --sp-file scripts/coccinelle/find-missing-error_propagate.cocci \ --keep-comments --smpl-spacing --dir . HANDLING: ./migration/colo.c [[manual check required: error_propagate() might be missing in migrate_set_block_enabled() ./migration/colo.c:439:4]] Add the missing error_propagate() after review. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20200413205250.687-1-f4bug@amsat.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-04-29migration/colo: Fix qmp_xen_colo_do_checkpoint() error handlingMarkus Armbruster
The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. qmp_xen_colo_do_checkpoint() passes @errp first to replication_do_checkpoint_all(), and then to colo_notify_filters_event(). If both fail, this will trip the assertion in error_setv(). Similar code in secondary_vm_do_failover() calls colo_notify_filters_event() only after replication_do_checkpoint_all() succeeded. Do the same here. Fixes: 0e8818f023616677416840d6ddc880db8de3c967 Cc: Zhang Chen <chen.zhang@intel.com> Cc: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20200422130719.28225-12-armbru@redhat.com>
2020-03-25migration/colo: fix use after free of local_errVladimir Sementsov-Ogievskiy
local_err is used again in secondary_vm_do_failover() after replication_stop_all(), so we must zero it. Otherwise try to set non-NULL local_err will crash. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200324153630.11882-5-vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-13COLO: Optimize memory back-up processzhanghailiang
This patch will reduce the downtime of VM for the initial process, Previously, we copied all these memory in preparing stage of COLO while we need to stop VM, which is a time-consuming process. Here we optimize it by a trick, back-up every page while in migration process while COLO is enabled, though it affects the speed of the migration, but it obviously reduce the downtime of back-up all SVM'S memory in COLO preparing stage. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-Id: <20200224065414.36524-5-zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> minor typo fixes
2020-02-28migration/colo: wrap incoming checkpoint process into new helperzhanghailiang
Split checkpoint incoming process into a helper. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-08-16sysemu: Split sysemu/runstate.h off sysemu/sysemu.hMarkus Armbruster
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related to the system-emulator. Evidence: * It's included widely: in my "build everything" tree, changing sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h, down from 5400 due to the previous two commits). * It pulls in more than a dozen additional headers. Split stuff related to run state management into its own header sysemu/runstate.h. Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400 to 4200. Touching new sysemu/runstate.h recompiles some 500 objects. Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also add qemu/main-loop.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-30-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [Unbreak OS-X build]
2019-08-16Include qemu/main-loop.h lessMarkus Armbruster
In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include exec/memory.h slightly lessMarkus Armbruster
Drop unnecessary inclusions from headers. Downgrade a few more to exec/hwaddr.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-17-armbru@redhat.com>
2019-07-02migration/colo.c: Add missed filter notify for Xen COLO.Zhang Chen
We need to notify net filter to do checkpoint for Xen COLO, like KVM side. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-05-14migration/colo.c: Remove redundant input parameterZhang Chen
The colo_do_failover no need the input parameter. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20190426090730.2691-2-chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-03-25Migration/colo.c: Make user obtain the last COLO mode info after failoverZhang Chen
Add the last_colo_mode to save the status after failover. This patch can solve the issue that user want to get last colo mode use query_colo_status after failover. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-03-25Migration/colo.c: Add the necessary checks for colo_do_failoverZhang Chen
Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-03-25Migration/colo.c: Add new COLOExitReason to handle all failover stateZhang Chen
In this patch we add the processing state for COLOExitReason, because we have to identify COLO in the failover processing state or failover error state. In the way, we can handle all the failover state. We have improved the description of the COLOExitReason by the way. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-03-25Migration/colo.c: Fix COLO failover status errorZhang Chen
When finished COLO failover, the status is FAILOVER_STATUS_COMPLETED. The origin codes misunderstand the FAILOVER_STATUS_REQUIRE. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-03-06Migration/colo.c: Make COLO node running after failoverZhang Chen
Delay to close COLO for auto start VM after failover. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190303145021.2962-4-chen.zhang@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-03-06Migration/colo.c: Fix double close bug when occur COLO failoverZhang Chen
In migration_incoming_state_destroy(void) will check the mis->to_src_file to double close the mis->to_src_file when occur COLO failover. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190303145021.2962-2-chen.zhang@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-12-14qapi: add conditions to REPLICATION type/commands on the schemaMarc-André Lureau
Add #if defined(CONFIG_REPLICATION) in generated code, and adjust the code accordingly. Made conditional: * xen-set-replication, query-xen-replication-status, xen-colo-do-checkpoint Before the patch, we first register the commands unconditionally in generated code (requires a stub), then conditionally unregister in qmp_unregister_commands_hack(). Afterwards, we register only when CONFIG_REPLICATION. The command fails exactly the same, with CommandNotFound. Improvement, because now query-qmp-schema is accurate, and we're one step closer to killing qmp_unregister_commands_hack(). * enum BlockdevDriver value "replication" in command blockdev-add * BlockdevOptions variant @replication and related structures. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20181213123724.4866-23-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-11-21migration/colo.c: Fix compilation issue when disable replicationZhang Chen
This compilation issue will occur when user use --disable-replication to config Qemu. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Message-Id: <20181101021226.6353-1-zhangckid@gmail.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-10-19COLO: quick failover process by kick COLO threadzhanghailiang
COLO thread may sleep at qemu_sem_wait(&s->colo_checkpoint_sem), while failover works begin, It's better to wakeup it to quick the process. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: notify net filters about checkpoint/failover eventzhanghailiang
Notify all net filters about the checkpoint and failover event. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19savevm: split the process of different stages for loadvm/savevmZhang Chen
There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>