aboutsummaryrefslogtreecommitdiff
path: root/migration
AgeCommit message (Collapse)Author
2020-01-20migration/postcopy: enable random order target page arrivalWei Yang
After using number of target page received to track one host page, we could have the capability to handle random order target page arrival in one host page. This is a preparation for enabling compress during postcopy. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration/postcopy: set all_zero to true on the first target pageWei Yang
For the first target page, all_zero is set to true for this round check. After target_pages introduced, we could leverage this variable instead of checking the address offset. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration/postcopy: count target page number to decide the place_neededWei Yang
In postcopy, it requires to place whole host page instead of target page. Currently, it relies on the page offset to decide whether this is the last target page. We also can count the target page number during the iteration. When the number of target page equals (host page size / target page size), this means it is the last target page in the host page. This is a preparation for non-ordered target page transmission. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration/postcopy: wait for decompress thread in precopyWei Yang
Compress is not supported with postcopy, it is safe to wait for decompress thread just in precopy. This is a preparation for later patch. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration/postcopy: reduce memset when it is zero page and ↵Wei Yang
matches_target_page_size In this case, page_buffer content would not be used. Skip this to save some time. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration/ram: Yield periodically to the main loopYury Kotov
Usually, incoming migration coroutine yields to the main loop while its IO-channel is waiting for data to receive. But there is a case when RAM migration and data receive have the same speed: VM with huge zeroed RAM. In this case, IO-channel won't read and thus the main loop is stuck and for instance, it doesn't respond to QMP commands. For this case, yield periodically, but not too often, so as not to affect the speed of migration. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration: savevm_state_handler_insert: constant-time element insertionScott Cheloha
savevm_state's SaveStateEntry TAILQ is a priority queue. Priority sorting is maintained by searching from head to tail for a suitable insertion spot. Insertion is thus an O(n) operation. If we instead keep track of the head of each priority's subqueue within that larger queue we can reduce this operation to O(1) time. savevm_state_handler_remove() becomes slightly more complex to accomodate these gains: we need to replace the head of a priority's subqueue when removing it. With O(1) insertion, booting VMs with many SaveStateEntry objects is more plausible. For example, a ppc64 VM with maxmem=8T has 40000 such objects to insert. Signed-off-by: Scott Cheloha <cheloha@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration: add savevm_state_handler_remove()Scott Cheloha
Create a function to abstract common logic needed when removing a SaveStateEntry element from the savevm_state.handlers queue. For now we just remove the element. Soon it will involve additional cleanup. Signed-off-by: Scott Cheloha <cheloha@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration: Fix the re-run check of the migrate-incoming commandYury Kotov
The current check sets an error but doesn't fail the command. This may cause a problem if new connection attempt by the same URI affects the first connection. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration: Fix incorrect integer->float conversion caught by clangFangrui Song
Clang does not like qmp_migrate_set_downtime()'s code to clamp double @value to 0..INT64_MAX: qemu/migration/migration.c:2038:24: error: implicit conversion from 'long' to 'double' changes value from 9223372036854775807 to 9223372036854775808 [-Werror,-Wimplicit-int-float-conversion] The warning will be enabled by default in clang 10. It is not available for clang <= 9. The clamp is actually useless; @value is checked to be within 0..MAX_MIGRATE_DOWNTIME_SECONDS immediately before. Delete it. While there, make the conversion from double to int64_t explicit. Signed-off-by: Fangrui Song <i@maskray.me> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [Patch split, commit message improved] Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration: Rate limit inside host pagesDr. David Alan Gilbert
When using hugepages, rate limiting is necessary within each huge page, since a 1G huge page can take a significant time to send, so you end up with bursty behaviour. Fixes: 4c011c37ecb3 ("postcopy: Send whole huge pages") Reported-by: Lin Ma <LMa@suse.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20ram.c: remove unneeded labelsDaniel Henrique Barboza
ram_save_queue_pages() has an 'err' label that can be replaced by 'return -1' instead. Same thing with ram_discard_range(), and in this case we can also get rid of the 'ret' variable and return either '-1' on error or the result of ram_block_discard_range(). CC: Juan Quintela <quintela@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20migration: Make sure that we don't call write() in case of errorJuan Quintela
If we are exiting due to an error/finish/.... Just don't try to even touch the channel with one IO operation. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2020-01-20multifd: Initialize local variableJuan Quintela
Fill everything with zero, so the padding fields are also initialized. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06vmstate: replace DeviceState with VMStateIfMarc-André Lureau
Replace DeviceState dependency with VMStateIf on vmstate API. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com>
2019-12-17colo: fix return without releasing RCUPaolo Bonzini
Use WITH_RCU_READ_LOCK_GUARD to avoid exiting colo_init_ram_cache without releasing RCU. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17migration: fix maybe-uninitialized warningMarc-André Lureau
../migration/ram.c: In function ‘multifd_recv_thread’: /home/elmarco/src/qq/include/qapi/error.h:165:5: error: ‘block’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 165 | error_setg_internal((errp), __FILE__, __LINE__, __func__, \ | ^~~~~~~~~~~~~~~~~~~ ../migration/ram.c:818:15: note: ‘block’ was declared here 818 | RAMBlock *block; | ^~~~~ Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-16migration: ram: Switch to ram block writebackBeata Michalska
Switch to ram block writeback for pmem migration. Signed-off-by: Beata Michalska <beata.michalska@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20191121000843.24844-4-beata.michalska@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-11-25net/virtio: fix dev_unplug_pendingJens Freimann
.dev_unplug_pending is set up by virtio-net code indepent of failover support was set for the device or not. This gives a wrong result when we check for existing primary devices in migration code. Fix this by actually calling dev_unplug_pending() instead of just checking if the function pointer was set. When the feature was not negotiated dev_unplug_pending() will always return false. This prevents us from going into the wait-unplug state when there's no primary device present. Fixes: 9711cd0dfc3f ("net/virtio: add failover support") Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-10-29migration: add new migration state wait-unplugJens Freimann
This patch adds a new migration state called wait-unplug. It is entered after the SETUP state if failover devices are present. It will transition into ACTIVE once all devices were succesfully unplugged from the guest. So if a guest doesn't respond or takes long to honor the unplug request the user will see the migration state 'wait-unplug'. In the migration thread we query failover devices if they're are still pending the guest unplug. When all are unplugged the migration continues. If one device won't unplug migration will stay in wait_unplug state. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20191029114905.6856-9-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-26core: replace getpagesize() with qemu_real_host_page_sizeWei Yang
There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-17block/dirty-bitmap: refactor bdrv_dirty_bitmap_nextVladimir Sementsov-Ogievskiy
bdrv_dirty_bitmap_next is always used in same pattern. So, split it into _next and _first, instead of combining two functions into one and add FOR_EACH_DIRTY_BITMAP macro. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190916141911.5255-5-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>
2019-10-17block/dirty-bitmap: add bs linkVladimir Sementsov-Ogievskiy
Add bs field to BdrvDirtyBitmap structure. Drop BlockDriverState parameter from bitmap APIs where possible. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190916141911.5255-3-vsementsov@virtuozzo.com [Rebased on top of block-copy. --js] Signed-off-by: John Snow <jsnow@redhat.com>
2019-10-11migration: Support gtree migrationEric Auger
Introduce support for GTree migration. A custom save/restore is implemented. Each item is made of a key and a data. If the key is a pointer to an object, 2 VMSDs are passed into the GTree VMStateField. When putting the items, the tree is traversed in sorted order by g_tree_foreach. On the get() path, gtrees must be allocated using the proper key compare, key destroy and value destroy. This must be handled beforehand, for example in a pre_load method. Tests are added to test save/dump of structs containing gtrees including the virtio-iommu domain/mappings scenario. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20191011121724.433-1-eric.auger@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> uintptr_t fixup for test on 32bit
2019-10-11migration/multifd: pages->used would be cleared when attach to ↵Wei Yang
multifd_send_state When we found an available channel in multifd_send_pages(), its pages->used is cleared and then attached to multifd_send_state. It is not necessary to do this twice. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-5-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/multifd: initialize packet->magic/version once at setup stageWei Yang
MultiFDPacket_t's magic and version field never changes during migration, so move these two fields in setup stage. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-4-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/multifd: use pages->allocated instead of the static maxWei Yang
multifd_send_fill_packet() prepares meta data for following pages to transfer. It would be more proper to fill pages->allocated instead of static max value, especially we want to support flexible packet size. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-3-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/multifd: fix a typo in comment of multifd_recv_unfill_packet()Wei Yang
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-2-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/postcopy: check PostcopyState before setting to ↵Wei Yang
POSTCOPY_INCOMING_RUNNING Currently, we set PostcopyState blindly to RUNNING, even we found the previous state is not LISTENING. This will lead to a corner case. First let's look at the code flow: qemu_loadvm_state_main() ret = loadvm_process_command() loadvm_postcopy_handle_run() return -1; if (ret < 0) { if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING) ... } >From above snippet, the corner case is loadvm_postcopy_handle_run() always sets state to RUNNING. And then it checks the previous state. If the previous state is not LISTENING, it will return -1. But at this moment, PostcopyState is already been set to RUNNING. Then ret is checked in qemu_loadvm_state_main(), when it is -1 PostcopyState is checked. Current logic would pause postcopy and retry if PostcopyState is RUNNING. This is not what we expect, because postcopy is not active yet. This patch makes sure state is set to RUNNING only previous state is LISTENING by checking the state first. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Suggested by: Peter Xu <peterx@redhat.com> Message-Id: <20191010011316.31363-3-richardw.yang@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/postcopy: rename postcopy_ram_enable_notify to ↵Wei Yang
postcopy_ram_incoming_setup Function postcopy_ram_incoming_setup and postcopy_ram_incoming_cleanup is a pair. Rename to make it clear for audience. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20191010011316.31363-2-richardw.yang@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/postcopy: postpone setting PostcopyState to ENDWei Yang
There are two places to call function postcopy_ram_incoming_cleanup() postcopy_ram_listen_thread on migration success loadvm_postcopy_handle_listen one setup failure On success, the vm will never accept another migration. On failure, PostcopyState is transited from LISTENING to END and would be checked in qemu_loadvm_state_main(). If PostcopyState is RUNNING, migration would be paused and retried. Currently PostcopyState is set to END in function postcopy_ram_incoming_cleanup(). With above analysis, we can take this step out and postpone this till the end of listen thread to indicate the listen thread is done. This is a preparation patch for later cleanup. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191006000249.29926-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixed up in merge to the 1 parameter postcopy_state_set
2019-10-11migration/postcopy: mis->have_listen_thread check will never be touchedWei Yang
If mis->have_listen_thread is true, this means current PostcopyState must be LISTENING or RUNNING. While the check at the beginning of the function makes sure the state transaction happens when its previous PostcopyState is ADVISE or DISCARD. This means we would never touch this check. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191006000249.29926-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration: report SaveStateEntry id and name on failureWei Yang
This provides helpful information on which entry failed. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-5-richardw.yang@linux.intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration: pass in_postcopy instead of check state againWei Yang
Not necessary to do the check again. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/postcopy: fix typo in mark_postcopy_blocktime_begin's commentWei Yang
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-3-richardw.yang@linux.intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/postcopy: map large zero page in postcopy_ram_incoming_setup()Wei Yang
postcopy_ram_incoming_setup() and postcopy_ram_incoming_cleanup() are counterpart. It is reasonable to map/unmap large zero page in these two functions respectively. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005135021.21721-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration/postcopy: allocate tmp_page in setup stageWei Yang
During migration, a tmp page is allocated so that we could place a whole host page during postcopy. Currently the page is allocated during load stage, this is a little bit late. And more important, if we failed to allocate it, the error is not checked properly. Even it is NULL, we would still use it. This patch moves the allocation to setup stage and if failed error message would be printed and caller would notice it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration: Don't try and recover return path in non-postcopyDr. David Alan Gilbert
In normal precopy we can't do reconnection recovery - but we also don't need to, since you can just rerun migration. At the moment if the 'return-path' capability is on, we use the return path in precopy to give a positive 'OK' to the end of migration; however if migration fails then we fall into the postcopy recovery path and hang. This fixes it by only running the return path in the postcopy case. Reported-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration: Use automatic rcu_read unlock in rdma.cDr. David Alan Gilbert
Use the automatic read unlocker in migration/rdma.c. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-5-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration: Use automatic rcu_read unlock in ram.cDr. David Alan Gilbert
Use the automatic read unlocker in migration/ram.c Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-4-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration: Fix missing rcu_read_unlockDr. David Alan Gilbert
Use the automatic rcu_read unlocker to fix a missing unlock. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-10-11migration: use migration_is_active to represent active stateWei Yang
Wrap the check into a function to make it easy to read. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190717005341.14140-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-09-25migration/postcopy: Recognise the recovery states as 'in_postcopy'Dr. David Alan Gilbert
Various parts of the migration code do different things when they're in postcopy mode; prior to this patch this has been 'postcopy-active'. This patch extends 'in_postcopy' to include 'postcopy-paused' and 'postcopy-recover'. In particular, when you set the max-postcopy-bandwidth parameter, this only affects the current migration fd if we're 'in_postcopy'; this leads to a race in the postcopy recovery test where it increases the speed from 4k/sec to unlimited, but that increase can get ignored if the change is made between the point at which the reconnection happens and it transitions back to active. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190923174942.12182-1-dgilbert@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-09-25migration/rdma.c: Swap synchronize_rcu for call_rcuDr. David Alan Gilbert
This fixes a deadlock that can occur on the migration source after a failed RDMA migration; as the source tries to cleanup it clears a pair of pointers and uses synchronize_rcu to wait; this is happening on the main thread. With the CPUs running a CPU thread can be an rcu reader and attempt to grab the main lock (kvm_handle_io->address_space_write->flatview_write->flatview_write_continue-> prepare_mmio_access->qemu_mutex_lock_iothread_impl) Replace the synchronize_rcu with a call_rcu to postpone the freeing. Fixes: 74637e6f08fceda98806 ("migration: implement bi-directional RDMA QIOChannel") ( https://bugzilla.redhat.com/show_bug.cgi?id=1746787 ) Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190913163507.1403-3-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-09-25migration/rdma: Don't moan about disconnects at the endDr. David Alan Gilbert
If we've already finished the migration or something has already gone wrong, don't moan about the migration stream disconnecting. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190913163507.1403-2-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-09-25migration: remove sent parameter in get_queued_page_not_dirtyWei Yang
This is a cleanup for previous removal of unsentmap. The sent parameter is not necessary now. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-09-25migration/postcopy: unsentmap is not necessary for postcopyWei Yang
Commit f3f491fcd6dd594ba695 ('Postcopy: Maintain unsentmap') introduced unsentmap to track not yet sent pages. This is not necessary since: * unsentmap is a sub-set of bmap before postcopy start * unsentmap is the summation of bmap and unsentmap after canonicalizing This patch just removes it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-09-25migration/postcopy: not necessary to do discard when canonicalizing bitmapWei Yang
All pages, either partially sent or partially dirty, will be discarded in postcopy_send_discard_bm_ram(), since we update the unsentmap to be unsentmap = unsentmap | dirty in ram_postcopy_send_discard_bitmap(). This is not necessary to do discard when canonicalizing bitmap. And by doing so, we separate the page discard into two individual steps: * canonicalize bitmap * discard page Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-09-25migration: fix vmdesc leak on vmstate_save() errorMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20190912122514.22504-2-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-09-16block: Remove unused masksNir Soffer
Replace confusing usage: ~BDRV_SECTOR_MASK With more clear: (BDRV_SECTOR_SIZE - 1) Remove BDRV_SECTOR_MASK and the unused BDRV_BLOCK_OFFSET_MASK which was it's last user. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-id: 20190827185913.27427-3-nsoffer@redhat.com Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>