aboutsummaryrefslogtreecommitdiff
path: root/migration/ram.c
AgeCommit message (Collapse)Author
2015-12-11Fix xbzrle vs last_sent_block updateDr. David Alan Gilbert
My fix (84e7b80a) replaced the last_sent_block update that I'd removed earlier; however it was too aggressive in the xbzrle case. save_xbzrle_page might return '0' to mean that the page didn't need sending since it was the same as the last sent version; in this case we can't update 'last_sent_block' since we didn't actually send it. Symptom: 'Illegal RAM offset 1018000' as we try and send a page to the wrong RAMBlock; potentially that could be a data corruption if you were really unlucky. Fixes: 84e7b80a05c0c44b90533c6cd2f1db5c932ccf77 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-id: 1449765106-6528-1-git-send-email-dgilbert@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-11-19Set last_sent_blockDr. David Alan Gilbert
In a82d593b61054b3dea43 I accidentally removed the setting of last_sent_block, put it back. Symptoms: Multithreaded compression only uses one thread. Migration is a bit less efficient since it won't use 'cont' flags. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixes: a82d593b61054b3dea43 Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-12Postcopy: Fix TP!=HP zero caseDr. David Alan Gilbert
Where the target page size is different from the host page we special case it, but I messed up on the zero case check. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-12migration: print ram_addr_t as RAM_ADDR_FMT not %zxJuan Quintela
Not all the wold is 64bits (yet). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2015-11-10Host page!=target page: Cleanup bitmapsDr. David Alan Gilbert
Prior to the start of postcopy, ensure that everything that will be transferred later is a whole host-page in size. This is accomplished by discarding partially transferred host pages and marking any that are partially dirty as fully dirty. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Don't sync dirty bitmaps in postcopyDr. David Alan Gilbert
Once we're in postcopy the source processors are stopped and memory shouldn't change any more, so there's no need to look at the dirty map. There are two notes to this: 1) If we do resync and a page had changed then the page would get sent again, which the destination wouldn't allow (since it might have also modified the page) 2) Before disabling this I'd seen very rare cases where a page had been marked dirtied although the memory contents are apparently identical Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10postcopy: Check order of received target pagesDr. David Alan Gilbert
Ensure that target pages received within a host page are in order. This shouldn't trigger, but in the cases where the sender goes wrong and sends stuff out of order it produces a corruption that's really nasty to debug. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Postcopy: Use helpers to map pages during migrationDr. David Alan Gilbert
In postcopy, the destination guest is running at the same time as it's receiving pages; as we receive new pages we must put them into the guests address space atomically to avoid a running CPU accessing a partially written page. Use the helpers in postcopy-ram.c to map these pages. qemu_get_buffer_in_place is used to avoid a copy out of qemu_file in the case that postcopy is going to do a copy anyway. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Page request: Consume pages off the post-copy queueDr. David Alan Gilbert
When transmitting RAM pages, consume pages that have been queued by MIG_RPCOMM_REQPAGE commands and send them ahead of normal page scanning. Note: a) After a queued page the linear walk carries on from after the unqueued page; there is a reasonable chance that the destination was about to ask for other closeby pages anyway. b) We have to be careful of any assumptions that the page walking code makes, in particular it does some short cuts on its first linear walk that break as soon as we do a queued page. c) We have to be careful to not break up host-page size chunks, since this makes it harder to place the pages on the destination. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Page request: Process incoming page requestDr. David Alan Gilbert
On receiving MIG_RPCOMM_REQ_PAGES look up the address and queue the page. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10postcopy: Incoming initialisationDr. David Alan Gilbert
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10migration_completion: Take current stateDr. David Alan Gilbert
Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we complete migration, and we need to know which we expect to be in to change state safely. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Postcopy: Maintain unsentmapDr. David Alan Gilbert
Maintain an 'unsentmap' of pages that have yet to be sent. This is used in the following patches to discard some set of the pages already sent as we enter postcopy mode. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Add qemu_savevm_state_complete_postcopyDr. David Alan Gilbert
Add qemu_savevm_state_complete_postcopy to complement qemu_savevm_state_complete_precopy together with a new save_live_complete_postcopy method on devices. The save_live_complete_precopy method is called on all devices during a precopy migration, and all non-postcopy devices during a postcopy migration at the transition. The save_live_complete_postcopy method is called at the end of postcopy for all postcopiable devices. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Modify save_live_pending for postcopyDr. David Alan Gilbert
Modify save_live_pending to return separate postcopiable and non-postcopiable counts. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Rename save_live_complete to save_live_complete_precopyDr. David Alan Gilbert
In postcopy we're going to need to perform the complete phase for postcopiable devices at a different point, start out by renaming all of the 'complete's to make the difference obvious. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10ram_load: Factor out host_from_stream_offset call and checkDr. David Alan Gilbert
The main RAM load loop has a call to host_from_stream_offset for each page type that actually loads data with the same test; factor it out before the switch. The host = NULL is to silence a bogus gcc warning of an unitialised in the RAM_SAVE_COMPRESS_PAGE case, it doesn't seem to realise that host is always initialised by the if at the top in the cases the switch takes. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10ram_debug_dump_bitmap: Dump a migration bitmap as textDr. David Alan Gilbert
Useful for debugging the migration bitmap and other bitmaps of the same format (including the sentmap in postcopy). The bitmap is printed to stderr. Lines that are all the expected value are excluded so the output can be quite compact for many bitmaps. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10qemu_ram_block_by_nameDr. David Alan Gilbert
Add a function to find a RAMBlock by name; use it in two of the places that already open code that loop; we've got another use later in postcopy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-04migration: code clean upLiang Li
Just clean up code, no behavior change. Signed-off-by: Liang Li <liang.z.li@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com>al3 Reviewed-by: Amit Shah <amit.shah@redhat.com>al3 Signed-off-by: Juan Quintela <quintela@redhat.com>al3
2015-11-04migration: rename cancel to cleanup in SaveVMHandlesLiang Li
'cleanup' seems more appropriate than 'cancel'. Signed-off-by: Liang Li <liang.z.li@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com>al3 Reviewed-by: Amit Shah <amit.shah@redhat.com>al3 Signed-off-by: Juan Quintela <quintela@redhat.com>al3
2015-11-04migration: defer migration_end & blk_mig_cleanupLiang Li
Because of the patch 3ea3b7fa9af067982f34b of kvm, which introduces a lazy collapsing of small sptes into large sptes mechanism, now migration_end() is a time consuming operation because it calls memroy_global_dirty_log_stop(), which will trigger the dropping of small sptes operation and takes about dozens of milliseconds, so call migration_end() before all the vmsate data has already been transferred to the destination will prolong VM downtime. This operation should be deferred after all the data has been transferred to the destination. blk_mig_cleanup() can be deferred too. For a VM with 8G RAM, this patch can reduce the VM downtime about 30 ms. Signed-off-by: Liang Li <liang.z.li@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>al3 Reviewed-by: Amit Shah <amit.shah@redhat.com>al3 Signed-off-by: Juan Quintela <quintela@redhat.com>al3
2015-10-15migration: fix deadlockDenis V. Lunev
Release qemu global mutex before call synchronize_rcu(). synchronize_rcu() waiting for all readers to finish their critical sections. There is at least one critical section in which we try to get QGM (critical section is in address_space_rw() and prepare_mmio_access() is trying to aquire QGM). Both functions (migration_end() and migration_bitmap_extend()) are called from main thread which is holding QGM. Thus there is a race condition that ends up with deadlock: main thread working thread Lock QGA | | Call KVM_EXIT_IO handler | | | Open rcu reader's critical section Migration cleanup bh | | | synchronize_rcu() is | waiting for readers | | prepare_mmio_access() is waiting for QGM \ / deadlock The patch changes bitmap freeing from direct g_free after synchronize_rcu to free inside call_rcu. Signed-off-by: Denis V. Lunev <den@openvz.org> Reported-by: Igor Redko <redkoi@virtuozzo.com> Tested-by: Igor Redko <redkoi@virtuozzo.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> CC: Anna Melekhova <annam@virtuozzo.com> CC: Juan Quintela <quintela@redhat.com> CC: Amit Shah <amit.shah@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Wen Congyang <wency@cn.fujitsu.com>
2015-09-30migration: Dynamic cpu throttling for auto-convergeJason J. Herne
Remove traditional auto-converge static 30ms throttling code and replace it with a dynamic throttling algorithm. Additionally, be more aggressive when deciding when to start throttling. Previously we waited until four unproductive memory passes. Now we begin throttling after only two unproductive memory passes. Four seemed quite arbitrary and only waiting for two passes allows us to complete the migration faster. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-29ram_find_and_save_block: Split out the findingDr. David Alan Gilbert
Split out the finding of the dirty page and all the wrap detection into a separate function since it was getting a bit hairy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1443018431-11170-3-git-send-email-dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> [Fix comment -- Amit] Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29Move dirty page search state into separate structureDr. David Alan Gilbert
Pull the search state for one iteration of the dirty page search into a structure. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1443018431-11170-2-git-send-email-dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29migration/ram.c: Use RAMBlock rather than MemoryRegionDr. David Alan Gilbert
RAM migration mainly works on RAMBlocks but in a few places uses data from MemoryRegions to access the same information that's already held in RAMBlocks; clean it up just to avoid the MemoryRegion use. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1439463094-5394-2-git-send-email-dgilbert@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-07-15migration: reduce the count of strlen callLiang Li
'strlen' is called three times in 'save_page_header', it's inefficient. Signed-off-by: Liang Li <liang.z.li@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-07-09migration: fix RCU deadlockPaolo Bonzini
migration_end calls synchronize_rcu() within a critical section. That causes a deadlock; move the call after rcu_read_unlock(). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-07migration: extend migration_bitmapLi Zhijian
Prevously, if we hotplug a device(e.g. device_add e1000) during migration is processing in source side, qemu will add a new ram block but migration_bitmap is not extended. In this case, migration_bitmap will overflow and lead qemu abort unexpectedly. Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-07-07migration: protect migration_bitmapLi Zhijian
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-07-07Rework ram_control_load_hook to hook during block loadDr. David Alan Gilbert
We need the names of RAMBlocks as they're loaded for RDMA, reuse a slightly modified ram_control_load_hook: a) Pass a 'data' parameter to use for the name in the block-reg case b) Only some hook types now require the presence of a hook function. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-06-12arch_init: Clean up the duplicate variable 'len' defining in ram_load()zhanghailiang
There are two places that define 'len' variable, It's OK for compiling, but makes it difficult for reading. Remove the local one which defined in the inside 'while' loop. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-06-12migration: reduce include filesJuan Quintela
To make changes easier, with the copy, I maintained almost all include files. Now I remove the unnecessary ones on this patch. This compiles on linux x64 with all architectures configured, and cross-compiles for windows 32 and 64 bits. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-06-12migration: Add myself to the copyright list of both filesJuan Quintela
If anyone feels like adding himself to the list, just sent me a patch. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-06-12migration: move ram stuff to migration/ramJuan Quintela
For historic reasons, ram migration have been on arch_init.c. Just split it into migration/ram.c, the same that happened with block.c. There is only code movement, no changes altogether. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>