aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-10-24iostatus: forward block_job_iostatus_reset to block jobPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qemu-iotests: add mirroring test casePaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24mirror: implement completionPaolo Bonzini
Switching to the target of the migration is done mostly asynchronously, and reported to management via the BLOCK_JOB_COMPLETED event; the only synchronous phase is opening the backing files. bdrv_open_backing_file can always be done, even for migration of the full image (aka sync: 'full'). In this case, qmp_drive_mirror will create the target disk with no backing file at all, and bdrv_open_backing_file will be a no-op. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qmp: add drive-mirror commandPaolo Bonzini
This adds the monitor commands that start the mirroring job. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24mirror: introduce mirror jobPaolo Bonzini
This patch adds the implementation of a new job that mirrors a disk to a new image while letting the guest continue using the old image. The target is treated as a "black box" and data is copied from the source to the target in the background. This can be used for several purposes, including storage migration, continuous replication, and observation of the guest I/O in an external program. It is also a first step in replacing the inefficient block migration code that is part of QEMU. The job is possibly never-ending, but it is logically structured into two phases: 1) copy all data as fast as possible until the target first gets in sync with the source; 2) keep target in sync and ensure that reopening to the target gets a correct (full) copy of the source data. The second phase is indicated by the progress in "info block-jobs" reporting the current offset to be equal to the length of the file. When the job is cancelled in the second phase, QEMU will run the job until the source is clean and quiescent, then it will report successful completion of the job. In other words, the BLOCK_JOB_CANCELLED event means that the target may _not_ be consistent with a past state of the source; the BLOCK_JOB_COMPLETED event means that the target is consistent with a past state of the source. (Note that it could already happen that management lost the race against QEMU and got a completion event instead of cancellation). It is not yet possible to complete the job and switch over to the target disk. The next patches will fix this and add many refinements to the basic idea introduced here. These include improved error management, some tunable knobs and performance optimizations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: introduce BLOCK_JOB_READY eventPaolo Bonzini
Even for jobs that need to be manually completed, management may want to take care itself of the completion, not requiring the user to issue a command to terminate the job. In this case we want to avoid that they poll us continuously, waiting for completion to become available. Thus, add a new event that signals the phase switch and the availability of the block-job-complete command. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: add block-job-completePaolo Bonzini
While streaming can be dropped as soon as it progressed through the whole image, mirroring needs to be completed manually for two reasons: 1) so that management knows exactly when the VM switches to the target; 2) because for other use cases such as replication, we may leave the operation running for the whole life of the virtual machine. Add a new block job command that manually completes background operations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: rename block_job_complete to block_job_completedPaolo Bonzini
The imperative will be used for the QMP command. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: export dirty bitmap information in query-blockPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: introduce new dirty bitmap functionalityPaolo Bonzini
Assert that write_compressed is never used with the dirty bitmap. Setting the bits early is wrong, because a coroutine might concurrently examine them and copy incomplete data from the source. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: add bdrv_open_backing_filePaolo Bonzini
Mirroring runs without the backing file so that it can be copied outside QEMU. However, we need to add it at the time the job is completed and QEMU switches to the target. Factor out the common bits of opening an image and completing a mirroring operation. The new function does not assume that the file is closed immediately after it returns failure, so it keeps the BDRV_O_NO_BACKING flag up-to-date. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: add bdrv_query_statsPaolo Bonzini
qmp_query_blockstat cannot have errors, remove the Error argument and create a new public function bdrv_query_stats out of it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: add bdrv_query_infoPaolo Bonzini
Extract it out of the implementation of "info block". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qemu-config: Add new -add-fd command line optionCorey Bryant
This option can be used for passing file descriptors on the command line. It mirrors the existing add-fd QMP command which allows an fd to be passed to QEMU via SCM_RIGHTS and added to an fd set. This can be combined with commands such as -drive to link file descriptors in an fd set to a drive: qemu-kvm -add-fd fd=3,set=2,opaque="rdwr:/path/to/file" -add-fd fd=4,set=2,opaque="rdonly:/path/to/file" -drive file=/dev/fdset/2,index=0,media=disk This example adds dups of fds 3 and 4, and the accompanying opaque strings to the fd set with ID=2. qemu_open() already knows how to handle a filename of this format. qemu_open() searches the corresponding fd set for an fd and when it finds a match, QEMU goes on to use a dup of that fd just like it would have used an fd that it opened itself. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24monitor: Prevent removing fd from set during initCorey Bryant
If an fd is added to an fd set via the command line, and it is not referenced by another command line option (ie. -drive), then clean it up after QEMU initialization is complete. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24monitor: Enable adding an inherited fd to an fd setCorey Bryant
qmp_add_fd() gets an fd that was received over a socket with SCM_RIGHTS and adds it to an fd set. This patch adds support that will enable adding an fd that was inherited on the command line to an fd set. Note: All of the code added to monitor_fdset_add_fd(), with the exception of the error path for non-valid fdset-id, is code motion from qmp_add_fd(). Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24monitor: Allow add-fd to any specified fd setCorey Bryant
The first call to add an fd to an fd set was previously not allowed to choose the fd set ID. The ID was generated as the first available and ensuing calls could add more fds by specifying the fd set ID. This change allows users to choose the fd set ID on the first call. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: bdrv_create(): don't leak cco.filename on errorLuiz Capitulino
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qemu-img: document 'info --backing-chain'Kashyap Chamarthy
Signed-off-by: Kashyap Chamarthy <kashyap.cv@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qemu-iotests: Add 043 backing file chain infinite loop testStefan Hajnoczi
This new test verifies that qemu-img info --backing-chain safely aborts when an image file has a backing file infinite loop. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qemu-img: Add --backing-chain option to info commandStefan Hajnoczi
The qemu-img info --backing-chain option enumerates the backing file chain. For example, for base.qcow2 <- snap1.qcow2 <- snap2.qcow2 the output becomes: $ qemu-img info --backing-chain snap2.qcow2 image: snap2.qcow2 file format: qcow2 virtual size: 100M (104857600 bytes) disk size: 196K cluster_size: 65536 backing file: snap1.qcow2 backing file format: qcow2 image: snap1.qcow2 file format: qcow2 virtual size: 100M (104857600 bytes) disk size: 196K cluster_size: 65536 backing file: base.qcow2 backing file format: qcow2 image: base.qcow2 file format: qcow2 virtual size: 100M (104857600 bytes) disk size: 136K cluster_size: 65536 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qemu-iotests: add relative backing file tests for block-commit (040)Jeff Cody
The previous block commit used absolute filenames for all block-commit images and commands; this adds relative filenames for the same tests. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: in commit, determine base image from the top imageJeff Cody
This simplifies some code and error checking, and also fixes a bug. bdrv_find_backing_image() should only be passed absolute filenames, or filenames relative to the chain. In the QMP message handler for block commit, when looking up the base do so from the determined top image, so we know it is reachable from top. Some of the error messages put out by block-commit have changed slightly, which causes 2 tests cases for block-commit to fail. This patch updates the test cases to look for the correct error output. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24block: make bdrv_find_backing_image compare canonical filenamesJeff Cody
Currently, bdrv_find_backing_image compares bs->backing_file with what is passed in as a backing_file name. Mismatches may occur, however, when bs->backing_file and backing_file are not both absolute or relative. Use path_combine() to make sure any relative backing filenames are relative to the current image filename being searched, and then use realpath() to make all comparisons based on absolute filenames. If either backing_file or bs->backing_file is determine to be a protocol, then no filename normalization is performed. This also changes bdrv_find_backing_image to no longer be recursive, but iterative. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qemu-img rebase: use empty string to rebase without backing fileAlex Bligh
This patch allows an empty filename to be passed as the new base image name for qemu-img rebase to mean base the image on no backing file (i.e. independent of any backing file). According to Eric Blake, qemu-img rebase already supports this when '-u' is used; this adds support when -u is not used. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qmp: fix __accept() in qmp.pyJeff Cody
In QEMUMonitorProtocol, commit e9d17b6 removed the __sockfile creation from __negotiate_capabilities(), which breaks _accept(). This causes failures in qemu-io python based tests (i.e. tests 030 and 040). This patch creates the sockfile in __accept() as well. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24qemu-iotests: Test qemu-img operation on zero size imageKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-24qemu-img: Fix division by zero for zero size imagesKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-23Rename target_phys_addr_t to hwaddrAvi Kivity
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are reserved) and its purpose doesn't match the name (most target_phys_addr_t addresses are not target specific). Replace it with a finger-friendly, standards conformant hwaddr. Outstanding patchsets can be fixed up with the command git rebase -i --exec 'find -name "*.[ch]" | xargs s/target_phys_addr_t/hwaddr/g' origin Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22Merge remote-tracking branch 'qemu-kvm/memory/urgent' into stagingAnthony Liguori
* qemu-kvm/memory/urgent: memory: abort if a memory region is destroyed during a transaction i440fx: avoid destroying memory regions within a transaction memory: Make eventfd adhere to device endianness
2012-10-22Merge remote-tracking branch 'awilliam/tags/vfio-pci-for-qemu-20121017.0' ↵Anthony Liguori
into staging * awilliam/tags/vfio-pci-for-qemu-20121017.0: vfio-pci: Mark non-migratable vfio-pci: Fix debug build
2012-10-22usb-serial: only expose device in guest when the chardev is openGerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22usb-serial: don't magically zap chardev on umplugGerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22serial: add pci-serial documentationGerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22serial: add 2x + 4x pci variantGerd Hoffmann
Add multiport serial card implementation, with two variants, one featuring two and one featuring four ports. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22serial: add windows inf file for the pci card to docsGerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22serial: add pci variantGerd Hoffmann
So we get a hot-pluggable 16550 uart. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22serial: split serial.cGerd Hoffmann
Split serial.c into serial.c, serial.h and serial-isa.c. While being at creating a serial.h header file move the serial prototypes from pc.h to the new serial.h. The latter leads to s/pc.h/serial.h/ in tons of boards which just want the serial bits from pc.h Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22Call MADV_HUGEPAGE for guest RAM allocationsLuiz Capitulino
This makes it possible for QEMU to use transparent huge pages (THP) when transparent_hugepage/enabled=madvise. Otherwise THP is only used when it's enabled system wide. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22Merge remote-tracking branch 'quintela/migration-next-20121017' into stagingAnthony Liguori
* quintela/migration-next-20121017: (41 commits) cpus: create qemu_in_vcpu_thread() savevm: make qemu_file_put_notify() return errors savevm: un-export qemu_file_set_error() block-migration: handle errors with the return codes correctly block-migration: Switch meaning of return value block-migration: make flush_blks() return errors buffered_file: buffered_put_buffer() don't need to set last_error savevm: Only qemu_fflush() can generate errors savevm: make qemu_fill_buffer() be consistent savevm: unexport qemu_ftell() savevm: unfold qemu_fclose_internal() savevm: make qemu_fflush() return an error code savevm: Remove qemu_fseek() virtio-net: use qemu_get_buffer() in a temp buffer savevm: unexport qemu_fflush migration: make migrate_fd_wait_for_unfreeze() return errors buffered_file: make buffered_flush return the error code buffered_file: callers of buffered_flush() already check for errors buffered_file: We can access directly to bandwidth_limit buffered_file: unfold migrate_fd_close ...
2012-10-22Merge remote-tracking branch 'qemu-kvm/memory/dma' into stagingAnthony Liguori
* qemu-kvm/memory/dma: (23 commits) pci: honor PCI_COMMAND_MASTER pci: give each device its own address space memory: add address_space_destroy() dma: make dma access its own address space memory: per-AddressSpace dispatch s390: avoid reaching into memory core internals memory: use AddressSpace for MemoryListener filtering memory: move tcg flush into a tcg memory listener memory: move address_space_memory and address_space_io out of memory core memory: manage coalesced mmio via a MemoryListener xen: drop no-op MemoryListener callbacks kvm: drop no-op MemoryListener callbacks xen_pt: drop no-op MemoryListener callbacks vfio: drop no-op MemoryListener callbacks memory: drop no-op MemoryListener callbacks memory: provide defaults for MemoryListener operations memory: maintain a list of address spaces memory: export AddressSpace memory: prepare AddressSpace for exporting xen_pt: use separate MemoryListeners for memory and I/O ...
2012-10-22pci: honor PCI_COMMAND_MASTERAvi Kivity
Currently we ignore PCI_COMMAND_MASTER completely: DMA succeeds even when the bit is clear. Honor PCI_COMMAND_MASTER by inserting a memory region into the device's bus master address space, and tying its enable status to PCI_COMMAND_MASTER. Tested using setpci -s 03 COMMAND=3 while a ping was running on a NIC in slot 3. The kernel (Linux) detected the stall and recovered after the command setpci -s 03 COMMAND=7 was issued. Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22pci: give each device its own address spaceAvi Kivity
Accesses from different devices can resolve differently (depending on bridge settings, iommus, and PCI_COMMAND_MASTER), so set up an address space for each device. Currently iommus are expressed outside the memory API, so this doesn't work if an iommu is present. Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22memory: add address_space_destroy()Avi Kivity
Since address spaces can be created dynamically by device hotplug, they can also be destroyed dynamically. Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22dma: make dma access its own address spaceAvi Kivity
Instead of accessing the cpu address space, use an address space configured by the caller. Eventually all dma functionality will be folded into AddressSpace, but we have to start from something. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22memory: per-AddressSpace dispatchAvi Kivity
Currently we use a global radix tree to dispatch memory access. This only works with a single address space; to support multiple address spaces we make the radix tree a member of AddressSpace (via an intermediate structure AddressSpaceDispatch to avoid exposing too many internals). A side effect is that address_space_io also gains a dispatch table. When we remove all the pre-memory-API I/O registrations, we can use that for dispatching I/O and get rid of the original I/O dispatch. Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22s390: avoid reaching into memory core internalsAvi Kivity
use cpu_physical_memory_is_io() instead. Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22memory: use AddressSpace for MemoryListener filteringAvi Kivity
Using the AddressSpace type reduces confusion, as you can't accidentally supply the MemoryRegion you're interested in. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22memory: move tcg flush into a tcg memory listenerAvi Kivity
We plan to make the core listener listen to all address spaces; this will cause many more flushes than necessary. Prepare for that by moving the flush into a tcg-specific listener. Later we can avoid registering the listener if tcg is disabled. Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22memory: move address_space_memory and address_space_io out of memory coreAvi Kivity
With this change, memory.c no longer knows anything about special address spaces, so it is prepared for AddressSpace based DMA. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>