aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-01-25pseries: Improve handling of multiple PCI host bridgesDavid Gibson
Multiple - even many - PCI host bridges (i.e. PCI domains) are very common on real PAPR compliant hardware. For reasons related to the PAPR specified IOMMU interfaces, PCI device assignment with VFIO will generally require at least two (virtual) PHBs and possibly more depending on which devices are assigned. At the moment the qemu PAPR PCI code will not deal with this well, leaving several crucial parameters of PHBs other than the default one uninitialized. This patch reworks the code to allow this. Every PHB needs a unique BUID (Bus Unit Identifier, the id used for the PAPR PCI related interfaces) and a unique LIOBN (Logical IO Bus Number, the id used for the PAPR IOMMU related interfaces). In addition they need windows in CPU real address space to access PCI memory space, PCI IO space and MSIs. Properties are added to the PCI host bridge qdevice to allow configuration of all these. To simplify configuration of multiple PHBs for common cases, a convenience "index" property is also added. This can be set instead of the low-level properties, and will generate suitable values for the other parameters, different for each index value. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25target-ppc: Give a meaningful error if too many threads are specifiedMike Qiu
Currently the target-ppc tcg code only supports a single thread. You can specify more, but they're treated identically to multiple cores. On KVM we obviously can't support more threads than the hardware; if more are specified it will cause strange and cryptic errors. This patch clarifies the situation by giving a simple meaningful error if more threads are specified than we can support. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25cuda: Move ADB bus into CUDA stateAndreas Färber
Replace the global adb_bus with a CUDA-internal one, accessed using regular qdev child bus accessor. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25adb: QOM'ify ADB devicesAndreas Färber
They were not qdev'ified before. Derive ADBDevice from DeviceState and convert reset callbacks to DeviceClass::reset, ADBDevice::opaque pointer to ADBDevice subtypes for mouse and keyboard and adb_{kbd,mouse}_init() to regular qdev functions. Fixing Coding Style issues and splitting keyboard and mouse off into their own files is left for a later point in time. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25adb: QOM'ify Apple Desktop BusAndreas Färber
It was not a qbus before, turn it into a first-class bus and initialize it properly from CUDA. Leave it a global variable as long as devices are not QOM'ified yet. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25cuda: QOM'ify CUDAAndreas Färber
It was not qdev'ified before. Turn it into a SysBusDevice and embed it in MacIO. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25ide/macio: QOM'ify MacIO IDEAndreas Färber
It was not qdev'ified before. Turn it into a SysBusDevice. Embed them into the MacIO devices. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25mac_nvram: QOM'ify MacIO NVRAMAndreas Färber
It was not qdev'ified before. Turn it into a SysBusDevice and initialize it via static properties. Prepare Old World specific MacIO state and embed the NVRAM state there. Drop macio_nvram_setup_bar() in favor of sysbus_mmio_map() or direct use of Memory API. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25mac_nvram: Mark as Big EndianAndreas Färber
Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25mac_nvram: Clean up public APIAndreas Färber
The state data field is accessed in uint8_t quantities, so switch from uint32_t argument and return value to uint8_t. Fix debug format specifiers while at it. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25macio: Split MacIO in twoAndreas Färber
Let the machines create two different types. This prepares to move knowledge about sub-devices from the machines into the devices. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25macio: Delay qdev init until all fields are initializedAndreas Färber
This turns macio_bar_setup() into an implementation detail of the qdev initfn, to be removed step by step. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25macio: QOM'ify some moreAndreas Färber
Move bar MemoryRegion initialization to an instance_init. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25ppc: Move Mac machines to hw/ppc/Andreas Färber
Signed-off-by: Andreas Färber <afaerber@suse.de> [agraf: squash in MAINTAINERS fix] Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-25ide: Add fall through annotationsKevin Wolf
Add comments to help static analysers detect that these cases are intentional, and clean up some whitespace in the environment of these comments. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2013-01-25block: Create proper size file for disk mirrorVishvananda Ishaya
The qmp monitor command to mirror a disk was passing -1 for size along with the disk's backing file. This size of the resulting disk is the size of the backing file, which is incorrect if the disk has been resized. Therefore we should always pass in the size of the current disk. Signed-off-by: Vishvananda Ishaya <vishvananda@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25ahci: Add migration supportJason Baron
Jason tested these patches by migrating Windows 7 and Fedora 17 guests (while under I/O) on both piix with ahci attached and on q35 (which has a built-in AHCI controller). Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25ahci: Change data types in preparation for migrationKevin Wolf
The size of an int depends on the host, so in order to be able to migrate these fields, make them either int32_t or bool, depending on the use. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25ahci: Remove unused AHCIDevice fieldsJason Baron
'dma_status' and 'dma_cb' are written to, but never read. Remove these fields in preparation for AHCI migration bits. Signed-off-by: Jason Baron <jbaron@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25hbitmap: add assertion on hbitmap_iter_initPaolo Bonzini
hbitmap_iter_init causes an out-of-bounds access when the "first" argument is or greater than or equal to the size of the bitmap. Forbid this with an assertion, and remove the failing testcase. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: do nothing on zero-sized diskPaolo Bonzini
On a zero-sized disk we need to break out of the job successfully before bdrv_dirty_iter_init is called, otherwise you will get an assertion failure with the next patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block/vdi: Check for bad signatureStefan Weil
vdi_open did not check for a bad signature. This check was only in vdi_probe. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block/vdi: Improved return values from vdi_openStefan Weil
vdi_open returned -1 in case of any error, but it should return an error code (negative value of errno or -EMEDIUMTYPE). Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block/vdi: Improve debug output for signatureStefan Weil
The signature is a 32 bit value and needs up to 8 hex digits for printing. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: Use error code EMEDIUMTYPE for wrong format in some block driversStefan Weil
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk when a file with the wrong format is selected. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: Add special error code for wrong formatStefan Weil
The block drivers need a special error code for "wrong format". From the available error codes EMEDIUMTYPE fits best. It is not available on all platforms, so a definition in qemu-common.h and a specific error report are needed. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: support arbitrarily-sized iterationsPaolo Bonzini
Yet another optimization is to extend the mirroring iteration to include more adjacent dirty blocks. This limits the number of I/O operations and makes mirroring efficient even with a small granularity. Most of the infrastructure is already in place; we only need to put a loop around the computation of the origin and sector count of the iteration. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: support more than one in-flight AIO operationPaolo Bonzini
With AIO support in place, we can start copying more than one chunk in parallel. This patch introduces the required infrastructure for this: the buffer is split into multiple granularity-sized chunks, and there is a free list to access them. Because of copy-on-write, a single operation may already require multiple chunks to be available on the free list. In addition, two different iterations on the HBitmap may want to copy the same cluster. We avoid this by keeping a bitmap of in-flight I/O operations, and blocking until the previous iteration completes. This should be a pretty rare occurrence, though; as long as there is no overlap the next iteration can start before the previous one finishes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: add buf-size argument to drive-mirrorPaolo Bonzini
This makes sense when the next commit starts using the extra buffer space to perform many I/O operations asynchronously. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: switch mirror_iteration to AIOPaolo Bonzini
There is really no change in the behavior of the job here, since there is still a maximum of one in-flight I/O operation between the source and the target. However, this patch already introduces the AIO callbacks (which are unmodified in the next patch) and some of the logic to count in-flight operations and only complete the job when there is none. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: allow customizing the granularityPaolo Bonzini
The desired granularity may be very different depending on the kind of operation (e.g. continuous replication vs. collapse-to-raw) and whether the VM is expected to perform lots of I/O while mirroring is in progress. Allow the user to customize it, while providing a sane default so that in general there will be no extra allocated space in the target compared to the source. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: allow customizing the granularity of the dirty bitmapPaolo Bonzini
Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: return count of dirty sectors, not chunksPaolo Bonzini
Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: perform COW if the cluster size is bigger than the granularityPaolo Bonzini
When mirroring runs, the backing files for the target may not yet be ready. However, this means that a copy-on-write operation on the target would fill the missing sectors with zeros. Copy-on-write only happens if the granularity of the dirty bitmap is smaller than the cluster size (and only for clusters that are allocated in the source after the job has started copying). So far, the granularity was fixed to 1MB; to avoid the problem we detected the situation and required the backing files to be available in that case only. However, we want to lower the granularity for efficiency, so we need a better solution. The solution is to always copy a whole cluster the first time it is touched. The code keeps a bitmap of clusters that have already been allocated by the mirroring job, and only does "manual" copy-on-write if the chunk being copied is zero in the bitmap. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: make round_to_clusters publicPaolo Bonzini
This is needed in the following patch. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: implement dirty bitmap using HBitmapPaolo Bonzini
This actually uses the dirty bitmap in the block layer, and converts mirroring to use an HBitmapIter. Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts) Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25host-utils: add ffslPaolo Bonzini
We can provide fast versions based on the other functions defined by host-utils.h. Some care is required on glibc, which provides ffsl already. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25add hierarchical bitmap data type and test casesPaolo Bonzini
HBitmaps provides an array of bits. The bits are stored as usual in an array of unsigned longs, but HBitmap is also optimized to provide fast iteration over set bits; going from one bit to the next is O(logB n) worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough that the number of levels is in fact fixed. In order to do this, it stacks multiple bitmaps with progressively coarser granularity; in all levels except the last, bit N is set iff the N-th unsigned long is nonzero in the immediately next level. When iteration completes on the last level it can examine the 2nd-last level to quickly skip entire words, and even do so recursively to skip blocks of 64 words or powers thereof (32 on 32-bit machines). Given an index in the bitmap, it can be split in group of bits like this (for the 64-bit case): bits 0-57 => word in the last bitmap | bits 58-63 => bit in the word bits 0-51 => word in the 2nd-last bitmap | bits 52-57 => bit in the word bits 0-45 => word in the 3rd-last bitmap | bits 46-51 => bit in the word So it is easy to move up simply by shifting the index right by log2(BITS_PER_LONG) bits. To move down, you shift the index left similarly, and add the word index within the group. Iteration uses ffs (find first set bit) to find the next word to examine; this operation can be done in constant time in most current architectures. Setting or clearing a range of m bits on all levels, the work to perform is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap. When iterating on a bitmap, each bit (on any level) is only visited once. Hence, The total cost of visiting a bitmap with m bits in it is the number of bits that are set in all bitmaps. Unless the bitmap is extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized cost of advancing from one bit to the next is usually constant. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25QAPI: Introduce memchar-read QMP commandLei Li
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-01-25QAPI: Introduce memchar-write QMP commandLei Li
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-01-25qemu-char: Add new char backend CirMemCharDriverLei Li
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-01-25docs: document virtio-balloon statsLuiz Capitulino
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2013-01-25balloon: re-enable balloon statsLuiz Capitulino
The statistics are now available through device properties via a polling mechanism. First a client has to enable polling, then it can query available stats. Polling is enabled by setting an update interval (in seconds) to a property named guest-stats-polling-interval, like this: { "execute": "qom-set", "arguments": { "path": "/machine/peripheral-anon/device[1]", "property": "guest-stats-polling-interval", "value": 4 } } Then the available stats can be retrieved by querying the guest-stats property. The returned object is a dict containing all available stats. Example: { "execute": "qom-get", "arguments": { "path": "/machine/peripheral-anon/device[1]", "property": "guest-stats" } } { "return": { "stats": { "stat-swap-out": 0, "stat-free-memory": 844943360, "stat-minor-faults": 219028, "stat-major-faults": 235, "stat-total-memory": 1044406272, "stat-swap-in": 0 }, "last-update": 1358529861 } } Please, check the next commit for full documentation. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2013-01-25balloon: drop old stats code & APILuiz Capitulino
Next commit will re-enable balloon stats with a different interface, but this old code conflicts with it. Let's drop it. It's important to note that the QMP and HMP interfaces are also dropped by this commit. That shouldn't be a problem though, because: 1. All QMP fields are optional 2. This feature has always been disabled Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2013-01-25block: Monitor command commit neglects to report some errorsJeff Cody
The non-live bdrv_commit() function may return one of the following errors: -ENOTSUP, -EBUSY, -EACCES, -EIO. The only error that is checked in the HMP handler is -EBUSY, so the monitor command 'commit' silently fails for all error cases other than 'Device is in use'. Report error using monitor_printf() and strerror(), and convert existing qerror_report() calls in do_commit() to monitor_printf(). Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-01-24Merge remote-tracking branch 'bonzini/scsi-next' into stagingAnthony Liguori
# By Paolo Bonzini (1) and Peter Lieven (1) # Via Paolo Bonzini * bonzini/scsi-next: iscsi: add support for iovectors iscsi: do not leak acb->buf when commands are aborted
2013-01-24Revert "serial: fix retry logic"Michael Tokarev
This reverts commit 67c5322d7000fd105a926eec44bc1765b7d70bdd: I'm not sure if the retry logic has ever worked when not using FIFO mode. I found this while writing a test case although code inspection confirms it is definitely broken. The TSR retry logic will never actually happen because it is guarded by an 'if (s->tsr_rety > 0)' but this is the only place that can ever make the variable greater than zero. That effectively makes the retry logic an 'if (0) I believe this is a typo and the intention was >= 0. Once this is fixed thoug I see double transmits with my test case. This is because in the non FIFO case, serial_xmit may get invoked while LSR.THRE is still high because the character was processed but the retransmit timer was still active. We can handle this by simply checking for LSR.THRE and returning early. It's possible that the FIFO paths also need some attention. Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Even if the previous logic was never worked, new logic breaks stuff - namely, qemu -enable-kvm -nographic -kernel /boot/vmlinuz-$(uname -r) -append console=ttyS0 -serial pty the above command will cause the virtual machine to stuck at startup using 100% CPU till one connects to the pty and sends any char to it. Note this is rather typical invocation for various headless virtual machines by libvirt. So revert this change for now, till a better solution will be found. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-01-24iscsi: add support for iovectorsPeter Lieven
This patch adds support for directly passing the iovec array from QEMUIOVector if libiscsi supports it (1.8.0 or newer). Signed-off-by: Peter Lieven <pl@kamp.de> [Preserve the improvements from commit 4cc841b, iscsi: partly avoid iovec linearization in iscsi_aio_writev, 2012-11-19 - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-01-24iscsi: do not leak acb->buf when commands are abortedPaolo Bonzini
acb->buf is freed in the WRITE(16) callback, but this may not get called at all when commands are aborted. Add another free in the ABORT TASK callback, which requires setting acb->buf to NULL everywhere. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-01-24target-cris: Fix typo in D_LOG() macroAndreas Färber
It's __VA_ARGS__. Fixes the build with CRIS_[OP_]HELPER_DEBUG defined. Broken since r6338 / 93fcfe39a0383377e647b821c9f165fd927cd4e0 (Convert references to logfile/loglevel to use qemu_log*() macros). Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>