aboutsummaryrefslogtreecommitdiff
path: root/hw/block
AgeCommit message (Collapse)Author
2016-01-26xen: Switch to libxengnttab interface for compat shims.Ian Campbell
In Xen 4.7 we are refactoring parts libxenctrl into a number of separate libraries which will provide backward and forward API and ABI compatiblity. One such library will be libxengnttab which provides access to grant tables. In preparation for this switch the compatibility layer in xen_common.h (which support building with older versions of Xen) to use what will be the new library API. This means that the gnttab shim will disappear for versions of Xen which include libxengnttab. To simplify things for the <= 4.0.0 support we wrap the int fd in a malloc(sizeof int) such that the handle is always a pointer. This leads to less typedef headaches and the need for XC_HANDLER_INITIAL_VALUE etc for these interfaces. Note that this patch does not add any support for actually using libxengnttab, it just adjusts the existing shims. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2016-01-25fdc: change auto fallback drive for ISA FDC to 288John Snow
The 2.88 drive is more suitable as a default because it can still read 1.44 images correctly, but the reverse is not true. Since there exist virtio-win drivers that are shipped on 2.88 floppy images, this patch will allow VMs booted without a floppy disk inserted to later insert a 2.88MB floppy and have that work. This patch has been tested with msdos, freedos, fedora, windows 8 and windows 10 without issue: if problems do arise for certain guests being unable to cope with 2.88MB drives as the default, they are in the minority and can use type=144 as needed (or insert a proper boot medium and omit type=144/288 or use type=auto) to obtain different drive types. As icing, the default will remain auto/144 for any pre-2.6 machine types, hopefully minimizing the impact of this change in legacy hw to basically zero. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-13-git-send-email-jsnow@redhat.com
2016-01-25fdc: rework pick_geometryJohn Snow
This one is the crazy one. fd_revalidate currently uses pick_geometry to tell if the diskette geometry has changed upon an eject/insert event, but it won't allow us to insert a 1.44MB diskette into a 2.88MB drive. This is inflexible. The new algorithm applies a new heuristic to guessing disk geometries that allows us to switch diskette types as long as the physical size matches before falling back to the old heuristic. The old one is roughly: - If the size (sectors) and type matches, choose it. - Fall back to the first geometry that matched our type. The new one is: - If the size (sectors) and type matches, choose it. - If the size (sectors) and physical size match, choose it. - Fall back to the first geometry that matched our type. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1453495865-9649-11-git-send-email-jsnow@redhat.com
2016-01-25fdc: add physical disk sizesJohn Snow
2.88MB capable drives can accept 1.44MB floppies, for instance. To rework the pick_geometry function, we need to know if our current drive can even accept the type of disks we're considering. NB: This allows us to distinguish between all of the "total sectors" collisions between 1.20MB and 1.44MB diskette types, by using the physical drive size as a differentiator. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-10-git-send-email-jsnow@redhat.com
2016-01-25fdc: add drive type optionJohn Snow
This patch adds a new explicit Floppy Drive Type option. The existing behavior in QEMU is to automatically guess a drive type based on the media inserted, or if a diskette is not present, arbitrarily assign one. This behavior can be described as "auto." This patch adds the option to pick an explicit behavior: 120, 144, 288 or none. The new "auto" option is intended to mimic current behavior, while the other types pick one explicitly. Set the type given by the CLI during fd_init. If the type remains the default (auto), we'll attempt to scan an inserted diskette if present to determine a type. If auto is selected but no diskette is present, we fall back to a predetermined default (currently 1.44MB to match legacy QEMU behavior.) Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-9-git-send-email-jsnow@redhat.com
2016-01-25fdc: Add fallback optionJohn Snow
Currently, QEMU chooses a drive type automatically based on the inserted media. If there is no disk inserted, it chooses a 1.44MB drive type. Change this behavior to be configurable, but leave it defaulted to 1.44. This is not earnestly intended to be used by a user or a management library, but rather exists so that pre-2.6 board types can configure it to be a legacy value. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-8-git-send-email-jsnow@redhat.com
2016-01-25fdc: add pick_driveJohn Snow
Split apart pick_geometry by creating a pick_drive routine that will only ever called during device bring-up instead of relying on pick_geometry to be used in both cases. With this change, the drive field is changed to be 'write once'. It is not altered after the initialization routines exit. media_validated does not need to be migrated. The target VM will just revalidate the media on post_load anyway. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-7-git-send-email-jsnow@redhat.com
2016-01-25fdc: Throw an assertion on misconfigured fd_formats tableJohn Snow
pick_geometry is a convoluted function that makes it difficult to tell at a glance what QEMU's current behavior for choosing a floppy drive type is when it can't quite identify the diskette. The code iterates over all entries in the candidate geometry table ("fd_formats") and if our specific drive type matches a row in the table, then either "match" is set to that entry (an exact match) and the loop exits, or "first_match" will be non-negative (the first such entry that shares the same drive type), and the loop continues. If our specific drive type is NONE, then all drive types in the candidate geometry table are considered. After iteration, if "match" was not set, we fall back to "first match". This means that either "match" was set, or we exited the loop without an exact match, in which case: - If drive type is NONE, the default is truly fd_formats[0], a 1.44MB type, because "first_match" will always get set to the first item. - If drive type is not NONE, pick_geometry's iteration was fussier and only looked at rows that matched our drive type. However, since all possible drive types are represented in the table, we still know that "first match" was set. - If drive type is not NONE and the fd_formats table lists no options for our drive type, we choose fd_formats[1], an incomprehensibly bizarre choice that can never happen anyway. Correct this: If first_match is -1, it can ONLY mean we didn't edit our fd_formats table correctly. Throw an assertion instead. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-6-git-send-email-jsnow@redhat.com
2016-01-25fdc: add disk fieldJohn Snow
Currently, 'drive' is used both to represent the current diskette type as well as the current drive type. This patch adds a 'disk' field that is updated explicitly to match the type of the disk. As of this patch, disk and drive are always the same, but forthcoming patches to change the behavior of pick_geometry will invalidate this assumption. disk does not need to be migrated because it is not user-visible state nor is it currently used for any calculations. It is purely informative, and will be rebuilt automatically via fd_revalidate on the new host. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-5-git-send-email-jsnow@redhat.com
2016-01-25fdc: add drive type qapi enumJohn Snow
Change the floppy drive type to a QAPI enum type, to allow us to specify the floppy drive type from the CLI in a forthcoming patch. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-4-git-send-email-jsnow@redhat.com
2016-01-25fdc: reduce number of pick_geometry argumentsJohn Snow
Modify this function to operate directly on FDrive objects instead of unpacking and passing all of those parameters manually. Reduces the complexity in the caller and reduces the number of args to just one. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-3-git-send-email-jsnow@redhat.com
2016-01-25fdc: move pick_geometryJohn Snow
Code motion: I want to refactor this function to work with FDrive directly, so shuffle it below that definition. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1453495865-9649-2-git-send-email-jsnow@redhat.com
2016-01-21ssi: Move ssi.h into a separate directoryAlistair Francis
Move the ssi.h include file into the ssi directory. While touching the code also fix the typdef lines as checkpatch complains. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-01-21m25p80.c: Add sst25wf080 SPI flash deviceAlistair Francis
Add the sst25wf080 SPI flash device. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-01-20block: Clean up includesPeter Maydell
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-01-13error: Use error_prepend() where it makes obvious senseMarkus Armbruster
Done with this Coccinelle semantic patch @@ expression FMT, E1, E2; expression list ARGS; @@ - error_setg(E1, FMT, ARGS, error_get_pretty(E2)); + error_propagate(E1, E2);/*###*/ + error_prepend(E1, FMT/*@@@*/, ARGS); followed by manual cleanup, first because I can't figure out how to make Coccinelle transform strings, and second to get rid of now superfluous error_propagate(). We now use or propagate the original error whole instead of just its message obtained with error_get_pretty(). This avoids suppressing its hint (see commit 50b7b00), but I can't see how the errors touched in this commit could come with hints. It also improves the message printed with &error_abort when we screw up (see commit 1e9b65b). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-13hw: Inline the qdev_prop_set_drive_nofail() wrapperMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1449764955-10741-3-git-send-email-armbru@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-01-07block: Rename BLOCK_OP_TYPE_MIRROR to BLOCK_OP_TYPE_MIRROR_SOURCEFam Zheng
It's necessary to distinguish source and target before we can add blockdev-mirror, because we would want a concrete type of operation to check on target bs before starting. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1450932306-13717-2-git-send-email-famz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-12-23Merge remote-tracking branch 'remotes/sstabellini/tags/xen-2015-12-22' into ↵Peter Maydell
staging Xen 2015/12/22 # gpg: Signature made Tue 22 Dec 2015 16:17:57 GMT using RSA key ID 70E1AE90 # gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" * remotes/sstabellini/tags/xen-2015-12-22: xen_disk: treat "vhd" as "vpc" xen/pass-through: correctly deal with RW1C bits xen/MSI-X: really enforce alignment xen/MSI-X: latch MSI-X table writes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-12-22block: replace IOV_MAX with BlockLimits.max_iovStefan Hajnoczi
Request merging must not result in a huge request that exceeds the maximum number of iovec elements. Use BlockLimits.max_iov instead of hardcoding IOV_MAX. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-12-22virtio-blk: trivial code optimizationGonglei
1. avoid possible superflous checking 2. make code more robustness ["make code more robustness" refers to avoiding integer underflows/overflows. --Stefan] Signed-off-by: Gonglei <arei.gonglei@huawei.com> Message-id: 1447207166-12612-1-git-send-email-arei.gonglei@huawei.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-12-18xen/blkif: Avoid double access to src->nr_segmentsStefano Stabellini
src is stored in shared memory and src->nr_segments is dereferenced twice at the end of the function. If a compiler decides to compile this into two separate memory accesses then the size limitation could be bypassed. Fix it by removing the double access to src->nr_segments. This is part of XSA-155. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-12-11xen_disk: treat "vhd" as "vpc"Stefano Stabellini
The Xen toolstack uses "vhd" to specify a disk in VHD format, however the name of the driver in QEMU is "vpc". Replace "vhd" with "vpc", so that QEMU can find the right driver to use for it. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-12-07virtio-blk: Drop x-data-plane optionFam Zheng
The official way of enabling dataplane is through the "iothread" property that references an iothread object created by "-object iothread". Since the old "x-data-plane=on" way now even crashes, it's probably easier to just drop it: $ qemu-system-x86_64 -drive file=null-co://,id=d0,if=none \ -device virtio-blk-pci,drive=d0,x-data-plane=on ERROR:/home/fam/work/qemu/qom/object.c:1515: object_get_canonical_path_component: assertion failed: (obj->parent != NULL) Aborted Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1449485967-19240-1-git-send-email-famz@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-11-25Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches # gpg: Signature made Wed 25 Nov 2015 13:33:14 GMT using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: qemu-iotests: Add -nographic when starting QEMU in 119 and 120 block/qapi: Plug memory leak on query-block error path raw-posix.c: Make GetBSDPath() handle caching options nand: fix flash erase when oob is in memory test-aio: Fix event notifier cleanup tests/Makefile: Add more dependencies for test-timed-average Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-11-25nand: fix flash erase when oob is in memoryRicard Wanderlof
For the "main area on file, oob in memory" case, fix the shifts so that we erase the correct number of pages. Signed-off-by: Ricard Wanderlöf <ricardw@axis.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-25Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20151125' into ↵Peter Maydell
staging Xen 2015/11/25 # gpg: Signature made Wed 25 Nov 2015 11:19:26 GMT using RSA key ID 70E1AE90 # gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" * remotes/sstabellini/tags/xen-20151125: xen_disk: Remove ioreq.postsync xen: fix usage of xc_domain_create in domain builder Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-11-25xen_disk: Remove ioreq.postsyncAlberto Garcia
This code has been dead for three years (since commit 7e7b7cba1). Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-11-24virtio-blk: Move resetting of req->mr_next to virtio_blk_handle_rw_errorFam Zheng
"werror=report" would free the req in virtio_blk_handle_rw_error, we mustn't write to it in that case. Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1448239280-15025-1-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-11-18nand: fix address overflowRabin Vincent
The shifts of the address mask and value shift beyond 32 bits when there are 5 address cycles. Cc: qemu-stable@nongnu.org Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-17virtio-blk: Fix double completion for werror=stopFam Zheng
When a request R is absorbed by request M, it is appended to the "mr_next" queue led by M, and is completed together with the completion of M, in virtio_blk_rw_complete. During DMA restart in virtio_blk_dma_restart_bh, requests in s->rq are parsed and submitted again, possibly with a stale req->mr_next. It could be a problem if the request merging in virtio_blk_handle_request hasn't refreshed every mr_next pointer, in which case, virtio_blk_rw_complete could walk through unexpected requests following the stale pointers. Fix this by unsetting the pointer in virtio_blk_rw_complete. It is safe because this req is either completed and freed right away, or it will be restarted and parsed from scratch out of the vq later. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-11-12xen_disk: Account for failed and invalid operationsAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: e0cbb96cb0e1f86c37c7ce332efdf02b57b9d365.1446044838.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12virtio-blk: Account for failed and invalid operationsAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 4f623ce52c9d673d35a043fc2959526b41b685c6.1446044838.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12nvme: Account for failed and invalid operationsAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 678dc67da229759d404b44f7cc2bf5ed8bf8ad14.1446044838.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12xen_disk: Account for flush operationsAlberto Garcia
Currently both BLKIF_OP_WRITE and BLKIF_OP_FLUSH_DISKCACHE are being accounted as write operations. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 7a2a14e3ac62027aa6267a6c02abc70717be9c0a.1446044837.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-29virtio: sync the dataplane vring state to the virtqueue before virtio_savePavel Butsykin
When creating snapshot with the dataplane enabled, the snapshot file gets not the actual state of virtqueue, because the current state is stored in VirtIOBlockDataPlane. Therefore, before saving snapshot need to sync the dataplane vring state to the virtqueue. The dataplane will resume its work at the next notify virtqueue. When snapshot loads with loadvm we get a message: VQ 0 size 0x80 Guest index 0x15f5 inconsistent with Host index 0x0: delta 0x15f5 error while loading state for instance 0x0 of device '0000:00:08.0/virtio-blk' Error -1 while loading VM state to reproduce the error I used the following hmp commands: savevm snap1 loadvm snap1 qemu parameters: --enable-kvm -smp 4 -m 1024 -drive file=/var/lib/libvirt/images/centos6.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0 -set device.virtio-disk0.x-data-plane=on Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Message-id: 1445859777-2982-1-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-29virtio-blk: switch off scsi-passthrough by defaultCornelia Huck
Devices that are compliant with virtio-1 do not support scsi passthrough any more (and it has not been a recommended setup anyway for quite some time). To avoid having to switch it off explicitly in newer qemus that turn on virtio-1 by default, let's switch the default to scsi=false for 2.5. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Message-id: 1444991154-79217-4-git-send-email-cornelia.huck@de.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-29virtio-blk: convert to virtqueue_mapMichael S. Tsirkin
Drop deprecated use of virtqueue_map_sg. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2015-10-23dataplane: Mark host notifiers' client type as "external"Fam Zheng
They will be excluded by type in the nested event loops in block layer, so that unwanted events won't be processed there. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23aio: Add "is_external" flag for event handlersFam Zheng
All callers pass in false, and the real external ones will switch to true in coming patches. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23block: Prepare for NULL BDSMax Reitz
blk_bs() will not necessarily return a non-NULL value any more (unless blk_is_available() is true or it can be assumed to otherwise, e.g. because it is called immediately after a successful blk_new_with_bs() or blk_new_open()). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23hw/block/fdc: Implement tray statusMax Reitz
The tray of an FDD is open iff there is no medium inserted (there are only two states for an FDD: "medium inserted" or "no medium inserted"). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-12block: switch from g_slice allocator to mallocPaolo Bonzini
Simplify memory allocation by sticking with a single API. GSlice is not that fast anyway (tcmalloc/jemalloc are better). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-12virtio-blk: use blk_io_plug/unplug for Linux AIO batchingStefan Hajnoczi
The raw-posix block driver implements Linux AIO batching so multiple requests can be submitted with a single io_submit(2) system call. Batching is currently only used by virtio-scsi and virtio-blk-data-plane. Enable batching for regular virtio-blk so the number of io_submit(2) system calls is reduced for workloads with queue depth > 1. In 4KB random read performance tests with queue depth 32, the CPU utilization on the host is reduced by 9.4%. The fio job is as follows: [global] bs=4k ioengine=libaio iodepth=32 direct=1 sync=0 time_based=1 runtime=30 clocksource=gettimeofday ramp_time=5 [job1] rw=randread filename=/dev/vdb size=4096M write_bw_log=fio write_iops_log=fio write_lat_log=fio log_avg_msec=1000 This benchmark was run on an raw image on LVM. The disk was an SSD drive and -drive cache=none,aio=native was used. Tested-by: Pradeep Surisetty <psuriset@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2015-09-18Fix bad error handling after memory_region_init_ram()Markus Armbruster
Symptom: $ qemu-system-x86_64 -m 10000000 Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456: upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory Aborted (core dumped) Root cause: commit ef701d7 screwed up handling of out-of-memory conditions. Before the commit, we report the error and exit(1), in one place, ram_block_add(). The commit lifts the error handling up the call chain some, to three places. Fine. Except it uses &error_abort in these places, changing the behavior from exit(1) to abort(), and thus undoing the work of commit 3922825 "exec: Don't abort when we can't allocate guest memory". The three places are: * memory_region_init_ram() Commit 4994653 (right after commit ef701d7) lifted the error handling further, through memory_region_init_ram(), multiplying the incorrect use of &error_abort. Later on, imitation of existing (bad) code may have created more. * memory_region_init_ram_ptr() The &error_abort is still there. * memory_region_init_rom_device() Doesn't need fixing, because commit 33e0eb5 (soon after commit ef701d7) lifted the error handling further, and in the process changed it from &error_abort to passing it up the call chain. Correct, because the callers are realize() methods. Fix the error handling after memory_region_init_ram() with a Coccinelle semantic patch: @r@ expression mr, owner, name, size, err; position p; @@ memory_region_init_ram(mr, owner, name, size, ( - &error_abort + &error_fatal | err@p ) ); @script:python@ p << r.p; @@ print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column) When the last argument is &error_abort, it gets replaced by &error_fatal. This is the fix. If the last argument is anything else, its position is reported. This lets us check the fix is complete. Four positions get reported: * ram_backend_memory_alloc() Error is passed up the call chain, ultimately through user_creatable_complete(). As far as I can tell, it's callers all handle the error sanely. * fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize() DeviceClass.realize() methods, errors handled sanely further up the call chain. We're good. Test case again behaves: $ qemu-system-x86_64 -m 10000000 qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory [Exit 1 ] The next commits will repair the rest of commit ef701d7's damage. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1441983105-26376-3-git-send-email-armbru@redhat.com> Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
2015-09-14Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* Support for jemalloc * qemu_mutex_lock_iothread "No such process" fix * cutils: qemu_strto* wrappers * iohandler.c simplification * Many other fixes and misc patches. And some MTTCG work (with Emilio's fixes squashed): * Signal-free TCG kick * Removing spinlock in favor of QemuMutex * User-mode emulation multi-threading fixes/docs # gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream: (44 commits) cutils: work around platform differences in strto{l,ul,ll,ull} cpu-exec: fix lock hierarchy for user-mode emulation exec: make mmap_lock/mmap_unlock globally available tcg: comment on which functions have to be called with mmap_lock held tcg: add memory barriers in page_find_alloc accesses remove unused spinlock. replace spinlock by QemuMutex. cpus: remove tcg_halt_cond and tcg_cpu_thread globals cpus: protect work list with work_mutex scripts/dump-guest-memory.py: fix after RAMBlock change configure: Add support for jemalloc add macro file for coccinelle configure: factor out adding disas configure vhost-scsi: fix wrong vhost-scsi firmware path checkpatch: remove tests that are not relevant outside the kernel checkpatch: adapt some tests to QEMU CODING_STYLE: update mixed declaration rules qmp: Add example usage of strto*l() qemu wrapper cutils: Add qemu_strtoull() wrapper cutils: Add qemu_strtoll() wrapper ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-11maint: remove unused include for signal.hDaniel P. Berrange
A number of files were including signal.h but not using any of the functions it provides Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-10virtio: avoid leading underscores for helpersCornelia Huck
Commit ef546f1275f6563e8934dd5e338d29d9f9909ca6 ("virtio: add feature checking helpers") introduced a helper __virtio_has_feature. We don't want to use reserved identifiers, though, so let's rename __virtio_has_feature to virtio_has_feature and virtio_has_feature to virtio_vdev_has_feature. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-09i8257: rewrite DMA_schedule to avoid hooking into the CPU loopPaolo Bonzini
The i8257 DMA controller uses an idle bottom half, which by default does not cause the main loop to exit. Therefore, the DMA_schedule function is there to ensure that the CPU relinquishes the iothread mutex to the iothread. However, this is not enough since the iothread will call aio_compute_timeout() and go to sleep again. In the iothread world, forcing execution of the idle bottom half is much simpler, and only requires a call to qemu_notify_event(). Do it, removing the need for the "cpu_request_exit" pseudo-irq. The next patch will remove it. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-07hw/block/nvme.c: Use pow2ceil() rather than hand-calculationPeter Maydell
Use pow2ceil() to round up to the next power of 2, rather than an inline calculation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1437741192-20955-4-git-send-email-peter.maydell@linaro.org