aboutsummaryrefslogtreecommitdiff
path: root/block/raw-posix.c
AgeCommit message (Collapse)Author
2017-01-09block: Rename raw-{posix,win32} to file-*.cEric Blake
These files deal with the file protocol, not the raw format (the file protocol is often used with other formats, and the raw format is not forced to use the file protocol). Rename things to make it a bit easier to follow. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-11-11raw-posix: Rename 'raw_s' to 'rs'Fam Zheng
It is too confusing because it sounds like a BDRVRawState variable. Suggested-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1477565117-17230-1-git-send-email-famz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-10-27raw-posix: Don't use bdrv_ioctl()Kevin Wolf
Instead of letting raw-posix use the bdrv_ioctl() abstraction to issue an ioctl to itself, just call ioctl() directly. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-24block: improve error handling in raw_openHalil Pasic
Make raw_open for POSIX more consistent in handling errors by setting the error object also when qemu_open fails. The error object was set generally set in case of errors, but I guess this case was overlooked. Do the same for win32. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Tested-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com> (POSIX only) Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-29block/qapi: Move 'aio' option to file driverKevin Wolf
The option whether or not to use a native AIO interface really isn't a generic option for all drivers, but only applies to the native file protocols. This patch moves the option in blockdev-add to the appropriate places (raw-posix and raw-win32). We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility because so far the AIO option was usually specified on the wrong layer (the top-level format driver, which didn't even look at it) and then inherited by the protocol driver (where it was actually used). We can't forbid this use except in new interfaces. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-07-20block: Convert .bdrv_aio_discard() to byte-basedEric Blake
Another step towards byte-based interfaces everywhere. Replace the sector-based driver callback .bdrv_aio_discard() with a new byte-based .bdrv_aio_pdiscard(). Only raw-posix and RBD drivers are affected, so it was not worth splitting into multiple patches. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-9-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20raw-posix: Switch paio_submit() to byte-basedEric Blake
The only remaining uses of paio_submit() were flush (with no offset or count) and discard (which we are switching to byte-based); furthermore, the similarly named paio_submit_co() is already byte-based. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-7-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-18linux-aio: share one LinuxAioState within an AioContextPaolo Bonzini
This has better performance because it executes fewer system calls and does not use a bottom half per disk. Originally proposed by Ming Lei. [Changed #include "raw-aio.h" to "block/raw-aio.h" in win32-aio.c to fix build error as reported by Peter Maydell <peter.maydell@linaro.org>. --Stefan] Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1467650000-51385-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> squash! linux-aio: share one LinuxAioState within an AioContext
2016-07-13raw-posix: Use qemu_dupFam Zheng
Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-05block: Fix error message styleEric Blake
error_setg() is not supposed to be used for multi-sentence messages; tweak the message to append a hint instead. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-05block: Move request_alignment into BlockLimitEric Blake
It makes more sense to have ALL block size limit constraints in the same struct. Improve the documentation while at it. Simplify a couple of conditionals, now that we have audited and documented that request_alignment is always non-zero. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-05block: Switch transfer length bounds to byte-basedEric Blake
Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_transfer_length and opt_transfer_length. Rename them (dropping the _length suffix) so that the compiler will help us catch the change in semantics across any rebased code, and improve the documentation. Use unsigned values, so that we don't have to worry about negative values and so that bit-twiddling is easier; however, we are still constrained by 2^31 of signed int in most APIs. When a value comes from an external source (iscsi and raw-posix), sanitize the results to ensure that opt_transfer is a power of 2. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-20coccinelle: Remove unnecessary variables for function return valueEduardo Habkost
Use Coccinelle script to replace 'ret = E; return ret' with 'return E'. The script will do the substitution only when the function return type and variable type are the same. Manual fixups: * audio/audio.c: coding style of "read (...)" and "write (...)" * block/qcow2-cluster.c: wrap line to make it shorter * block/qcow2-refcount.c: change indentation of wrapped line * target-tricore/op_helper.c: fix coding style of "remainder|quotient" * target-mips/dsp_helper.c: reverted changes because I don't want to argue about checkpatch.pl * ui/qemu-pixman.c: fix line indentation * block/rbd.c: restore blank line between declarations and statements Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1465855078-19435-4-git-send-email-ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Unused Coccinelle rule name dropped along with a redundant comment; whitespace touched up in block/qcow2-cluster.c; stale commit message paragraph deleted] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-20error: Remove unnecessary local_err variablesEduardo Habkost
This patch simplifies code that uses a local_err variable just to immediately use it for an error_propagate() call. Coccinelle patch used to perform the changes added to scripts/coccinelle/remove_local_err.cocci. Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1465855078-19435-3-git-send-email-ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Blank line in s390-virtio-ccw.c restored] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-20error: Remove NULL checks on error_propagate() callsEduardo Habkost
error_propagate() already ignores local_err==NULL, so there's no need to check it before calling. Coccinelle patch used to perform the changes added to scripts/coccinelle/error_propagate_null.cocci. Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1465855078-19435-2-git-send-email-ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-16raw-posix: Implement .bdrv_co_preadv/pwritevKevin Wolf
The raw-posix block driver actually supports byte-aligned requests now on non-O_DIRECT images, like it already (and previously incorrectly) claimed in bs->request_alignment. For some block drivers this means that a RMW cycle can be avoided when they write sub-sector metadata e.g. for cluster allocation. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16raw-posix: Switch to bdrv_co_* interfacesKevin Wolf
In order to use the modern byte-based .bdrv_co_preadv/pwritev() interface, this patch switches raw-posix to coroutine-based interfaces as a first step. In terms of semantics and performance, it doesn't make a difference with the existing code whether we go from a coroutine to a callback-based interface already in block/io.c or only in linux-aio.c As there have been concerns in the past that this change may be a step in the wrong direction with respect to a possible AIO fast path, the old callback-based interface for linux-aio is left around and can be reactivated when a fast path (e.g. directly from virtio-blk dataplane, bypassing the whole block layer) is implemented. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08raw-posix: Fetch max sectors for host block deviceFam Zheng
This is sometimes a useful value we should count in. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08raw-posix: Convert to bdrv_co_pwrite_zeroes()Eric Blake
Another step on our continuing quest to switch to byte-based interfaces. Signed-off-by: Eric Blake <eblake@redhat.com> [ kwolf: Fixed up trace_paio_submit_co() call for qiov == NULL ] Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Honor BDRV_REQ_FUA during write_zeroesEric Blake
The block layer has a couple of cases where it can lose Force Unit Access semantics when writing a large block of zeroes, such that the request returns before the zeroes have been guaranteed to land on underlying media. SCSI does not support FUA during WRITESAME(10/16); FUA is only supported if it falls back to WRITE(10/16). But where the underlying device is new enough to not need a fallback, it means that any upper layer request with FUA semantics was silently ignoring BDRV_REQ_FUA. Conversely, NBD has situations where it can support FUA but not ZERO_WRITE; when that happens, the generic block layer fallback to bdrv_driver_pwritev() (or the older bdrv_co_writev() in qemu 2.6) was losing the FUA flag. The problem of losing flags unrelated to ZERO_WRITE has been latent in bdrv_co_do_write_zeroes() since commit aa7bfbff, but back then, it did not matter because there was no FUA flag. It became observable when commit 93f5e6d8 paved the way for flags that can impact correctness, when we should have been using bdrv_co_writev_flags() with modified flags. Compare to commit 9eeb6dd, which got flag manipulation right in bdrv_co_do_zero_pwritev(). Symptoms: I tested with qemu-io with default writethrough cache (which is supposed to use FUA semantics on every write), and targetted an NBD client connected to a server that intentionally did not advertise NBD_FLAG_SEND_FUA. When doing 'write 0 512', the NBD client sent two operations (NBD_CMD_WRITE then NBD_CMD_FLUSH) to get the fallback FUA semantics; but when doing 'write -z 0 512', the NBD client sent only NBD_CMD_WRITE. The fix is do to a cleanup bdrv_co_flush() at the end of the operation if any step in the middle relied on a BDS that does not natively support FUA for that step (note that we don't need to flush after every operation, if the operation is broken into chunks based on bounce-buffer sizing). Each BDS gains a new flag .supported_zero_flags, which parallels the use of .supported_write_flags but only when accessing a zero write operation (the flags MUST be different, because of SCSI having different semantics based on WRITE vs. WRITESAME; and also because BDRV_REQ_MAY_UNMAP only makes sense on zero writes). Also fix some documentation to describe -ENOTSUP semantics, particularly since iscsi depends on those semantics. Down the road, we may want to add a driver where its .bdrv_co_pwritev() honors all three of BDRV_REQ_FUA, BDRV_REQ_ZERO_WRITE, and BDRV_REQ_MAY_UNMAP, and advertise this via bs->supported_write_flags for blocks opened by that driver; such a driver should NOT supply .bdrv_co_write_zeroes nor .supported_zero_flags. But none of the drivers touched in this patch want to do that (the act of writing zeroes is different enough from normal writes to deserve a second callback). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12linux-aio: make it more type safePaolo Bonzini
Replace void* with an opaque LinuxAioState type. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: plug whole tree at once, introduce bdrv_io_unplugged_begin/endPaolo Bonzini
Extract the handling of io_plug "depth" from linux-aio.c and let the main bdrv_drain loop do nothing but wait on I/O. Like the two newly introduced functions, bdrv_io_plug and bdrv_io_unplug now operate on all children. The visit order is now symmetrical between plug and unplug, making it possible for formats to implement plug/unplug. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-30block/raw-posix.c: Make physical devices usable in QEMU under Mac OS X hostProgrammingkid
Mac OS X can be picky when it comes to allowing the user to use physical devices in QEMU. Most mounted volumes appear to be off limits to QEMU. If an issue is detected, a message is displayed showing the user how to unmount a volume. Now QEMU uses both CD and DVD media. Signed-off-by: John Arbuckle <programmingkidx@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-22util: move declarations out of qemu-common.hVeronia Bahaa
Move declarations out of qemu-common.h for functions declared in utils/ files: e.g. include/qemu/path.h for utils/path.c. Move inline functions out of qemu-common.h and into new files (e.g. include/qemu/bcd.h) Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22include/qemu/osdep.h: Don't include qapi/error.hMarkus Armbruster
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-02raw: Assign bs to file in raw_co_get_block_statusFam Zheng
Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1453780743-16806-5-git-send-email-famz@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02block: Add "file" output parameter to block status query functionsFam Zheng
The added parameter can be used to return the BDS pointer which the valid offset is referring to. Its value should be ignored unless BDRV_BLOCK_OFFSET_VALID in ret is set. Until block drivers fill in the right value, let's clear it explicitly right before calling .bdrv_get_block_status. The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest case 102 passing, and will be fixed once all drivers return the right file pointer. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-01-20block: Clean up includesPeter Maydell
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-01-19block/raw-posix: avoid bogus fixup for cylinders on DASD disksChristian Borntraeger
large volume DASD that have > 64k cylinders do claim to have 0xFFFE cylinders as special value in the old 16 bit field. We want to pass this "token" along to the guest, instead of calculating the real number. Otherwise qemu might fail with "cyls must be between 1 and 65535" Cc: qemu-stable@nongnu.org Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18raw-posix: Make aio=native option bindingKevin Wolf
Traditionally, aio=native was treated as an advice that could simply be ignored if an error occurs while initialising Linux AIO or the feature wasn't compiled in. This behaviour was deprecated in commit 96518254 (qemu 2.3; error during init) and commit 1501ecc1 (qemu 2.5; not compiled in). This patch changes raw-posix to error out in these cases instead of printing a deprecation warning. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-12-17qapi: Don't let implicit enum MAX member collideEric Blake
Now that we guarantee the user doesn't have any enum values beginning with a single underscore, we can use that for our own purposes. Renaming ENUM_MAX to ENUM__MAX makes it obvious that the sentinel is generated. This patch was mostly generated by applying a temporary patch: |diff --git a/scripts/qapi.py b/scripts/qapi.py |index e6d014b..b862ec9 100644 |--- a/scripts/qapi.py |+++ b/scripts/qapi.py |@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = { | max_index = c_enum_const(name, 'MAX', prefix) | ret += mcgen(''' | [%(max_index)s] = NULL, |+// %(max_index)s | }; | ''', | max_index=max_index) then running: $ cat qapi-{types,event}.c tests/test-qapi-types.c | sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list $ git grep -l _MAX | xargs sed -i -f list The only things not generated are the changes in scripts/qapi.py. Rejecting enum members named 'MAX' is now useless, and will be dropped in the next patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> [Rebased to current master, commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-11-25raw-posix.c: Make GetBSDPath() handle caching optionsProgrammingkid
Add support for caching options that can be specified from the command line. The CD-ROM raw char device bypasses the host page cache and therefore has alignment requirements. Alignment probing is necessary so only use the raw char device if BDRV_O_NOCACHE is set. This patch fixes -cdrom /dev/cdrom on Mac OS X hosts, where bdrv_read() used to fail due to misaligned requests during image format probing. Signed-off-by: John Arbuckle <programmingkidx@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-11-12block: Drop BlockDriver.bdrv_ioctlFam Zheng
Now the callback is not used any more, drop the field along with all implementations in block drivers, which are iscsi and raw. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1447064214-29930-8-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23block: Make bdrv_is_inserted() return a boolMax Reitz
Make bdrv_is_inserted(), blk_is_inserted(), and the callback BlockDriver.bdrv_is_inserted() return a bool. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23block: Remove host floppy supportMax Reitz
It has been deprecated as of 2.3, so we can now remove it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-16raw-posix: warn about BDRV_O_NATIVE_AIO if libaio is unavailableStefan Hajnoczi
raw-posix.c silently ignores BDRV_O_NATIVE_AIO if libaio is unavailable. It is confusing when aio=native performance is identical to aio=threads because the binary was accidentally built without libaio. Print a deprecation warning if -drive aio=native is used with a binary that does not support libaio. There are probably users using aio=native who would be inconvenienced if QEMU suddenly refused to start their guests. In the future this will become an error. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-12block: switch from g_slice allocator to mallocPaolo Bonzini
Simplify memory allocation by sticking with a single API. GSlice is not that fast anyway (tcmalloc/jemalloc are better). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-02block/raw-posix: Open file descriptor O_RDWR to work around glibc ↵Richard W.M. Jones
posix_fallocate emulation issue. https://bugzilla.redhat.com/show_bug.cgi?id=1265196 The following command fails on an NFS mountpoint: $ qemu-img create -f qcow2 -o preallocation=falloc disk.img 262144 Formatting 'disk.img', fmt=qcow2 size=262144 encryption=off cluster_size=65536 preallocation='falloc' lazy_refcounts=off qemu-img: disk.img: Could not preallocate data for the new file: Bad file descriptor The reason turns out to be because NFS doesn't support the posix_fallocate call. glibc emulates it instead. However glibc's emulation involves using the pread(2) syscall. The pread syscall fails with EBADF if the file descriptor is opened without the read open-flag (ie. open (..., O_WRONLY)). I contacted glibc upstream about this, and their response is here: https://bugzilla.redhat.com/show_bug.cgi?id=1265196#c9 There are two possible fixes: Use Linux fallocate directly, or (this fix) work around the problem in qemu by opening the file with O_RDWR instead of O_WRONLY. Signed-off-by: Richard W.M. Jones <rjones@redhat.com> BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1265196 Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-04block/raw-posix: Use raw_normalize_devicepath()Max Reitz
The filename given to qemu_open() in block/raw-posix.c should generally have been processed by raw_normalize_devicepath(); unless we are only probing (in which case the caller often checks whether the file is a block device or not, and this property will be changed by raw_normalize_devicepath() on NetBSD) or it is about a deprecated device (i.e. floppy). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-07-07block/raw-posix: Don't think /dev/fd/<NN> is a floppy drive.Richard W.M. Jones
In libguestfs we use /dev/fd/<NN> to pass pre-opened file descriptors to qemu-img. Lately I've discovered that although this works, qemu believes that these are floppy disk images. That in itself isn't much of a problem, but now qemu prints a warning about host floppy pass-thru being deprecated. Extend the existing test so that it ignores /dev/fd/ as well as /dev/fdset/ A simple test of this, if you are using the bash shell, is: qemu-img info <( cat /dev/null ) without this patch: $ qemu-img info <( cat /dev/null ) qemu-img: Host floppy pass-through is deprecated Support for it will be removed in a future release. qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: Illegal seek with this patch: $ qemu-img info <( cat /dev/null ) qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: Illegal seek Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 1435761614-31358-1-git-send-email-rjones@redhat.com Fixes: https://bugs.launchpad.net/qemu/+bug/1470536 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23raw-posix: Introduce hdev_is_sg()Dimitris Aragiorgis
Until now, an SG device was identified only by checking if its path started with "/dev/sg". Then, hdev_open() would set the bs->sg flag accordingly. The patch relies on the actual properties of the device instead of the specified file path. To this end, test for an SG device (e.g. /dev/sg0) by ensuring that all of the following holds: - The specified file name corresponds to a character device - The device supports the SG_GET_VERSION_NUM ioctl - The device supports the SG_GET_SCSI_ID ioctl Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Message-id: 1435056300-14924-6-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23raw-posix: Use DPRINTF for DEBUG_FLOPPYDimitris Aragiorgis
Get rid of several #ifdef DEBUG_FLOPPY and substitute them with DPRINTF. Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-5-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23raw-posix: DPRINTF instead of DEBUG_BLOCK_PRINTDimitris Aragiorgis
Building the QEMU tools fails if we #define DEBUG_BLOCK inside block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with a simple DPRINTF() (that does not cause bit-rot). Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-4-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23block: Use bdrv_is_sg() everywhereDimitris Aragiorgis
Instead of checking bs->sg use bdrv_is_sg() consistently throughout the code. Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-2-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-22qerror: Move #include out of qerror.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-12raw-posix: Fix .bdrv_co_get_block_status() for unaligned image sizeKevin Wolf
Image files with an unaligned image size have a final hole that starts at EOF, i.e. in the middle of a sector. Currently, *pnum == 0 is returned when checking the status of this sector. In qemu-img, this triggers an assertion failure. In order to fix this, one type for the sector that contains EOF must be found. Treating a hole as data is safe, so this patch rounds the calculated number of data sectors up, so that a partial sector at EOF is treated as a full data sector. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1229394 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1433840108-9996-1-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22block: align bounce buffers to pageDenis V. Lunev
The following sequence int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644); for (i = 0; i < 100000; i++) write(fd, buf, 4096); performs 5% better if buf is aligned to 4096 bytes. The difference is quite reliable. On the other hand we do not want at the moment to enforce bounce buffering if guest request is aligned to 512 bytes. The patch changes default bounce buffer optimal alignment to MAX(page size, 4k). 4k is chosen as maximal known sector size on real HDD. The justification of the performance improve is quite interesting. From the kernel point of view each request to the disk was split by two. This could be seen by blktrace like this: 9,0 11 1 0.000000000 11151 Q WS 312737792 + 1023 [qemu-img] 9,0 11 2 0.000007938 11151 Q WS 312738815 + 8 [qemu-img] 9,0 11 3 0.000030735 11151 Q WS 312738823 + 1016 [qemu-img] 9,0 11 4 0.000032482 11151 Q WS 312739839 + 8 [qemu-img] 9,0 11 5 0.000041379 11151 Q WS 312739847 + 1016 [qemu-img] 9,0 11 6 0.000042818 11151 Q WS 312740863 + 8 [qemu-img] 9,0 11 7 0.000051236 11151 Q WS 312740871 + 1017 [qemu-img] 9,0 5 1 0.169071519 11151 Q WS 312741888 + 1023 [qemu-img] After the patch the pattern becomes normal: 9,0 6 1 0.000000000 12422 Q WS 314834944 + 1024 [qemu-img] 9,0 6 2 0.000038527 12422 Q WS 314835968 + 1024 [qemu-img] 9,0 6 3 0.000072849 12422 Q WS 314836992 + 1024 [qemu-img] 9,0 6 4 0.000106276 12422 Q WS 314838016 + 1024 [qemu-img] and the amount of requests sent to disk (could be calculated counting number of lines in the output of blktrace) is reduced about 2 times. Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest does his job well and real requests comes properly aligned (to page). Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-3-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22block: minimal bounce buffer alignmentDenis V. Lunev
The patch introduces new concept: minimal memory alignment for bounce buffers. Original so called "optimal" value is actually minimal required value for aligment. It should be used for validation that the IOVec is properly aligned and bounce buffer is not required. Though, from the performance point of view, it would be better if bounce buffer or IOVec allocated by QEMU will be aligned stricter. The patch does not change any alignment value yet. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-2-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-03-19raw-posix: Deprecate aio=threads fallback without O_DIRECTKevin Wolf
Currently, if the user requests aio=native, but forgets to choose a cache mode that sets O_DIRECT, that request is silently ignored and raw falls back to aio=threads. Deprecate that behaviour so we can make it an error in future qemu versions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2015-03-19raw-posix: Deprecate host floppy passthroughMarkus Armbruster
Raise your hand if you have a physical floppy drive in a computer you've powered on in 2015. Okay, I see we got a few weirdos in the audience. That's okay, weirdos are welcome here. Kidding aside, media change detection doesn't fully work, isn't going to be fixed, and floppy passthrough just isn't earning its keep anymore. Deprecate block driver host_floppy now, so we can drop it after a grace period. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>