aboutsummaryrefslogtreecommitdiff
path: root/block.c
AgeCommit message (Collapse)Author
2010-06-03Merge remote branch 'kwolf/for-anthony' into stagingAnthony Liguori
2010-06-01Monitor: Drop QMP documentation from codeLuiz Capitulino
Previous commit added QMP documentation to the qemu-monitor.hx file, it's is a copy of this information. While it's good to keep it near code, maintaining two copies of the same information is too hard and has little benefit as we don't expect client writers to consult the code to find how to use a QMP command. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-05-28block: Add missing bdrv_delete() for SG_IO BlockDriver in find_image_format()Nicholas A. Bellinger
This patch adds a missing bdrv_delete() call in find_image_format() so that a SG_IO BlockDriver properly releases the temporary BlockDriverState *bs created from bdrv_file_open() Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org> Reported-by: Chris Krumme <chris.krumme@windriver.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28add support for protocol driver create_optionsMORITA Kazutaka
This patch enables protocol drivers to use their create options which are not supported by the format. For example, protcol drivers can use a backing_file option with raw format. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28block: Fix multiwrite with overlapping requestsKevin Wolf
With overlapping requests, the total number of sectors is smaller than the sum of the nb_sectors of both requests. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-26Add cache=unsafe parameter to -driveAlexander Graf
Usually the guest can tell the host to flush data to disk. In some cases we don't want to flush though, but try to keep everything in cache. So let's add a new cache value to -drive that allows us to set the cache policy to most aggressive, disabling flushes. We call this mode "unsafe", as guest data is not guaranteed to survive host crashes anymore. This patch also adds a noop function for aio, so we can do nothing in AIO fashion. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-05-21block: Add SG_IO device check in refresh_total_sectors()Nicholas Bellinger
This patch adds a special case check for scsi-generic devices in refresh_total_sectors() to skip the subsequent BlockDriver->bdrv_getlength() that will be returning -ESPIPE from block/raw-posic.c:raw_getlength() for BlockDriverState->sg=1 devices. Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-21block: Make find_image_format() return 'raw' BlockDriver for SG_IO devicesNicholas Bellinger
This patch adds a special BlockDriverState->sg check in block.c:find_image_format() after bdrv_file_open() -> block/raw-posix.c:hdev_open() has been called to determine if we are dealing with a Linux host scsi-generic device. The patch then returns the BlockDriver * from bdrv_find_format("raw"), skipping the subsequent bdrv_read() and rest of find_image_format(). Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-21block: fix sector comparism in multiwrite_req_compareChristoph Hellwig
The difference between the start sectors of two requests can be larger than the size of the "int" type, which can lead to a not correctly sorted multiwrite array and thus spurious I/O errors and filesystem corruption due to incorrect request merges. So instead of doing the cute sector arithmetics trick spell out the exact comparisms. Spotted by Kevin Wolf based on a testcase from Michael Tokarev. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17block: Remove special case for vvfatKevin Wolf
The special case doesn't really us buy anything. Without it vvfat works more consistently as a protocol. We get raw on top of vvfat now, which works just as well as using vvfat directly. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17Fix docs for block stats monitor commandDaniel P. Berrange
The 'parent' field in the 'query-blockstats' monitor command is part of the top level block device QDict, not part of the 2nd level 'stats' QDict. * block.c: Fix docs for 'parent' field in block stats monitor command output Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17use qemu_free() instead of free()Bruce Rogers
There is a call to free() where qemu_free() should instead be used. Signed-off-by: Bruce Rogers <brogers@novell.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17block: Fix bdrv_commitKevin Wolf
When reopening the image, don't guess the driver, but use the same driver as was used before. This is important if the format=... option was used for that image. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17block: Fix protocol detection for Windows devicesKevin Wolf
We can't assume the file protocol for Windows devices, they need the same detection as other files for which an explicit protocol is not specified. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17block: Avoid unchecked casts for AIOCBsKevin Wolf
Use container_of for one direction and &acb->common for the other one. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Release allocated options after bdrv_openJan Kiszka
They aren't used afterwards nor supposed to be stored by a bdrv_create handler. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Add wr_highest_sector blockstatKevin Wolf
This adds the wr_highest_sector blockstat which implements what is generally known as the high watermark. It is the highest offset of a sector written to the respective BlockDriverState since it has been opened. The query-blockstat QMP command is extended to add this value to the result, and also to add the statistics of the underlying protocol in a new "parent" field. Note that to get the "high watermark" of a qcow2 image, you need to look into the wr_highest_sector field of the parent (which can be a file, a host_device, ...). The wr_highest_sector of the qcow2 BlockDriverState itself is the highest offset on the _virtual_ disk that the guest has written to. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Cache total_sectors to reduce bdrv_getlength callsStefan Hajnoczi
The BlockDriver bdrv_getlength function is called from the I/O code path when checking that the request falls within the device. Unfortunately this involves an lseek system call in the raw protocol; every read or write request will incur this lseek cost. Jan Kiszka <jan.kiszka@siemens.com> identified this issue and its latency overhead. This patch caches device length in the existing total_sectors variable so lseek calls can be avoided for fixed size devices. Growable devices fall back to the full bdrv_getlength code path because I have not added logic to detect extending the size of the device in a write. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Set backing_hd to NULL after deleting itStefan Hajnoczi
It is safer to set backing_hd to NULL after deleting it so that any use after deletion is obvious during development. Happy segfaulting! This patch should be applied after Kevin Wolf's "vmdk: Convert to bdrv_open" so that vmdk does not segfault on close. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: bdrv_has_zero_initKevin Wolf
This fixes the problem that qemu-img's use of no_zero_init only considered the no_zero_init flag of the format driver, but not of the underlying protocols. Between the raw/file split and this fix, converting to host devices is broken. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Open the underlying image file in generic codeKevin Wolf
Format drivers shouldn't need to bother with things like file names, but rather just get an open BlockDriverState for the underlying protocol. This patch introduces this behaviour for bdrv_open implementation. For protocols which need to access the filename to open their file/device/connection/... a new callback bdrv_file_open is introduced which doesn't get an underlying file opened. For now, also some of the more obscure formats use bdrv_file_open because they open() the file themselves instead of using the block.c functions. They need to be fixed in later patches. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Avoid forward declaration of bdrv_open_commonKevin Wolf
Move bdrv_open_common so it's defined before its callers and remove the forward declaration. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Split bdrv_openKevin Wolf
bdrv_open contains quite some code that is only useful for opening images (as opposed to opening files by a protocol), for example snapshots. This patch splits the code so that we have bdrv_open_file() for files (uses protocols), bdrv_open() for images (uses format drivers) and bdrv_open_common() for the code common for opening both images and files. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: separate raw images from the file protocolChristoph Hellwig
We're running into various problems because the "raw" file access, which is used internally by the various image formats is entangled with the "raw" image format, which maps the VM view 1:1 to a file system. This patch renames the raw file backends to the file protocol which is treated like other protocols (e.g. nbd and http) and adds a new "raw" image format which is just a wrapper around calls to the underlying protocol. The patch is surprisingly simple, besides changing the probing logical in block.c to only look for image formats when using bdrv_open and renaming of the old raw protocols to file there's almost nothing in there. For creating images, a new bdrv_create_file is introduced which guesses the protocol to use. This allows using qemu-img create -f raw (or just using the default) for both files and host devices. Converting the other format drivers to use this function to create their images is left for later patches. The only issues still open are in the handling of the host devices. Firstly in current qemu we can specifiy the host* format names on various command line acceping images, but the new code can't do that without adding some translation. Second the layering breaks the no_zero_init flag in the BlockDriver used by qemu-img. I'm not happy how this is done per-driver instead of per-state so I'll prepare a separate patch to clean this up. There's some more cleanup opportunity after this patch, e.g. using separate lists and registration functions for image formats vs protocols and maybe even host drivers, but this can be done at a later stage. Also there's a check for protocol in bdrv_open for the BDRV_O_SNAPSHOT case that I don't quite understand, but which I fear won't work as expected - possibly even before this patch. Note that this patch requires various recent block patches from Kevin and me, which should all be in his block queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23block: Free iovec arrays allocated by multiwrite_merge()Stefan Hajnoczi
A new iovec array is allocated when creating a merged write request. This patch ensures that the iovec array is deleted in addition to its qiov owner. Reported-by: Leszek Urbanski <tygrys@moo.pl> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23block: Convert first_drv to QLISTStefan Hajnoczi
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23block: Convert bdrv_first to QTAILQStefan Hajnoczi
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23block: Do not export bdrv_firstStefan Hajnoczi
The bdrv_first linked list of BlockDriverStates is currently extern so that block migration can iterate the list. However, since there is already a bdrv_iterate() function there is no need to expose bdrv_first. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23block: get rid of the BDRV_O_FILE flagChristoph Hellwig
BDRV_O_FILE is only used to communicate between bdrv_file_open and bdrv_open. It affects two things: first bdrv_open only searches for protocols using find_protocol instead of all image formats and host drivers. We can easily move that to the caller and pass the found driver to bdrv_open. Second it is used to not force a read-write open of a snapshot file. But we never use bdrv_file_open to open snapshots and this behaviour doesn't make sense to start with. qemu-io abused the BDRV_O_FILE for it's growable option, switch it to using bdrv_file_open to make sure we only open files as growable were we can actually support that. This patch requires Kevin's "[PATCH] Replace calls of old bdrv_open" to be applied first. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23Replace calls of old bdrv_openKevin Wolf
What is known today as bdrv_open2 becomes the new bdrv_open. All remaining callers of the old function are converted to the new one. In some places they even know the right format, so they should have used bdrv_open2 from the beginning. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23blkdebug: Add events and rulesKevin Wolf
Block drivers can trigger a blkdebug event whenever they reach a place where it could be useful to inject an error for testing/debugging purposes. Rules are read from a blkdebug config file and describe which action is taken when an event is triggered. For now this is only injecting an error (with a few options) or changing the state (which is an integer). Rules can be declared to be active only in a specific state; this way later rules can distiguish on which path we came to trigger their event. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-10block: Fix multiwrite memory leak in error caseKevin Wolf
Previously multiwrite_user_cb was never called if a request in the multiwrite batch failed right away because it did set mcb->error immediately. Make it look more like a normal callback to fix this. Reported-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-10block: Fix error code in multiwrite for immediate failuresKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-10block: Fix multiwrite error handlingKevin Wolf
When two requests of the same multiwrite batch fail, the callback of all requests in that batch were called twice. This could have any kind of nasty effects, in my case it lead to use after free and eventually a segfault. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-03-17Wrong error message in block_passwd commandShahar Havivi
Signed-off-by: Shahar Havivi <shaharh@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-19block: more read-only changes, related to backing filesNaphtali Sprei
Open backing file read-only where possible Upgrade backing file to read-write during commit, back to read-only after commit If upgrade fail, back to read-only. If also fail, "disconnect" the drive. Signed-off-by: Naphtali Sprei <nsprei@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10Monitor: remove unneeded checksLuiz Capitulino
It's not needed to check the return of qobject_from_jsonf() anymore, as an assert() has been added there. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10block: saner flags filtering in bdrv_open2Christoph Hellwig
Clean up the current mess about figuring out which flags to pass to the driver. BDRV_O_FILE, BDRV_O_SNAPSHOT and BDRV_O_NO_BACKING are flags only used by the block layer internally so filter them out directly. Previously BDRV_O_NO_BACKING could accidentally be passed to the drivers, but wasn't ever used. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10block: BLOCK_IO_ERROR QMP eventLuiz Capitulino
This commit introduces the bdrv_mon_event() function, which should be called by block subsystems (eg. IDE) when a I/O error occurs, so that an QMP event is emitted. The following information is currently provided in the event: - device name - operation (ie. "read" or "write") - action taken (eg. "stop") Event example: { "event": "BLOCK_IO_ERROR", "data": { "device": "ide0-hd1", "operation": "write", "action": "stop" }, "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-09Count dirty blocks and expose an API to get dirty countLiran Schour
This will manage dirty counter for each device and will allow to get the dirty counter from above. Signed-off-by: Liran Schour <lirans@il.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26block: avoid creating too large iovecs in multiwrite_mergeChristoph Hellwig
If we go over the maximum number of iovecs support by syscall we get back EINVAL from the kernel which translate to I/O errors for the guest. Add a MAX_IOV defintion for platforms that don't have it. For now we use the same 1024 define that's used on Linux and various other platforms, but until the windows block backend implements some kind of vectored I/O it doesn't matter. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26win32: pair qemu_memalign() with qemu_vfree()Herve Poussineau
Win32 suffers from a very big memory leak when dealing with SCSI devices. Each read/write request allocates memory with qemu_memalign (ie VirtualAlloc) but frees it with qemu_free (ie free). Pair all qemu_memalign() calls with qemu_vfree() to prevent such leaks. Signed-off-by: Herve Poussineau <hpoussin@reactos.org> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26block: clean up bdrv_open2 structure a bitChristoph Hellwig
Check the whitelist as early as possible instead of continuing the setup, and move all the error handling code to the end of the function. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26No need anymoe for bdrv_set_read_onlyNaphtali Sprei
Signed-off-by: Naphtali Sprei <nsprei@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26block: Return original error codes in bdrv_pread/writeKevin Wolf
Don't assume -EIO but return the real error. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20Revert "block: prevent multiwrite_merge from creating too large iovecs"Anthony Liguori
This reverts commit 0076bc0c1d93adcbc7f1af184e04902cf37e9ab8. Kevin Wolf pointed out that this breaks the mingw32 build. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20block: prevent multiwrite_merge from creating too large iovecsChristoph Hellwig
If we go over the maximum number of iovecs support by syscall we get back EINVAL from the kernel which translate to I/O errors for the guest. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20block: fix cache flushing in bdrv_commitChristoph Hellwig
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20Disable fall-back to read-only when cannot open drive's file for read-writeNaphtali Sprei
Signed-off-by: Naphtali Sprei <nsprei@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20Clean-up a little bit the RW related bits of BDRV_O_FLAGS. BDRV_O_RDONLY ↵Naphtali Sprei
gone (and so is BDRV_O_ACCESS). Default value for bdrv_flags (0/zero) is READ-ONLY. Need to explicitly request READ-WRITE. Instead of using the field 'readonly' of the BlockDriverState struct for passing the request, pass the request in the flags parameter to the function. Signed-off-by: Naphtali Sprei <nsprei@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>