aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)Author
2010-07-15Make default invocation of block drivers safer (v3)Anthony Liguori
CVE-2008-2004 described a vulnerability in QEMU whereas a malicious user could trick the block probing code into accessing arbitrary files in a guest. To mitigate this, we added an explicit format parameter to -drive which disabling block probing. Fast forward to today, and the vast majority of users do not use this parameter. libvirt does not use this by default nor does virt-manager. Most users want block probing so we should try to make it safer. This patch adds some logic to the raw device which attempts to detect a write operation to the beginning of a raw device. If the first 4 bytes happen to match an image file that has a backing file that we support, it scrubs the signature to all zeros. If a user specifies an explicit format parameter, this behavior is disabled. I contend that while a legitimate guest could write such a signature to the header, we would behave incorrectly anyway upon the next invocation of QEMU. This simply changes the incorrect behavior to not involve a security vulnerability. I've tested this pretty extensively both in the positive and negative case. I'm not 100% confident in the block layer's ability to deal with zero sized writes particularly with respect to the aio functions so some additional eyes would be appreciated. Even in the case of a single sector write, we have to make sure to invoked the completion from a bottom half so just removing the zero sized write is not an option. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-07sheepdog: fix compile error on systems without TCP_CORKMORITA Kazutaka
WIN32 is not only the system which doesn't have TCP_CORK (e.g. OS X). Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-06block: add sheepdog driver for distributed storage supportMORITA Kazutaka
Sheepdog is a distributed storage system for QEMU. It provides highly available block level storage volumes to VMs like Amazon EBS. This patch adds a qemu block driver for Sheepdog. Sheepdog features are: - No node in the cluster is special (no metadata node, no control node, etc) - Linear scalability in performance and capacity - No single point of failure - Autonomous management (zero configuration) - Useful volume management support such as snapshot and cloning - Thin provisioning - Autonomous load balancing The more details are available at the project site: http://www.osrg.net/sheepdog/ Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-06raw-posix: Fix test for host CD-ROMMarkus Armbruster
raw_pread_aligned() retries up to two times if the block device backs a virtual CD-ROM (a drive with media=cdrom and if=ide, scsi, xen or none). This makes no sense. Whether retrying reads can correct read errors can only depend on what we're reading, not on how the result gets used. We need to check what whether we're reading from a physical CD-ROM or floppy here. I doubt retrying is useful even then. Left for another day. Impact: * Virtual CD-ROM backed by host_cdrom behaves the same. * Virtual CD-ROM backed by file or host_device no longer retries. * A drive backed by host_cdrom now retries even if it's not a virtual CD-ROM. * Any drive backed by host_floppy now retries. While there, clean up gratuitous use of goto. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-06qcow2/vdi: Change check to distinguish error casesKevin Wolf
This distinguishes between harmless leaks and real corruption. Hopefully users better understand what qemu-img check wants to tell them. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02blkdebug: Initialize state as 1Kevin Wolf
state = 0 in rules means that the rule is valid for any state. Therefore it's impossible to have a rule that works only in the initial state. This changes the initial state from 0 to 1 to make this possible. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02blkdebug: Free QemuOpts after having read the configKevin Wolf
Forgetting to free them means that the next instance inherits all rules and gets its own rules only additionally. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02blkdebug: Fix set_state_opts definitionKevin Wolf
The list head was initialized to point to the wrong list, so all actions ended up being handled as inject-error even if they were set-state in fact. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02qcow2: Fix error handling during metadata preallocationKevin Wolf
People were wondering why qemu-img check failed after they tried to preallocate a large qcow2 file and ran out of disk space. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22qcow2: Don't try to check tables that couldn't be loadedKevin Wolf
Trying to check them leads to a second error message which is more confusing than helpful: Can't get refcount for cluster 0: Invalid argument ERROR cluster 0 refcount=-22 reference=1 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22qcow2: Fix qemu-img check segfault on corrupted imagesKevin Wolf
With corrupted images, we can easily get an cluster index that exceeds the array size of the temporary refcount table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22vpc: Use bdrv_(p)write_sync for metadata writesKevin Wolf
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22vmdk: Use bdrv_(p)write_sync for metadata writesKevin Wolf
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22qcow2: Use bdrv_(p)write_sync for metadata writesKevin Wolf
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22qcow: Use bdrv_(p)write_sync for metadata writesKevin Wolf
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22cow: Use bdrv_(p)write_sync for metadata writesKevin Wolf
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. While at it, correct the wrong usage of errno. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15cow: use qemu block APIChristoph Hellwig
Use bdrv_pwrite to access the backing device instead of pread, and convert the driver to implementing the bdrv_open method which gives it an already opened BlockDriverState for the underlying device. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15cow: stop using mmapChristoph Hellwig
We don't have an equivalent to mmap in the qemu block API, so read and write the bitmap directly. At least in the dumb implementation added in this patch this is a lot less efficient, but it means cow can also work on windows, and over nbd or curl. And it fixes qemu-iotests testcase 012 which did not work properly due to issues with read-only mmap access. In addition we can also get rid of the now unused get_mmap_addr function. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15cow: use pread/pwriteChristoph Hellwig
Use pread/pwrite instead of lseek + read/write in preparation of using the qemu block API. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15qcow2: Restore L1 entry on l2_allocate failureKevin Wolf
If writing the L1 table to disk failed, we need to restore its old content in memory to avoid inconsistencies. Reported-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15qcow2: Return real error code in load_refcount_blockKevin Wolf
This fixes load_refcount_block which completely ignored the return value of write_refcount_block and always returned -EIO for bdrv_pwrite failure. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15qcow2: Allow alloc_clusters_noref to return errorsKevin Wolf
Currently it would consider blocks for which get_refcount fails used. However, it's unlikely that get_refcount would succeed for the next cluster, so it's not really helpful. Return an error instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15qcow2: Allow get_refcount to return errorsKevin Wolf
get_refcount might need to load a refcount block from disk, so errors may happen. Return the error code instead of assuming a refcount of 1 and change the callers to respect error return values. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15vpc: Read/write multiple sectors at onceKevin Wolf
This changes the vpc block driver (for VHD) to read/write multiple sectors at once instead of doing a request for each single sector. Before this, running qemu-iotests for VPC took ages, now it's actually quite reasonable to run it always (down from ~1 hour to 40 seconds for me). Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-14Merge remote branch 'kwolf/for-anthony' into stagingAnthony Liguori
Conflicts: hw/pc.c
2010-06-13Move stdbool.hPaul Brook
Move inclusion of stdbool.h to common header files, instead of including in an ad-hoc manner. Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-06-04Cleanup: raw-posix.c: Be more consistent using BDRV_SECTOR_SIZE instead of 512Jes Sorensen
Clean up raw-posix.c to be more consistent using BDRV_SECTOR_SIZE instead of hard coded 512 values. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Fix corruption after error in update_refcountKevin Wolf
After it is done with updating refcounts in the cache, update_refcount writes all changed entries to disk. If a refcount block allocation fails, however, there was no change yet and therefore first_index = last_index = -1. Don't treat -1 as a normal sector index (resulting in a 512 byte write!) but return without updating anything in this case. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Fix corruption after refblock allocationKevin Wolf
Refblock allocation code needs to take into consideration that update_refcount will load a different refcount block into the cache, so it must initialize the cache for a new refcount block only afterwards. Not doing this means that not only the refcount in the wrong block is updated, but also that the caller will work on the wrong block. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Return right error code in write_refcount_block_entriesKevin Wolf
write_refcount_block_entries used to return -EIO for any errors. Change this to return the real error code. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Change l2_load to return 0/-errnoKevin Wolf
Provide the error code to the caller instead of just indicating success/error. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Allow qcow2_get_cluster_offset to return errorsKevin Wolf
qcow2_get_cluster_offset() looks up a given virtual disk offset and returns the offset of the corresponding cluster in the image file. Errors (e.g. L2 table can't be read) are currenctly indicated by a return value of 0, which is unfortuately the same as for any unallocated cluster. So in effect we can't check for errors. This makes the old return value a by-reference parameter and returns the usual 0/-errno error code. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Fix error handling in l2_allocateKevin Wolf
l2_allocate has some intermediate states in which the image is inconsistent. Change the order to write to the L1 table only after the new L2 table has successfully been initialized. Also reset the L2 cache in failure case, it's very likely wrong. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Clear L2 table cache after write errorKevin Wolf
If the L2 table was already updated in cache, but writing it to disk has failed, we must not continue using the changed version in the cache to stay consistent with what's on the disk. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-24Merge remote branch 'kwolf/for-anthony' into stagingAnthony Liguori
2010-05-22Fix %lld or %llx printf format useBlue Swirl
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-05-21vvfat: More build fixes with DEBUGKevin Wolf
Casting a pointer to an int doesn't work on 64 bit platforms. Use the %p printf conversion specifier instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-21vvfat: Fix compilation with DEBUG definedRiccardo Magliocchetti
gcc does not like passing a NULL where an int value is expected: block/vvfat.c: In function ‘checkpoint’: block/vvfat.c:2868: error: passing argument 2 of ‘remove_mapping’ makes integer from pointer without a cast Signed-off-by: Riccardo Magliocchetti <riccardo.magliocchetti@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17block/vdi: Fix image opening and creation for odd disk sizesStefan Weil
The fix is based on a patch from Kevin Wolf. Here his comment: "The number of blocks needs to be rounded up to cover all of the virtual hard disk. Without this fix, we can't even open our own images if their size is not a multiple of the block size." While Kevin's patch addressed vdi_create, my modification also fixes vdi_open which now accepts images with odd disk sizes. v3: Don't allow reading of disk images with too large disk sizes. Neither VBoxManage nor old versions of qemu-img read such images. This change requires rounding of odd disk sizes before we do the checks. Cc: Kevin Wolf <kwolf@redhat.com> Cc: François Revol <revol@free.fr> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17dmg: use qemu block APIChristoph Hellwig
Use bdrv_pwrite to access the backing device instead of pread, and convert the driver to implementing the bdrv_open method which gives it an already opened BlockDriverState for the underlying device. Dmg actually does an lseek to a negative offset in the open routine, which we replace with offset arithmetics after doing a bdrv_getlength. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17dmg: use preadChristoph Hellwig
Use pread instead of lseek + read in preparation of using the qemu block API. Note that dmg actually uses the implicit file offset a lot in dmg_open, and we had to replace it with an offset variable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17dmg: fix reading of uncompressed chunksChristoph Hellwig
When dmg_read_chunk encounters an uncompressed chunk it currently calls read without any previous adjustment of the file postion. This seems very wrong, and the "reference" implementation in dmg2img does a search to the same offset as done in the various compression cases, so do the same here. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17block/vpc: Fix conversion from size to disk geometryStefan Weil
The VHD algorithm calculates a disk geometry which is usually smaller than the requested size. QEMU tried to round up but failed for certain sizes: qemu-img create -f vpc disk.vpc 9437184 would create an image with 9435136 bytes (which is too small for qemu-img convert). Instead of hacking the geometry algorithm, the patch increases the number of sectors until we get enough sectors. Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17parallels: use qemu block APIChristoph Hellwig
Use bdrv_pwrite to access the backing device instead of pread, and convert the driver to implementing the bdrv_open method which gives it an already opened BlockDriverState for the underlying device. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17parallels: use preadChristoph Hellwig
Use pread instead of lseek + read in preparation of using the qemu block API. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17block/vdi: Allow disk images of size 0Stefan Weil
Even it is not very useful, users may create images of size 0. Without the special option CONFIG_ZERO_MALLOC, qemu_mallocz aborts execution when it is told to allocate 0 bytes, so avoid this kind of call. Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17block: Avoid unchecked casts for AIOCBsKevin Wolf
Use container_of for one direction and &acb->common for the other one. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17bochs: use qemu block APIChristoph Hellwig
Use bdrv_pwrite to access the backing device instead of pread, and convert the driver to implementing the bdrv_open method which gives it an already opened BlockDriverState for the underlying device. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17bochs: use preadChristoph Hellwig
Use pread instead of lseek + read in preparation of using the qemu block API. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17cloop: use qemu block APIChristoph Hellwig
Use bdrv_pwrite to access the backing device instead of pread, and convert the driver to implementing the bdrv_open method which gives it an already opened BlockDriverState for the underlying device. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>