aboutsummaryrefslogtreecommitdiff
path: root/hw/virtio-blk.c
AgeCommit message (Collapse)Author
2010-09-21virtio-blk: propagate the required alignmentChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-09trace: Trace virtio-blk, multiwrite, and paio_submitStefan Hajnoczi
This patch adds trace events that make it possible to observe virtio-blk. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-08-30virtio-blk: Fix migration of queued requestsKevin Wolf
in_sg[].iovec and out_sg[].ioved are pointer to (source) host memory and therefore invalid after migration. When loading the device state we must create a new mapping on the destination host. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-24Rearrange block headersBlue Swirl
Changing block.h or blockdev.h resulted in recompiling most objects. Move DriveInfo typedef and BlockInterfaceType enum definitions to qemu-common.h and rearrange blockdev.h use to decrease churn. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-26virtio-blk: Create exit function to unregister savevmAlex Williamson
Otherwise we can't migrate after we've removed a virtio block device. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-13ide scsi virtio-blk: Reject empty drives unless media is removableMarkus Armbruster
Disks without media make no sense. For SCSI, a Linux guest kernel complains during boot. I didn't try other combinations. scsi-generic doesn't need the additional check, because it already requires bdrv_is_sg(), which fails without media. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-13virtio-blk: Fix virtio-blk-s390 to require driveMarkus Armbruster
Move the check from virtio_blk_init_pci(), where it protects only virtio-blk-pci, to virtio_blk_init(). Without that, virtio-blk-s390 initializes without a drive. I figure that can lead to null pointer dereferences. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-06Merge remote branch 'kwolf/for-anthony' into stagingAnthony Liguori
2010-07-06savevm: Add DeviceState paramAlex Williamson
When available, we'd like to be able to access the DeviceState when registering a savevm. For buses with a get_dev_path() function, this will allow us to create more unique savevm id strings. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-06Add virtio disk identification supportjohn cooper
This patch adds the final missing bits for support of passing a serial/id string to a virtio-blk guest driver. The guest-side component already exists in the virtio driver, and has recently been reworked by Ryan to export a /sys interface for retrieval of the id from guest userland. Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02block: Fix virtual media change for if=noneMarkus Armbruster
BlockDriverState member removable controls whether virtual media change (monitor commands change, eject) is allowed. It is set when the "type hint" is BDRV_TYPE_CDROM or BDRV_TYPE_FLOPPY. The type hint is only set by drive_init(). It sets BDRV_TYPE_FLOPPY for if=floppy. It sets BDRV_TYPE_CDROM for media=cdrom and if=ide, scsi, xen, or none. if=ide and if=scsi work, because the type hint makes it a CD-ROM. if=xen likewise, I think. For the same reason, if=none works when it's used by ide-drive or scsi-disk. For other guest devices, there are problems: * fdc: you can't change virtual media $ qemu [...] -drive if=none,id=foo,... -global isa-fdc.driveA=foo QEMU 0.12.50 monitor - type 'help' for more information (qemu) eject foo Device 'foo' is not removable unless you add media=cdrom, but that makes it readonly. * virtio: if you add media=cdrom, you can change virtual media. If you eject, the guest gets I/O errors. If you change, the guest sees the drive's contents suddenly change. * scsi-generic: if you add media=cdrom, you can change virtual media. I didn't test what that does to the guest or the physical device, but it can't be pretty. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02qdev: Decouple qdev_prop_drive from DriveInfoMarkus Armbruster
Make the property point to BlockDriverState, cutting out the DriveInfo middleman. This prepares the ground for block devices that don't have a DriveInfo. Currently all user-defined ones have a DriveInfo, because the only way to define one is -drive & friends (they go through drive_init()). DriveInfo is closely tied to -drive, and like -drive, it mixes information about host and guest part of the block device. I'm working towards a new way to define block devices, with clean host/guest separation, and I need to get DriveInfo out of the way for that. Fortunately, the device models are perfectly happy with BlockDriverState, except for two places: ide_drive_initfn() and scsi_disk_initfn() need to check the DriveInfo for a serial number set with legacy -drive serial=... Use drive_get_by_blockdev() there. Device model code should now use DriveInfo only when explicitly dealing with drives defined the old way, i.e. without -device. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22virtio-blk: fix the list operation in virtio_blk_load().Yoshiaki Tamura
Although it is really rare to get in to the while loop, the list operation in the loop is obviously wrong. Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15block: Move error actions from DriveInfo to BlockDriverStateMarkus Armbruster
That's where they belong semantically (block device host part), even though the actions are actually executed by guest device code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-14virtio-blk: simplify multiwrite calling conventionsChristoph Hellwig
Pass the MultiReqBuffer structure down all the way to the I/O submission instead of takin it apart. Also mark num_writes unsigned as it can't go negative, and take the check for any pending I/O requests into the submission function. Last but not least rename do_multiwrite to virtio_submit_multiwrite to fit the general naming scheme and make clear what it does. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-06-14virtio-blk: stop tracking old_bsChristoph Hellwig
There is a 1:1 relation between VirtIOBlock and BlockDriverState instances, no need to track it because it won't change. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-06-04blockdev: Collect block device code in new blockdev.cMarkus Armbruster
Anything that moves hundreds of lines out of vl.c can't be all bad. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-04Cleanup: virtio-blk.c: Be more consistent using BDRV_SECTOR_SIZE insteadJes Sorensen
Clean up virtio-blk.c to be more consistent using BDRV_SECTOR_SIZE instead of hard coded 512 values. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-21virtio-blk: fix barrier supportChristoph Hellwig
Before issuing the barrier to the block driver we need to flush our oustanding queue of write requests, as the flush is supposed to be issued after them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-21virtio-blk: Avoid zeroing every request structureStefan Hajnoczi
The VirtIOBlockRequest structure is about 40 KB in size. This patch avoids zeroing every request by only initializing fields that are read. The other fields are either written to or may not be used at all. Oprofile shows about 10% of CPU samples in memset called by virtio_blk_alloc_request(). The workload is dd if=/dev/vda of=/dev/null iflag=direct bs=8k running concurrently 4 times. This patch makes memset disappear to the bottom of the profile. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23Remove un-needed codeBruce Rogers
The bdrv_set_geometry_hint call below is not needed - it's just setting what was just read. Signed-off-by: Bruce Rogers <brogers@novell.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-18virtio-blk: Fix use after free in error caseKevin Wolf
virtio_blk_req_complete frees the request, so we can't access it any more when calling bdrv_mon_event. Use the pointer that was copied earlier. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-03-17block: add logical_block_size propertyChristoph Hellwig
Add a logical block size attribute as various guest side tools only increase the filesystem sector size based on it, not the advisory physical block size. For scsi we already have support for a different logical block size in place for CDROMs that we can built upon. Only my recent block device characteristics VPD page needs some fixups. Note that we leave the logial block size for CDROMs hardcoded as the 2k value is expected for it in general. For virtio-blk we already have a feature flag claiming to support a variable logical block size that was added for the s390 kuli hypervisor. Interestingly it does not actually change the units in which the protocol works, which is still fixed at 512 bytes, but only communicates a different minimum I/O granularity. So all we need to do in virtio is to add a trap for unaligned I/O and round down the device size to the next multiple of the logical block size. IDE does not support any other logical block size than 512 bytes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-03-08block: Emit BLOCK_IO_ERROR before vm_stop() callLuiz Capitulino
The next commit will move the STOP event into do_vm_stop(), to have the expected event sequence we need to emit the I/O error event before calling vm_stop(). The expected sequence is: { "event": "BLOCK_IO_ERROR" [...] } { "event": "STOP" } Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10virtio-blk: add topology supportChristoph Hellwig
Export all topology information in the block config structure, guarded by a new VIRTIO_BLK_F_TOPOLOGY feature flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10block: add topology qdev propertiesChristoph Hellwig
Add three new qdev properties to export block topology information to the guest. This is needed to get optimal I/O alignment for RAID arrays or SSDs. The options are: - physical_block_size to specify the physical block size of the device, this is going to increase from 512 bytes to 4096 kilobytes for many modern storage devices - min_io_size to specify the minimal I/O size without performance impact, this is typically set to the RAID chunk size for arrays. - opt_io_size to specify the optimal sustained I/O size, this is typically the RAID stripe width for arrays. I decided to not auto-probe these values from blkid which might easily be possible as I don't know how to deal with these issues on migration. Note that we specificly only set the physical_block_size, and not the logial one which is the unit all I/O is described in. The reason for that is that IDE does not support increasing the logical block size and at last for now I want to stick to one meachnisms in queue and allow for easy switching of transports for a given backing image which would not be possible if scsi and virtio use real 4k sectors, while ide only uses the physical block exponent. To make this more common for the different block drivers introduce a new BlockConf structure holding all common block properties and a DEFINE_BLOCK_PROPERTIES macro to add them all together, mirroring what is done for network drivers. Also switch over all block drivers to use it, except for the floppy driver which has weird driveA/driveB properties and probably won't require any advanced block options ever. Example usage for a virtio device with 4k physical block size and 8k optimal I/O size: -drive file=scratch.img,media=disk,cache=none,id=scratch \ -device virtio-blk-pci,drive=scratch,physical_block_size=4096,opt_io_size=8192 aliguori: updated patch to take into account BLOCK events Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10virtio-blk: revert serial number supporthch@lst.de
The addition of the whole ATA IDENTIY page caused the config space to go above the allowed size in the PCI spec, and thus the feature was already reverted in the Linux guest driver and disabled by default in qemu. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10virtio-blk: Generate BLOCK_IO_ERROR QMP eventLuiz Capitulino
Just call bdrv_mon_event() in the right place. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-29virtio-blk: Fix error cases which ignored rerror/werrorKevin Wolf
If an I/O request fails right away instead of getting an error only in the callback, we still need to consider rerror/werror. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-29virtio-blk: Fix restart after read errorKevin Wolf
Current code assumes that only write requests are ever going to be restarted. This is wrong since rerror=stop exists. Instead of directly starting writes, use the same request processing as used for new requests. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-29virtio_blk: Factor virtio_blk_handle_request outKevin Wolf
We need a function that handles a single request. Create one by splitting out code from virtio_blk_handle_output. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-13virtio-blk: remove dead variable in virtio_blk_handle_scsiChristoph Hellwig
As pointed out by clang size is only ever written to, but never actually used. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-11virtio: add features as qdev propertiesMichael S. Tsirkin
Add feature bits as properties to virtio. This makes it possible to e.g. define machine without indirect buffer support, which is required for 0.10 compatibility, or without hardware checksum support, which is required for 0.11 compatibility. Since default values for optional features are now set by qdev, get_features callback has been modified: it sets non-optional bits, and clears bits not supported by host. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03virtio-blk: Implement rerror optionKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03Rename DriveInfo.onerror to on_write_errorKevin Wolf
Either rename variables and functions to refer to write errors (which is what they actually do) or introduce a parameter to distinguish reads and writes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-12virtio-blk: Pass read errors to the guestKevin Wolf
We need to signal not only write errors, but also read errors to the guest driver. This fixes a regression introduced by 869a5c6d. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-09Pass the drive's readonly attribute to the guest OSNaphtali Sprei
Implemented for virtio-blk and for scsi Signed-off-by: Naphtali Sprei <nsprei@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05drive cleanup fixes.Gerd Hoffmann
Changes: * drive_uninit() wants a DriveInfo now. * drive_uninit() also calls bdrv_delete(), so callers don't need to do that. * drive_uninit() calls are moved over to the ->exit() callbacks, destroy_bdrvs() is zapped. * setting bdrv->private is not needed any more as the only user (destroy_bdrvs) is gone. * usb-storage needs no drive_uninit, scsi-disk will handle that. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11virtio-blk: add volatile writecache featureChristoph Hellwig
Add a new VIRTIO_BLK_F_WCACHE feature to virtio-blk to indicate that we have a volatile write cache that needs controlled flushing. Implement a VIRTIO_BLK_T_FLUSH operation to flush it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11qemu: make virtio-blk PCI compliant by defaultMichael S. Tsirkin
commit bf011293faaa7f87e4de83185931e7411b794128 made virtio-blk-pci not PCI-compliant, since it makes region 0 (which is an i/o region) size > 256, and, since PCI 2.1, i/o regions are limited to 256 bytes size. When the ATA serial number feature is off, which is the default, make the device spec compliant again, by making region 0 smaller. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: Vadim Rozenfeld <vrozenfe@redhat.com> Tested-by: Vadim Rozenfeld <vrozenfe@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11virtio-blk: Use bdrv_aio_multiwriteKevin Wolf
It is quite common for virtio-blk to submit more than one write request in a row to the qemu block layer. Use bdrv_aio_multiwrite to allow block drivers to optimize its handling of the requests. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27virtio-blk: handle NULL returns from bdrv_aio_{read, write}Christoph Hellwig
The bdrv_aio_{read,write} routines can return a NULL pointer when the I/O submission fails. Currently we ignore this and will wait forever for an I/O completion and leading to a hang of the guest. I can easily reproduce this using the native Linux AIO patch, but it's also possible using normal pthreads-based AIO. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-10qdev-ify virtio-blk.Gerd Hoffmann
First user of the new drive property. With this patch applied host and guest config can be specified separately, like this: -drive if=none,id=disk1,file=/path/to/disk.img -device virtio-blk-pci,drive=disk1 You can set any property for virtio-blk-pci now. You can set the pci address via addr=. You can switch the device into 0.10 compat mode using class=0x0180. As this is per device you can have one 0.10 and one 0.11 virtio block device in a single virtual machine. Old syntax continues to work. Internally it does the same as the two lines above though. One side effect this has is a different initialization order, which might result in a different pci address being assigned by default. Long term plan here is to have this working for all block devices, i.e. once all scsi is properly qdev-ified you will be able to do something like this: -drive if=none,id=sda,file=/path/to/disk.img -device lsi,id=lsi,addr=<pciaddr> -device scsi-disk,drive=sda,bus=lsi.0,lun=<n> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Message-Id:
2009-07-30Fix VM state change handlers running out of orderMarkus Armbruster
When a VM state change handler changes VM state, other VM state change handlers can see the state transitions out of order. bmdma_map(), scsi_disk_init() and virtio_blk_init() install VM state change handlers to restart DMA. These handlers can vm_stop() by running into a write error on a drive with werror=stop. This throws the VM state change handler callback into disarray. Here's an example case I observed: 0. The virtual IDE drive goes south. All future writes return errors. 1. Something encounters a write error, and duly stops the VM with vm_stop(). 2. vm_stop() calls vm_state_notify(0). 3. vm_state_notify() runs the callbacks in list vm_change_state_head. It contains ide_dma_restart_cb() installed by bmdma_map(). It also contains audio_vm_change_state_handler() installed by audio_init(). 4. audio_vm_change_state_handler() stops audio stuff. 5. User continues VM with monitor command "c". This runs vm_start(). 6. vm_start() calls vm_state_notify(1). 7. vm_state_notify() runs the callbacks in vm_change_state_head. 8. ide_dma_restart_cb() happens to come first. It does its work, runs into a write error, and duly stops the VM with vm_stop(). 9. vm_stop() runs vm_state_notify(0). 10. vm_state_notify() runs the callbacks in vm_change_state_head. 11. audio_vm_change_state_handler() stops audio stuff. Which isn't running. 12. vm_stop() finishes, ide_dma_restart_cb() finishes, step 7's vm_state_notify() resumes running handlers. 13. audio_vm_change_state_handler() starts audio stuff. Oopsie. Fix this by moving the actual write from each VM state change handler into a new bottom half (suggested by Gleb Natapov). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-24Add serial number support for virtio_blkjohn cooper
[brought forward to current qemu-kvm.git] This patch implements the missing qemu logic to interpret a '-drive .. serial=XYZ ..' flag for a virtio_blk device. The serial number string is contained in a skeletal IDENTIFY DEVICE data structure and this structure is made available to the guest virtio_blk driver via pci i/o region 0. Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16virtio blk: fix warning.Gerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-13Avoid gcc 4.4 warning about uninitialized fieldBlue Swirl
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-05-18Separate virtio PCI codePaul Brook
Split the PCI host bindings from the VRing transport implementation. Signed-off-by: Paul Brook <paul@codesourcery.com>
2009-05-14Virtio-blk qdev conversionPaul Brook
Signed-off-by: Paul Brook <paul@codesourcery.com>
2009-05-14Virtio-net qdev conversionPaul Brook
Signed-off-by: Paul Brook <paul@codesourcery.com>