aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)Author
2020-11-15nomaintainer: Fix Lesser GPL version numberChetan Pant
There is no "version 2" of the "Lesser" General Public License. It is either "GPL version 2.0" or "Lesser GPL version 2.1". This patch replaces all occurrences of "Lesser GPL version 2" with "Lesser GPL version 2.1" in comment section. This patch contains all the files, whose maintainer I could not get from ‘get_maintainer.pl’ script. Signed-off-by: Chetan Pant <chetan4windows@gmail.com> Message-Id: <20201023124424.20177-1-chetan4windows@gmail.com> Reviewed-by: Thomas Huth <thuth@redhat.com> [thuth: Adapted exec.c and qdev-monitor.c to new location] Signed-off-by: Thomas Huth <thuth@redhat.com>
2020-11-12meson: move vhost_user_blk_server to meson.buildStefan Hajnoczi
The --enable/disable-vhost-user-blk-server options were implemented in ./configure. There has been confusion about them and part of the problem is that the shell syntax used for setting the default value is not easy to read. Move the option over to meson where the conditions are easier to understand: have_vhost_user_blk_server = (targetos == 'linux') if get_option('vhost_user_blk_server').enabled() if targetos != 'linux' error('vhost_user_blk_server requires linux') endif elif get_option('vhost_user_blk_server').disabled() or not have_system have_vhost_user_blk_server = false endif This patch does not change behavior. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201110171121.1265142-2-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-11-09block: Fix some code style problems, "foo* bar" should be "foo *bar"shiliyang
There have some code style problems be found when read the block driver code. So I fixes some problems of this error, ERROR: "foo* bar" should be "foo *bar". Signed-off-by: Liyang Shi <shiliyang@huawei.com> Reported-by: Euler Robot <euler.robot@huawei.com> Message-Id: <3211f389-6d22-46c1-4a16-e6a2ba66f070@huawei.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-11-09block: Fixes nfs compiling error on msys2/mingwYonggang Luo
These compiling errors are fixed: ../block/nfs.c:27:10: fatal error: poll.h: No such file or directory 27 | #include <poll.h> | ^~~~~~~~ compilation terminated. ../block/nfs.c:63:5: error: unknown type name 'blkcnt_t' 63 | blkcnt_t st_blocks; | ^~~~~~~~ ../block/nfs.c: In function 'nfs_client_open': ../block/nfs.c:550:27: error: 'struct _stat64' has no member named 'st_blocks' 550 | client->st_blocks = st.st_blocks; | ^ ../block/nfs.c: In function 'nfs_get_allocated_file_size': ../block/nfs.c:751:41: error: 'struct _stat64' has no member named 'st_blocks' 751 | return (task.ret < 0 ? task.ret : st.st_blocks * 512); | ^ ../block/nfs.c: In function 'nfs_reopen_prepare': ../block/nfs.c:805:31: error: 'struct _stat64' has no member named 'st_blocks' 805 | client->st_blocks = st.st_blocks; | ^ ../block/nfs.c: In function 'nfs_get_allocated_file_size': ../block/nfs.c:752:1: error: control reaches end of non-void function [-Werror=return-type] 752 | } | ^ On msys2/mingw, there is no st_blocks in struct _stat64 yet, we disable the usage of it on msys2/mingw, and create a typedef long long blkcnt_t; for further implementation Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Message-Id: <20201105123116.674-2-luoyonggang@gmail.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-11-09qcow2: Document and enforce the QCowL2Meta invariantsAlberto Garcia
The QCowL2Meta structure is used to store information about a part of a write request that touches clusters that need changes in their L2 entries. This happens with newly-allocated clusters or subclusters. This structure has changed a bit since it was first created and its current documentation is not quite up-to-date. A write request can span a region consisting of a combination of clusters of different types, and qcow2_alloc_host_offset() can repeatedly call handle_copied() and handle_alloc() to add more clusters to the mix as long as they all are contiguous on the image file. Because of this a write request has a list of QCowL2Meta structures, one for each part of the request that needs changes in the L2 metadata. Each one of them spans nb_clusters and has two copy-on-write regions located immediately before and after the middle region touched by that part of the write request. Even when those regions themselves are empty their offsets must be correct because they are used to know the location of the middle region. This was not always the case but it is not a problem anymore because the only two places where QCowL2Meta structures are created (calculate_l2_meta() and qcow2_co_truncate()) ensure that the copy-on-write regions are correctly defined, and so do assertions like the ones in perform_cow(). The conditional initialization of the 'written_to' variable is therefore unnecessary and is removed by this patch. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201007161323.4667-1-berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-11-09block: Remove unused includeAlexChen
The "qemu-common.h" include is not used, remove it. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: AlexChen <alex.chen@huawei.com> Message-Id: <5F8FFB94.3030209@huawei.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-11-05Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pc,pci,vhost,virtio: fixes Lots of fixes all over the place. virtio-mem and virtio-iommu patches are kind of fixes but it seems better to just make them behave sanely than try to educate users about the limitations ... Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Wed 04 Nov 2020 18:40:03 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (31 commits) contrib/vhost-user-blk: fix get_config() information leak block/export: fix vhost-user-blk get_config() information leak block/export: make vhost-user-blk config space little-endian configure: introduce --enable-vhost-user-blk-server libvhost-user: follow QEMU comment style vhost-blk: set features before setting inflight feature Revert "vhost-blk: set features before setting inflight feature" net: Add vhost-vdpa in show_netdevs() vhost-vdpa: Add qemu_close in vhost_vdpa_cleanup vfio: Don't issue full 2^64 unmap virtio-iommu: Set supported page size mask vfio: Set IOMMU page size as per host supported page size memory: Add interface to set iommu page size mask virtio-iommu: Add notify_flag_changed() memory region callback virtio-iommu: Add replay() memory region callback virtio-iommu: Call memory notifiers in attach/detach virtio-iommu: Add memory notifiers for map/unmap virtio-iommu: Store memory region in endpoint struct virtio-iommu: Fix virtio_iommu_mr() hw/smbios: Fix leaked fd in save_opt_one() error path ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-11-03block/export: fix vhost-user-blk get_config() information leakStefan Hajnoczi
Refuse get_config() requests in excess of sizeof(struct virtio_blk_config). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201027173528.213464-5-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-11-03block/export: make vhost-user-blk config space little-endianStefan Hajnoczi
VIRTIO 1.0 devices have little-endian configuration space. The vhost-user-blk-server.c code already uses little-endian for virtqueue processing but not for the configuration space fields. Fix this so the vhost-user-blk export works on big-endian hosts. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201027173528.213464-4-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-11-03configure: introduce --enable-vhost-user-blk-serverStefan Hajnoczi
Make it possible to compile out the vhost-user-blk server. It is enabled by default on Linux. Note that vhost-user-server.c depends on libvhost-user, which requires CONFIG_LINUX. The CONFIG_VHOST_USER dependency was erroneous since that option controls vhost-user frontends (previously known as "master") and not device backends (previously known as "slave"). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201027173528.213464-3-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-11-03block/nvme: Fix nvme_submit_command() on big-endian hostPhilippe Mathieu-Daudé
The Completion Queue Command Identifier is a 16-bit value, so nvme_submit_command() is unlikely to work on big-endian hosts, as the relevant bits are truncated. Fix by using the correct byte-swap function. Fixes: bdd6a90a9e5 ("block: Add VFIO based NVMe driver") Reported-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20201029093306.1063879-25-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Fix use of write-only doorbells page on Aarch64 archPhilippe Mathieu-Daudé
qemu_vfio_pci_map_bar() calls mmap(), and mmap(2) states: 'offset' must be a multiple of the page size as returned by sysconf(_SC_PAGE_SIZE). In commit f68453237b9 we started to use an offset of 4K which broke this contract on Aarch64 arch. Fix by mapping at offset 0, and and accessing doorbells at offset=4K. Fixes: f68453237b9 ("block/nvme: Map doorbells pages write-only") Reported-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-24-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Align iov's va and size on host page sizeEric Auger
Make sure iov's va and size are properly aligned on the host page size. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-23-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Change size and alignment of prp_list_pagesEric Auger
In preparation of 64kB host page support, let's change the size and alignment of the prp_list_pages so that the VFIO DMA MAP succeeds with 64kB host page size. We align on the host page size. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-22-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Change size and alignment of queueEric Auger
In preparation of 64kB host page support, let's change the size and alignment of the queue so that the VFIO DMA MAP succeeds. We align on the host page size. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-21-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Change size and alignment of IDENTIFY response bufferEric Auger
In preparation of 64kB host page support, let's change the size and alignment of the IDENTIFY command response buffer so that the VFIO DMA MAP succeeds. We align on the host page size. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-20-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Correct minimum device page sizePhilippe Mathieu-Daudé
While trying to simplify the code using a macro, we forgot the 12-bit shift... Correct that. Fixes: fad1eb68862 ("block/nvme: Use register definitions from 'block/nvme.h'") Reported-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-19-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Set request_alignment at initializationPhilippe Mathieu-Daudé
Commit bdd6a90a9e5 ("block: Add VFIO based NVMe driver") sets the request_alignment in nvme_refresh_limits(). For consistency, also set it during initialization. Reported-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-18-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Simplify nvme_cmd_sync()Philippe Mathieu-Daudé
As all commands use the ADMIN queue, it is pointless to pass it as argument each time. Remove the argument, and rename the function as nvme_admin_cmd_sync() to make this new behavior clearer. Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20201029093306.1063879-17-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Simplify ADMIN queue accessPhilippe Mathieu-Daudé
We don't need to dereference from BDRVNVMeState each time. Use a NVMeQueuePair pointer on the admin queue. The nvme_init() becomes easier to review, matching the style of nvme_add_io_queue(). Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-16-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Correctly initialize Admin Queue AttributesPhilippe Mathieu-Daudé
From the specification chapter 3.1.8 "AQA - Admin Queue Attributes" the Admin Submission Queue Size field is a 0’s based value: Admin Submission Queue Size (ASQS): Defines the size of the Admin Submission Queue in entries. Enabling a controller while this field is cleared to 00h produces undefined results. The minimum size of the Admin Submission Queue is two entries. The maximum size of the Admin Submission Queue is 4096 entries. This is a 0’s based value. This bug has never been hit because the device initialization uses a single command synchronously :) Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-15-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Use definitions instead of magic values in add_io_queue()Philippe Mathieu-Daudé
Replace magic values by definitions, and simplifiy since the number of queues will never reach 64K. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-14-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Make nvme_init_queue() return boolean indicating errorPhilippe Mathieu-Daudé
Just for consistency, following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), return a boolean value indicating an error is set or not. Directly pass errp as the local_err is not requested in our case. This simplifies a bit nvme_create_queue_pair(). Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-12-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Make nvme_identify() return boolean indicating errorPhilippe Mathieu-Daudé
Just for consistency, following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), return a boolean value indicating an error is set or not. Directly pass errp as the local_err is not requested in our case. Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20201029093306.1063879-11-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Use unsigned integer for queue counter/sizePhilippe Mathieu-Daudé
We can not have negative queue count/size/index, use unsigned type. Rename 'nr_queues' as 'queue_count' to match the spec naming. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-10-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Move definitions before structure declarationsPhilippe Mathieu-Daudé
To be able to use some definitions in structure declarations, move them earlier. No logical change. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-9-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Trace queue pair creation/deletionPhilippe Mathieu-Daudé
Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-8-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Improve nvme_free_req_queue_wait() trace informationPhilippe Mathieu-Daudé
What we want to trace is the block driver state and the queue index. Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-7-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Trace nvme_poll_queue() per queuePhilippe Mathieu-Daudé
As we want to enable multiple queues, report the event in each nvme_poll_queue() call, rather than once in the callback calling nvme_poll_queues(). Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-6-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Trace controller capabilitiesPhilippe Mathieu-Daudé
Controllers have different capabilities and report them in the CAP register. We are particularly interested by the page size limits. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-5-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Report warning with warn_report()Philippe Mathieu-Daudé
Instead of displaying warning on stderr, use warn_report() which also displays it on the monitor. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-4-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/nvme: Use hex format to display offset in trace eventsPhilippe Mathieu-Daudé
Use the same format used for the hw/vfio/ trace events. Suggested-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20201029093306.1063879-3-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com>
2020-11-03block/vvfat: Fix bad printf format specifiersAlexChen
We should use printf format specifier "%u" instead of "%d" for argument of type "unsigned int". In addition, fix two error format problems found by checkpatch.pl: ERROR: space required after that ',' (ctx:VxV) + fprintf(stderr,"%s attributes=0x%02x begin=%u size=%d\n", ^ ERROR: line over 90 characters + fprintf(stderr, "%d, %s (%u, %d)\n", i, commit->path ? commit->path : "(null)", commit->param.rename.cluster, commit->action); Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Alex Chen <alex.chen@huawei.com> Message-Id: <5FA12620.6030705@huawei.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-30nbd: Add 'qemu-nbd -A' to expose allocation depthEric Blake
Allow the server to expose an additional metacontext to be requested by savvy clients. qemu-nbd adds a new option -A to expose the qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this can also be set via QMP when using block-export-add. qemu as client is hacked into viewing the key aspects of this new context by abusing the already-experimental x-dirty-bitmap option to collapse all depths greater than 2, which results in a tri-state value visible in the output of 'qemu-img map --output=json' (yes, that means x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like renaming it as it would introduce a needless break of back-compat, even though we make no compat guarantees with x- members): unallocated (depth 0) => "zero":false, "data":true local (depth 1) => "zero":false, "data":false backing (depth 2+) => "zero":true, "data":true libnbd as client is probably a nicer way to get at the information without having to decipher such hacks in qemu as client. ;) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-11-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2020-10-30block: Return depth level during bdrv_is_allocated_aboveEric Blake
When checking for allocation across a chain, it's already easy to count the depth within the chain at which the allocation is found. Instead of throwing that information away, return it to the caller. Existing callers only cared about allocated/non-allocated, but having a depth available will be used by NBD in the next patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-9-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-27block: End quiescent sections when a BDS is deletedGreg Kurz
If a BDS gets deleted during blk_drain_all(), it might miss a call to bdrv_do_drained_end(). This means missing a call to aio_enable_external() and the AIO context remains disabled for ever. This can cause a device to become irresponsive and to disrupt the guest execution, ie. hang, loop forever or worse. This scenario is quite easy to encounter with virtio-scsi on POWER when punching multiple blockdev-create QMP commands while the guest is booting and it is still running the SLOF firmware. This happens because SLOF disables/re-enables PCI devices multiple times via IO/MEM/MASTER bits of PCI_COMMAND register after the initial probe/feature negotiation, as it tends to work with a single device at a time at various stages like probing and running block/network bootloaders without doing a full reset in-between. This naturally generates many dataplane stops and starts, and thus many drain sections that can race with blockdev_create_run(). In the end, SLOF bails out. It is somehow reproducible on x86 but it requires to generate articial dataplane start/stop activity with stop/cont QMP commands. In this case, seabios ends up looping for ever, waiting for the virtio-scsi device to send a response to a command it never received. Add a helper that pairs all previously called bdrv_do_drained_begin() with a bdrv_do_drained_end() and call it from bdrv_close(). While at it, update the "/bdrv-drain/graph-change/drain_all" test in test-bdrv-drain so that it can catch the issue. BugId: https://bugzilla.redhat.com/show_bug.cgi?id=1874441 Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160346526998.272601.9045392804399803158.stgit@bahia.lan> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-27qcow2: Skip copy-on-write when allocating a zero clusterAlberto Garcia
Since commit c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a when a write request results in a new allocation QEMU first tries to see if the rest of the cluster outside the written area contains only zeroes. In that case, instead of doing a normal copy-on-write operation and writing explicit zero buffers to disk, the code zeroes the whole cluster efficiently using pwrite_zeroes() with BDRV_REQ_NO_FALLBACK. This improves performance very significantly but it only happens when we are writing to an area that was completely unallocated before. Zero clusters (QCOW2_CLUSTER_ZERO_*) are treated like normal clusters and are therefore slower to allocate. This happens because the code uses bdrv_is_allocated_above() rather bdrv_block_status_above(). The former is not as accurate for this purpose but it is faster. However in the case of qcow2 the underlying call does already report zero clusters just fine so there is no reason why we cannot use that information. After testing 4KB writes on an image that only contains zero clusters this patch results in almost five times more IOPS. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <6d77cab968c501c44d6e1089b9bc91b04170b49e.1603731354.git.berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-27qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status()Alberto Garcia
If a BlockDriverState supports backing files but has none then any unallocated area reads back as zeroes. bdrv_co_block_status() is only reporting this is if want_zero is true, but this is an inexpensive test and there is no reason not to do it in all cases. Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <66fa0914a0e2b727ab6d1b63ca773d7cd29a9a9e.1603731354.git.berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-26Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into ↵Peter Maydell
staging * fix --disable-tcg builds (Claudio) * Fixes for macOS --enable-modules build and OpenBSD curses/iconv detection (myself) * Start preparing for meson 0.56 (myself) * Move directory configuration to meson (myself) * Start untangling qemu_init (myself) * Windows fixes (Sunil) * Remove -no-kbm (Thomas) # gpg: Signature made Mon 26 Oct 2020 11:12:17 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: machine: move SMP initialization from vl.c machine: move UP defaults to class_base_init machine: remove deprecated -machine enforce-config-section option win32: boot broken when bind & data dir are the same WHPX: Fix WHPX build break configure: move install_blobs from configure to meson configure: remove unused variable from config-host.mak configure: move directory options from config-host.mak to meson configure: allow configuring localedir Makefile: separate meson rerun from the rest of the ninja invocation Remove deprecated -no-kvm option replay: do not build if TCG is not available qtest: unbreak non-TCG builds in bios-tables-test hw/core/qdev-clock: add a reference on aliased clocks do not use colons in test names meson: rewrite curses/iconv test build: fix macOS --enable-modules build Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-10-23block/io: fix bdrv_is_allocated_aboveVladimir Sementsov-Ogievskiy
bdrv_is_allocated_above wrongly handles short backing files: it reports after-EOF space as UNALLOCATED which is wrong, as on read the data is generated on the level of short backing file (if all overlays have unallocated areas at that place). Reusing bdrv_common_block_status_above fixes the issue and unifies code path. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20200924194003.22080-5-vsementsov@virtuozzo.com [Fix s/has/have/ as suggested by Eric Blake. Fix s/area/areas/. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23block/io: bdrv_common_block_status_above: support bs == baseVladimir Sementsov-Ogievskiy
We are going to reuse bdrv_common_block_status_above in bdrv_is_allocated_above. bdrv_is_allocated_above may be called with include_base == false and still bs == base (for ex. from img_rebase()). So, support this corner case. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20200924194003.22080-4-vsementsov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23block/io: bdrv_common_block_status_above: support include_baseVladimir Sementsov-Ogievskiy
In order to reuse bdrv_common_block_status_above in bdrv_is_allocated_above, let's support include_base parameter. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200924194003.22080-3-vsementsov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23block/io: fix bdrv_co_block_status_aboveVladimir Sementsov-Ogievskiy
bdrv_co_block_status_above has several design problems with handling short backing files: 1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but without BDRV_BLOCK_ALLOCATED flag, when actually short backing file which produces these after-EOF zeros is inside requested backing sequence. 2. With want_zero=false, it may return pnum=0 prior to actual EOF, because of EOF of short backing file. Fix these things, making logic about short backing files clearer. With fixed bdrv_block_status_above we also have to improve is_zero in qcow2 code, otherwise iotest 154 will fail, because with this patch we stop to merge zeros of different types (produced by fully unallocated in the whole backing chain regions vs produced by short backing files). Note also, that this patch leaves for another day the general problem around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated vs go-to-backing. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200924194003.22080-2-vsementsov@virtuozzo.com [Fix s/comes/come/ as suggested by Eric Blake --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23block/export: add vhost-user-blk multi-queue supportStefan Hajnoczi
Allow the number of queues to be configured using --export vhost-user-blk,num-queues=N. This setting should match the QEMU --device vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers its own value if the vhost-user-blk backend offers fewer queues than QEMU. The vhost-user-blk-server.c code is already capable of multi-queue. All virtqueue processing runs in the same AioContext. No new locking is needed. Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit. Note that the feature bit only announces the presence of the num_queues configuration space field. It does not promise that there is more than 1 virtqueue, so we can set it unconditionally. I tested multi-queue by running a random read fio test with numjobs=4 on an -smp 4 guest. After the benchmark finished the guest /proc/interrupts file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/ directory shows that Linux blk-mq has 4 queues configured. An automated test is included in the next commit. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-id: 20201001144604.559733-2-stefanha@redhat.com [Fixed accidental tab characters as suggested by Markus Armbruster --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23block/export: add iothread and fixed-iothread optionsStefan Hajnoczi
Make it possible to specify the iothread where the export will run. By default the block node can be moved to other AioContexts later and the export will follow. The fixed-iothread option forces strict behavior that prevents changing AioContext while the export is active. See the QAPI docs for details. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200929125516.186715-5-stefanha@redhat.com [Fix stray '#' character in block-export.json and add missing "(since: 5.2)" as suggested by Eric Blake. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23block: move block exports to libblockdevStefan Hajnoczi
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd. They are not used by other programs and are not otherwise needed in libblock. Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss. Since bdrv_close_all() (libblock) calls blk_exp_close_all() (libblockdev) a stub function is required.. Make qemu-nbd.c use signal handling utility functions instead of duplicating the code. This helps because os-posix.c is in libblockdev and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks. Once we use the signal handling utility functions we also end up providing the necessary symbol. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200929125516.186715-4-stefanha@redhat.com [Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23util/vhost-user-server: use static library in meson.buildStefan Hajnoczi
Don't compile contrib/libvhost-user/libvhost-user.c again. Instead build the static library once and then reuse it throughout QEMU. Also switch from CONFIG_LINUX to CONFIG_VHOST_USER, which is what the vhost-user tools (vhost-user-gpu, etc) do. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200924151549.913737-14-stefanha@redhat.com [Added CONFIG_LINUX again because libvhost-user doesn't build on macOS. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23util/vhost-user-server: move header to include/Stefan Hajnoczi
Headers used by other subsystems are located in include/. Also add the vhost-user-server and vhost-user-blk-server headers to MAINTAINERS. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200924151549.913737-13-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23block/export: convert vhost-user-blk server to block export APIStefan Hajnoczi
Use the new QAPI block exports API instead of defining our own QOM objects. This is a large change because the lifecycle of VuBlockDev needs to follow BlockExportDriver. QOM properties are replaced by QAPI options objects. VuBlockDev is renamed VuBlkExport and contains a BlockExport field. Several fields can be dropped since BlockExport already has equivalents. The file names and meson build integration will be adjusted in a future patch. libvhost-user should probably be built as a static library that is linked into QEMU instead of as a .c file that results in duplicate compilation. The new command-line syntax is: $ qemu-storage-daemon \ --blockdev file,node-name=drive0,filename=test.img \ --export vhost-user-blk,node-name=drive0,id=export0,unix-socket=/tmp/vhost-user-blk.sock Note that unix-socket is optional because we may wish to accept chardevs too in the future. Markus noted that supported address families are not explicit in the QAPI schema. It is unlikely that support for more address families will be added since file descriptor passing is required and few address families support it. If a new address family needs to be added, then the QAPI 'features' syntax can be used to advertize them. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-id: 20200924151549.913737-12-stefanha@redhat.com [Skip test on big-endian host architectures because this device doesn't support them yet (as already mentioned in a code comment). --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23block/export: report flush errorsStefan Hajnoczi
Propagate the flush return value since errors are possible. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200924151549.913737-11-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>