aboutsummaryrefslogtreecommitdiff
path: root/tools/virtiofsd/fuse_virtio.c
AgeCommit message (Collapse)Author
2023-02-16virtiofsd: Remove sourceDr. David Alan Gilbert
Now remove all the source. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-08-01misc: fix commonly doubled up wordsDaniel P. Berrangé
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220707163720.1421716-5-berrange@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-04-21util: replace qemu_get_local_state_pathname()Marc-André Lureau
Simplify the function to only return the directory path. Callers are adjusted to use the GLib function to build paths, g_build_filename(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-39-marcandre.lureau@redhat.com>
2021-10-25virtiofsd: Error on bad socket group nameDr. David Alan Gilbert
Make the '--socket-group=' option fail if the group name is unknown: ./tools/virtiofsd/virtiofsd .... --socket-group=zaphod vhost socket: unable to find group 'zaphod' Reported-by: Xiaoling Gao <xiagao@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211014122554.34599-1-dgilbert@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-10-25virtiofsd: Add a helper to stop all queuesVivek Goyal
Use a helper to stop all the queues. Later in the patch series I am planning to use this helper at one more place later in the patch series. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-6-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-10-25virtiofsd: Add a helper to send element on virtqueueVivek Goyal
We have open coded logic to take locks and push element on virtqueue at three places. Add a helper and use it everywhere. Code is easier to read and less number of lines of code. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-5-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-10-25virtiofsd: Remove unused virtio_fs_config definitionVivek Goyal
"struct virtio_fs_config" definition seems to be unused in fuse_virtio.c. Remove it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-4-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-09-16virtiofsd: Reverse req_list before processing itSergio Lopez
With the thread pool disabled, we add the requests in the queue to a GList, processing by iterating over there afterwards. For adding them, we're using "g_list_prepend()", which is more efficient but causes the requests to be processed in reverse order, breaking the read-ahead and request-merging optimizations in the host for sequential operations. According to the documentation, if you need to process the request in-order, using "g_list_prepend()" and then reversing the list with "g_list_reverse()" is more efficient than using "g_list_append()", so let's do it that way. Testing on a spinning disk (to boost the increase of read-ahead and request-merging) shows a 4x improvement on sequential write fio test: Test: fio --directory=/mnt/virtio-fs --filename=fio-file1 --runtime=20 --iodepth=16 --size=4G --direct=1 --blocksize=4K --ioengine libaio --rw write --name seqwrite-libaio Without "g_list_reverse()": ... Jobs: 1 (f=1): [W(1)][100.0%][w=22.4MiB/s][w=5735 IOPS][eta 00m:00s] seqwrite-libaio: (groupid=0, jobs=1): err= 0: pid=710: Tue Aug 24 12:58:16 2021 write: IOPS=5709, BW=22.3MiB/s (23.4MB/s)(446MiB/20002msec); 0 zone resets ... With "g_list_reverse()": ... Jobs: 1 (f=1): [W(1)][100.0%][w=84.0MiB/s][w=21.5k IOPS][eta 00m:00s] seqwrite-libaio: (groupid=0, jobs=1): err= 0: pid=716: Tue Aug 24 13:00:15 2021 write: IOPS=21.3k, BW=83.1MiB/s (87.2MB/s)(1663MiB/20001msec); 0 zone resets ... Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20210824131158.39970-1-slp@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-09virtiofsd: Add missing newline in error messageHubert Jasudowicz
Signed-off-by: Hubert Jasudowicz <hubert.jasudowicz@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <e5914ad202a13e9c1bc2a5efa267ff3bd4f48db6.1625173475.git.hubert.jasudowicz@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-26virtiofsd: Set req->reply_sent right after sending replyVivek Goyal
There is no reason to set it in label "err". We should be able to set it right after sending reply. It is easier to read. Also got rid of label "err" because now only thing it was doing was return a code. We can return from the error location itself and no need to first jump to label "err". Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-8-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26virtiofsd: Check EOF before short readVivek Goyal
In virtio_send_data_iov() we are checking first for short read and then EOF condition. Change the order. Basically check for error and EOF first and last remaining piece is short ready which will lead to retry automatically at the end of while loop. Just that it is little simpler to read to the code. There is no need to call "continue" and also one less call of "len-=ret". Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-7-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26virtiofsd: Simplify skip byte logicVivek Goyal
We need to skip bytes in two cases. a. Before we start reading into in_sg, we need to skip iov_len bytes in the beginning which typically will have fuse_out_header. b. If preadv() does a short read, then we need to retry preadv() with remainig bytes and skip the bytes preadv() read in short read. For case a, there is no reason that skipping logic be inside the while loop. Move it outside. And only retain logic "b" inside while loop. Also get rid of variable "skip_size". Looks like we can do without it. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-6-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26virtiofsd: get rid of in_sg_left variableVivek Goyal
in_sg_left seems to be being used primarly for debugging purpose. It is keeping track of how many bytes are left in the scatter list we are reading into. We already have another variable "len" which keeps track how many bytes are left to be read. And in_sg_left is greater than or equal to len. We have already ensured that in the beginning of function. if (in_len < tosend_len) { fuse_log(FUSE_LOG_ERR, "%s: elem %d too small for data len %zd\n", __func__, elem->index, tosend_len); ret = E2BIG; goto err; } So in_sg_left seems like a redundant variable. It probably was useful for debugging when code was being developed. Get rid of it. It helps simplify this function. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-5-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26virtiofsd: Use iov_discard_front() to skip bytesVivek Goyal
There are places where we need to skip few bytes from front of the iovec array. We have our own custom code for that. Looks like iov_discard_front() can do same thing. So use that helper instead. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-4-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26virtiofsd: Get rid of unreachable code in readVivek Goyal
pvreadv() can return following. - error - 0 in case of EOF - short read We seem to handle all the cases already. We are retrying read in case of short read. So another check for short read seems like dead code. Get rid of it. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-3-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26virtiofsd: Check for EINTR in preadv() and retryVivek Goyal
We don't seem to check for EINTR and retry. There are other places in code where we check for EINTR. So lets add a check. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-2-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13virtiofsd: Fix check of chown()'s return valueGreg Kurz
Otherwise you always get this warning when using --socket-group=users vhost socket failed to set group to users (100) While here, print out the error if chown() fails. Fixes: f6698f2b03b0 ("tools/virtiofsd: add support for --socket-group") Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <162040394890.714971.15502455176528384778.stgit@bahia.lan> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-06virtiofsd/fuse_virtio.c: Changed allocations of locals to GLibMahmoud Mandour
Replaced the allocation of local variables from malloc() to GLib allocation functions. In one instance, dropped the usage to an assert after a malloc() call and used g_malloc() instead. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-8-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-06virtiofsd: Changed allocations of fv_VuDev & its internals to GLib functionsMahmoud Mandour
Changed the allocations of fv_VuDev structs, VuDev structs, and fv_QueueInfo strcuts from using calloc()/realloc() & free() to using the equivalent functions from GLib. In instances, removed the pair of allocation and assertion for non-NULL checking with a GLib function that aborts on error. Removed NULL-checking for fv_VuDev struct allocation and used a GLib function that crashes on error; namely, g_new0(). This is because allocating one struct should not be a problem on an healthy system. Also following the pattern of aborting-on-null behaviour that is taken with allocating VuDev structs and fv_QueueInfo structs. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-6-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-06virtiofsd: Changed allocations of iovec to GLib's functionsMahmoud Mandour
Replaced the calls to malloc()/calloc() and their respective calls to free() of iovec structs with GLib's allocation and deallocation functions and used g_autofree when appropriate. Replaced the allocation of in_sg_cpy to g_new() instead of a call to calloc() and a null-checking assertion. Not g_new0() because the buffer is immediately overwritten using memcpy. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Message-Id: <20210427181333.148176-1-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-06virtiofsd: Don't assume header layoutDr. David Alan Gilbert
virtiofsd incorrectly assumed a fixed set of header layout in the virt queue; assuming that the fuse and write headers were conveniently separated from the data; the spec doesn't allow us to take that convenience, so fix it up to deal with it the hard way. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210428110100.27757-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-15virtiofsd: Release vu_dispatch_lock when stopping queueGreg Kurz
QEMU can stop a virtqueue by sending a VHOST_USER_GET_VRING_BASE request to virtiofsd. As with all other vhost-user protocol messages, the thread that runs the main event loop in virtiofsd takes the vu_dispatch lock in write mode. This ensures that no other thread can access virtqueues or memory tables at the same time. In the case of VHOST_USER_GET_VRING_BASE, the main thread basically notifies the queue thread that it should terminate and waits for its termination: main() virtio_loop() vu_dispatch_wrlock() vu_dispatch() vu_process_message() vu_get_vring_base_exec() fv_queue_cleanup_thread() pthread_join() Unfortunately, the queue thread ends up calling virtio_send_msg() at some point, which itself needs to grab the lock: fv_queue_thread() g_list_foreach() fv_queue_worker() fuse_session_process_buf_int() do_release() lo_release() fuse_reply_err() send_reply() send_reply_iov() fuse_send_reply_iov_nofree() fuse_send_msg() virtio_send_msg() vu_dispatch_rdlock() <-- Deadlock ! Simply have the main thread to release the lock before going to sleep and take it back afterwards. A very similar patch was already sent by Vivek Goyal sometime back: https://listman.redhat.com/archives/virtio-fs/2021-January/msg00073.html The only difference here is that this done in fv_queue_set_started() because fv_queue_cleanup_thread() can also be called from virtio_loop() without the lock being held. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210312092212.782255-8-groug@kaod.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-16virtiofsd: vu_dispatch locking should never failGreg Kurz
pthread_rwlock_rdlock() and pthread_rwlock_wrlock() can fail if a deadlock condition is detected or the current thread already owns the lock. They can also fail, like pthread_rwlock_unlock(), if the mutex wasn't properly initialized. None of these are ever expected to happen with fv_VuDev::vu_dispatch_rwlock. Some users already check the return value and assert, some others don't. Introduce rdlock/wrlock/unlock wrappers that just do the former and use them everywhere for improved consistency and robustness. This is just cleanup. It doesn't fix any actual issue. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210203182434.93870-1-groug@kaod.org> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-12-18virtiofsd: Use --thread-pool-size=0 to mean no thread poolVivek Goyal
Right now we create a thread pool and main thread hands over the request to thread in thread pool to process. Number of threads in thread pool can be managed by option --thread-pool-size. In tests we have noted that many of the workloads are getting better performance if we don't use a thread pool at all and process all the requests in the context of a thread receiving the request. Hence give user an option to be able to run virtiofsd without using a thread pool. To implement this, I have used existing option --thread-pool-size. This option defines how many maximum threads can be in the thread pool. Thread pool size zero freezes thead pool. I can't see why will one start virtiofsd with a frozen thread pool (hence frozen file system). So I am redefining --thread-pool-size=0 to mean, don't use a thread pool. Instead process the request in the context of thread receiving request from the queue. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20201109143548.GA1479853@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-12-10Clean up includesMarkus Armbruster
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes, with the changes to the following files manually reverted: contrib/libvhost-user/libvhost-user-glib.h contrib/libvhost-user/libvhost-user.c contrib/libvhost-user/libvhost-user.h contrib/plugins/hotblocks.c contrib/plugins/hotpages.c contrib/plugins/howvec.c contrib/plugins/lockstep.c linux-user/mips64/cpu_loop.c linux-user/mips64/signal.c linux-user/sparc64/cpu_loop.c linux-user/sparc64/signal.c linux-user/x86_64/cpu_loop.c linux-user/x86_64/signal.c target/s390x/gen-features.c tests/fp/platform.h tests/migration/s390x/a-b-bios.c tests/plugin/bb.c tests/plugin/empty.c tests/plugin/insn.c tests/plugin/mem.c tests/test-rcu-simpleq.c tests/test-rcu-slist.c tests/test-rcu-tailq.c tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.c contrib/plugins/, tests/plugin/, and tests/test-rcu-slist.c appear not to include osdep.h intentionally. The remaining reverts are the same as in commit bbfff19688d. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20201113061216.2483385-1-armbru@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Alexander Bulekov <alxndr@bu.edu>
2020-12-08libvhost-user: make it a meson subprojectMarc-André Lureau
By making libvhost-user a subproject, check it builds standalone (without the global QEMU cflags etc). Note that the library still relies on QEMU include/qemu/atomic.h and linux_headers/. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20201125100640.366523-6-marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-11-02tools/virtiofsd: Check vu_init() return value (CID 1435958)Philippe Mathieu-Daudé
Since commit 6f5fd837889, vu_init() can fail if malloc() returns NULL. This fixes the following Coverity warning: CID 1435958 (#1 of 1): Unchecked return value (CHECKED_RETURN) Fixes: 6f5fd837889 ("libvhost-user: support many virtqueues") Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201102092339.2034297-1-philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-23libvhost-user: Allow vu_message_read to be replacedCoiby Xu
Allow vu_message_read to be replaced by one which will make use of the QIOChannel functions. Thus reading vhost-user message won't stall the guest. For slave channel, we still use the default vu_message_read. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Coiby Xu <coiby.xu@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200918080912.321299-2-coiby.xu@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-12tools/virtiofsd: add support for --socket-groupAlex Bennée
If you like running QEMU as a normal user (very common for TCG runs) but you have to run virtiofsd as a root user you run into connection problems. Adding support for an optional --socket-group allows the users to keep using the command line. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200925125147.26943-2-alex.bennee@linaro.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Split long line
2020-09-25virtiofsd: Used glib "shared" thread poolVivek Goyal
glib offers thread pools and it seems to support "exclusive" and "shared" thread pools. https://developer.gnome.org/glib/stable/glib-Thread-Pools.html#g-thread-pool-new Currently we use "exlusive" thread pools but its performance seems to be poor. I tried using "shared" thread pools and performance seems much better. I posted performance results here. https://www.redhat.com/archives/virtio-fs/2020-September/msg00080.html So lets switch to shared thread pools. We can think of making it optional once somebody can show in what cases exclusive thread pools offer better results. For now, my simple performance tests across the board see better results with shared thread pools. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20200921213216.GE13362@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> With seccomp fix from Miklos
2020-08-28virtiofsd: probe unshare(CLONE_FS) and print an errorStefan Hajnoczi
An assertion failure is raised during request processing if unshare(CLONE_FS) fails. Implement a probe at startup so the problem can be detected right away. Unfortunately Docker/Moby does not include unshare in the seccomp.json list unless CAP_SYS_ADMIN is given. Other seccomp.json lists always include unshare (e.g. podman is unaffected): https://raw.githubusercontent.com/seccomp/containers-golang/master/seccomp.json Use "docker run --security-opt seccomp=path/to/seccomp.json ..." if the default seccomp.json is missing unshare. Cc: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200727190223.422280-4-stefanha@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-03-03virtiofsd: Fix xattr operationsMisono Tomohiro
Current virtiofsd has problems about xattr operations and they does not work properly for directory/symlink/special file. The fundamental cause is that virtiofsd uses openat() + f...xattr() systemcalls for xattr operation but we should not open symlink/special file in the daemon. Therefore the function is restricted. Fix this problem by: 1. during setup of each thread, call unshare(CLONE_FS) 2. in xattr operations (i.e. lo_getxattr), if inode is not a regular file or directory, use fchdir(proc_loot_fd) + ...xattr() + fchdir(root.fd) instead of openat() + f...xattr() (Note: for a regular file/directory openat() + f...xattr() is still used for performance reason) With this patch, xfstests generic/062 passes on virtiofs. This fix is suggested by Miklos Szeredi and Stefan Hajnoczi. The original discussion can be found here: https://www.redhat.com/archives/virtio-fs/2019-October/msg00046.html Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Message-Id: <20200227055927.24566-3-misono.tomohiro@jp.fujitsu.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-02-10virtiofsd: fv_create_listen_socket error path socket leakDr. David Alan Gilbert
If we fail when bringing up the socket we can leak the listen_fd; in practice the daemon will exit so it's not really a problem. Fixes: Coverity CID 1413121 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-01-23virtiofsd: stop all queue threads on exit in virtio_loop()Eryu Guan
On guest graceful shutdown, virtiofsd receives VHOST_USER_GET_VRING_BASE request from VMM and shuts down virtqueues by calling fv_set_started(), which joins fv_queue_thread() threads. So when virtio_loop() returns, there should be no thread is still accessing data in fuse session and/or virtio dev. But on abnormal exit, e.g. guest got killed for whatever reason, vhost-user socket is closed and virtio_loop() breaks out the main loop and returns to main(). But it's possible fv_queue_worker()s are still working and accessing fuse session and virtio dev, which results in crash or use-after-free. Fix it by stopping fv_queue_thread()s before virtio_loop() returns, to make sure there's no-one could access fuse session and virtio dev. Reported-by: Qingming Su <qingming.su@linux.alibaba.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: add --thread-pool-size=NUM optionStefan Hajnoczi
Add an option to control the size of the thread pool. Requests are now processed in parallel by default. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: process requests in a thread poolStefan Hajnoczi
Introduce a thread pool so that fv_queue_thread() just pops VuVirtqElements and hands them to the thread pool. For the time being only one worker thread is allowed since passthrough_ll.c is not thread-safe yet. Future patches will lift this restriction so that multiple FUSE requests can be processed in parallel. The main new concept is struct FVRequest, which contains both VuVirtqElement and struct fuse_chan. We now have fv_VuDev for a device, fv_QueueInfo for a virtqueue, and FVRequest for a request. Some of fv_QueueInfo's fields are moved into FVRequest because they are per-request. The name FVRequest conforms to QEMU coding style and I expect the struct fv_* types will be renamed in a future refactoring. This patch series is not optimal. fbuf reuse is dropped so each request does malloc(se->bufsize), but there is no clean and cheap way to keep this with a thread pool. The vq_lock mutex is held for longer than necessary, especially during the eventfd_write() syscall. Performance can be improved in the future. prctl(2) had to be added to the seccomp whitelist because glib invokes it. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: prevent fv_queue_thread() vs virtio_loop() racesStefan Hajnoczi
We call into libvhost-user from the virtqueue handler thread and the vhost-user message processing thread without a lock. There is nothing protecting the virtqueue handler thread if the vhost-user message processing thread changes the virtqueue or memory table while it is running. This patch introduces a read-write lock. Virtqueue handler threads are readers. The vhost-user message processing thread is a writer. This will allow concurrency for multiqueue in the future while protecting against fv_queue_thread() vs virtio_loop() races. Note that the critical sections could be made smaller but it would be more invasive and require libvhost-user changes. Let's start simple and improve performance later, if necessary. Another option would be an RCU-style approach with lighter-weight primitives. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23Virtiofsd: fix memory leak on fuse queueinfoLiu Bo
For fuse's queueinfo, both queueinfo array and queueinfos are allocated in fv_queue_set_started() but not cleaned up when the daemon process quits. This fixes the leak in proper places. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Prevent multiply running with same vhost_user_socketMasayoshi Mizuma
virtiofsd can run multiply even if the vhost_user_socket is same path. ]# ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/tmp/share & [1] 244965 virtio_session_mount: Waiting for vhost-user socket connection... ]# ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/tmp/share & [2] 244966 virtio_session_mount: Waiting for vhost-user socket connection... ]# The user will get confused about the situation and maybe the cause of the unexpected problem. So it's better to prevent the multiple running. Create a regular file under localstatedir directory to exclude the vhost_user_socket. To create and lock the file, use qemu_write_pidfile() because the API has some sanity checks and file lock. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Applied fixes from Stefan's review and moved osdep include Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: cleanup allocated resource in seLiu Bo
This cleans up unfreed resources in se on quiting, including se->virtio_dev, se->vu_socket_path, se->vu_socketfd. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Kill threads when queues are stoppedDr. David Alan Gilbert
Kill the threads we've started when the queues get stopped. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> With improvements by: Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Pass write iov's all the way throughDr. David Alan Gilbert
Pass the write iov pointing to guest RAM all the way through rather than copying the data. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Plumb fuse_bufvec through to do_write_bufDr. David Alan Gilbert
Let fuse_session_process_buf_int take a fuse_bufvec * instead of a fuse_buf; and then through to do_write_buf - where in the best case it can pass that straight through to op.write_buf without copying (other than skipping a header). Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: add --fd=FDNUM fd passing optionStefan Hajnoczi
Although --socket-path=PATH is useful for manual invocations, management tools typically create the UNIX domain socket themselves and pass it to the vhost-user device backend. This way QEMU can be launched immediately with a valid socket. No waiting for the vhost-user device backend is required when fd passing is used. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Fast path for virtio readDr. David Alan Gilbert
Readv the data straight into the guests buffer. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> With fix by: Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Keep track of repliesDr. David Alan Gilbert
Keep track of whether we sent a reply to a request; this is a bit paranoid but it means: a) We should always recycle an element even if there was an error in the request b) Never try and send two replies on one queue element Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Send replies to messagesDr. David Alan Gilbert
Route fuse out messages back through the same queue elements that had the command that triggered the request. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Start reading commands from queueDr. David Alan Gilbert
Pop queue elements off queues, copy the data from them and pass that to fuse. Note: 'out' in a VuVirtqElement is from QEMU 'in' in libfuse is into the daemon So we read from the out iov's to get a fuse_in_header When we get a kick we've got to read all the elements until the queue is empty. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Poll kick_fd for queueDr. David Alan Gilbert
In the queue thread poll the kick_fd we're passed. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-01-23virtiofsd: Start queue threadsDr. David Alan Gilbert
Start a thread for each queue when we get notified it's been started. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> fix by: Signed-off-by: Jun Piao <piaojun@huawei.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>