diff options
author | Dr. David Alan Gilbert <dgilbert@redhat.com> | 2023-01-18 12:11:51 +0000 |
---|---|---|
committer | Dr. David Alan Gilbert <dgilbert@redhat.com> | 2023-02-16 18:15:08 +0000 |
commit | e0dc2631ec4ac718ebe22ddea0ab25524eb37b0e (patch) | |
tree | 0121bd6c8d29247933112f92b7ae059f5780d883 /docs/tools | |
parent | 8ab5e8a503b55eb27672777cfedea902bb22a246 (diff) |
virtiofsd: Remove source
Now remove all the source.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Diffstat (limited to 'docs/tools')
-rw-r--r-- | docs/tools/virtiofsd.rst | 403 |
1 files changed, 0 insertions, 403 deletions
diff --git a/docs/tools/virtiofsd.rst b/docs/tools/virtiofsd.rst deleted file mode 100644 index 995a754a7b..0000000000 --- a/docs/tools/virtiofsd.rst +++ /dev/null @@ -1,403 +0,0 @@ -QEMU virtio-fs shared file system daemon -======================================== - -Synopsis --------- - -**virtiofsd** [*OPTIONS*] - -Description ------------ - -Share a host directory tree with a guest through a virtio-fs device. This -program is a vhost-user backend that implements the virtio-fs device. Each -virtio-fs device instance requires its own virtiofsd process. - -This program is designed to work with QEMU's ``--device vhost-user-fs-pci`` -but should work with any virtual machine monitor (VMM) that supports -vhost-user. See the Examples section below. - -This program must be run as the root user. The program drops privileges where -possible during startup although it must be able to create and access files -with any uid/gid: - -* The ability to invoke syscalls is limited using seccomp(2). -* Linux capabilities(7) are dropped. - -In "namespace" sandbox mode the program switches into a new file system -namespace and invokes pivot_root(2) to make the shared directory tree its root. -A new pid and net namespace is also created to isolate the process. - -In "chroot" sandbox mode the program invokes chroot(2) to make the shared -directory tree its root. This mode is intended for container environments where -the container runtime has already set up the namespaces and the program does -not have permission to create namespaces itself. - -Both sandbox modes prevent "file system escapes" due to symlinks and other file -system objects that might lead to files outside the shared directory. - -Options -------- - -.. program:: virtiofsd - -.. option:: -h, --help - - Print help. - -.. option:: -V, --version - - Print version. - -.. option:: -d - - Enable debug output. - -.. option:: --syslog - - Print log messages to syslog instead of stderr. - -.. option:: -o OPTION - - * debug - - Enable debug output. - - * flock|no_flock - - Enable/disable flock. The default is ``no_flock``. - - * modcaps=CAPLIST - Modify the list of capabilities allowed; CAPLIST is a colon separated - list of capabilities, each preceded by either + or -, e.g. - ''+sys_admin:-chown''. - - * log_level=LEVEL - - Print only log messages matching LEVEL or more severe. LEVEL is one of - ``err``, ``warn``, ``info``, or ``debug``. The default is ``info``. - - * posix_lock|no_posix_lock - - Enable/disable remote POSIX locks. The default is ``no_posix_lock``. - - * readdirplus|no_readdirplus - - Enable/disable readdirplus. The default is ``readdirplus``. - - * sandbox=namespace|chroot - - Sandbox mode: - - namespace: Create mount, pid, and net namespaces and pivot_root(2) into - the shared directory. - - chroot: chroot(2) into shared directory (use in containers). - The default is "namespace". - - * source=PATH - - Share host directory tree located at PATH. This option is required. - - * timeout=TIMEOUT - - I/O timeout in seconds. The default depends on cache= option. - - * writeback|no_writeback - - Enable/disable writeback cache. The cache allows the FUSE client to buffer - and merge write requests. The default is ``no_writeback``. - - * xattr|no_xattr - - Enable/disable extended attributes (xattr) on files and directories. The - default is ``no_xattr``. - - * posix_acl|no_posix_acl - - Enable/disable posix acl support. Posix ACLs are disabled by default. - - * security_label|no_security_label - - Enable/disable security label support. Security labels are disabled by - default. This will allow client to send a MAC label of file during - file creation. Typically this is expected to be SELinux security - label. Server will try to set that label on newly created file - atomically wherever possible. - - * killpriv_v2|no_killpriv_v2 - - Enable/disable ``FUSE_HANDLE_KILLPRIV_V2`` support. KILLPRIV_V2 is enabled - by default as long as the client supports it. Enabling this option helps - with performance in write path. - -.. option:: --socket-path=PATH - - Listen on vhost-user UNIX domain socket at PATH. - -.. option:: --socket-group=GROUP - - Set the vhost-user UNIX domain socket gid to GROUP. - -.. option:: --fd=FDNUM - - Accept connections from vhost-user UNIX domain socket file descriptor FDNUM. - The file descriptor must already be listening for connections. - -.. option:: --thread-pool-size=NUM - - Restrict the number of worker threads per request queue to NUM. The default - is 0. - -.. option:: --cache=none|auto|always - - Select the desired trade-off between coherency and performance. ``none`` - forbids the FUSE client from caching to achieve best coherency at the cost of - performance. ``auto`` acts similar to NFS with a 1 second metadata cache - timeout. ``always`` sets a long cache lifetime at the expense of coherency. - The default is ``auto``. - -Extended attribute (xattr) mapping ----------------------------------- - -By default the name of xattr's used by the client are passed through to the server -file system. This can be a problem where either those xattr names are used -by something on the server (e.g. selinux client/server confusion) or if the -``virtiofsd`` is running in a container with restricted privileges where it -cannot access some attributes. - -Mapping syntax -~~~~~~~~~~~~~~ - -A mapping of xattr names can be made using -o xattrmap=mapping where the ``mapping`` -string consists of a series of rules. - -The first matching rule terminates the mapping. -The set of rules must include a terminating rule to match any remaining attributes -at the end. - -Each rule consists of a number of fields separated with a separator that is the -first non-white space character in the rule. This separator must then be used -for the whole rule. -White space may be added before and after each rule. - -Using ':' as the separator a rule is of the form: - -``:type:scope:key:prepend:`` - -**scope** is: - -- 'client' - match 'key' against a xattr name from the client for - setxattr/getxattr/removexattr -- 'server' - match 'prepend' against a xattr name from the server - for listxattr -- 'all' - can be used to make a single rule where both the server - and client matches are triggered. - -**type** is one of: - -- 'prefix' - is designed to prepend and strip a prefix; the modified - attributes then being passed on to the client/server. - -- 'ok' - Causes the rule set to be terminated when a match is found - while allowing matching xattr's through unchanged. - It is intended both as a way of explicitly terminating - the list of rules, and to allow some xattr's to skip following rules. - -- 'bad' - If a client tries to use a name matching 'key' it's - denied using EPERM; when the server passes an attribute - name matching 'prepend' it's hidden. In many ways it's use is very like - 'ok' as either an explicit terminator or for special handling of certain - patterns. - -- 'unsupported' - If a client tries to use a name matching 'key' it's - denied using ENOTSUP; when the server passes an attribute - name matching 'prepend' it's hidden. In many ways it's use is very like - 'ok' as either an explicit terminator or for special handling of certain - patterns. - -**key** is a string tested as a prefix on an attribute name originating -on the client. It maybe empty in which case a 'client' rule -will always match on client names. - -**prepend** is a string tested as a prefix on an attribute name originating -on the server, and used as a new prefix. It may be empty -in which case a 'server' rule will always match on all names from -the server. - -e.g.: - - ``:prefix:client:trusted.:user.virtiofs.:`` - - will match 'trusted.' attributes in client calls and prefix them before - passing them to the server. - - ``:prefix:server::user.virtiofs.:`` - - will strip 'user.virtiofs.' from all server replies. - - ``:prefix:all:trusted.:user.virtiofs.:`` - - combines the previous two cases into a single rule. - - ``:ok:client:user.::`` - - will allow get/set xattr for 'user.' xattr's and ignore - following rules. - - ``:ok:server::security.:`` - - will pass 'security.' xattr's in listxattr from the server - and ignore following rules. - - ``:ok:all:::`` - - will terminate the rule search passing any remaining attributes - in both directions. - - ``:bad:server::security.:`` - - would hide 'security.' xattr's in listxattr from the server. - -A simpler 'map' type provides a shorter syntax for the common case: - -``:map:key:prepend:`` - -The 'map' type adds a number of separate rules to add **prepend** as a prefix -to the matched **key** (or all attributes if **key** is empty). -There may be at most one 'map' rule and it must be the last rule in the set. - -Note: When the 'security.capability' xattr is remapped, the daemon has to do -extra work to remove it during many operations, which the host kernel normally -does itself. - -Security considerations -~~~~~~~~~~~~~~~~~~~~~~~ - -Operating systems typically partition the xattr namespace using -well defined name prefixes. Each partition may have different -access controls applied. For example, on Linux there are multiple -partitions - - * ``system.*`` - access varies depending on attribute & filesystem - * ``security.*`` - only processes with CAP_SYS_ADMIN - * ``trusted.*`` - only processes with CAP_SYS_ADMIN - * ``user.*`` - any process granted by file permissions / ownership - -While other OS such as FreeBSD have different name prefixes -and access control rules. - -When remapping attributes on the host, it is important to -ensure that the remapping does not allow a guest user to -evade the guest access control rules. - -Consider if ``trusted.*`` from the guest was remapped to -``user.virtiofs.trusted*`` in the host. An unprivileged -user in a Linux guest has the ability to write to xattrs -under ``user.*``. Thus the user can evade the access -control restriction on ``trusted.*`` by instead writing -to ``user.virtiofs.trusted.*``. - -As noted above, the partitions used and access controls -applied, will vary across guest OS, so it is not wise to -try to predict what the guest OS will use. - -The simplest way to avoid an insecure configuration is -to remap all xattrs at once, to a given fixed prefix. -This is shown in example (1) below. - -If selectively mapping only a subset of xattr prefixes, -then rules must be added to explicitly block direct -access to the target of the remapping. This is shown -in example (2) below. - -Mapping examples -~~~~~~~~~~~~~~~~ - -1) Prefix all attributes with 'user.virtiofs.' - -:: - - -o xattrmap=":prefix:all::user.virtiofs.::bad:all:::" - - -This uses two rules, using : as the field separator; -the first rule prefixes and strips 'user.virtiofs.', -the second rule hides any non-prefixed attributes that -the host set. - -This is equivalent to the 'map' rule: - -:: - - -o xattrmap=":map::user.virtiofs.:" - -2) Prefix 'trusted.' attributes, allow others through - -:: - - "/prefix/all/trusted./user.virtiofs./ - /bad/server//trusted./ - /bad/client/user.virtiofs.// - /ok/all///" - - -Here there are four rules, using / as the field -separator, and also demonstrating that new lines can -be included between rules. -The first rule is the prefixing of 'trusted.' and -stripping of 'user.virtiofs.'. -The second rule hides unprefixed 'trusted.' attributes -on the host. -The third rule stops a guest from explicitly setting -the 'user.virtiofs.' path directly to prevent access -control bypass on the target of the earlier prefix -remapping. -Finally, the fourth rule lets all remaining attributes -through. - -This is equivalent to the 'map' rule: - -:: - - -o xattrmap="/map/trusted./user.virtiofs./" - -3) Hide 'security.' attributes, and allow everything else - -:: - - "/bad/all/security./security./ - /ok/all///' - -The first rule combines what could be separate client and server -rules into a single 'all' rule, matching 'security.' in either -client arguments or lists returned from the host. This stops -the client seeing any 'security.' attributes on the server and -stops it setting any. - -SELinux support ---------------- -One can enable support for SELinux by running virtiofsd with option -"-o security_label". But this will try to save guest's security context -in xattr security.selinux on host and it might fail if host's SELinux -policy does not permit virtiofsd to do this operation. - -Hence, it is preferred to remap guest's "security.selinux" xattr to say -"trusted.virtiofs.security.selinux" on host. - -"-o xattrmap=:map:security.selinux:trusted.virtiofs.:" - -This will make sure that guest and host's SELinux xattrs on same file -remain separate and not interfere with each other. And will allow both -host and guest to implement their own separate SELinux policies. - -Setting trusted xattr on host requires CAP_SYS_ADMIN. So one will need -add this capability to daemon. - -"-o modcaps=+sys_admin" - -Giving CAP_SYS_ADMIN increases the risk on system. Now virtiofsd is more -powerful and if gets compromised, it can do lot of damage to host system. -So keep this trade-off in my mind while making a decision. - -Examples --------- - -Export ``/var/lib/fs/vm001/`` on vhost-user UNIX domain socket -``/var/run/vm001-vhost-fs.sock``: - -.. parsed-literal:: - - host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o source=/var/lib/fs/vm001 - host# |qemu_system| \\ - -chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \\ - -device vhost-user-fs-pci,chardev=char0,tag=myfs \\ - -object memory-backend-memfd,id=mem,size=4G,share=on \\ - -numa node,memdev=mem \\ - ... - guest# mount -t virtiofs myfs /mnt |