diff options
author | Paolo Bonzini <pbonzini@redhat.com> | 2017-08-22 06:50:18 +0200 |
---|---|---|
committer | Paolo Bonzini <pbonzini@redhat.com> | 2017-09-22 21:07:24 +0200 |
commit | b855f8d175a0a26c9798cbc5962bb8c0d9538231 (patch) | |
tree | 06ef9b853e3700eb33aaccd9ffba4c0922baaf42 /docs | |
parent | 7c9e527659c67d4d7b41d9504f93d2d7ee482488 (diff) |
scsi: build qemu-pr-helper
Introduce a privileged helper to run persistent reservation commands.
This lets virtual machines send persistent reservations without using
CAP_SYS_RAWIO or out-of-tree patches. The helper uses Unix permissions
and SCM_RIGHTS to restrict access to processes that can access its socket
and prove that they have an open file descriptor for a raw SCSI device.
The next patch will also correct the usage of persistent reservations
with multipath devices.
It would also be possible to support for Linux's IOC_PR_* ioctls in
the future, to support NVMe devices. For now, however, only SCSI is
supported.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Diffstat (limited to 'docs')
-rw-r--r-- | docs/interop/pr-helper.rst | 83 | ||||
-rw-r--r-- | docs/pr-manager.rst | 33 |
2 files changed, 116 insertions, 0 deletions
diff --git a/docs/interop/pr-helper.rst b/docs/interop/pr-helper.rst new file mode 100644 index 0000000000..9f76d5bcf9 --- /dev/null +++ b/docs/interop/pr-helper.rst @@ -0,0 +1,83 @@ +.. + +====================================== +Persistent reservation helper protocol +====================================== + +QEMU's SCSI passthrough devices, ``scsi-block`` and ``scsi-generic``, +can delegate implementation of persistent reservations to an external +(and typically privileged) program. Persistent Reservations allow +restricting access to block devices to specific initiators in a shared +storage setup. + +For a more detailed reference please refer the the SCSI Primary +Commands standard, specifically the section on Reservations and the +"PERSISTENT RESERVE IN" and "PERSISTENT RESERVE OUT" commands. + +This document describes the socket protocol used between QEMU's +``pr-manager-helper`` object and the external program. + +.. contents:: + +Connection and initialization +----------------------------- + +All data transmitted on the socket is big-endian. + +After connecting to the helper program's socket, the helper starts a simple +feature negotiation process by writing four bytes corresponding to +the features it exposes (``supported_features``). QEMU reads it, +then writes four bytes corresponding to the desired features of the +helper program (``requested_features``). + +If a bit is 1 in ``requested_features`` and 0 in ``supported_features``, +the corresponding feature is not supported by the helper and the connection +is closed. On the other hand, it is acceptable for a bit to be 0 in +``requested_features`` and 1 in ``supported_features``; in this case, +the helper will not enable the feature. + +Right now no feature is defined, so the two parties always write four +zero bytes. + +Command format +-------------- + +It is invalid to send multiple commands concurrently on the same +socket. It is however possible to connect multiple sockets to the +helper and send multiple commands to the helper for one or more +file descriptors. + +A command consists of a request and a response. A request consists +of a 16-byte SCSI CDB. A file descriptor must be passed to the helper +together with the SCSI CDB using ancillary data. + +The CDB has the following limitations: + +- the command (stored in the first byte) must be one of 0x5E + (PERSISTENT RESERVE IN) or 0x5F (PERSISTENT RESERVE OUT). + +- the allocation length (stored in bytes 7-8 of the CDB for PERSISTENT + RESERVE IN) or parameter list length (stored in bytes 5-8 of the CDB + for PERSISTENT RESERVE OUT) is limited to 8 KiB. + +For PERSISTENT RESERVE OUT, the parameter list is sent right after the +CDB. The length of the parameter list is taken from the CDB itself. + +The helper's reply has the following structure: + +- 4 bytes for the SCSI status + +- 4 bytes for the payload size (nonzero only for PERSISTENT RESERVE IN + and only if the SCSI status is 0x00, i.e. GOOD) + +- 96 bytes for the SCSI sense data + +- if the size is nonzero, the payload follows + +The sense data is always sent to keep the protocol simple, even though +it is only valid if the SCSI status is CHECK CONDITION (0x02). + +The payload size is always less than or equal to the allocation length +specified in the CDB for the PERSISTENT RESERVE IN command. + +If the protocol is violated, the helper closes the socket. diff --git a/docs/pr-manager.rst b/docs/pr-manager.rst index b6089fb57c..7107e59fb8 100644 --- a/docs/pr-manager.rst +++ b/docs/pr-manager.rst @@ -49,3 +49,36 @@ Alternatively, using ``-blockdev``:: -object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock -blockdev node-name=hd,driver=raw,file.driver=host_device,file.filename=/dev/sdb,file.pr-manager=helper0 -device scsi-block,drive=hd + +---------------------------------- +Invoking :program:`qemu-pr-helper` +---------------------------------- + +QEMU provides an implementation of the persistent reservation helper, +called :program:`qemu-pr-helper`. The helper should be started as a +system service and supports the following option: + +-d, --daemon run in the background +-q, --quiet decrease verbosity +-f, --pidfile=path PID file when running as a daemon +-k, --socket=path path to the socket +-T, --trace=trace-opts tracing options + +By default, the socket and PID file are placed in the runtime state +directory, for example :file:`/var/run/qemu-pr-helper.sock` and +:file:`/var/run/qemu-pr-helper.pid`. The PID file is not created +unless :option:`-d` is passed too. + +:program:`qemu-pr-helper` can also use the systemd socket activation +protocol. In this case, the systemd socket unit should specify a +Unix stream socket, like this:: + + [Socket] + ListenStream=/var/run/qemu-pr-helper.sock + +After connecting to the socket, :program:`qemu-pr-helper`` can optionally drop +root privileges, except for those capabilities that are needed for +its operation. To do this, add the following options: + +-u, --user=user user to drop privileges to +-g, --group=group group to drop privileges to |