aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorStefan Hajnoczi <stefanha@redhat.com>2023-11-07 18:59:40 +0800
committerStefan Hajnoczi <stefanha@redhat.com>2023-11-07 18:59:41 +0800
commitf6b615b52d1d92f02103596a30df95f31138a2e4 (patch)
treeb8d13f1b7e485177a8b6b470df30eaac268b3466 /docs
parent7eee58ae3bb15a2bceb368997ce1a48fd3c607e7 (diff)
parent94cd94f1c0137b56000c01208e03d0907ad34910 (diff)
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, fixes virtio sound card support vhost-user: back-end state migration cxl: line length reduction enabling fabric management vhost-vdpa: shadow virtqueue hash calculation Support shadow virtqueue RSS Support tests: CPU topology related smbios test cases Fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmVKDDoPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpF08H/0Zts8uvkHbgiOEJw4JMHU6/VaCipfIYsp01 # GSfwYOyEsXJ7GIxKWaCiMnWXEm7tebNCPKf3DoUtcAojQj3vuF9XbWBKw/bfRn83 # nGO/iiwbYViSKxkwqUI+Up5YiN9o0M8gBFrY0kScPezbnYmo5u2bcADdEEq6gH68 # D0Ea8i+WmszL891ypvgCDBL2ObDk3qX3vA5Q6J2I+HKX2ofJM59BwaKwS5ghw+IG # BmbKXUZJNjUQfN9dQ7vJuiuqdknJ2xUzwW2Vn612ffarbOZB1DZ6ruWlrHty5TjX # 0w4IXEJPBgZYbX9oc6zvTQnbLDBJbDU89mnme0TcmNMKWmQKTtc= # =vEv+ # -----END PGP SIGNATURE----- # gpg: Signature made Tue 07 Nov 2023 18:06:50 HKT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (63 commits) acpi/tests/avocado/bits: enable console logging from bits VM acpi/tests/avocado/bits: enforce 32-bit SMBIOS entry point hw/cxl: Add tunneled command support to mailbox for switch cci. hw/cxl: Add dummy security state get hw/cxl/type3: Cleanup multiple CXL_TYPE3() calls in read/write functions hw/cxl/mbox: Add Get Background Operation Status Command hw/cxl: Add support for device sanitation hw/cxl/mbox: Wire up interrupts for background completion hw/cxl/mbox: Add support for background operations hw/cxl: Implement Physical Ports status retrieval hw/pci-bridge/cxl_downstream: Set default link width and link speed hw/cxl/mbox: Add Physical Switch Identify command. hw/cxl/mbox: Add Information and Status / Identify command hw/cxl: Add a switch mailbox CCI function hw/pci-bridge/cxl_upstream: Move defintion of device to header. hw/cxl/mbox: Generalize the CCI command processing hw/cxl/mbox: Pull the CCI definition out of the CXLDeviceState hw/cxl/mbox: Split mailbox command payload into separate input and output hw/cxl/mbox: Pull the payload out of struct cxl_cmd and make instances constant hw/cxl: Fix a QEMU_BUILD_BUG_ON() in switch statement scope issue. ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Diffstat (limited to 'docs')
-rw-r--r--docs/interop/vhost-user.rst301
-rw-r--r--docs/system/device-emulation.rst1
-rw-r--r--docs/system/devices/virtio-snd.rst49
3 files changed, 331 insertions, 20 deletions
diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 768fb5c28c..9f1103f85a 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -108,6 +108,43 @@ A vring state description
:num: a 32-bit number
+A vring descriptor index for split virtqueues
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
++-------------+---------------------+
+| vring index | index in avail ring |
++-------------+---------------------+
+
+:vring index: 32-bit index of the respective virtqueue
+
+:index in avail ring: 32-bit value, of which currently only the lower 16
+ bits are used:
+
+ - Bits 0–15: Index of the next *Available Ring* descriptor that the
+ back-end will process. This is a free-running index that is not
+ wrapped by the ring size.
+ - Bits 16–31: Reserved (set to zero)
+
+Vring descriptor indices for packed virtqueues
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
++-------------+--------------------+
+| vring index | descriptor indices |
++-------------+--------------------+
+
+:vring index: 32-bit index of the respective virtqueue
+
+:descriptor indices: 32-bit value:
+
+ - Bits 0–14: Index of the next *Available Ring* descriptor that the
+ back-end will process. This is a free-running index that is not
+ wrapped by the ring size.
+ - Bit 15: Driver (Available) Ring Wrap Counter
+ - Bits 16–30: Index of the entry in the *Used Ring* where the back-end
+ will place the next descriptor. This is a free-running index that
+ is not wrapped by the ring size.
+ - Bit 31: Device (Used) Ring Wrap Counter
+
A vring address description
^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -285,6 +322,32 @@ VhostUserShared
:UUID: 16 bytes UUID, whose first three components (a 32-bit value, then
two 16-bit values) are stored in big endian.
+Device state transfer parameters
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
++--------------------+-----------------+
+| transfer direction | migration phase |
++--------------------+-----------------+
+
+:transfer direction: a 32-bit enum, describing the direction in which
+ the state is transferred:
+
+ - 0: Save: Transfer the state from the back-end to the front-end,
+ which happens on the source side of migration
+ - 1: Load: Transfer the state from the front-end to the back-end,
+ which happens on the destination side of migration
+
+:migration phase: a 32-bit enum, describing the state in which the VM
+ guest and devices are:
+
+ - 0: Stopped (in the period after the transfer of memory-mapped
+ regions before switch-over to the destination): The VM guest is
+ stopped, and the vhost-user device is suspended (see
+ :ref:`Suspended device state <suspended_device_state>`).
+
+ In the future, additional phases might be added e.g. to allow
+ iterative migration while the device is running.
+
C structure
-----------
@@ -344,6 +407,7 @@ in the ancillary data:
* ``VHOST_USER_SET_VRING_ERR``
* ``VHOST_USER_SET_BACKEND_REQ_FD`` (previous name ``VHOST_USER_SET_SLAVE_REQ_FD``)
* ``VHOST_USER_SET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
+* ``VHOST_USER_SET_DEVICE_STATE_FD``
If *front-end* is unable to send the full message or receives a wrong
reply it will close the connection. An optional reconnection mechanism
@@ -374,35 +438,50 @@ negotiation.
Ring states
-----------
-Rings can be in one of three states:
+Rings have two independent states: started/stopped, and enabled/disabled.
-* stopped: the back-end must not process the ring at all.
+* While a ring is stopped, the back-end must not process the ring at
+ all, regardless of whether it is enabled or disabled. The
+ enabled/disabled state should still be tracked, though, so it can come
+ into effect once the ring is started.
-* started but disabled: the back-end must process the ring without
+* started and disabled: The back-end must process the ring without
causing any side effects. For example, for a networking device,
in the disabled state the back-end must not supply any new RX packets,
but must process and discard any TX packets.
-* started and enabled.
+* started and enabled: The back-end must process the ring normally, i.e.
+ process all requests and execute them.
-Each ring is initialized in a stopped state. The back-end must start
-ring upon receiving a kick (that is, detecting that file descriptor is
-readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK``
-or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated,
-and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``.
+Each ring is initialized in a stopped and disabled state. The back-end
+must start a ring upon receiving a kick (that is, detecting that file
+descriptor is readable) on the descriptor specified by
+``VHOST_USER_SET_VRING_KICK`` or receiving the in-band message
+``VHOST_USER_VRING_KICK`` if negotiated, and stop a ring upon receiving
+``VHOST_USER_GET_VRING_BASE``.
Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
-If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
-ring starts directly in the enabled state.
-
-If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
-initialized in a disabled state and is enabled by
-``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.
+In addition, upon receiving a ``VHOST_USER_SET_FEATURES`` message from
+the front-end without ``VHOST_USER_F_PROTOCOL_FEATURES`` set, the
+back-end must enable all rings immediately.
While processing the rings (whether they are enabled or not), the back-end
must support changing some configuration aspects on the fly.
+.. _suspended_device_state:
+
+Suspended device state
+^^^^^^^^^^^^^^^^^^^^^^
+
+While all vrings are stopped, the device is *suspended*. In addition to
+not processing any vring (because they are stopped), the device must:
+
+* not write to any guest memory regions,
+* not send any notifications to the guest,
+* not send any messages to the front-end,
+* still process and reply to messages from the front-end.
+
Multiple queue support
----------------------
@@ -490,7 +569,8 @@ ancillary data, it may be used to inform the front-end that the log has
been modified.
Once the source has finished migration, rings will be stopped by the
-source. No further update must be done before rings are restarted.
+source (:ref:`Suspended device state <suspended_device_state>`). No
+further update must be done before rings are restarted.
In postcopy migration the back-end is started before all the memory has
been received from the source host, and care must be taken to avoid
@@ -502,6 +582,80 @@ it performs WAKE ioctl's on the userfaultfd to wake the stalled
back-end. The front-end indicates support for this via the
``VHOST_USER_PROTOCOL_F_PAGEFAULT`` feature.
+.. _migrating_backend_state:
+
+Migrating back-end state
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Migrating device state involves transferring the state from one
+back-end, called the source, to another back-end, called the
+destination. After migration, the destination transparently resumes
+operation without requiring the driver to re-initialize the device at
+the VIRTIO level. If the migration fails, then the source can
+transparently resume operation until another migration attempt is made.
+
+Generally, the front-end is connected to a virtual machine guest (which
+contains the driver), which has its own state to transfer between source
+and destination, and therefore will have an implementation-specific
+mechanism to do so. The ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature
+provides functionality to have the front-end include the back-end's
+state in this transfer operation so the back-end does not need to
+implement its own mechanism, and so the virtual machine may have its
+complete state, including vhost-user devices' states, contained within a
+single stream of data.
+
+To do this, the back-end state is transferred from back-end to front-end
+on the source side, and vice versa on the destination side. This
+transfer happens over a channel that is negotiated using the
+``VHOST_USER_SET_DEVICE_STATE_FD`` message. This message has two
+parameters:
+
+* Direction of transfer: On the source, the data is saved, transferring
+ it from the back-end to the front-end. On the destination, the data
+ is loaded, transferring it from the front-end to the back-end.
+
+* Migration phase: Currently, the only supported phase is the period
+ after the transfer of memory-mapped regions before switch-over to the
+ destination, when both the source and destination devices are
+ suspended (:ref:`Suspended device state <suspended_device_state>`).
+ In the future, additional phases might be supported to allow iterative
+ migration while the device is running.
+
+The nature of the channel is implementation-defined, but it must
+generally behave like a pipe: The writing end will write all the data it
+has into it, signalling the end of data by closing its end. The reading
+end must read all of this data (until encountering the end of file) and
+process it.
+
+* When saving, the writing end is the source back-end, and the reading
+ end is the source front-end. After reading the state data from the
+ channel, the source front-end must transfer it to the destination
+ front-end through an implementation-defined mechanism.
+
+* When loading, the writing end is the destination front-end, and the
+ reading end is the destination back-end. After reading the state data
+ from the channel, the destination back-end must deserialize its
+ internal state from that data and set itself up to allow the driver to
+ seamlessly resume operation on the VIRTIO level.
+
+Seamlessly resuming operation means that the migration must be
+transparent to the guest driver, which operates on the VIRTIO level.
+This driver will not perform any re-initialization steps, but continue
+to use the device as if no migration had occurred. The vhost-user
+front-end, however, will re-initialize the vhost state on the
+destination, following the usual protocol for establishing a connection
+to a vhost-user back-end: This includes, for example, setting up memory
+mappings and kick and call FDs as necessary, negotiating protocol
+features, or setting the initial vring base indices (to the same value
+as on the source side, so that operation can resume).
+
+Both on the source and on the destination side, after the respective
+front-end has seen all data transferred (when the transfer FD has been
+closed), it sends the ``VHOST_USER_CHECK_DEVICE_STATE`` message to
+verify that data transfer was successful in the back-end, too. The
+back-end responds once it knows whether the transfer and processing was
+successful or not.
+
Memory access
-------------
@@ -896,6 +1050,7 @@ Protocol features
#define VHOST_USER_PROTOCOL_F_STATUS 16
#define VHOST_USER_PROTOCOL_F_XEN_MMAP 17
#define VHOST_USER_PROTOCOL_F_SHARED_OBJECT 18
+ #define VHOST_USER_PROTOCOL_F_DEVICE_STATE 19
Front-end message types
-----------------------
@@ -1042,18 +1197,54 @@ Front-end message types
``VHOST_USER_SET_VRING_BASE``
:id: 10
:equivalent ioctl: ``VHOST_SET_VRING_BASE``
- :request payload: vring state description
+ :request payload: vring descriptor index/indices
:reply payload: N/A
- Sets the base offset in the available vring.
+ Sets the next index to use for descriptors in this vring:
+
+ * For a split virtqueue, sets only the next descriptor index to
+ process in the *Available Ring*. The device is supposed to read the
+ next index in the *Used Ring* from the respective vring structure in
+ guest memory.
+
+ * For a packed virtqueue, both indices are supplied, as they are not
+ explicitly available in memory.
+
+ Consequently, the payload type is specific to the type of virt queue
+ (*a vring descriptor index for split virtqueues* vs. *vring descriptor
+ indices for packed virtqueues*).
``VHOST_USER_GET_VRING_BASE``
:id: 11
:equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
:request payload: vring state description
- :reply payload: vring state description
+ :reply payload: vring descriptor index/indices
+
+ Stops the vring and returns the current descriptor index or indices:
+
+ * For a split virtqueue, returns only the 16-bit next descriptor
+ index to process in the *Available Ring*. Note that this may
+ differ from the available ring index in the vring structure in
+ memory, which points to where the driver will put new available
+ descriptors. For the *Used Ring*, the device only needs the next
+ descriptor index at which to put new descriptors, which is the
+ value in the vring structure in memory, so this value is not
+ covered by this message.
+
+ * For a packed virtqueue, neither index is explicitly available to
+ read from memory, so both indices (as maintained by the device) are
+ returned.
+
+ Consequently, the payload type is specific to the type of virt queue
+ (*a vring descriptor index for split virtqueues* vs. *vring descriptor
+ indices for packed virtqueues*).
- Get the available vring base offset.
+ When and as long as all of a device’s vrings are stopped, it is
+ *suspended*, see :ref:`Suspended device state
+ <suspended_device_state>`.
+
+ The request payload’s *num* field is currently reserved and must be
+ set to 0.
``VHOST_USER_SET_VRING_KICK``
:id: 12
@@ -1464,6 +1655,76 @@ Front-end message types
the requested UUID. Back-end will reply passing the fd when the operation
is successful, or no fd otherwise.
+``VHOST_USER_SET_DEVICE_STATE_FD``
+ :id: 42
+ :equivalent ioctl: N/A
+ :request payload: device state transfer parameters
+ :reply payload: ``u64``
+
+ Front-end and back-end negotiate a channel over which to transfer the
+ back-end’s internal state during migration. Either side (front-end or
+ back-end) may create the channel. The nature of this channel is not
+ restricted or defined in this document, but whichever side creates it
+ must create a file descriptor that is provided to the respectively
+ other side, allowing access to the channel. This FD must behave as
+ follows:
+
+ * For the writing end, it must allow writing the whole back-end state
+ sequentially. Closing the file descriptor signals the end of
+ transfer.
+
+ * For the reading end, it must allow reading the whole back-end state
+ sequentially. The end of file signals the end of the transfer.
+
+ For example, the channel may be a pipe, in which case the two ends of
+ the pipe fulfill these requirements respectively.
+
+ Initially, the front-end creates a channel along with such an FD. It
+ passes the FD to the back-end as ancillary data of a
+ ``VHOST_USER_SET_DEVICE_STATE_FD`` message. The back-end may create a
+ different transfer channel, passing the respective FD back to the
+ front-end as ancillary data of the reply. If so, the front-end must
+ then discard its channel and use the one provided by the back-end.
+
+ Whether the back-end should decide to use its own channel is decided
+ based on efficiency: If the channel is a pipe, both ends will most
+ likely need to copy data into and out of it. Any channel that allows
+ for more efficient processing on at least one end, e.g. through
+ zero-copy, is considered more efficient and thus preferred. If the
+ back-end can provide such a channel, it should decide to use it.
+
+ The request payload contains parameters for the subsequent data
+ transfer, as described in the :ref:`Migrating back-end state
+ <migrating_backend_state>` section.
+
+ The value returned is both an indication for success, and whether a
+ file descriptor for a back-end-provided channel is returned: Bits 0–7
+ are 0 on success, and non-zero on error. Bit 8 is the invalid FD
+ flag; this flag is set when there is no file descriptor returned.
+ When this flag is not set, the front-end must use the returned file
+ descriptor as its end of the transfer channel. The back-end must not
+ both indicate an error and return a file descriptor.
+
+ Using this function requires prior negotiation of the
+ ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature.
+
+``VHOST_USER_CHECK_DEVICE_STATE``
+ :id: 43
+ :equivalent ioctl: N/A
+ :request payload: N/A
+ :reply payload: ``u64``
+
+ After transferring the back-end’s internal state during migration (see
+ the :ref:`Migrating back-end state <migrating_backend_state>`
+ section), check whether the back-end was able to successfully fully
+ process the state.
+
+ The value returned indicates success or error; 0 is success, any
+ non-zero value is an error.
+
+ Using this function requires prior negotiation of the
+ ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature.
+
Back-end message types
----------------------
diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst
index 1167f3a9f2..d1f3277cb0 100644
--- a/docs/system/device-emulation.rst
+++ b/docs/system/device-emulation.rst
@@ -93,6 +93,7 @@ Emulated Devices
devices/vhost-user.rst
devices/virtio-gpu.rst
devices/virtio-pmem.rst
+ devices/virtio-snd.rst
devices/vhost-user-rng.rst
devices/canokey.rst
devices/usb-u2f.rst
diff --git a/docs/system/devices/virtio-snd.rst b/docs/system/devices/virtio-snd.rst
new file mode 100644
index 0000000000..2a9187fd70
--- /dev/null
+++ b/docs/system/devices/virtio-snd.rst
@@ -0,0 +1,49 @@
+virtio sound
+============
+
+This document explains the setup and usage of the Virtio sound device.
+The Virtio sound device is a paravirtualized sound card device.
+
+Linux kernel support
+--------------------
+
+Virtio sound requires a guest Linux kernel built with the
+``CONFIG_SND_VIRTIO`` option.
+
+Description
+-----------
+
+Virtio sound implements capture and playback from inside a guest using the
+configured audio backend of the host machine.
+
+Device properties
+-----------------
+
+The Virtio sound device can be configured with the following properties:
+
+ * ``jacks`` number of physical jacks (Unimplemented).
+ * ``streams`` number of PCM streams. At the moment, no stream configuration is supported: the first one will always be a playback stream, an optional second will always be a capture stream. Adding more will cycle stream directions from playback to capture.
+ * ``chmaps`` number of channel maps (Unimplemented).
+
+All streams are stereo and have the default channel positions ``Front left, right``.
+
+Examples
+--------
+
+Add an audio device and an audio backend at once with ``-audio`` and ``model=virtio``:
+
+ * pulseaudio: ``-audio driver=pa,model=virtio``
+ or ``-audio driver=pa,model=virtio,server=/run/user/1000/pulse/native``
+ * sdl: ``-audio driver=sdl,model=virtio``
+ * coreaudio: ``-audio driver=coreaudio,model=virtio``
+
+etc.
+
+To specifically add virtualized sound devices, you have to specify a PCI device
+and an audio backend listed with ``-audio driver=help`` that works on your host
+machine, e.g.:
+
+::
+
+ -device virtio-sound-pci,audiodev=my_audiodev \
+ -audiodev alsa,id=my_audiodev