diff options
author | Stefan Hajnoczi <stefanha@redhat.com> | 2023-11-07 18:59:40 +0800 |
---|---|---|
committer | Stefan Hajnoczi <stefanha@redhat.com> | 2023-11-07 18:59:41 +0800 |
commit | f6b615b52d1d92f02103596a30df95f31138a2e4 (patch) | |
tree | b8d13f1b7e485177a8b6b470df30eaac268b3466 /docs | |
parent | 7eee58ae3bb15a2bceb368997ce1a48fd3c607e7 (diff) | |
parent | 94cd94f1c0137b56000c01208e03d0907ad34910 (diff) |
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, fixes
virtio sound card support
vhost-user: back-end state migration
cxl:
line length reduction
enabling fabric management
vhost-vdpa:
shadow virtqueue hash calculation Support
shadow virtqueue RSS Support
tests:
CPU topology related smbios test cases
Fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmVKDDoPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpF08H/0Zts8uvkHbgiOEJw4JMHU6/VaCipfIYsp01
# GSfwYOyEsXJ7GIxKWaCiMnWXEm7tebNCPKf3DoUtcAojQj3vuF9XbWBKw/bfRn83
# nGO/iiwbYViSKxkwqUI+Up5YiN9o0M8gBFrY0kScPezbnYmo5u2bcADdEEq6gH68
# D0Ea8i+WmszL891ypvgCDBL2ObDk3qX3vA5Q6J2I+HKX2ofJM59BwaKwS5ghw+IG
# BmbKXUZJNjUQfN9dQ7vJuiuqdknJ2xUzwW2Vn612ffarbOZB1DZ6ruWlrHty5TjX
# 0w4IXEJPBgZYbX9oc6zvTQnbLDBJbDU89mnme0TcmNMKWmQKTtc=
# =vEv+
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 18:06:50 HKT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (63 commits)
acpi/tests/avocado/bits: enable console logging from bits VM
acpi/tests/avocado/bits: enforce 32-bit SMBIOS entry point
hw/cxl: Add tunneled command support to mailbox for switch cci.
hw/cxl: Add dummy security state get
hw/cxl/type3: Cleanup multiple CXL_TYPE3() calls in read/write functions
hw/cxl/mbox: Add Get Background Operation Status Command
hw/cxl: Add support for device sanitation
hw/cxl/mbox: Wire up interrupts for background completion
hw/cxl/mbox: Add support for background operations
hw/cxl: Implement Physical Ports status retrieval
hw/pci-bridge/cxl_downstream: Set default link width and link speed
hw/cxl/mbox: Add Physical Switch Identify command.
hw/cxl/mbox: Add Information and Status / Identify command
hw/cxl: Add a switch mailbox CCI function
hw/pci-bridge/cxl_upstream: Move defintion of device to header.
hw/cxl/mbox: Generalize the CCI command processing
hw/cxl/mbox: Pull the CCI definition out of the CXLDeviceState
hw/cxl/mbox: Split mailbox command payload into separate input and output
hw/cxl/mbox: Pull the payload out of struct cxl_cmd and make instances constant
hw/cxl: Fix a QEMU_BUILD_BUG_ON() in switch statement scope issue.
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Diffstat (limited to 'docs')
-rw-r--r-- | docs/interop/vhost-user.rst | 301 | ||||
-rw-r--r-- | docs/system/device-emulation.rst | 1 | ||||
-rw-r--r-- | docs/system/devices/virtio-snd.rst | 49 |
3 files changed, 331 insertions, 20 deletions
diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst index 768fb5c28c..9f1103f85a 100644 --- a/docs/interop/vhost-user.rst +++ b/docs/interop/vhost-user.rst @@ -108,6 +108,43 @@ A vring state description :num: a 32-bit number +A vring descriptor index for split virtqueues +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + ++-------------+---------------------+ +| vring index | index in avail ring | ++-------------+---------------------+ + +:vring index: 32-bit index of the respective virtqueue + +:index in avail ring: 32-bit value, of which currently only the lower 16 + bits are used: + + - Bits 0–15: Index of the next *Available Ring* descriptor that the + back-end will process. This is a free-running index that is not + wrapped by the ring size. + - Bits 16–31: Reserved (set to zero) + +Vring descriptor indices for packed virtqueues +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + ++-------------+--------------------+ +| vring index | descriptor indices | ++-------------+--------------------+ + +:vring index: 32-bit index of the respective virtqueue + +:descriptor indices: 32-bit value: + + - Bits 0–14: Index of the next *Available Ring* descriptor that the + back-end will process. This is a free-running index that is not + wrapped by the ring size. + - Bit 15: Driver (Available) Ring Wrap Counter + - Bits 16–30: Index of the entry in the *Used Ring* where the back-end + will place the next descriptor. This is a free-running index that + is not wrapped by the ring size. + - Bit 31: Device (Used) Ring Wrap Counter + A vring address description ^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -285,6 +322,32 @@ VhostUserShared :UUID: 16 bytes UUID, whose first three components (a 32-bit value, then two 16-bit values) are stored in big endian. +Device state transfer parameters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + ++--------------------+-----------------+ +| transfer direction | migration phase | ++--------------------+-----------------+ + +:transfer direction: a 32-bit enum, describing the direction in which + the state is transferred: + + - 0: Save: Transfer the state from the back-end to the front-end, + which happens on the source side of migration + - 1: Load: Transfer the state from the front-end to the back-end, + which happens on the destination side of migration + +:migration phase: a 32-bit enum, describing the state in which the VM + guest and devices are: + + - 0: Stopped (in the period after the transfer of memory-mapped + regions before switch-over to the destination): The VM guest is + stopped, and the vhost-user device is suspended (see + :ref:`Suspended device state <suspended_device_state>`). + + In the future, additional phases might be added e.g. to allow + iterative migration while the device is running. + C structure ----------- @@ -344,6 +407,7 @@ in the ancillary data: * ``VHOST_USER_SET_VRING_ERR`` * ``VHOST_USER_SET_BACKEND_REQ_FD`` (previous name ``VHOST_USER_SET_SLAVE_REQ_FD``) * ``VHOST_USER_SET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``) +* ``VHOST_USER_SET_DEVICE_STATE_FD`` If *front-end* is unable to send the full message or receives a wrong reply it will close the connection. An optional reconnection mechanism @@ -374,35 +438,50 @@ negotiation. Ring states ----------- -Rings can be in one of three states: +Rings have two independent states: started/stopped, and enabled/disabled. -* stopped: the back-end must not process the ring at all. +* While a ring is stopped, the back-end must not process the ring at + all, regardless of whether it is enabled or disabled. The + enabled/disabled state should still be tracked, though, so it can come + into effect once the ring is started. -* started but disabled: the back-end must process the ring without +* started and disabled: The back-end must process the ring without causing any side effects. For example, for a networking device, in the disabled state the back-end must not supply any new RX packets, but must process and discard any TX packets. -* started and enabled. +* started and enabled: The back-end must process the ring normally, i.e. + process all requests and execute them. -Each ring is initialized in a stopped state. The back-end must start -ring upon receiving a kick (that is, detecting that file descriptor is -readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK`` -or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated, -and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``. +Each ring is initialized in a stopped and disabled state. The back-end +must start a ring upon receiving a kick (that is, detecting that file +descriptor is readable) on the descriptor specified by +``VHOST_USER_SET_VRING_KICK`` or receiving the in-band message +``VHOST_USER_VRING_KICK`` if negotiated, and stop a ring upon receiving +``VHOST_USER_GET_VRING_BASE``. Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``. -If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the -ring starts directly in the enabled state. - -If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is -initialized in a disabled state and is enabled by -``VHOST_USER_SET_VRING_ENABLE`` with parameter 1. +In addition, upon receiving a ``VHOST_USER_SET_FEATURES`` message from +the front-end without ``VHOST_USER_F_PROTOCOL_FEATURES`` set, the +back-end must enable all rings immediately. While processing the rings (whether they are enabled or not), the back-end must support changing some configuration aspects on the fly. +.. _suspended_device_state: + +Suspended device state +^^^^^^^^^^^^^^^^^^^^^^ + +While all vrings are stopped, the device is *suspended*. In addition to +not processing any vring (because they are stopped), the device must: + +* not write to any guest memory regions, +* not send any notifications to the guest, +* not send any messages to the front-end, +* still process and reply to messages from the front-end. + Multiple queue support ---------------------- @@ -490,7 +569,8 @@ ancillary data, it may be used to inform the front-end that the log has been modified. Once the source has finished migration, rings will be stopped by the -source. No further update must be done before rings are restarted. +source (:ref:`Suspended device state <suspended_device_state>`). No +further update must be done before rings are restarted. In postcopy migration the back-end is started before all the memory has been received from the source host, and care must be taken to avoid @@ -502,6 +582,80 @@ it performs WAKE ioctl's on the userfaultfd to wake the stalled back-end. The front-end indicates support for this via the ``VHOST_USER_PROTOCOL_F_PAGEFAULT`` feature. +.. _migrating_backend_state: + +Migrating back-end state +^^^^^^^^^^^^^^^^^^^^^^^^ + +Migrating device state involves transferring the state from one +back-end, called the source, to another back-end, called the +destination. After migration, the destination transparently resumes +operation without requiring the driver to re-initialize the device at +the VIRTIO level. If the migration fails, then the source can +transparently resume operation until another migration attempt is made. + +Generally, the front-end is connected to a virtual machine guest (which +contains the driver), which has its own state to transfer between source +and destination, and therefore will have an implementation-specific +mechanism to do so. The ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature +provides functionality to have the front-end include the back-end's +state in this transfer operation so the back-end does not need to +implement its own mechanism, and so the virtual machine may have its +complete state, including vhost-user devices' states, contained within a +single stream of data. + +To do this, the back-end state is transferred from back-end to front-end +on the source side, and vice versa on the destination side. This +transfer happens over a channel that is negotiated using the +``VHOST_USER_SET_DEVICE_STATE_FD`` message. This message has two +parameters: + +* Direction of transfer: On the source, the data is saved, transferring + it from the back-end to the front-end. On the destination, the data + is loaded, transferring it from the front-end to the back-end. + +* Migration phase: Currently, the only supported phase is the period + after the transfer of memory-mapped regions before switch-over to the + destination, when both the source and destination devices are + suspended (:ref:`Suspended device state <suspended_device_state>`). + In the future, additional phases might be supported to allow iterative + migration while the device is running. + +The nature of the channel is implementation-defined, but it must +generally behave like a pipe: The writing end will write all the data it +has into it, signalling the end of data by closing its end. The reading +end must read all of this data (until encountering the end of file) and +process it. + +* When saving, the writing end is the source back-end, and the reading + end is the source front-end. After reading the state data from the + channel, the source front-end must transfer it to the destination + front-end through an implementation-defined mechanism. + +* When loading, the writing end is the destination front-end, and the + reading end is the destination back-end. After reading the state data + from the channel, the destination back-end must deserialize its + internal state from that data and set itself up to allow the driver to + seamlessly resume operation on the VIRTIO level. + +Seamlessly resuming operation means that the migration must be +transparent to the guest driver, which operates on the VIRTIO level. +This driver will not perform any re-initialization steps, but continue +to use the device as if no migration had occurred. The vhost-user +front-end, however, will re-initialize the vhost state on the +destination, following the usual protocol for establishing a connection +to a vhost-user back-end: This includes, for example, setting up memory +mappings and kick and call FDs as necessary, negotiating protocol +features, or setting the initial vring base indices (to the same value +as on the source side, so that operation can resume). + +Both on the source and on the destination side, after the respective +front-end has seen all data transferred (when the transfer FD has been +closed), it sends the ``VHOST_USER_CHECK_DEVICE_STATE`` message to +verify that data transfer was successful in the back-end, too. The +back-end responds once it knows whether the transfer and processing was +successful or not. + Memory access ------------- @@ -896,6 +1050,7 @@ Protocol features #define VHOST_USER_PROTOCOL_F_STATUS 16 #define VHOST_USER_PROTOCOL_F_XEN_MMAP 17 #define VHOST_USER_PROTOCOL_F_SHARED_OBJECT 18 + #define VHOST_USER_PROTOCOL_F_DEVICE_STATE 19 Front-end message types ----------------------- @@ -1042,18 +1197,54 @@ Front-end message types ``VHOST_USER_SET_VRING_BASE`` :id: 10 :equivalent ioctl: ``VHOST_SET_VRING_BASE`` - :request payload: vring state description + :request payload: vring descriptor index/indices :reply payload: N/A - Sets the base offset in the available vring. + Sets the next index to use for descriptors in this vring: + + * For a split virtqueue, sets only the next descriptor index to + process in the *Available Ring*. The device is supposed to read the + next index in the *Used Ring* from the respective vring structure in + guest memory. + + * For a packed virtqueue, both indices are supplied, as they are not + explicitly available in memory. + + Consequently, the payload type is specific to the type of virt queue + (*a vring descriptor index for split virtqueues* vs. *vring descriptor + indices for packed virtqueues*). ``VHOST_USER_GET_VRING_BASE`` :id: 11 :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE`` :request payload: vring state description - :reply payload: vring state description + :reply payload: vring descriptor index/indices + + Stops the vring and returns the current descriptor index or indices: + + * For a split virtqueue, returns only the 16-bit next descriptor + index to process in the *Available Ring*. Note that this may + differ from the available ring index in the vring structure in + memory, which points to where the driver will put new available + descriptors. For the *Used Ring*, the device only needs the next + descriptor index at which to put new descriptors, which is the + value in the vring structure in memory, so this value is not + covered by this message. + + * For a packed virtqueue, neither index is explicitly available to + read from memory, so both indices (as maintained by the device) are + returned. + + Consequently, the payload type is specific to the type of virt queue + (*a vring descriptor index for split virtqueues* vs. *vring descriptor + indices for packed virtqueues*). - Get the available vring base offset. + When and as long as all of a device’s vrings are stopped, it is + *suspended*, see :ref:`Suspended device state + <suspended_device_state>`. + + The request payload’s *num* field is currently reserved and must be + set to 0. ``VHOST_USER_SET_VRING_KICK`` :id: 12 @@ -1464,6 +1655,76 @@ Front-end message types the requested UUID. Back-end will reply passing the fd when the operation is successful, or no fd otherwise. +``VHOST_USER_SET_DEVICE_STATE_FD`` + :id: 42 + :equivalent ioctl: N/A + :request payload: device state transfer parameters + :reply payload: ``u64`` + + Front-end and back-end negotiate a channel over which to transfer the + back-end’s internal state during migration. Either side (front-end or + back-end) may create the channel. The nature of this channel is not + restricted or defined in this document, but whichever side creates it + must create a file descriptor that is provided to the respectively + other side, allowing access to the channel. This FD must behave as + follows: + + * For the writing end, it must allow writing the whole back-end state + sequentially. Closing the file descriptor signals the end of + transfer. + + * For the reading end, it must allow reading the whole back-end state + sequentially. The end of file signals the end of the transfer. + + For example, the channel may be a pipe, in which case the two ends of + the pipe fulfill these requirements respectively. + + Initially, the front-end creates a channel along with such an FD. It + passes the FD to the back-end as ancillary data of a + ``VHOST_USER_SET_DEVICE_STATE_FD`` message. The back-end may create a + different transfer channel, passing the respective FD back to the + front-end as ancillary data of the reply. If so, the front-end must + then discard its channel and use the one provided by the back-end. + + Whether the back-end should decide to use its own channel is decided + based on efficiency: If the channel is a pipe, both ends will most + likely need to copy data into and out of it. Any channel that allows + for more efficient processing on at least one end, e.g. through + zero-copy, is considered more efficient and thus preferred. If the + back-end can provide such a channel, it should decide to use it. + + The request payload contains parameters for the subsequent data + transfer, as described in the :ref:`Migrating back-end state + <migrating_backend_state>` section. + + The value returned is both an indication for success, and whether a + file descriptor for a back-end-provided channel is returned: Bits 0–7 + are 0 on success, and non-zero on error. Bit 8 is the invalid FD + flag; this flag is set when there is no file descriptor returned. + When this flag is not set, the front-end must use the returned file + descriptor as its end of the transfer channel. The back-end must not + both indicate an error and return a file descriptor. + + Using this function requires prior negotiation of the + ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature. + +``VHOST_USER_CHECK_DEVICE_STATE`` + :id: 43 + :equivalent ioctl: N/A + :request payload: N/A + :reply payload: ``u64`` + + After transferring the back-end’s internal state during migration (see + the :ref:`Migrating back-end state <migrating_backend_state>` + section), check whether the back-end was able to successfully fully + process the state. + + The value returned indicates success or error; 0 is success, any + non-zero value is an error. + + Using this function requires prior negotiation of the + ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature. + Back-end message types ---------------------- diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst index 1167f3a9f2..d1f3277cb0 100644 --- a/docs/system/device-emulation.rst +++ b/docs/system/device-emulation.rst @@ -93,6 +93,7 @@ Emulated Devices devices/vhost-user.rst devices/virtio-gpu.rst devices/virtio-pmem.rst + devices/virtio-snd.rst devices/vhost-user-rng.rst devices/canokey.rst devices/usb-u2f.rst diff --git a/docs/system/devices/virtio-snd.rst b/docs/system/devices/virtio-snd.rst new file mode 100644 index 0000000000..2a9187fd70 --- /dev/null +++ b/docs/system/devices/virtio-snd.rst @@ -0,0 +1,49 @@ +virtio sound +============ + +This document explains the setup and usage of the Virtio sound device. +The Virtio sound device is a paravirtualized sound card device. + +Linux kernel support +-------------------- + +Virtio sound requires a guest Linux kernel built with the +``CONFIG_SND_VIRTIO`` option. + +Description +----------- + +Virtio sound implements capture and playback from inside a guest using the +configured audio backend of the host machine. + +Device properties +----------------- + +The Virtio sound device can be configured with the following properties: + + * ``jacks`` number of physical jacks (Unimplemented). + * ``streams`` number of PCM streams. At the moment, no stream configuration is supported: the first one will always be a playback stream, an optional second will always be a capture stream. Adding more will cycle stream directions from playback to capture. + * ``chmaps`` number of channel maps (Unimplemented). + +All streams are stereo and have the default channel positions ``Front left, right``. + +Examples +-------- + +Add an audio device and an audio backend at once with ``-audio`` and ``model=virtio``: + + * pulseaudio: ``-audio driver=pa,model=virtio`` + or ``-audio driver=pa,model=virtio,server=/run/user/1000/pulse/native`` + * sdl: ``-audio driver=sdl,model=virtio`` + * coreaudio: ``-audio driver=coreaudio,model=virtio`` + +etc. + +To specifically add virtualized sound devices, you have to specify a PCI device +and an audio backend listed with ``-audio driver=help`` that works on your host +machine, e.g.: + +:: + + -device virtio-sound-pci,audiodev=my_audiodev \ + -audiodev alsa,id=my_audiodev |