From 5400c02b90bb647a961f3210255178b68602bd5b Mon Sep 17 00:00:00 2001 From: Markus Armbruster Date: Tue, 15 Mar 2016 19:34:51 +0100 Subject: ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster Reviewed-by: Marc-André Lureau Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com> --- docs/specs/ivshmem-spec.txt | 66 ++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 31 deletions(-) (limited to 'docs') diff --git a/docs/specs/ivshmem-spec.txt b/docs/specs/ivshmem-spec.txt index 4c33973552..f3912c0565 100644 --- a/docs/specs/ivshmem-spec.txt +++ b/docs/specs/ivshmem-spec.txt @@ -17,9 +17,10 @@ get interrupted by its peers. There are two basic configurations: -- Just shared memory: -device ivshmem,shm=NAME,... +- Just shared memory: -device ivshmem-plain,memdev=HMB,... - This uses shared memory object NAME. + This uses host memory backend HMB. It should have option "share" + set. - Shared memory plus interrupts: -device ivshmem,chardev=CHR,vectors=N,... @@ -30,9 +31,8 @@ There are two basic configurations: Each peer gets assigned a unique ID by the server. IDs must be between 0 and 65535. - Interrupts are message-signaled by default (MSI-X). With msi=off - the device has no MSI-X capability, and uses legacy INTx instead. - vectors=N configures the number of vectors to use. + Interrupts are message-signaled (MSI-X). vectors=N configures the + number of vectors to use. For more details on ivshmem device properties, see The QEMU Emulator User Documentation (qemu-doc.*). @@ -40,14 +40,15 @@ User Documentation (qemu-doc.*). == The ivshmem PCI device's guest interface == -The device has vendor ID 1af4, device ID 1110, revision 0. +The device has vendor ID 1af4, device ID 1110, revision 1. Before +QEMU 2.6.0, it had revision 0. === PCI BARs === The ivshmem PCI device has two or three BARs: - BAR0 holds device registers (256 Byte MMIO) -- BAR1 holds MSI-X table and PBA (only when using MSI-X) +- BAR1 holds MSI-X table and PBA (only ivshmem-doorbell) - BAR2 maps the shared memory object There are two ways to use this device: @@ -58,18 +59,19 @@ There are two ways to use this device: user space (see http://dpdk.org/browse/memnic). - If you additionally need the capability for peers to interrupt each - other, you need BAR0 and, if using MSI-X, BAR1. You will most - likely want to write a kernel driver to handle interrupts. Requires - the device to be configured for interrupts, obviously. + other, you need BAR0 and BAR1. You will most likely want to write a + kernel driver to handle interrupts. Requires the device to be + configured for interrupts, obviously. Before QEMU 2.6.0, BAR2 can initially be invalid if the device is configured for interrupts. It becomes safely accessible only after -the ivshmem server provided the shared memory. Guest software should -wait for the IVPosition register (described below) to become -non-negative before accessing BAR2. +the ivshmem server provided the shared memory. These devices have PCI +revision 0 rather than 1. Guest software should wait for the +IVPosition register (described below) to become non-negative before +accessing BAR2. -The device is not capable to tell guest software whether it is -configured for interrupts. +Revision 0 of the device is not capable to tell guest software whether +it is configured for interrupts. === PCI device registers === @@ -77,10 +79,12 @@ BAR 0 contains the following registers: Offset Size Access On reset Function 0 4 read/write 0 Interrupt Mask - bit 0: peer interrupt + bit 0: peer interrupt (rev 0) + reserved (rev 1) bit 1..31: reserved 4 4 read/write 0 Interrupt Status - bit 0: peer interrupt + bit 0: peer interrupt (rev 0) + reserved (rev 1) bit 1..31: reserved 8 4 read-only 0 or ID IVPosition 12 4 write-only N/A Doorbell @@ -92,18 +96,18 @@ Software should only access the registers as specified in column "Access". Reserved bits should be ignored on read, and preserved on write. -Interrupt Status and Mask Register together control the legacy INTx -interrupt when the device has no MSI-X capability: INTx is asserted -when the bit-wise AND of Status and Mask is non-zero and the device -has no MSI-X capability. Interrupt Status Register bit 0 becomes 1 -when an interrupt request from a peer is received. Reading the -register clears it. +In revision 0 of the device, Interrupt Status and Mask Register +together control the legacy INTx interrupt when the device has no +MSI-X capability: INTx is asserted when the bit-wise AND of Status and +Mask is non-zero and the device has no MSI-X capability. Interrupt +Status Register bit 0 becomes 1 when an interrupt request from a peer +is received. Reading the register clears it. IVPosition Register: if the device is not configured for interrupts, this is zero. Else, it is the device's ID (between 0 and 65535). Before QEMU 2.6.0, the register may read -1 for a short while after -reset. +reset. These devices have PCI revision 0 rather than 1. There is no good way for software to find out whether the device is configured for interrupts. A positive IVPosition means interrupts, @@ -124,14 +128,14 @@ interrupt vectors connected, the write is ignored. The device is not capable to tell guest software what peers are connected, or how many interrupt vectors are connected. -If the peer doesn't use MSI-X, its Interrupt Status register is set to -1. This asserts INTx unless masked by the Interrupt Mask register. -The device is not capable to communicate the interrupt vector to guest -software then. +The peer's interrupt for this vector then becomes pending. There is +no way for software to clear the pending bit, and a polling mode of +operation is therefore impossible. -If the peer uses MSI-X, the interrupt for this vector becomes pending. -There is no way for software to clear the pending bit, and a polling -mode of operation is therefore impossible with MSI-X. +If the peer is a revision 0 device without MSI-X capability, its +Interrupt Status register is set to 1. This asserts INTx unless +masked by the Interrupt Mask register. The device is not capable to +communicate the interrupt vector to guest software then. With multiple MSI-X vectors, different vectors can be used to indicate different events have occurred. The semantics of interrupt vectors -- cgit v1.2.3