diff options
author | Michael S. Tsirkin <mst@redhat.com> | 2012-04-23 15:46:22 +0300 |
---|---|---|
committer | Michael S. Tsirkin <mst@redhat.com> | 2012-04-25 10:53:47 +0300 |
commit | a821ce59338c79bb72dc844dd44ea53701965b2b (patch) | |
tree | d742c0d8a81491e83dcd80c58d9d80ee458216f4 | |
parent | 92045d80badc43c9f95897aad675dc7ef17a3b3f (diff) |
virtio: order index/descriptor reads
virtio has the equivalent of:
if (vq->last_avail_index != vring_avail_idx(vq)) {
read descriptor head at vq->last_avail_index;
}
In theory, processor can reorder descriptor head
read to happen speculatively before the index read.
this would trigger the following race:
host descriptor head read <- reads invalid head from ring
guest writes valid descriptor head
guest writes avail index
host avail index read <- observes valid index
as a result host will use an invalid head value.
This was not observed in the field by me but after
the experience with the previous two races
I think it is prudent to address this theoretical race condition.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-rw-r--r-- | hw/virtio.c | 5 | ||||
-rw-r--r-- | qemu-barrier.h | 14 |
2 files changed, 17 insertions, 2 deletions
diff --git a/hw/virtio.c b/hw/virtio.c index 5615b59a6c..168abe4864 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -287,6 +287,11 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) idx, vring_avail_idx(vq)); exit(1); } + /* On success, callers read a descriptor at vq->last_avail_idx. + * Make sure descriptor read does not bypass avail index read. */ + if (num_heads) { + smp_rmb(); + } return num_heads; } diff --git a/qemu-barrier.h b/qemu-barrier.h index f0b842e5b2..7e11197814 100644 --- a/qemu-barrier.h +++ b/qemu-barrier.h @@ -7,12 +7,13 @@ #if defined(__i386__) /* - * Because of the strongly ordered x86 storage model, wmb() is a nop + * Because of the strongly ordered x86 storage model, wmb() and rmb() are nops * on x86(well, a compiler barrier only). Well, at least as long as * qemu doesn't do accesses to write-combining memory or non-temporal * load/stores from C code. */ #define smp_wmb() barrier() +#define smp_rmb() barrier() /* * We use GCC builtin if it's available, as that can use * mfence on 32 bit as well, e.g. if built with -march=pentium-m. @@ -27,6 +28,7 @@ #elif defined(__x86_64__) #define smp_wmb() barrier() +#define smp_rmb() barrier() #define smp_mb() asm volatile("mfence" ::: "memory") #elif defined(_ARCH_PPC) @@ -37,6 +39,13 @@ * each other */ #define smp_wmb() asm volatile("eieio" ::: "memory") + +#if defined(__powerpc64__) +#define smp_rmb() asm volatile("lwsync" ::: "memory") +#else +#define smp_rmb() asm volatile("sync" ::: "memory") +#endif + #define smp_mb() asm volatile("sync" ::: "memory") #else @@ -45,10 +54,11 @@ * For (host) platforms we don't have explicit barrier definitions * for, we use the gcc __sync_synchronize() primitive to generate a * full barrier. This should be safe on all platforms, though it may - * be overkill for wmb(). + * be overkill for wmb() and rmb(). */ #define smp_wmb() __sync_synchronize() #define smp_mb() __sync_synchronize() +#define smp_rmb() __sync_synchronize() #endif |