aboutsummaryrefslogtreecommitdiff
path: root/hw/virtio
AgeCommit message (Collapse)Author
2016-03-03virtio-rng: ask for more data if queue is not fully drainedLadi Prosek
This commit effectively reverts: commit 4621c1768ef5d12171cca2aa1473595ecb9f1c9e Author: Amit Shah <amit.shah@redhat.com> Date: Wed Nov 21 11:21:19 2012 +0530 virtio-rng: remove extra request for entropy but instead of calling virtio_rng_process unconditionally, it first checks to see if the queue is empty as a little bit of optimization. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1456998514-19271-1-git-send-email-lprosek@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-25vring: removePaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25virtio: export vring_notify as virtio_should_notifyPaolo Bonzini
Virtio dataplane needs to trigger the irq manually through the guest notifier. Export virtio_should_notify so that it can be used around event_notifier_set. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25virtio: add AioContext-specific function for host notifiersPaolo Bonzini
This is used to register ioeventfd with a dataplane thread. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25vring: make vring_enable_notification return voidPaolo Bonzini
Make the API more similar to the regular virtqueue API. This will help when modifying the code to not use vring.c anymore. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25balloon: Use only 'pc-dimm' type dimm for ballooningVladimir Sementsov-Ogievskiy
For now there are only two dimm's: pc-dimm and nvdimm. This patch is actually needed to disable ballooning on nvdimm. But, to avoid future bugs, instead of disallowing nvdimm, we allow only pc-dimm. So, if someone adds new dimm which should be balloon-able, then this ability should be explicitly specified here. Why ballooning for nvdimm should be disabled for now: NVDIMM for now is planned to use as a backing store for DAX filesystem in the guest and thus this memory is excluded from guest memory management and LRUs. In this case libvirt running QEMU along with configured balloon almost immediately inflates balloon and effectively kill the guest as qemu counts nvdimm as part of the ram. Counting dimm devices as part of the ram for ballooning was started from commit 463756d03: virtio-balloon: Fix balloon not working correctly when hotplug memory Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-25virtio-balloon: rewrite get_current_ram_size()Vladimir Sementsov-Ogievskiy
Use pc_dimm_built_list() instead of qmp_pc_dimm_device_list() Actually, Qapi is not related to this internal helper. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-23move get_current_ram_size to virtio-balloon.cVladimir Sementsov-Ogievskiy
get_current_ram_size() is used only in virtio-balloon.c This patch moves it into virtio-balloon and make it static, to allow some balloon-specific tuning. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-23vhost-user: don't merge regions with different fdsMichael S. Tsirkin
vhost currently merges regions with contiguious virtual and physical addresses. This breaks for vhost-user since that also needs fds to match. Add a vhost_ops entry to compare the fds for vhost-user only. Cc: qemu-stable@nongnu.org Cc: Victor Kaplansky <victork@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-18vhost-user interrupt management fixesVictor Kaplansky
Since guest_mask_notifier can not be used in vhost-user mode due to buffering implied by unix control socket, force use_mask_notifier on virtio devices of vhost-user interfaces, and send correct callfd to the guest at vhost start. Using guest_notifier_mask function in vhost-user case may break interrupt mask paradigm, because mask/unmask is not really done when returning from guest_notifier_mask call, instead message is posted in a unix socket, and processed later. Add an option boolean flag 'use_mask_notifier' to disable the use of guest_notifier_mask in virtio pci. Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Signed-off-by: Victor Kaplansky <victork@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-16vhost: simplify vhost_needs_vring_endian()Greg Kurz
After the call to virtio_vdev_has_feature(), we only care for legacy devices, so we don't need the extra check in virtio_is_big_endian(). Also the device_endian field is always set (VIRTIO_DEVICE_ENDIAN_UNKNOWN may only happen on a virtio_load() path that cannot lead here), so we don't need the assert() either. This open codes the device_endian checking in vhost_needs_vring_endian(). It also adds a comment to explain the logic, as recent reviews showed the cross-endian tweaks aren't that obvious. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-16vhost: move virtio 1.0 check to cross-endian helperGreg Kurz
Indeed vhost doesn't need to ask for vring endian fixing if the device is virtio 1.0, since it is already handled by the in-kernel vhost driver. This patch simply consolidates the logic into the existing helper. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-16virtio: move cross-endian helper to vhostGreg Kurz
If target is bi-endian (ppc64, arm), the virtio_legacy_is_cross_endian() indeed returns the runtime state of the virtio device. However, it returns false unconditionally in the general case. This sounds a bit strange given the name of the function. This helper is only useful for vhost actually, where indeed non bi-endian targets don't have to deal with cross-endian issues. This patch moves the helper to vhost.c and gives it a more appropriate name. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-08qapi: Drop unused 'kind' for struct/enum visitEric Blake
visit_start_struct() and visit_type_enum() had a 'kind' argument that was usually set to either the stringized version of the corresponding qapi type name, or to NULL (although some clients didn't even get that right). But nothing ever used the argument. It's even hard to argue that it would be useful in a debugger, as a stack backtrace also tells which type is being visited. Therefore, drop the 'kind' argument as dead. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1454075341-13658-22-git-send-email-eblake@redhat.com> [Harmless rebase mistake cleaned up] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08qom: Swap 'name' next to visitor in ObjectPropertyAccessorEric Blake
Similar to the previous patch, it's nice to have all functions in the tree that involve a visitor and a name for conversion to or from QAPI to consistently stick the 'name' parameter next to the Visitor parameter. Done by manually changing include/qom/object.h and qom/object.c, then running this Coccinelle script and touching up the fallout (Coccinelle insisted on adding some trailing whitespace). @ rule1 @ identifier fn; typedef Object, Visitor, Error; identifier obj, v, opaque, name, errp; @@ void fn - (Object *obj, Visitor *v, void *opaque, const char *name, + (Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { ... } @@ identifier rule1.fn; expression obj, v, opaque, name, errp; @@ fn(obj, v, - opaque, name, + name, opaque, errp) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1454075341-13658-20-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08qapi: Swap visit_* arguments for consistent 'name' placementEric Blake
JSON uses "name":value, but many of our visitor interfaces were called with visit_type_FOO(v, &value, name, errp). This can be a bit confusing to have to mentally swap the parameter order to match JSON order. It's particularly bad for visit_start_struct(), where the 'name' parameter is smack in the middle of the otherwise-related group of 'obj, kind, size' parameters! It's time to do a global swap of the parameter ordering, so that the 'name' parameter is always immediately after the Visitor argument. Additional reason in favor of the swap: the existing include/qjson.h prefers listing 'name' first in json_prop_*(), and I have plans to unify that file with the qapi visitors; listing 'name' first in qapi will minimize churn to the (admittedly few) qjson.h clients. Later patches will then fix docs, object.h, visitor-impl.h, and those clients to match. Done by first patching scripts/qapi*.py by hand to make generated files do what I want, then by running the following Coccinelle script to affect the rest of the code base: $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'` I then had to apply some touchups (Coccinelle insisted on TAB indentation in visitor.h, and botched the signature of visit_type_enum() by rewriting 'const char *const strings[]' to the syntactically invalid 'const char*const[] strings'). The movement of parameters is sufficient to provoke compiler errors if any callers were missed. // Part 1: Swap declaration order @@ type TV, TErr, TObj, T1, T2; identifier OBJ, ARG1, ARG2; @@ void visit_start_struct -(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp) +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp) { ... } @@ type bool, TV, T1; identifier ARG1; @@ bool visit_optional -(TV v, T1 ARG1, const char *name) +(TV v, const char *name, T1 ARG1) { ... } @@ type TV, TErr, TObj, T1; identifier OBJ, ARG1; @@ void visit_get_next_type -(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp) +(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp) { ... } @@ type TV, TErr, TObj, T1, T2; identifier OBJ, ARG1, ARG2; @@ void visit_type_enum -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp) +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp) { ... } @@ type TV, TErr, TObj; identifier OBJ; identifier VISIT_TYPE =~ "^visit_type_"; @@ void VISIT_TYPE -(TV v, TObj OBJ, const char *name, TErr errp) +(TV v, const char *name, TObj OBJ, TErr errp) { ... } // Part 2: swap caller order @@ expression V, NAME, OBJ, ARG1, ARG2, ERR; identifier VISIT_TYPE =~ "^visit_type_"; @@ ( -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR) +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR) | -visit_optional(V, ARG1, NAME) +visit_optional(V, NAME, ARG1) | -visit_get_next_type(V, OBJ, ARG1, NAME, ERR) +visit_get_next_type(V, NAME, OBJ, ARG1, ERR) | -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR) +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR) | -VISIT_TYPE(V, OBJ, NAME, ERR) +VISIT_TYPE(V, NAME, OBJ, ERR) ) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08qom: Use typedef for VisitorEric Blake
No need to repeat 'struct Visitor' when we already have it in typedefs.h. Omitting the redundant 'struct' also makes a later patch easier to search for all object property callbacks that are associated with a Visitor. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1454075341-13658-18-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08balloon: Improve use of qapi visitorEric Blake
Rework the control flow of balloon_stats_get_all() to make it easier for a later patch to split visit_end_struct(). Also switch to the uint64 visitor to match the data type. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1454075341-13658-10-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-06virtio: combine write of an entry into used ringVincenzo Maffione
Fill in an element of the used ring with a single combined access to the guest physical memory, rather than using two separated accesses. This reduces the overhead due to expensive address translation. Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Message-Id: <e4a89a767a4a92cbb6bcc551e151487eb36e1722.1450218353.git.v.maffione@gmail.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: read avail_idx from VQ only when necessaryVincenzo Maffione
The virtqueue_pop() implementation needs to check if the avail ring contains some pending buffers. To perform this check, it is not always necessary to fetch the avail_idx in the VQ memory, which is expensive. This patch introduces a shadow variable tracking avail_idx and modifies virtio_queue_empty() to access avail_idx in physical memory only when necessary. Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Message-Id: <b617d6459902773d9f4ab843bfaca764f5af8eda.1450218353.git.v.maffione@gmail.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: cache used_idx in a VirtQueue fieldVincenzo Maffione
Accessing used_idx in the VQ requires an expensive access to guest physical memory. Before this patch, 3 accesses are normally done for each pop/push/notify call. However, since the used_idx is only written by us, we can track it in our internal data structure. Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Message-Id: <3d062ec54e9a7bf9fb325c1fd693564951f2b319.1450218353.git.v.maffione@gmail.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: combine the read of a descriptorPaolo Bonzini
Compared to vring, virtio has a performance penalty of 10%. Fix it by combining all the reads for a descriptor in a single address_space_read call. This also simplifies the code nicely. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06vring: slim down allocation of VirtQueueElementsPaolo Bonzini
Build the addresses and s/g lists on the stack, and then copy them to a VirtQueueElement that is just as big as required to contain this particular s/g list. The cost of the copy is minimal compared to that of a large malloc. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: slim down allocation of VirtQueueElementsPaolo Bonzini
Build the addresses and s/g lists on the stack, and then copy them to a VirtQueueElement that is just as big as required to contain this particular s/g list. The cost of the copy is minimal compared to that of a large malloc. When virtqueue_map is used on the destination side of migration or on loadvm, the iovecs have already been split at memory region boundary, so we can just reuse the out_num/in_num we find in the file. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: introduce virtqueue_alloc_elementPaolo Bonzini
Allocate the arrays for in_addr/out_addr/in_sg/out_sg outside the VirtQueueElement. For now, virtqueue_pop and vring_pop keep allocating a very large VirtQueueElement. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: introduce qemu_get/put_virtqueue_elementPaolo Bonzini
Move allocation to virtio functions also when loading/saving a VirtQueueElement. This will also let the load/save functions keep backwards compatibility when the VirtQueueElement layout is changed. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: move allocation to virtqueue_pop/vring_popPaolo Bonzini
The return code of virtqueue_pop/vring_pop is unused except to check for errors or 0. We can thus easily move allocation inside the functions and just return a pointer to the VirtQueueElement. The advantage is that we will be able to allocate only the space that is needed for the actual size of the s/g list instead of the full VIRTQUEUE_MAX_SIZE items. Currently VirtQueueElement takes about 48K of memory, and this kind of allocation puts a lot of stress on malloc. By cutting the size by two or three orders of magnitude, malloc can use much more efficient algorithms. The patch is pretty large, but changes to each device are testable more or less independently. Splitting it would mostly add churn. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-04Fix virtio migrationDr. David Alan Gilbert
I misunderstood the vmstate macro definition when I reworked the virtio .get/.put. The VMSTATE_STRUCT_VARRAY_KNOWN, was described as being for "a variable length array (i.e. _type *_field) but we know the length". However it actually specified operation for arrays embedded in the struct (i.e. _type _field[]) since it lacked the VMS_POINTER flag. This caused offset calculation to be completely off, examining and potentially sending random data instead of the VirtQueue content. Replace the otherwise unused VMSTATE_STRUCT_VARRAY_KNOWN with a VMSTATE_STRUCT_VARRAY_POINTER_KNOWN that includes the VMS_POINTER flag (so now actually doing what it advertises) and use it in the virtio migration code. Fixes and description as per Sascha's suggestions/debug. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Tested-By: Sascha Silbe <silbe@linux.vnet.ibm.com> Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com> Fixes: 50e5ae4dc3e4f21e874512f9e87b93b5472d26e0 Fixes: 2cf0148674430b6693c60d42b7eef721bfa9509f Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-01-29virtio: Clean up includesPeter Maydell
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1453832250-766-15-git-send-email-peter.maydell@linaro.org
2016-01-12Merge remote-tracking branch 'remotes/kvaneesh/tags/for-upstream-signed' ↵Peter Maydell
into staging VirtFS update: Cleanups mostly isolating virtio related details into separate files. This is done to enable easy addition of Xen transport for VirtFS. The changes include: 1. Rename a bunch of files and functions to make clear they are generic. 2. disentangle virtio transport code and generic 9pfs code. 3. Some function name clean-up. # gpg: Signature made Tue 12 Jan 2016 06:04:35 GMT using RSA key ID 04C4E23A # gpg: Good signature from "Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 4846 9DE7 1860 360F A6E9 968C DE41 A4FE 04C4 E23A * remotes/kvaneesh/tags/for-upstream-signed: (25 commits) 9pfs: introduce V9fsVirtioState 9pfs: factor out v9fs_device_{,un}realize_common 9pfs: rename virtio-9p.c to 9p.c 9pfs: rename virtio_9p_set_fd_limit to use v9fs_ prefix 9pfs: move handle_9p_output and make it static function 9pfs: export pdu_{submit,alloc,free} 9pfs: factor out virtio_9p_push_and_notify 9pfs: break out 9p.h from virtio-9p.h 9pfs: break out virtio_init_iov_from_pdu 9pfs: factor out pdu_push_and_notify 9pfs: factor out virtio_pdu_{,un}marshal 9pfs: make pdu_{,un}marshal proper functions 9pfs: PDU processing functions should start pdu_ prefix 9pfs: PDU processing functions don't need to take V9fsState as argument fsdev: rename virtio-9p-marshal.{c,h} to 9p-iov-marshal.{c,h} fsdev: break out 9p-marshal.{c,h} from virtio-9p-marshal.{c,h} 9pfs: remove dead code 9pfs: merge hw/virtio/virtio-9p.h into hw/9pfs/virtio-9p.h 9pfs: rename virtio-9p-xattr{,-user}.{c,h} to 9p-xattr{,-user}.{c,h} 9pfs: rename virtio-9p-synth.{c,h} to 9p-synth.{c,h} ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-01-129pfs: introduce V9fsVirtioStateWei Liu
V9fsState now only contains generic fields. Introduce V9fsVirtioState for virtio transport. Change virtio-pci and virtio-ccw to use V9fsVirtioState. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2016-01-09virtio: fix error message for number of queuesCornelia Huck
There's no such thing as "PCI queues" in the virtio core. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-01-09migration/virtio: Remove simple .get/.put useDr. David Alan Gilbert
The 'virtqueue_state' and 'ringsize' can be saved using VMSTATE macros rather than hand coded .get/.put Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com>
2016-01-089pfs: merge hw/virtio/virtio-9p.h into hw/9pfs/virtio-9p.hWei Liu
The deleted file only contained V9fsConf which wasn't virtio specific. Merge that to the general header of 9pfs. Fixed header inclusions as I went along. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2015-12-02virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass ↵Shmulik Ladkani
realize method In 1811e64 'hw/virtio: Add PCIe capability to virtio devices', the QEMU_PCI_CAP_EXPRESS capability was added to virtio's pci_dev, within 'virtio_pci_realize' - the pci device object realization method. This occurs to late, as 'pci_qdev_realize' (DeviceClass.realize of TYPE_PCI_DEVICE) has already been called, without knowing that the device instance is indeed an "express" instance, thus allocating insufficient pci config space. As a result, device may crash upon attempt to write to the PCIE config space. Fix, by arming the QEMU_PCI_CAP_EXPRESS capability early in virtio-pci's own DeviceClass realize method. This also makes code cleaner, as 'virtio_pci_realize' may now access the 'pci_is_express' predicate when needed. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-12-02virtio: handle non-virtio-1-capable backend for ccwCornelia Huck
If you run a qemu advertising VERSION_1 with an old kernel where vhost did not yet support VERSION_1, you'll end up with a device that is {modern pci|ccw revision 1} but does not advertise VERSION_1. This is not a sensible configuration and is rejected by the Linux guest drivers. To fix this, add a ->post_plugged() callback invoked after features have been queried that can handle the VERSION_1 bit being withdrawn and change ccw to fall back to revision 0 if VERSION_1 is gone. Note that pci is _not_ fixed; we'll need to rethink the approach for the next release but at least for pci it's not a regression. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-26Revert "vhost: send SET_VRING_ENABLE at start/stop"Michael S. Tsirkin
This reverts commit 3a12f32229a046f4d4ab0a3a52fb01d2d5a1ab76. In case of live migration several queues can be enabled and not only the first one. So informing backend that only the first queue is enabled is wrong. Reported-by: Thibaut Collet <thibaut.collet@6wind.com> Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2015-11-18vhost-user: fix log sizeMichael S. Tsirkin
commit 2b8819c6eee517c1582983773f8555bb3f9ed645 ("vhost-user: modify SET_LOG_BASE to pass mmap size and offset") passes log size in units of 4 byte chunks instead of the expected size in bytes. Fix this up. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-16vhost-user: start/stop all ringsMichael S. Tsirkin
We are currently only sending VRING_ENABLE message for the first ring, that's wrong: we must start/stop them all. Reported-by: Victor Kaplansky <victork@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-16vhost-user: print original request on errorMichael S. Tsirkin
When we get an unexpected response, print out the original request. Helps debug protocol errors tremendously. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-16vhost: let SET_VRING_ENABLE message depends on protocol featureYuanhan Liu
But not depend on PROTOCOL_F_MQ feature bit. So that we could use SET_VRING_ENABLE to sign the backend on stop, even if MQ is disabled. That's reasonable, since we will have one queue pair at least. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-12hw/virtio: Add PCIe capability to virtio devicesMarcel Apfelbaum
The virtio devices are converted to PCI-Express if they are plugged into a PCI-Express bus and the 'modern' protocol is enabled. Devices plugged directly into the Root Complex as Integrated Endpoints remain PCI. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-12vhost: send SET_VRING_ENABLE at start/stopYuanhan Liu
Send SET_VRING_ENABLE at start/stop, to give the backend an explicit sign of our state. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-12vhost: rename RESET_DEVICE backto RESET_OWNERYuanhan Liu
This patch basically reverts commit d1f8b30e. It turned out that it breaks stuff, so revert it: http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html CC: "Michael S. Tsirkin" <mst@redhat.com> Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-12vhost-user: modify SET_LOG_BASE to pass mmap size and offsetVictor Kaplansky
Unlike the kernel, vhost-user application accesses log table by mmaping it to its user space. This change adds two new fields to VhostUserMsg payload: mmap_size, and mmap_offset and make QEMU to pass the to vhost-user application in VHOST_USER_SET_LOG_BASE request. Signed-off-by: Victor Kaplansky <victork@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-12virtio-pci: unbreak queue_enable readJason Wang
Guest always get zero when reading queue_enable. This violates spec. Fixing this by setting the queue_enable to true during any guest writing and setting it to zero during reset. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-12virtio-pci: introduce pio notification capability for modern deviceJason Wang
We used to use mmio for notification. This could be slow on some arch (e.g on x86 without EPT). So this patch introduces pio bar and a pio notification cap for modern device. This ability is enabled through property "modern-pio-notify" for virtio pci devices and was disabled by default. Management can enable when it thinks it was needed. Benchmarks shows almost no obvious difference compared to legacy device on machines without ept. Thanks Wenli Quan <wquan@redhat.com> for the benchmarking. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-12virtio-pci: use zero length mmio eventfd for 1.0 notification cap when possibleJason Wang
We use data match eventfd for 1.0 notification currently. This could be slow since software decoding is needed for mmio exit. To speed this up, we can switch to use zero length mmio eventfd for 1.0 notification since we can examine the queue index directly from the writing address. KVM kernel module can utilize this by registering it to fast mmio bus which could be as fast as pio on ept capable machine when fast mmio is supported by host kernel. Lots of improvements were seen on a ept capable machine: Guest RX:(TCP) size/session/+throughput%/+cpu%/-+per cpu%/ 64/1/+1.6807%/[-16.2421%]/[+21.3984%]/ 64/2/+0.6091%/[-11.0187%]/[+13.0678%]/ 64/4/+0.0553%/[-5.9768%]/[+6.4155%]/ 64/8/+0.1206%/[-4.0057%]/[+4.2984%]/ 256/1/-0.0031%/[-10.1166%]/[+11.2517%]/ 256/2/-0.5058%/[-6.1656%]/+6.0317%]/ ... Guest TX:(TCP) size/session/+throughput%/+cpu%/-+per cpu%/ 64/1/[+18.9183%]/-0.2823%/[+19.2550%]/ 64/2/[+13.5714%]/[+2.2675%]/[+11.0533%]/ 64/4/[+13.1070%]/[+2.1817%]/[+10.6920%]/ 64/8/[+13.0426%]/[+2.0887%]/[+10.7299%]/ 256/1/[+36.2761%]/+6.3434%/[+28.1471%]/ ... 1024/1/[+44.8873%]/+2.0811%/[+41.9335%]/ ... 1024/4/+0.0228%/[-2.2044%]/[+2.2774%]/ ... 16384/2/+0.0127%/[-5.0346%]/[+5.3148%]/ ... 65535/1/[+0.0062%]/[-4.1183%]/[+4.3017%]/ 65535/2/+0.0004%/[-4.2311%]/[+4.4185%]/ 65535/4/+0.0107%/[-4.6106%]/[+4.8446%]/ 65535/8/-0.0090%/[-5.5178%]/[+5.8306%]/ Latency:(TCP_RR) size/session/+transaction rate%/+cpu%/-+per cpu%/ 64/1/[+6.5248%]/[-9.2882%]/[+17.4322%]/ 64/25/[+11.0854%]/[+0.8000%]/[+10.2038%]/ 64/50/[+12.1076%]/[+2.4627%]/[+9.4131%]/ 256/1/[+5.3677%]/[+10.5669%]/-4.7024%/ 256/25/[+5.6402%]/-0.8962%/[+6.5955%]/ 256/50/[+5.9685%]/[+1.7766%]/[+4.1188%]/ 4096/1/+0.2508%/[-10.4941%]/[+12.0047%]/ 4096/25/[+1.8533%]/-0.0273%/+1.8812%/ 4096/50/[+1.2156%]/-1.4134%/+2.6667%/ Notes: data with '[]' is the one whose significance is greater than 95%. Thanks Wenli Quan <wquan@redhat.com> for the benchmarking. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-12virtio-pci: fix 1.0 virtqueue migrationJason Wang
We don't migrate the followings fields for virtio-pci: uint32_t dfselect; uint32_t gfselect; uint32_t guest_features[2]; struct { uint16_t num; bool enabled; uint32_t desc[2]; uint32_t avail[2]; uint32_t used[2]; } vqs[VIRTIO_QUEUE_MAX]; This will confuse driver if migrating during initialization. Solves this issue by: - introduce transport specific callbacks to load and store extra virtqueue states. - add a new subsection for virtio to migrate transport specific modern device state. - implement pci specific callbacks. - add a new property for virtio-pci for whether or not to migrate extra state. - compat the migration for 2.4 and elder machine types Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-11-10Inhibit ballooning during postcopyDr. David Alan Gilbert
Postcopy detects accesses to pages that haven't been transferred yet using userfaultfd, and it causes exceptions on pages that are 'not present'. Ballooning also causes pages to be marked as 'not present' when the guest inflates the balloon. Potentially a balloon could be inflated to discard pages that are currently inflight during postcopy and that may be arriving at about the same time. To avoid this confusion, disable ballooning during postcopy. When disabled we drop balloon requests from the guest. Since ballooning is generally initiated by the host, the management system should avoid initiating any balloon instructions to the guest during migration, although it's not possible to know how long it would take a guest to process a request made prior to the start of migration. Guest initiated ballooning will not know if it's really freed a page of host memory or not. Queueing the requests until after migration would be nice, but is non-trivial, since the set of inflate/deflate requests have to be compared with the state of the page to know what the final outcome is allowed to be. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>