aboutsummaryrefslogtreecommitdiff
path: root/hw/net/virtio-net.c
AgeCommit message (Collapse)Author
2020-01-07Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio, pci, pc: fixes, features Bugfixes all over the place. HMAT support. New flags for vhost-user-blk utility. Auto-tuning of seg max for virtio storage. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 06 Jan 2020 17:05:05 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (32 commits) intel_iommu: add present bit check for pasid table entries intel_iommu: a fix to vtd_find_as_from_bus_num() virtio-net: delete also control queue when TX/RX deleted virtio: reset region cache when on queue deletion virtio-mmio: update queue size on guest write tests: add virtio-scsi and virtio-blk seg_max_adjust test virtio: make seg_max virtqueue size dependent hw: fix using 4.2 compat in 5.0 machine types for i440fx/q35 vhost-user-scsi: reset the device if supported vhost-user: add VHOST_USER_RESET_DEVICE to reset devices hw/pci/pci_host: Let pci_data_[read/write] use unsigned 'size' argument hw/pci/pci_host: Remove redundant PCI_DPRINTF() virtio-mmio: Clear v2 transport state on soft reset ACPI: add expected files for HMAT tests (acpihmat) tests/bios-tables-test: add test cases for ACPI HMAT tests/numa: Add case for QMP build HMAT hmat acpi: Build Memory Side Cache Information Structure(s) hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s) hmat acpi: Build Memory Proximity Domain Attributes Structure(s) numa: Extend CLI to provide memory side cache information ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-06virtio-net: delete also control queue when TX/RX deletedYuri Benditovich
https://bugzilla.redhat.com/show_bug.cgi?id=1708480 If the control queue is not deleted together with TX/RX, it later will be ignored in freeing cache resources and hot unplug will not be completed. Cc: qemu-stable@nongnu.org Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Message-Id: <20191226043649.14481-3-yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-01-06vmstate: replace DeviceState with VMStateIfMarc-André Lureau
Replace DeviceState dependency with VMStateIf on vmstate API. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com>
2019-12-02net/virtio: Fix failover error handling crash bugsMarkus Armbruster
Functions that take an Error ** parameter to pass an error to the caller expect the parameter to point to null. failover_replug_primary() violates this precondition in several places: * After qemu_opts_from_qdict() failed, *errp is no longer null. Passing it to error_setg() is wrong, and will trip the assertion in error_setv(). Messed up in commit 150ab54aa6 "net/virtio: fix re-plugging of primary device". Simply drop the error_setg(). * Passing @errp to qemu_opt_set_bool(), hotplug_handler_pre_plug(), and hotplug_handler_plug() is wrong. If one of the first two fails, *errp is no longer null. Risks tripping the same assertion. Moreover, continuing after such errors is unsafe. Messed up in commit 9711cd0dfc "net/virtio: add failover support". Fix by handling each error properly. failover_replug_primary() crashes when passed a null @errp. Also messed up in commit 9711cd0dfc. This bug can't bite as no caller actually passes null. Fix it anyway. Fixes: 9711cd0dfc3fa414f7f64935713c07134ae67971 Fixes: 150ab54aa6934583180f88a2bd540bc6fc4fbff3 Cc: Jens Freimann <jfreimann@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20191130194240.10517-3-armbru@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>
2019-12-02net/virtio: Drop useless n->primary_dev not null checksMarkus Armbruster
virtio_net_handle_migration_primary() returns early when it can't ensure n->primary_dev is non-null. Checking it again right after that early return is redundant. Drop. If n->primary_dev is null on entering failover_replug_primary(), @pdev will become null, and pdev->partially_hotplugged will crash. Checking n->primary_dev later is useless. It can't actually be null, because its caller virtio_net_handle_migration_primary() ensures it isn't. Drop the useless check. Cc: Jens Freimann <jfreimann@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20191130194240.10517-2-armbru@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>
2019-11-25net/virtio: return error when device_opts arg is NULLJens Freimann
This fixes CID 1407222. Fixes: 9711cd0dfc3f ("net/virtio: add failover support") Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-11-25net/virtio: fix re-plugging of primary deviceJens Freimann
failover_replug_primary was returning true on failure which lead to re-plug not working when a migration failed. Fix this by returning success when hotplug worked. This is a bug that was missed in last round of testing but was tested succesfully with this version. Also make sure we don't pass NULL to qdev_set_parent_bus(). This fixes CID 1407224. Fixes: 9711cd0dfc3f ("net/virtio: add failover support") Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-11-25net/virtio: return early when failover primary alread addedJens Freimann
Bail out when primary device was already added before. This avoids printing a wrong warning message during reboot. Fixes: 9711cd0dfc3f ("net/virtio: add failover support") Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-11-25net/virtio: fix dev_unplug_pendingJens Freimann
.dev_unplug_pending is set up by virtio-net code indepent of failover support was set for the device or not. This gives a wrong result when we check for existing primary devices in migration code. Fix this by actually calling dev_unplug_pending() instead of just checking if the function pointer was set. When the feature was not negotiated dev_unplug_pending() will always return false. This prevents us from going into the wait-unplug state when there's no primary device present. Fixes: 9711cd0dfc3f ("net/virtio: add failover support") Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-10-29virtio_net: use RCU_READ_LOCK_GUARDDr. David Alan Gilbert
Use RCU_READ_LOCK_GUARD rather than the manual rcu_read_(un)lock call. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20191025103403.120616-3-dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-29net/virtio: add failover supportJens Freimann
This patch adds support to handle failover device pairs of a virtio-net device and a (vfio-)pci device, where the virtio-net acts as the standby device and the (vfio-)pci device as the primary. The general idea is that we have a pair of devices, a (vfio-)pci and a emulated (virtio-net) device. Before migration the vfio device is unplugged and data flows to the emulated device, on the target side another (vfio-)pci device is plugged in to take over the data-path. In the guest the net_failover module will pair net devices with the same MAC address. To achieve this we need: 1. Provide a callback function for the should_be_hidden DeviceListener. It is called when the primary device is plugged in. Evaluate the QOpt passed in to check if it is the matching primary device. It returns if the device should be hidden or not. When it should be hidden it stores the device options in the VirtioNet struct and the device is added once the VIRTIO_NET_F_STANDBY feature is negotiated during virtio feature negotiation. If the virtio-net devices are not realized at the time the (vfio-)pci devices are realized, we need to connect the devices later. This way we make sure primary and standby devices can be specified in any order. 2. Register a callback for migration status notifier. When called it will unplug its primary device before the migration happens. 3. Register a callback for the migration code that checks if a device needs to be unplugged from the guest. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191029114905.6856-11-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-29Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Tue 29 Oct 2019 02:33:36 GMT # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: COLO-compare: Fix incorrect `if` logic virtio-net: prevent offloads reset on migration virtio: new post_load hook net: add tulip (dec21143) driver Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-29virtio-net: prevent offloads reset on migrationMikhail Sennikovsky
Currently offloads disabled by guest via the VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET command are not preserved on VM migration. Instead all offloads reported by guest features (via VIRTIO_PCI_GUEST_FEATURES) get enabled. What happens is: first the VirtIONet::curr_guest_offloads gets restored and offloads are getting set correctly: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=0, tso6=0, ecn=0, ufo=0) at net/net.c:474 #1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 #2 virtio_net_post_load_device (opaque=0x555557701ca0, version_id=11) at hw/net/virtio-net.c:2334 #3 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577c80 <vmstate_virtio_net_device>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:168 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2197 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 However later on the features are getting restored, and offloads get reset to everything supported by features: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=1, tso6=1, ecn=0, ufo=0) at net/net.c:474 #1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 #2 virtio_net_set_features (vdev=0x555557701ca0, features=5104441767) at hw/net/virtio-net.c:773 #3 virtio_set_features_nocheck (vdev=0x555557701ca0, val=5104441767) at hw/virtio/virtio.c:2052 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2220 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 Fix this by preserving the state in saved_guest_offloads field and pushing out offload initialization to the new post load hook. Cc: qemu-stable@nongnu.org Signed-off-by: Mikhail Sennikovsky <mikhail.sennikovskii@cloud.ionos.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-10-28include: Move endof() up from hw/virtio/virtio.hMax Reitz
endof() is a useful macro, we can make use of it outside of virtio. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-08-16sysemu: Move the VMChangeStateEntry typedef to qemu/typedefs.hMarkus Armbruster
In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 1800 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h, down from 5400 due to the previous commit). Several headers include sysemu/sysemu.h just to get typedef VMChangeStateEntry. Move it from sysemu/sysemu.h to qemu/typedefs.h. Spell its structure tag the same while there. Drop the now superfluous includes of sysemu/sysemu.h from headers. Touching sysemu/sysemu.h now recompiles some 1100 objects. qemu/uuid.h also drops from 1800 to 1100, and qapi/qapi-types-run-state.h from 5000 to 4400. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-29-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include hw/qdev-properties.h lessMarkus Armbruster
In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16Include qemu/main-loop.h lessMarkus Armbruster
In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-07-02net/announce: Add optional IDDr. David Alan Gilbert
Previously there was a single instance of the timer used by monitor triggered announces, that's OK, but when combined with the previous change that lets you have announces for subsets of interfaces it's a bit restrictive if you want to do different things to different interfaces. Add an 'id' field to the announce, and maintain a list of the timers based on id. This allows you to for example: a) Start an announce going on interface eth0 for a long time b) Start an announce going on interface eth1 for a long time c) Kill the announce on eth0 while leaving eth1 going. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-06-12Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2019-04-02virtio-net: Fix typo in commentYuval Shaia
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Message-Id: <20190321161832.10533-1-yuval.shaia@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-05virtio-net: Allow qemu_announce_self to trigger virtio announcementsDr. David Alan Gilbert
Expose the virtio-net self announcement capability and allow qemu_announce_self() to call it. These announces are caused by something external (i.e. the announce-self command); they won't trigger if the migration counter is triggering announces at the same time. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-03-05virtio-net: Switch to using announce timerDr. David Alan Gilbert
Switch virtio's self announcement to use the AnnounceTimer. It keeps it's own AnnounceTimer (per device), and starts running it using a migration post-load and a virtual clock; that way the announce happens once the guest is actually running. The timer uses the migration parameters to set the timing of the repeats. Based on earlier patches by myself and Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-02-22virtio-net: make VirtIOFeature usable for other virtio devicesStefano Garzarella
In order to use VirtIOFeature also in other virtio devices, we move its declaration and the endof() macro (renamed in virtio_endof()) in virtio.h. We add virtio_feature_get_config_size() function to iterate the array of VirtIOFeature and to return the config size depending on the features enabled. (as virtio_net_set_config_size() did) Suggested-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20190221103314.58500-5-sgarzare@redhat.com Message-Id: <20190221103314.58500-5-sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-17virtio-net: changed VIRTIO_NET_F_RSC_EXT to be 61Yuri Benditovich
Allocated feature bit changed in spec draft per TC request. Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-17virtio-net: support RSC v4/v6 tcp traffic for Windows HCKYuri Benditovich
This commit adds implementation of RX packets coalescing, compatible with requirements of Windows Hardware compatibility kit. The device enables feature VIRTIO_NET_F_RSC_EXT in host features if it supports extended RSC functionality as defined in the specification. This feature requires at least one of VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6. Windows guest driver acks this feature only if VIRTIO_NET_F_CTRL_GUEST_OFFLOADS is also present. If the guest driver acks VIRTIO_NET_F_RSC_EXT feature, the device coalesces TCPv4 and TCPv6 packets (if respective VIRTIO_NET_F_GUEST_TSO feature is on, populates extended RSC information in virtio header and sets VIRTIO_NET_HDR_F_RSC_INFO bit in header flags. The device does not recalculate checksums in the coalesced packet, so they are not valid. In this case: All the data packets in a tcp connection are cached to a single buffer in every receive interval, and will be sent out via a timer, the 'virtio_net_rsc_timeout' controls the interval, this value may impact the performance and response time of tcp connection, 50000(50us) is an experience value to gain a performance improvement, since the whql test sends packets every 100us, so '300000(300us)' passes the test case, it is the default value as well, tune it via the command line parameter 'rsc_interval' within 'virtio-net-pci' device, for example, to launch a guest with interval set as '500000': 'virtio-net-pci,netdev=hostnet1,bus=pci.0,id=net1,mac=00, guest_rsc_ext=on,rsc_interval=500000' The timer will only be triggered if the packets pool is not empty, and it'll drain off all the cached packets. 'NetRscChain' is used to save the segments of IPv4/6 in a VirtIONet device. A new segment becomes a 'Candidate' as well as it passed sanity check, the main handler of TCP includes TCP window update, duplicated ACK check and the real data coalescing. An 'Candidate' segment means: 1. Segment is within current window and the sequence is the expected one. 2. 'ACK' of the segment is in the valid window. Sanity check includes: 1. Incorrect version in IP header 2. An IP options or IP fragment 3. Not a TCP packet 4. Sanity size check to prevent buffer overflow attack. 5. An ECN packet Even though, there might more cases should be considered such as ip identification other flags, while it breaks the test because windows set it to the same even it's not a fragment. Normally it includes 2 typical ways to handle a TCP control flag, 'bypass' and 'finalize', 'bypass' means should be sent out directly, while 'finalize' means the packets should also be bypassed, but this should be done after search for the same connection packets in the pool and drain all of them out, this is to avoid out of order fragment. All the 'SYN' packets will be bypassed since this always begin a new' connection, other flags such 'URG/FIN/RST/CWR/ECE' will trigger a finalization, because this normally happens upon a connection is going to be closed, an 'URG' packet also finalize current coalescing unit. Statistics can be used to monitor the basic coalescing status, the 'out of order' and 'out of window' means how many retransmitting packets, thus describe the performance intuitively. Difference between ip v4 and v6 processing: Fragment length in ipv4 header includes itself, while it's not included for ipv6, thus means ipv6 can carry a real 65535 payload. Note that main goal of implementing this feature in software is to create reference setup for certification tests. In such setups guest migration is not required, so the coalesced packets not yet delivered to the guest will be lost in case of migration. Signed-off-by: Wei Xu <wexu@redhat.com> Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-11avoid TABs in files that only contain a fewPaolo Bonzini
Most files that have TABs only contain a handful of them. Change them to spaces so that we don't confuse people. disas, standard-headers, linux-headers and libdecnumber are imported from other projects and probably should be exempted from the check. Outside those, after this patch the following files still contain both 8-space and TAB sequences at the beginning of the line. Many of them have a majority of TABs, or were initially committed with all tabs. bsd-user/i386/target_syscall.h bsd-user/x86_64/target_syscall.h crypto/aes.c hw/audio/fmopl.c hw/audio/fmopl.h hw/block/tc58128.c hw/display/cirrus_vga.c hw/display/xenfb.c hw/dma/etraxfs_dma.c hw/intc/sh_intc.c hw/misc/mst_fpga.c hw/net/pcnet.c hw/sh4/sh7750.c hw/timer/m48t59.c hw/timer/sh_timer.c include/crypto/aes.h include/disas/bfd.h include/hw/sh4/sh.h libdecnumber/decNumber.c linux-headers/asm-generic/unistd.h linux-headers/linux/kvm.h linux-user/alpha/target_syscall.h linux-user/arm/nwfpe/double_cpdo.c linux-user/arm/nwfpe/fpa11_cpdt.c linux-user/arm/nwfpe/fpa11_cprt.c linux-user/arm/nwfpe/fpa11.h linux-user/flat.h linux-user/flatload.c linux-user/i386/target_syscall.h linux-user/ppc/target_syscall.h linux-user/sparc/target_syscall.h linux-user/syscall.c linux-user/syscall_defs.h linux-user/x86_64/target_syscall.h slirp/cksum.c slirp/if.c slirp/ip.h slirp/ip_icmp.c slirp/ip_icmp.h slirp/ip_input.c slirp/ip_output.c slirp/mbuf.c slirp/misc.c slirp/sbuf.c slirp/socket.c slirp/socket.h slirp/tcp_input.c slirp/tcpip.h slirp/tcp_output.c slirp/tcp_subr.c slirp/tcp_timer.c slirp/tftp.c slirp/udp.c slirp/udp.h target/cris/cpu.h target/cris/mmu.c target/cris/op_helper.c target/sh4/helper.c target/sh4/op_helper.c target/sh4/translate.c tcg/sparc/tcg-target.inc.c tests/tcg/cris/check_addo.c tests/tcg/cris/check_moveq.c tests/tcg/cris/check_swap.c tests/tcg/multiarch/test-mmap.c ui/vnc-enc-hextile-template.h ui/vnc-enc-zywrle.h util/envlist.c util/readline.c The following have only TABs: bsd-user/i386/target_signal.h bsd-user/sparc64/target_signal.h bsd-user/sparc64/target_syscall.h bsd-user/sparc/target_signal.h bsd-user/sparc/target_syscall.h bsd-user/x86_64/target_signal.h crypto/desrfb.c hw/audio/intel-hda-defs.h hw/core/uboot_image.h hw/sh4/sh7750_regnames.c hw/sh4/sh7750_regs.h include/hw/cris/etraxfs_dma.h linux-user/alpha/termbits.h linux-user/arm/nwfpe/fpopcode.h linux-user/arm/nwfpe/fpsr.h linux-user/arm/syscall_nr.h linux-user/arm/target_signal.h linux-user/cris/target_signal.h linux-user/i386/target_signal.h linux-user/linux_loop.h linux-user/m68k/target_signal.h linux-user/microblaze/target_signal.h linux-user/mips64/target_signal.h linux-user/mips/target_signal.h linux-user/mips/target_syscall.h linux-user/mips/termbits.h linux-user/ppc/target_signal.h linux-user/sh4/target_signal.h linux-user/sh4/termbits.h linux-user/sparc64/target_syscall.h linux-user/sparc/target_signal.h linux-user/x86_64/target_signal.h linux-user/x86_64/termbits.h pc-bios/optionrom/optionrom.h slirp/mbuf.h slirp/misc.h slirp/sbuf.h slirp/tcp.h slirp/tcp_timer.h slirp/tcp_var.h target/i386/svm.h target/sparc/asi.h target/xtensa/core-dc232b/xtensa-modules.inc.c target/xtensa/core-dc233c/xtensa-modules.inc.c target/xtensa/core-de212/core-isa.h target/xtensa/core-de212/xtensa-modules.inc.c target/xtensa/core-fsf/xtensa-modules.inc.c target/xtensa/core-sample_controller/core-isa.h target/xtensa/core-sample_controller/xtensa-modules.inc.c target/xtensa/core-test_kc705_be/core-isa.h target/xtensa/core-test_kc705_be/xtensa-modules.inc.c tests/tcg/cris/check_abs.c tests/tcg/cris/check_addc.c tests/tcg/cris/check_addcm.c tests/tcg/cris/check_addoq.c tests/tcg/cris/check_bound.c tests/tcg/cris/check_ftag.c tests/tcg/cris/check_int64.c tests/tcg/cris/check_lz.c tests/tcg/cris/check_openpf5.c tests/tcg/cris/check_sigalrm.c tests/tcg/cris/crisutils.h tests/tcg/cris/sys.c tests/tcg/i386/test-i386-ssse3.c ui/vgafont.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20181213223737.11793-3-pbonzini@redhat.com> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Acked-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Eric Blake <eblake@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Stefan Markovic <smarkovic@wavecomp.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19cpus hw target: Use warn_report() & friends to report warningsMarkus Armbruster
Calling error_report() in a function that takes an Error ** argument is suspicious. Convert a few that are actually warnings to warn_report(). While there, split a warning consisting of multiple sentences to conform to conventions spelled out in warn_report()'s contract. Cc: Alex Bennée <alex.bennee@linaro.org> Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Fam Zheng <famz@redhat.com> Cc: Wei Huang <wei@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20181017082702.5581-5-armbru@redhat.com>
2018-08-28qapi: Drop qapi_event_send_FOO()'s Error ** argumentPeter Xu
The generated qapi_event_send_FOO() take an Error ** argument. They can't actually fail, because all they do with the argument is passing it to functions that can't fail: the QObject output visitor, and the @qmp_emit callback, which is either monitor_qapi_event_queue() or event_test_emit(). Drop the argument, and pass &error_abort to the QObject output visitor and @qmp_emit instead. Suggested-by: Eric Blake <eblake@redhat.com> Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180815133747.25032-4-peterx@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message rewritten, update to qapi-code-gen.txt corrected] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-06-27compiler: add a sizeof_field() macroStefan Hajnoczi
Determining the size of a field is useful when you don't have a struct variable handy. Open-coding this is ugly. This patch adds the sizeof_field() macro, which is similar to typeof_field(). Existing instances are updated to use the macro. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180614164431.29305-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-26virtio_net: flush uncompleted TX on resetGreg Kurz
If the backend could not transmit a packet right away for some reason, the packet is queued for asynchronous sending. The corresponding vq element is tracked in the async_tx.elem field of the VirtIONetQueue, for later freeing when the transmission is complete. If a reset happens before completion, virtio_net_tx_complete() will push async_tx.elem back to the guest anyway, and we end up with the inuse flag of the vq being equal to -1. The next call to virtqueue_pop() is then likely to fail with "Virtqueue size exceeded". This can be reproduced easily by starting a guest with an hubport backend that is not connected to a functional network, eg, -device virtio-net-pci,netdev=hub0 -netdev hubport,id=hub0,hubid=0 and no other -netdev hubport,hubid=0 on the command line. The appropriate fix is to ensure that such an asynchronous transmission cannot survive a device reset. So for all queues, we first try to send the packet again, and eventually we purge it if the backend still could not deliver it. CC: qemu-stable@nongnu.org Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://github.com/open-power-host-os/qemu/issues/37 Signed-off-by: Greg Kurz <groug@kaod.org> Tested-by: R. Nageswara Sastry <nasastry@in.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-03-13virtio-net: add linkspeed and duplex settings to virtio-netJason Baron
Although linkspeed and duplex can be set in a linux guest via 'ethtool -s', this requires custom ethtool commands for virtio-net by default. Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows the hypervisor to export a linkspeed and duplex setting. The user can subsequently overwrite it later if desired via: 'ethtool -s'. Linkspeed and duplex settings can be set as: '-device virtio-net,speed=10000,duplex=full' where speed is [0...INT_MAX], and duplex is ["half"|"full"]. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: virtio-dev@lists.oasis-open.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13virtio-net: use 64-bit values for feature flagsJason Baron
In prepartion for using some of the high order feature bits, make sure that virtio-net uses 64-bit values everywhere. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: virtio-dev@lists.oasis-open.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-02Include less of the generated modular QAPI headersMarkus Armbruster
In my "build everything" tree, a change to the types in qapi-schema.json triggers a recompile of about 4800 out of 5100 objects. The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h, qapi-types.h. Each of these headers still includes all its shards. Reduce compile time by including just the shards we actually need. To illustrate the benefits: adding a type to qapi/migration.json now recompiles some 2300 instead of 4800 objects. The next commit will improve it further. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-24-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-02-09Drop superfluous includes of qapi/qmp/qjson.hMarkus Armbruster
Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-19-armbru@redhat.com>
2018-02-09Include qapi/error.h exactly where neededMarkus Armbruster
This cleanup makes the number of objects depending on qapi/error.h drop from 1910 (out of 4743) to 1612 in my "build everything" tree. While there, separate #include from file comment with a blank line, and drop a useless comment on why qemu/osdep.h is included first. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-5-armbru@redhat.com> [Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
2017-11-28virtio-net: don't touch virtqueue if vm is stoppedJason Wang
Guest state should not be touched if VM is stopped, unfortunately we didn't check running state and tried to drain tx queue unconditionally in virtio_net_set_status(). A crash was then noticed as a migration destination when user type quit after virtqueue state is loaded but before region cache is initialized. In this case, virtio_net_drop_tx_queue_data() tries to access the uninitialized region cache. Fix this by only dropping tx queue data when vm is running. Fixes: 283e2c2adcb80 ("net: virtio-net discards TX data after link down") Cc: Yuri Benditovich <yuri.benditovich@daynix.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-09-27migration: pre_save return intDr. David Alan Gilbert
Modify the pre_save method on VMStateDescription to return an int rather than void so that it potentially can fail. Changed zillions of devices to make them return 0; the only case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already had an error_report/return case. Note: If you add an error exit in your pre_save you must emit an error_report to say why. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170925112917.21340-2-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-07-17virtio-net: fix offload ctrl endianJason Wang
Spec said offloads should be le64, so use virtio_ldq_p() to guarantee valid endian. Fixes: 644c98587d4c ("virtio-net: dynamic network offloads configuration") Cc: qemu-stable@nongnu.org Cc: Dmitry Fleytman <dfleytma@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17virtion-net: Prefer is_power_of_2()Michal Privoznik
We have a function that checks if given number is power of two. We should prefer it instead of expanding the check on our own. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-03virtio-net: fix tx queue size for !vhost-userMichael S. Tsirkin
Current code segfaults when no nic peer is specified. Fix it up - fall back to default queue size. Fixes: 9b02e1618cf26a ("virtio-net: enable configurable tx queue size") Cc: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03virtio-net: enable configurable tx queue sizeWei Wang
This patch enables the virtio-net tx queue size to be configurable between 256 (the default queue size) and 1024 by the user when the vhost-user backend is used. Currently, the maximum tx queue size for other backends is 512 due to the following limitations: - QEMU backend: the QEMU backend implementation in some cases may send 1024+1 iovs to writev. - Vhost_net backend: there are possibilities that the guest sends a vring_desc of memory which crosses a MemoryRegion thereby generating more than 1024 iovs after translation from guest-physical address in the backend. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-13migration: Move self_announce_delay() to misc.hJuan Quintela
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-05-25virtio_net: Bypass backends for MTU feature negotiationMaxime Coquelin
This patch adds a new internal "x-mtu-bypass-backend" property to bypass backends for MTU feature negotiation. When this property is set, the MTU feature is negotiated as soon as supported by the guest and a MTU value is set via the host_mtu parameter. In case the backend advertises the feature (e.g. DPDK's vhost-user backend), the feature negotiation is propagated down to the backend. When this property is not set, the backend has to support the MTU feature for its negotiation to succeed. For compatibility purpose, this property is disabled for machine types v2.9 and older. Cc: Aaron Conole <aconole@redhat.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-23virtio-net: fix wild pointer when remove virtio-net queuesYunjian Wang
The tx_bh or tx_timer will free in virtio_net_del_queue() function, when removing virtio-net queues if the guest doesn't support multiqueue. But it might be still referenced by virtio_net_set_status(), which needs to be set NULL. And also the tx_waiting needs to be set zero to prevent virtio_net_set_status() accessing tx_bh or tx_timer. Cc: qemu-stable@nongnu.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-31virtio-net: avoid call tap_enable when there's only one queueJason Wang
We call tap_enable() even if for multiqueue is not enabled. This is wrong since it should be used for multiqueue codes to enable a disabled queue. Fixing this by only calling this when multiqueue is used. Fixes: 16dbaf905b72 ("tap: support enabling or disabling a queue") Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Tested-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-02-17virtio: use VRingMemoryRegionCaches for avail and used ringsPaolo Bonzini
The virtio-net change is necessary because it uses virtqueue_fill and virtqueue_flush instead of the more convenient virtqueue_push. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-13virtio/migration: Migrate virtio-net to VMStateDr. David Alan Gilbert
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20170203160651.19917-5-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Merge fix against Halil's removal of the '_start' field in VMSTATE_VBUFFER_MULTIPLY
2017-01-10virtio-net: Add MTU feature supportMaxime Coquelin
This patch allows advising guest with host MTU's by setting host_mtu parameter. If VIRTIO_NET_F_MTU has been successfully negotiated, MTU value is passed to the backend. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Aaron Conole <aconole@redhat.com Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10net: virtio-net discards TX data after link downYuri Benditovich
https://bugzilla.redhat.com/show_bug.cgi?id=1295637 Upon set_link monitor command or upon netdev deletion virtio-net sends link down indication to the guest and stops vhost if one is used. Guest driver can still submit data for TX until it recognizes link loss. If these packets not returned by the host, the Windows guest will never be able to finish disable/removal/shutdown. Now each packet sent by guest after NIC indicated link down will be completed immediately. Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingStefan Hajnoczi
virtio, vhost, pc, pci: documentation, fixes and cleanups Lots of fixes all over the place. Unfortunately, this does not yet fix a regression with vhost introduced by the last pull, the issue is typically this error: kvm_mem_ioeventfd_add: error adding ioeventfd: File exists followed by QEMU aborting. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> * remotes/mst/tags/for_upstream: (28 commits) docs: add PCIe devices placement guidelines virtio: drop virtio_queue_get_ring_{size,addr}() vhost: drop legacy vring layout bits vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZE nvdimm acpi: use aml_name_decl to define named object nvdimm acpi: rename nvdimm_dsm_reserved_root nvdimm acpi: fix two comments nvdimm acpi: define DSM return codes nvdimm acpi: rename nvdimm_acpi_hotplug nvdimm acpi: cleanup nvdimm_build_fit nvdimm acpi: rename nvdimm_plugged_device_list docs: improve the doc of Read FIT method nvdimm acpi: clean up nvdimm_build_acpi pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplug pc: memhp: move nvdimm hotplug out of memory hotplug nvdimm acpi: drop the lock of fit buffer qdev: hotplug: drop HotplugHandler.post_plug callback vhost: migration blocker only if shared log is used virtio-net: mark VIRTIO_NET_F_GSO as legacy ... Message-id: 1479237527-11846-1-git-send-email-mst@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>