aboutsummaryrefslogtreecommitdiff
path: root/hw/net/e1000.c
AgeCommit message (Collapse)Author
2023-11-21net: Provide MemReentrancyGuard * to qemu_new_nic()Akihiko Odaki
Recently MemReentrancyGuard was added to DeviceState to record that the device is engaging in I/O. The network device backend needs to update it when delivering a packet to a device. In preparation for such a change, add MemReentrancyGuard * as a parameter of qemu_new_nic(). Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-29e1000: remove old compatibility codePaolo Bonzini
This code is not needed anymore in the supported machine types. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-07-07hw/net: e1000: Remove the logic of padding short frames in the receive pathBin Meng
Now that we have implemented unified short frames padding in the QEMU networking codes, remove the same logic in the NIC codes. This actually reverts commit 78aeb23eded2d0b765bf9145c71f80025b568acd. Signed-off-by: Bin Meng <bmeng@tinylab.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000x: Share more Rx filtering logicAkihiko Odaki
This saves some code and enables tracepoint for e1000's VLAN filtering. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000x: Fix BPRC and MPRCAkihiko Odaki
Before this change, e1000 and the common code updated BPRC and MPRC depending on the matched filter, but e1000e and igb decided to update those counters by deriving the packet type independently. This inconsistency caused a multicast packet to be counted twice. Updating BPRC and MPRC depending on are fundamentally flawed anyway as a filter can be used for different types of packets. For example, it is possible to filter broadcast packets with MTA. Always determine what counters to update by inspecting the packets. Fixes: 3b27430177 ("e1000: Implementing various counters") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000e: Fix tx/rx counterstimothee.cocault@gmail.com
The bytes and packets counter registers are cleared on read. Copying the "total counter" registers to the "good counter" registers has side effects. If the "total" register is never read by the OS, it only gets incremented. This leads to exponential growth of the "good" register. This commit increments the counters individually to avoid this. Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Split header filesAkihiko Odaki
Some definitions in the header files are invalid for igb so extract them to new header files to keep igb from referring to them. Signed-off-by: Gal Hammer <gal.hammer@sap.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Count CRC in Tx statisticsAkihiko Odaki
The Software Developer's Manual 13.7.4.5 "Packets Transmitted (64 Bytes) Count" says: > This register counts the number of packets transmitted that are > exactly 64 bytes (from <Destination Address> through <CRC>, > inclusively) in length. It also says similar for the other Tx statistics registers. Add the number of bytes for CRC to those registers. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Configure ResettableClassAkihiko Odaki
This is part of recent efforts of refactoring e1000 and e1000e. DeviceClass's reset member is deprecated so migrate to ResettableClass. There is no behavioral difference. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Use memcpy to intialize registersAkihiko Odaki
Use memcpy instead of memmove to initialize registers. The initial register templates and register table instances will never overlap. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Use more constant definitionsAkihiko Odaki
The definitions for E1000_VFTA_ENTRY_SHIFT, E1000_VFTA_ENTRY_MASK, and E1000_VFTA_ENTRY_BIT_SHIFT_MASK were copied from: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/intel/e1000/e1000_hw.h?h=v6.0.9#n306 The definitions for E1000_NUM_UNICAST, E1000_MC_TBL_SIZE, and E1000_VLAN_FILTER_TBL_SIZE were copied from: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/intel/e1000/e1000_hw.h?h=v6.0.9#n707 Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Mask registers when writingAkihiko Odaki
When a register has effective bits fewer than their width, the old code inconsistently masked when writing or reading. Make the code consistent by always masking when writing, and remove some code duplication. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Use hw/net/mii.hAkihiko Odaki
hw/net/mii.h provides common definitions for MII. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000e: Fix the code styleAkihiko Odaki
igb implementation first starts off by copying e1000e code. Correct the code style before that. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-01-08include/hw/pci: Split pci_device.h off pci.hMarkus Armbruster
PCIDeviceClass and PCIDevice are defined in pci.h. Many users of the header don't actually need them. Similar structs live in their own headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h, PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and PCIESlot in pcie_port.h. Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with the code that needs them. Adjust include directives. This also enables the next commit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221222100330.380143-6-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-07-06e1000: set RX descriptor status in a separate operationDing Hui
The code of setting RX descriptor status field maybe work fine in previously, however with the update of glibc version, it shows two issues when guest using dpdk receive packets: 1. The dpdk has a certain probability getting wrong buffer_addr this impact may be not obvious, such as lost a packet once in a while 2. The dpdk may consume a packet twice when scan the RX desc queue over again this impact will lead a infinite wait in Qemu, since the RDT (tail pointer) be inscreased to equal to RDH by unexpected, which regard as the RX desc queue is full Write a whole of RX desc with DD flag on is not quite correct, because when the underlying implementation of memcpy using XMM registers to copy e1000_rx_desc (when AVX or something else CPU feature is usable), the bytes order of desc writing to memory is indeterminacy We can use full-scale test case to reproduce the issue-2 by https://github.com/BASM/qemu_dpdk_e1000_test (thanks to Leonid Myravjev) I also write a POC test case at https://github.com/cdkey/e1000_poc which can reproduce both of them, and easy to verify the patch effect. The hw watchpoint also shows that, when Qemu using XMM related instructions writing 16 bytes e1000_rx_desc, concurrent with DPDK using movb writing 1 byte status, the final result of writing to memory will be one of them, if it made by Qemu which DD flag is on, DPDK will consume it again. Setting DD status in a separate operation, can prevent the impact of disorder memory writing by memcpy, also avoid unexpected data when concurrent writing status by qemu and guest dpdk. Links: https://lore.kernel.org/qemu-devel/20200102110504.GG121208@stefanha-x1.localdomain/T/ Reported-by: Leonid Myravjev <asm@asm.pp.ru> Cc: Stefan Hajnoczi <stefanha@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: qemu-stable@nongnu.org Tested-by: Jing Zhang <zhangjing@sangfor.com.cn> Reviewed-by: Frank Lee <lifan38153@sangfor.com.cn> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-11-05e1000: fix tx re-entrancy problemJon Maloy
The fact that the MMIO handler is not re-entrant causes an infinite loop under certain conditions: Guest write to TDT -> Loopback -> RX (DMA to TDT) -> TX We now eliminate the effect of this problem locally in e1000, by adding a boolean in struct E1000State indicating when the TX side is busy. This will cause any entering new call to return early instead of interfering with the ongoing work, and eliminates any risk of looping. This is intended to address CVE-2021-20257. Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-08-02hw/net: e1000: Correct the initial value of VET registerChristina Wang
The initial value of VLAN Ether Type (VET) register is 0x8100, as per the manual and real hardware. While Linux e1000 driver always writes VET register to 0x8100, it is not always the case for everyone. Drivers relying on the reset value of VET won't be able to transmit and receive VLAN frames in QEMU. Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com> Signed-off-by: Christina Wang <christina.wang@windriver.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-15e1000: switch to use qemu_receive_packet() for loopbackJason Wang
This patch switches to use qemu_receive_packet() which can detect reentrancy and return early. This is intended to address CVE-2021-3416. Cc: Prasad J Pandit <ppandit@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-03-15e1000: fail early for evil descriptorJason Wang
During procss_tx_desc(), driver can try to chain data descriptor with legacy descriptor, when will lead underflow for the following calculation in process_tx_desc() for bytes: if (tp->size + bytes > msh) bytes = msh - tp->size; This will lead a infinite loop. So check and fail early if tp->size if greater or equal to msh. Reported-by: Alexander Bulekov <alxndr@bu.edu> Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr> Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de> Cc: Prasad J Pandit <ppandit@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-01-08Remove superfluous timer_del() callsPeter Maydell
This commit is the result of running the timer-del-timer-free.cocci script on the whole source tree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201215154107.3255-4-peter.maydell@linaro.org
2020-11-15nomaintainer: Fix Lesser GPL version numberChetan Pant
There is no "version 2" of the "Lesser" General Public License. It is either "GPL version 2.0" or "Lesser GPL version 2.1". This patch replaces all occurrences of "Lesser GPL version 2" with "Lesser GPL version 2.1" in comment section. This patch contains all the files, whose maintainer I could not get from ‘get_maintainer.pl’ script. Signed-off-by: Chetan Pant <chetan4windows@gmail.com> Message-Id: <20201023124424.20177-1-chetan4windows@gmail.com> Reviewed-by: Thomas Huth <thuth@redhat.com> [thuth: Adapted exec.c and qdev-monitor.c to new location] Signed-off-by: Thomas Huth <thuth@redhat.com>
2020-09-09Use DECLARE_*CHECKER* macrosEduardo Habkost
Generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=TypeCheckMacro $(git grep -l '' -- '*.[ch]') Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-12-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-13-ehabkost@redhat.com> Message-Id: <20200831210740.126168-14-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-09-09Move QOM typedefs and add missing includesEduardo Habkost
Some typedefs and macros are defined after the type check macros. This makes it difficult to automatically replace their definitions with OBJECT_DECLARE_TYPE. Patch generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=QOMStructTypedefSplit $(git grep -l '' -- '*.[ch]') which will split "typdef struct { ... } TypedefName" declarations. Followed by: $ ./scripts/codeconverter/converter.py -i --pattern=MoveSymbols \ $(git grep -l '' -- '*.[ch]') which will: - move the typedefs and #defines above the type check macros - add missing #include "qom/object.h" lines if necessary Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-9-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-10-ehabkost@redhat.com> Message-Id: <20200831210740.126168-11-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-08-27e1000: Rename QOM class cast macrosEduardo Habkost
Rename the E1000_DEVICE_CLASS() and E1000_DEVICE_GET_CLASS() macros to be consistent with the E1000() instance cast macro. This will allow us to register the type cast macros using OBJECT_DECLARE_TYPE later. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Tested-By: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20200825192110.3528606-2-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-05-15Drop more @errp parameters after previous commitMarkus Armbruster
Several functions can't fail anymore: ich9_pm_add_properties(), device_add_bootindex_property(), ppc_compat_add_property(), spapr_caps_add_properties(), PropertyInfo.create(). Drop their @errp parameter. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-16-armbru@redhat.com>
2020-05-15e1000: Don't run e1000_instance_init() twiceMarkus Armbruster
QOM object initialization runs .instance_init() for the type and all its supertypes; see object_init_with_type(). Both TYPE_E1000_BASE and its concrete subtypes set .instance_init() to e1000_instance_init(). For the concrete subtypes, it duly gets run twice. The second run fails, but the error gets ignored (a later commit will change that). Remove it from the subtypes. Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-12-armbru@redhat.com>
2020-03-31hw/net: Make NetCanReceive() return a booleanPhilippe Mathieu-Daudé
The NetCanReceive handler return whether the device can or can not receive new packets. Make it obvious by returning a boolean type. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-03-09hw/net/e1000: Move macreg[] arrays to .rodata to save 1MiB of .dataPhilippe Mathieu-Daudé
Each array consumes 256KiB of .data. As we do not reassign entries, we can move it to the .rodata section, and save a total of 1MiB of .data (size reported on x86_64 host). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20200305010446.17029-3-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-03-09hw/net/e1000: Add readops/writeops typedefsPhilippe Mathieu-Daudé
Express the macreg[] arrays using typedefs. No logical changes introduced here. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20200305010446.17029-2-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-01-24qdev: set properties with device_class_set_props()Marc-André Lureau
The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-21hw/net/e1000: Fix erroneous commentPhilippe Mathieu-Daudé
Missed during the QOM convertion in 9af21dbee14. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20190715102210.31365-1-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-08-16Include hw/qdev-properties.h lessMarkus Armbruster
In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16Include hw/hw.h exactly where neededMarkus Armbruster
In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include migration/vmstate.h lessMarkus Armbruster
In my "build everything" tree, changing migration/vmstate.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get VMStateDescription. The previous commit made that unnecessary. Include migration/vmstate.h only where it's still needed. Touching it now recompiles only some 1600 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-16-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-07-29e1000: don't raise interrupt in pre_save()Jason Wang
We should not raise any interrupt after VM has been stopped but this is what e1000 currently did when mit timer is active in pre_save(). Fixing this by scheduling a timer in post_load() which can make sure the interrupt was raised when VM is running. Reported-and-tested-by: Longpeng <longpeng2@huawei.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-06-12Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2019-05-17e1000: Never increment the RX undersize count registerChris Kenna
In situations where e1000 receives an undersized Ethernet frame, QEMU increments the emulated "Receive Undersize Count (RUC)" register when padding the frame. This is incorrect because this an expected scenario (e.g. with VLAN tag stripping) and not an error. As such, QEMU should not increment the emulated RUC. Fixes: 3b2743017749 ("e1000: Implementing various counters") Reviewed-by: Mark Kanda <mark.kanda@oracle.com> Reviewed-by: Bhavesh Davda <bhavesh.davda@oracle.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Chris Kenna <chris.kenna@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-03-29e1000: Delay flush queue when receive RCTLyuchenlin
Due to too early RCT0 interrput, win10x32 may hang on booting. This problem can be reproduced by doing power cycle on win10x32 guest. In our environment, we have 10 win10x32 and stress power cycle. The problem will happen about 20 rounds. Below shows some log with comment: The normal case: 22831@1551928392.984687:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 22831@1551928392.985655:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 22831@1551928392.985801:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 22831@1551928393.056710:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: ICR read: 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 22831@1551928393.077548:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: ICR read: 0 e1000: set_ics 2, ICR 0, IMR 0 e1000: set_ics 2, ICR 2, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 22831@1551928393.102974:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 22831@1551928393.103267:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: RCTL: 255, mac_reg[RCTL] = 0x40002 <- win10x32 says it can handle RX now e1000: set_ics 0, ICR 2, IMR 9d <- unmask interrupt e1000: RCTL: 255, mac_reg[RCTL] = 0x48002 e1000: set_ics 80, ICR 2, IMR 9d <- interrupt and work! ... The bad case: 27744@1551930483.117766:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 27744@1551930483.118398:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 27744@1551930483.198063:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: ICR read: 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 27744@1551930483.218675:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: ICR read: 0 e1000: set_ics 2, ICR 0, IMR 0 e1000: set_ics 2, ICR 2, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 27744@1551930483.241768:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 27744@1551930483.241979:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: RCTL: 255, mac_reg[RCTL] = 0x40002 <- win10x32 says it can handle RX now e1000: set_ics 80, ICR 2, IMR 0 <- flush queue (caused by setting RCTL) e1000: set_ics 0, ICR 82, IMR 9d <- unmask interrupt and because 0x82&0x9d != 0 generate interrupt, hang on here... To workaround this problem, simply delay flush queue. Also stop receiving when timer is going to run. Tested on CentOS, Win7SP1x64 and Win10x32. Signed-off-by: yuchenlin <yuchenlin@synology.com> Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19e1000: indicate dropped packets in HW countersJason Wang
The e1000 emulation silently discards RX packets if there's insufficient space in the ring buffer. This leads to errors on higher-level protocols in the guest, with no indication about the error cause. This patch increments the "Missed Packets Count" (MPC) and "Receive No Buffers Count" (RNBC) HW counters in this case. As the emulation has no FIFO for buffering packets that can't immediately be pushed to the guest, these two registers are practically equivalent (see 10.2.7.4, 10.2.7.33 in https://www.intel.com/content/www/us/en/embedded/products/networking/82574l-gbe-controller-datasheet.html). On a Linux guest, the register content will be reflected in the "rx_missed_errors" and "rx_no_buffer_count" stats from "ethtool -S", and in the "missed" stat from "ip -s -s link show", giving at least some hint about the error cause inside the guest. If the cause is known, problems like this can often be avoided easily, by increasing the number of RX descriptors in the guest e1000 driver (e.g under Linux, "e1000.RxDescriptors=1024"). The patch also adds a qemu trace message for this condition. Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-04-10e1000: Choose which set of props to migrateDr. David Alan Gilbert
When we're using the subsection we migrate both the 'props' and 'tso_props' data; when we're not using the subsection (to migrate to 2.11 or old machine types) we've got to choose what to migrate in the main structure. If we're using the subsection migrate 'props' in the main structure. If we're not using the subsection then migrate the last one that changed, which gives behaviour similar to the old behaviour. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-04-10e1000: Migrate props via a temporary structureDr. David Alan Gilbert
Swing the tx.props out via a temporary structure, so in future patches we can select what we're going to send. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-04-10e1000: wire new subsection to propertyDr. David Alan Gilbert
Wire the new subsection from the previous commit to a property so we can turn it off easily. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-04-10e1000: Dupe offload data on reading old streamDr. David Alan Gilbert
Old QEMUs only had one set of offload data; when we only receive one lot, dupe the received data - that should give us about the same bug level as the old version. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-04-10e1000: Convert v3 fields to subsectionDr. David Alan Gilbert
A bunch of new TSO fields were introduced by d62644b4 and this bumped the VMState version; however it's easier for those trying to keep backwards migration compatibility if these fields are added in a subsection instead. Move the new fields to a subsection. Since this was added after 2.11, this change will only affect compatbility with 2.12-rc0. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-03-05hw/net: Remove unnecessary header includesThomas Huth
Headers like "hw/loader.h" and "qemu/sockets.h" are not needed in the hw/net/*.c files. And Some other headers are included via other headers already, so we can drop them, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-12-22e1000: Separate TSO and non-TSO contexts, fixing UDP TX corruptionEd Swierk via Qemu-devel
The device is supposed to maintain two distinct contexts for transmit offloads: one has parameters for both segmentation and checksum offload, the other only for checksum offload. The guest driver can send two context descriptors, one for each context (the TSE flag specifies which). Then the guest can refer to one or the other context in subsequent transmit data descriptors, depending on what offloads it wants applied to each packet. Currently the e1000 device stores just one context, and misinterprets the TSE flags in the context and data descriptors. This is often okay: Linux happens to send a fresh context descriptor before every data descriptor, so forgetting the other context doesn't matter. Windows does rely on separate contexts for TSO vs. non-TSO packets, but for mostly-TCP traffic the two contexts have identical TCP-specific offload parameters so confusing them doesn't matter. One case where this confusion matters is when a Windows guest sets up a TSO context for TCP and a non-TSO context for UDP, and then transmits both TCP and UDP traffic in parallel. The e1000 device sometimes ends up using TCP-specific parameters while doing checksum offload on a UDP datagram: it writes the checksum to offset 16 (the correct location for a TCP checksum), stomping on two bytes of UDP data, and leaving the wrong value in the actual UDP checksum field at offset 6. (Even worse, the host network stack may then recompute the UDP checksum, "correcting" it to match the corrupt data before sending it out a physical interface.) Correct this by tracking the TSO context independently of the non-TSO context, and selecting the appropriate context based on the TSE flag in each transmit data descriptor. Signed-off-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-12-22e1000, e1000e: Move per-packet TX offload flags out of context stateEd Swierk via Qemu-devel
sum_needed and cptse flags are received from the guest within each transmit data descriptor. They are not part of the offload context; instead, they determine how to apply a previously received context to the packet being transmitted: - If cptse is set, perform both segmentation and checksum offload using the parameters in the TSO context; otherwise just do checksum offload. (Currently the e1000 device incorrectly stores only one context, which will be fixed in a subsequent patch.) - Depending on the bits set in sum_needed, possibly perform L4 checksum offload and/or IP checksum offload, using the parameters in the appropriate context. Move these flags out of struct e1000x_txd_props, which is otherwise dedicated to storing values from a context descriptor, and into the per-packet TX struct. Signed-off-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-11-20net: Transmit zero UDP checksum as 0xFFFFEd Swierk
The checksum algorithm used by IPv4, TCP and UDP allows a zero value to be represented by either 0x0000 and 0xFFFF. But per RFC 768, a zero UDP checksum must be transmitted as 0xFFFF because 0x0000 is a special value meaning no checksum. Substitute 0xFFFF whenever a checksum is computed as zero when modifying a UDP datagram header. Doing this on IPv4 and TCP checksums is unnecessary but legal. Add a wrapper for net_checksum_finish() that makes the substitution. (We can't just change net_checksum_finish(), as that function is also used by receivers to verify checksums, and in that case the expected value is always 0x0000.) Signed-off-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-10-15pci: Add INTERFACE_CONVENTIONAL_PCI_DEVICE to Conventional PCI devicesEduardo Habkost
Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of TYPE_PCI_DEVICE, except: 1) The ones that already have INTERFACE_PCIE_DEVICE set: * base-xhci * e1000e * nvme * pvscsi * vfio-pci * virtio-pci * vmxnet3 2) base-pci-bridge Not all PCI bridges are Conventional PCI devices, so INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes that are actually Conventional PCI: * dec-21154-p2p-bridge * i82801b11-bridge * pbm-bridge * pci-bridge The direct subtypes of base-pci-bridge not touched by this patch are: * xilinx-pcie-root: Already marked as PCIe-only. * pcie-pci-bridge: Already marked as PCIe-only. * pcie-port: all non-abstract subtypes of pcie-port are already marked as PCIe-only devices. 3) megasas-base Not all megasas devices are Conventional PCI devices, so the interface names are added to the subclasses registered by megasas_register_types(), according to information in the megasas_devices[] array. "megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas". Acked-by: Alberto Garcia <berto@igalia.com> Acked-by: John Snow <jsnow@redhat.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>