aboutsummaryrefslogtreecommitdiff
path: root/hw/net
AgeCommit message (Collapse)Author
2023-05-23e1000e: Always log status after building rx metadataAkihiko Odaki
Without this change, the status flags may not be traced e.g. if checksum offloading is disabled. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000x: Rename TcpIpv6 into TcpIpv6ExAkihiko Odaki
e1000e and igb employs NetPktRssIpV6TcpEx for RSS hash if TcpIpv6 MRQC bit is set. Moreover, igb also has a MRQC bit for NetPktRssIpV6Tcp though it is not implemented yet. Rename it to TcpIpv6Ex to avoid confusion. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000x: Take CRC into consideration for size checkAkihiko Odaki
Section 13.7.15 Receive Length Error Count says: > Packets over 1522 bytes are oversized if LongPacketEnable is 0b > (RCTL.LPE). If LongPacketEnable (LPE) is 1b, then an incoming packet > is considered oversized if it exceeds 16384 bytes. > These lengths are based on bytes in the received packet from > <Destination Address> through <CRC>, inclusively. As QEMU processes packets without CRC, the number of bytes for CRC need to be subtracted. This change adds some size definitions to be used to derive the new size thresholds to eth.h. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000x: Share more Rx filtering logicAkihiko Odaki
This saves some code and enables tracepoint for e1000's VLAN filtering. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23net/eth: Rename eth_setup_vlan_headers_exAkihiko Odaki
The old eth_setup_vlan_headers has no user so remove it and rename eth_setup_vlan_headers_ex. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23hw/net/net_tx_pkt: Remove net_rx_pkt_get_l4_infoAkihiko Odaki
This function is not used. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23igb: Always copy ethernet headerAkihiko Odaki
igb_receive_internal() used to check the iov length to determine copy the iovs to a contiguous buffer, but the check is flawed in two ways: - It does not ensure that iovcnt > 0. - It does not take virtio-net header into consideration. The size of this copy is just 22 octets, which can be even less than the code size required for checks. This (wrong) optimization is probably not worth so just remove it. Removing this also allows igb to assume aligned accesses for the ethernet header. Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000e: Always copy ethernet headerAkihiko Odaki
e1000e_receive_internal() used to check the iov length to determine copy the iovs to a contiguous buffer, but the check is flawed in two ways: - It does not ensure that iovcnt > 0. - It does not take virtio-net header into consideration. The size of this copy is just 18 octets, which can be even less than the code size required for checks. This (wrong) optimization is probably not worth so just remove it. Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()Akihiko Odaki
igb does not properly ensure the buffer passed to net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header. Allow it to pass scattered data to net_rx_pkt_set_protocols(). Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23igb: Clear IMS bits when committing ICR accessAkihiko Odaki
The datasheet says contradicting statements regarding ICR accesses so it is not reliable to determine the behavior of ICR accesses. However, e1000e does clear IMS bits when reading ICR accesses and Linux also expects ICR accesses will clear IMS bits according to: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igb/igb_main.c?h=v6.2#n8048 Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23igb: Do not require CTRL.VME for tx VLAN taggingAkihiko Odaki
While the datasheet of e1000e says it checks CTRL.VME for tx VLAN tagging, igb's datasheet has no such statements. It also says for "CTRL.VLE": > This register only affects the VLAN Strip in Rx it does not have any > influence in the Tx path in the 82576. (Appendix A. Changes from the 82575) There is no "CTRL.VLE" so it is more likely that it is a mistake of CTRL.VME. Fixes: fba7c3b788 ("igb: respect VMVIR and VMOLR for VLAN") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23igb: Fix Rx packet type encodingAkihiko Odaki
igb's advanced descriptor uses a packet type encoding different from one used in e1000e's extended descriptor. Fix the logic to encode Rx packet type accordingly. Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000x: Fix BPRC and MPRCAkihiko Odaki
Before this change, e1000 and the common code updated BPRC and MPRC depending on the matched filter, but e1000e and igb decided to update those counters by deriving the packet type independently. This inconsistency caused a multicast packet to be counted twice. Updating BPRC and MPRC depending on are fundamentally flawed anyway as a filter can be used for different types of packets. For example, it is possible to filter broadcast packets with MTA. Always determine what counters to update by inspecting the packets. Fixes: 3b27430177 ("e1000: Implementing various counters") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23hw/net/net_tx_pkt: Decouple interface from PCIAkihiko Odaki
This allows to use the network packet abstractions even if PCI is not used. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23hw/net/net_tx_pkt: Decouple implementation from PCIAkihiko Odaki
This is intended to be followed by another change for the interface. It also fixes the leak of memory mapping when the specified memory is partially mapped. Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23e1000e: Fix tx/rx counterstimothee.cocault@gmail.com
The bytes and packets counter registers are cleared on read. Copying the "total counter" registers to the "good counter" registers has side effects. If the "total" register is never read by the OS, it only gets incremented. This leads to exponential growth of the "good" register. This commit increments the counters individually to avoid this. Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-19virtio-net: not enable vq reset feature unconditionallyEugenio Pérez
The commit 93a97dc5200a ("virtio-net: enable vq reset feature") enables unconditionally vq reset feature as long as the device is emulated. This makes impossible to actually disable the feature, and it causes migration problems from qemu version previous than 7.2. The entire final commit is unneeded as device system already enable or disable the feature properly. This reverts commit 93a97dc5200a95e63b99cb625f20b7ae802ba413. Fixes: 93a97dc5200a ("virtio-net: enable vq reset feature") Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230504101447.389398-1-eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-16hw/net: Move xilinx_ethlite.c to the target-independent source setThomas Huth
Now that the tswap() functions are available for target-independent code, too, we can move xilinx_ethlite.c from specific_ss to softmmu_ss to avoid that we have to compile this file multiple times. Message-Id: <20230508120314.59274-1-thuth@redhat.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-05-02hw/net/allwinner-sun8i-emac: Correctly byteswap descriptor fieldsPeter Maydell
In allwinner-sun8i-emac we just read directly from guest memory into a host FrameDescriptor struct and back. This only works on little-endian hosts. Reading and writing of descriptors is already abstracted into functions; make those functions also handle the byte-swapping so that TransferDescriptor structs as seen by the rest of the code are always in host-order, and fix two places that were doing ad-hoc descriptor reading without using the functions. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230424165053.1428857-3-peter.maydell@linaro.org
2023-05-02hw/net/msf2-emac: Don't modify descriptor in-place in emac_store_desc()Peter Maydell
The msf2-emac ethernet controller has functions emac_load_desc() and emac_store_desc() which read and write the in-memory descriptor blocks and handle conversion between guest and host endianness. As currently written, emac_store_desc() does the endianness conversion in-place; this means that it effectively consumes the input EmacDesc struct, because on a big-endian host the fields will be overwritten with the little-endian versions of their values. Unfortunately, in all the callsites the code continues to access fields in the EmacDesc struct after it has called emac_store_desc() -- specifically, it looks at the d.next field. The effect of this is that on a big-endian host networking doesn't work because the address of the next descriptor is corrupted. We could fix this by making the callsite avoid using the struct; but it's more robust to have emac_store_desc() leave its input alone. (emac_load_desc() also does an in-place conversion, but here this is fine, because the function is supposed to be initializing the struct.) Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-id: 20230424151919.1333299-1-peter.maydell@linaro.org
2023-05-02hw/net: npcm7xx_emc: set MAC in register spacePatrick Venture
The MAC address set from Qemu wasn't being saved into the register space. Reviewed-by: Hao Wu <wuhaotsh@google.com> Signed-off-by: Patrick Venture <venture@google.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: moved variable declaration to top of function] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-04-28hw: replace most qemu_bh_new calls with qemu_bh_new_guardedAlexander Bulekov
This protects devices from bh->mmio reentrancy issues. Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure. Signed-off-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230427211013.2994127-5-alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-04-20hw/net/imx_fec: Support two Ethernet interfaces connected to single MDIO busGuenter Roeck
The SOC on i.MX6UL and i.MX7 has 2 Ethernet interfaces. The PHY on each may be connected to separate MDIO busses, or both may be connected on the same MDIO bus using different PHY addresses. Commit 461c51ad4275 ("Add a phy-num property to the i.MX FEC emulator") added support for specifying PHY addresses, but it did not provide support for linking the second PHY on a given MDIO bus to the other Ethernet interface. To be able to support two PHY instances on a single MDIO bus, two properties are needed: First, there needs to be a flag indicating if the MDIO bus on a given Ethernet interface is connected. If not, attempts to read from this bus must always return 0xffff. Implement this property as phy-connected. Second, if the MDIO bus on an interface is active, it needs a link to the consumer interface to be able to provide PHY access for it. Implement this property as phy-consumer. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Message-id: 20230315145248.1639364-2-linux@roeck-us.net Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-28igb: respect VMVIR and VMOLR for VLANSriram Yagnaraman
Add support for stripping/inserting VLAN for VFs. Had to move CSUM calculation back into the for loop, since packet data is pulled inside the loop based on strip VLAN decision for every VF. net_rx_pkt_fix_l4_csum should be extended to accept a buffer instead for igb. Work for a future patch. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28igb: implement VF Tx and Rx statsSriram Yagnaraman
Please note that loopback counters for VM to VM traffic is not implemented yet: VFGOTLBC, VFGPTLBC, VFGORLBC and VFGPRLBC. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28igb: respect E1000_VMOLR_RSSESriram Yagnaraman
RSS for VFs is only enabled if VMOLR[n].RSSE is set. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28igb: check oversized packets for VMDqSriram Yagnaraman
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28igb: implement VFRE and VFTE registersSriram Yagnaraman
Also introduce: - Checks for RXDCTL/TXDCTL queue enable bits - IGB_NUM_VM_POOLS enum (Sec 1.5: Table 1-7) Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28igb: add ICR_RXDWSriram Yagnaraman
IGB uses RXDW ICR bit to indicate that rx descriptor has been written back. This is the same as RXT0 bit in older HW. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28igb: handle PF/VF reset properlySriram Yagnaraman
Use PFRSTD to reset RSTI bit for VFs, and raise VFLRE interrupt when VF is reset. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28hw/net/net_tx_pkt: Align l3_hdrAkihiko Odaki
Align the l3_hdr member of NetTxPkt by defining it as a union of ip_header, ip6_header, and an array of octets. Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1544 Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28hw/net/net_tx_pkt: Ignore ECN bitAkihiko Odaki
No segmentation should be performed if gso type is VIRTIO_NET_HDR_GSO_NONE even if ECN bit is set. Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1544 Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28igb: Fix DMA requester specification for Tx packetAkihiko Odaki
igb used to specify the PF as DMA requester when reading Tx packets. This made Tx requests from VFs to be performed on the address space of the PF, defeating the purpose of SR-IOV. Add some logic to change the requester depending on the queue, which can be assigned to a VF. Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-28igb: Save more Tx statesAkihiko Odaki
The current implementation of igb uses only part of a advanced Tx context descriptor and first data descriptor because it misses some features and sniffs the trait of the packet instead of respecting the packet type specified in the descriptor. However, we will certainly need the entire Tx context descriptor when we update igb to respect these ignored fields. Save the entire context descriptor and first data descriptor except the buffer address to prepare for such a change. This also introduces the distinction of contexts with different indexes, which was not present in e1000e but in igb. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10Intrdocue igb device emulationAkihiko Odaki
This change introduces emulation for the Intel 82576 adapter, AKA igb. The details of the device will be provided by the documentation that will follow this change. This initial implementation of igb does not cover the full feature set, but it selectively implements changes necessary to pass tests of Linut Test Project, and Windows HLK. The below is the list of the implemented changes; anything not listed here is not implemented: New features: - igb advanced descriptor handling - Support of 16 queues - SRRCTL.BSIZEPACKET register field - SRRCTL.RDMTS register field - Tx descriptor completion writeback - Extended RA registers - VMDq feature - MRQC "Multiple Receive Queues Enable" register field - DTXSWC.Loopback_en register field - VMOLR.ROMPE register field - VMOLR.AUPE register field - VLVF.VLAN_id register field - VLVF.VI_En register field - VF - Mailbox - Reset - Extended interrupt registers - Default values for IGP01E1000 PHY registers Removed features: - e1000e extended descriptor - e1000e packet split descriptor - Legacy descriptor - PHY register paging - MAC Registers - Legacy interrupt timer registers - Legacy EEPROM registers - PBA/POEM registers - RSRPD register - RFCTL.ACKDIS - RCTL.DTYPE - Copper PHY registers Misc: - VET register format - ICR register format Signed-off-by: Gal Hammer <gal.hammer@sap.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> [Jason: don't abort on msi(x)_init()] Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Split header filesAkihiko Odaki
Some definitions in the header files are invalid for igb so extract them to new header files to keep igb from referring to them. Signed-off-by: Gal Hammer <gal.hammer@sap.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10net/eth: Introduce EthL4HdrProtoAkihiko Odaki
igb, a new network device emulation, will need SCTP checksum offloading. Currently eth_get_protocols() has a bool parameter for each protocol currently it supports, but there will be a bit too many parameters if we add yet another protocol. Introduce an enum type, EthL4HdrProto to represent all L4 protocols eth_get_protocols() support with one parameter. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000e: Implement system clockAkihiko Odaki
The system clock is necessary to implement PTP features. While we are not implementing PTP features for e1000e yet, we do have a plan to implement them for igb, a new network device derived from e1000e, so add system clock to the common base first. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10net/eth: Report if headers are actually presentAkihiko Odaki
The values returned by eth_get_protocols() are used to perform RSS, checksumming and segmentation. Even when a packet signals the use of the protocols which these operations can be applied to, the headers for them may not be present because of too short packet or fragmentation, for example. In such a case, the operations cannot be applied safely. Report the presence of headers instead of whether the use of the protocols are indicated with eth_get_protocols(). This also makes corresponding changes to the callers of eth_get_protocols() to match with its new signature and to remove redundant checks for fragmentation. Fixes: 75020a7021 ("Common definitions for VMWARE devices") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000e: Count CRC in Tx statisticsAkihiko Odaki
The datasheet 8.19.29 "Good Packets Transmitted Count - GPTC (0x04080; RC)" says: > This register counts the number of good (no errors) packets > transmitted. A good transmit packet is considered one that is 64 or > more bytes in length (from <Destination Address> through <CRC>, > inclusively) in length. It also says similar for the other Tx statistics registers. Add the number of bytes for CRC to those registers. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000: Count CRC in Tx statisticsAkihiko Odaki
The Software Developer's Manual 13.7.4.5 "Packets Transmitted (64 Bytes) Count" says: > This register counts the number of packets transmitted that are > exactly 64 bytes (from <Destination Address> through <CRC>, > inclusively) in length. It also says similar for the other Tx statistics registers. Add the number of bytes for CRC to those registers. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000e: Combine rx tracesAkihiko Odaki
Whether a packet will be written back to the guest depends on the remaining space of the queue. Therefore, e1000e_rx_written_to_guest and e1000e_rx_not_written_to_guest should log the index of the queue instead of generated interrupts. This also removes the need of e1000e_rx_rss_dispatched_to_queue, which logs the queue index. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000e: Do not assert when MSI-X is disabled laterAkihiko Odaki
Assertions will fail if MSI-X gets disabled while a timer for MSI-X interrupts is running so remove them to avoid abortions. Fortunately, nothing bad happens even if the assertions won't trigger as msix_notify(), called by timer handlers, does nothing when MSI-X is disabled. This bug was found by Alexander Bulekov when fuzzing igb, a new device implementation derived from e1000e: https://patchew.org/QEMU/20230129053316.1071513-1-alxndr@bu.edu/ The fixed test case is: fuzz/crash_aea040166819193cf9fedb810c6d100221da721a Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10hw/net/net_tx_pkt: Check the payload lengthAkihiko Odaki
Check the payload length if checksumming to ensure the payload contains the space for the resulting value. This bug was found by Alexander Bulekov with the fuzzer: https://patchew.org/QEMU/20230129053316.1071513-1-alxndr@bu.edu/ The fixed test case is: fuzz/crash_6aeaa33e7211ecd603726c53e834df4c6d1e08bc Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10hw/net/net_tx_pkt: Implement TCP segmentationAkihiko Odaki
There was no proper implementation of TCP segmentation before this change, and net_tx_pkt relied solely on IPv4 fragmentation. Not only this is not aligned with the specification, but it also resulted in corrupted IPv6 packets. This is particularly problematic for the igb, a new proposed device implementation; igb provides loopback feature for VMDq and the feature relies on software segmentation. Implement proper TCP segmentation in net_tx_pkt to fix such a scenario. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000e: Perform software segmentation for loopbackAkihiko Odaki
e1000e didn't perform software segmentation for loopback if virtio-net header is enabled, which is wrong. To fix the problem, introduce net_tx_pkt_send_custom(), which allows the caller to specify whether offloading should be assumed or not. net_tx_pkt_send_custom() also allows the caller to provide a custom sending function. Packets with virtio-net headers and ones without virtio-net headers will be provided at the same time so the function can choose the preferred version. In case of e1000e loopback, it prefers to have virtio-net headers as they allows to skip the checksum verification if VIRTIO_NET_HDR_F_DATA_VALID is set. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10hw/net/net_rx_pkt: Remove net_rx_pkt_has_virt_hdrAkihiko Odaki
When virtio-net header is not set, net_rx_pkt_get_vhdr() returns zero-filled virtio_net_hdr, which is actually valid. In fact, tap device uses zero-filled virtio_net_hdr when virtio-net header is not provided by the peer. Therefore, we can just remove net_rx_pkt_has_virt_hdr() and always assume NetTxPkt has a valid virtio-net header. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10hw/net/net_tx_pkt: Automatically determine if virtio-net header is usedAkihiko Odaki
The new function qemu_get_using_vnet_hdr() allows to automatically determine if virtio-net header is used. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10e1000x: Alter the signature of e1000x_is_vlan_packetAkihiko Odaki
e1000x_is_vlan_packet() had a pointer to uint8_t as a parameter, but it does not have to be uint8_t. Change the type to void *. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10net: Check L4 header sizeAkihiko Odaki
net_tx_pkt_build_vheader() inspects TCP header but had no check for the header size, resulting in an undefined behavior. Check the header size and drop the packet if the header is too small. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>