aboutsummaryrefslogtreecommitdiff
path: root/hw/pci
AgeCommit message (Collapse)Author
2023-10-04pcie_sriov: unregister_vfs(): fix error pathVladimir Sementsov-Ogievskiy
local_err must be NULL before calling object_property_set_bool(), so we must clear it on each iteration. Let's also use more convenient error_reportf_err(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230925194040.68592-8-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04pci: SLT must be ROMichael S. Tsirkin
current code sets PCI_SEC_LATENCY_TIMER to RW, but for pcie to pcie bridges it must be RO 0 according to pci express spec which says: This register does not apply to PCI Express. It must be read-only and hardwired to 00h. For PCI Express to PCI/PCI-X Bridges, refer to the [PCIe-to-PCI-PCI-X-Bridge] for requirements for this register. also, fix typo in comment where it's made writeable - this typo is likely what prevented us noticing we violate this requirement in the 1st place. Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> Message-Id: <de9d05366a70172e1789d10591dbe59e39c3849c.1693432039.git.mst@redhat.com> Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-20hw/pci: spelling fixesMichael Tokarev
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-08-11pci: Fix the update of interrupt disable bit in PCI_COMMAND registerGuoyi Tu
The PCI_COMMAND register is located at offset 4 within the PCI configuration space and occupies 2 bytes. The interrupt disable bit is at the 10th bit, which corresponds to the byte at offset 5 in the PCI configuration space. In our testing environment, the guest driver may directly updates the byte at offset 5 in the PCI configuration space. The backtrace looks like as following: at hw/pci/pci.c:1442 at hw/virtio/virtio-pci.c:605 val=5, len=1) at hw/pci/pci_host.c:81 In this situation, the range_covers_byte function called by the pci_default_write_config function will return false, resulting in the inability to handle the interrupt disable update event. To fix this issue, we can use the ranges_overlap function instead of range_covers_byte to determine whether the interrupt bit has been updated. Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn> Signed-off-by: yuanminghao <yuanmh12@chinatelecom.cn> Message-Id: <ce2d0437-8faa-4d61-b536-4668f645a959@chinatelecom.cn> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: b6981cb57be5 ("pci: interrupt disable bit support")
2023-08-03pci: do not respond config requests after PCI device ejectYuri Benditovich
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2224964 In migration with VF failover, Windows guest and ACPI hot unplug we do not need to satisfy config requests, otherwise the guest immediately detects the device and brings up its driver. Many network VF's are stuck on the guest PCI bus after the migration. Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Message-Id: <20230728084049.191454-1-yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-25hw/pci: add comment to explain checking for available function 0 in pci hotplugAni Sinha
This change is cosmetic. A comment is added explaining why we need to check for the availability of function 0 when we hotplug a device. CC: mst@redhat.com CC: mjt@tls.msk.ru Signed-off-by: Ani Sinha <anisinha@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-14kconfig: Add PCIe devices to s390x machinesCédric Le Goater
It is useful to extend the number of available PCIe devices to KVM guests for passthrough scenarios and also to expose these models to a different (big endian) architecture. Introduce a new config PCIE_DEVICES to select models, Intel Ethernet adapters and one USB controller. These devices all support MSI-X which is a requirement on s390x as legacy INTx are not supported. Cc: Matthew Rosato <mjrosato@linux.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Huth <thuth@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20230712080146.839113-1-clg@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-07-10pcie: Specify 0 for ARI next function numbersAkihiko Odaki
The current implementers of ARI are all SR-IOV devices. The ARI next function number field is undefined for VF according to PCI Express Base Specification Revision 5.0 Version 1.0 section 9.3.7.7. The PF still requires some defined value so end the linked list formed with the field by specifying 0 as required for any ARI implementation according to section 7.8.7.2. For migration, the field will keep having 1 as its value on the old QEMU machine versions. Fixes: 2503461691 ("pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt") Fixes: 44c2c09488 ("hw/nvme: Add support for SR-IOV") Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230710153838.33917-3-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10pcie: Use common ARI next function numberAkihiko Odaki
Currently the only implementers of ARI is SR-IOV devices, and they behave similar. Share the ARI next function number. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230710153838.33917-2-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10pcie: Add hotplug detect state register to cmaskLeonardo Bras
When trying to migrate a machine type pc-q35-6.0 or lower, with this cmdline options, -device driver=pcie-root-port,port=18,chassis=19,id=pcie-root-port18,bus=pcie.0,addr=0x12 \ -device driver=nec-usb-xhci,p2=4,p3=4,id=nex-usb-xhci0,bus=pcie-root-port18,addr=0x12.0x1 the following bug happens after all ram pages were sent: qemu-kvm: get_pci_config_device: Bad config data: i=0x6e read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19 qemu-kvm: Failed to load PCIDevice:config qemu-kvm: Failed to load pcie-root-port:parent_obj.parent_obj.parent_obj qemu-kvm: error while loading state for instance 0x0 of device '0000:00:12.0/pcie-root-port' qemu-kvm: load of migration failed: Invalid argument This happens on pc-q35-6.0 or lower because of: { "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" } In this scenario, hotplug_handler_plug() calls pcie_cap_slot_plug_cb(), which sets dev->config byte 0x6e with bit PCI_EXP_SLTSTA_PDS to signal PCI hotplug for the guest. After a while the guest will deal with this hotplug and qemu will clear the above bit. Then, during migration, get_pci_config_device() will compare the configs of both the freshly created device and the one that is being received via migration, which will differ due to the PCI_EXP_SLTSTA_PDS bit and cause the bug to reproduce. To avoid this fake incompatibility, there are tree fields in PCIDevice that can help: - wmask: Used to implement R/W bytes, and - w1cmask: Used to implement RW1C(Write 1 to Clear) bytes - cmask: Used to enable config checks on load. According to PCI Express® Base Specification Revision 5.0 Version 1.0, table 7-27 (Slot Status Register) bit 6, the "Presence Detect State" is listed as RO (read-only), so it only makes sense to make use of the cmask field. So, clear PCI_EXP_SLTSTA_PDS bit on cmask, so the fake incompatibility on get_pci_config_device() does not abort the migration. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215819 Signed-off-by: Leonardo Bras <leobras@redhat.com> Message-Id: <20230706045546.593605-3-leobras@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2023-07-10hw/pci: warn when PCIe device is plugged into non-zero slot of downstream portAni Sinha
PCIe downstream ports only have a single device 0, so PCI Express devices can only be plugged into slot 0 on a PCIe port. Add a warning to let users know when the invalid configuration is used. We may enforce this more strongly later once we get more clarity on whether we are introducing a bad regression for users currently using the wrong configuration. The change has been tested to not break or alter behaviors of ARI capable devices by instantiating seven vfs on an emulated igb device (the maximum number of vfs the igb device supports). The vfs are instantiated correctly and are seen to have non-zero device/slot numbers in the conventional PCI BDF representation. CC: jusual@redhat.com CC: imammedo@redhat.com CC: mst@redhat.com CC: akihiko.odaki@daynix.com Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 Signed-off-by: Ani Sinha <anisinha@redhat.com> Reviewed-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20230705115925.5339-6-anisinha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
2023-07-10pcie: Release references of virtual functionsAkihiko Odaki
pci_new() automatically retains a reference to a virtual function when registering it so we need to release the reference when unregistering. Fixes: 7c0fa8dff8 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20230411090408.48366-1-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2023-07-10hw/pci/pci: Remove multifunction parameter from pci_new_multifunction()Bernhard Beschow
There is also pci_new() which creates non-multifunction PCI devices. Accordingly the parameter is always set to true when a multi function PCI device is to be created. The reason for the parameter's existence seems to be that it is used in the internal PCI code as well which is the only location where it gets set to false. This one usage can be resolved by factoring out an internal helper function. Remove this redundant, error-prone parameter. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20230304114043.121024-6-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10hw/pci/pci: Remove multifunction parameter from ↵Bernhard Beschow
pci_create_simple_multifunction() There is also pci_create_simple() which creates non-multifunction PCI devices. Accordingly the parameter is always set to true when a multi function PCI device is to be created. The reason for the parameter's existence seems to be that it is used in the internal PCI code as well which is the only location where it gets set to false. This one usage can be replaced by trivial code. Remove this redundant, error-prone parameter. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20230304114043.121024-5-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10hw/pci/pci_host: Introduce PCI_HOST_BYPASS_IOMMU macroBernhard Beschow
Introduce a macro to avoid copy and pasting strings which can easily cause typos. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20230630073720.21297-5-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10pcie: Add a PCIe capability version helperAlex Williamson
Report the PCIe capability version for a device Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Robin Voetter <robin@streamhpc.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-23pci: ROM preallocation for incoming migrationVladimir Sementsov-Ogievskiy
On incoming migration we have the following sequence to load option ROM: 1. On device realize we do normal load ROM from the file 2. Than, on incoming migration we rewrite ROM from the incoming RAM block. If sizes mismatch we fail, like this: Size mismatch: 0000:00:03.0/virtio-net-pci.rom: 0x40000 != 0x80000: Invalid argument This is not ideal when we migrate to updated distribution: we have to keep old ROM files in new distribution and be careful around romfile property to load correct ROM file. Which is loaded actually just to allocate the ROM with correct length. Note, that romsize property doesn't really help: if we try to specify it when default romfile is larger, it fails with something like: romfile "efi-virtio.rom" (160768 bytes) is too large for ROM size 65536 Let's just ignore ROM file when romsize is specified and we are in incoming migration state. In other words, we need only to preallocate ROM of specified size, local ROM file is unrelated. This way: If romsize was specified on source, we just use same commandline as on source, and migration will work independently of local ROM files on target. If romsize was not specified on source (and we have mismatching local ROM file on target host), we have to specify romsize on target to match source romsize. romfile parameter may be kept same as on source or may be dropped, the file is not loaded anyway. As a bonus we avoid extra reading from ROM file on target. Note: when we don't have romsize parameter on source command line and need it for target, it may be calculated as aligned up to power of two size of ROM file on source (if we know, which file is it) or, alternatively it may be retrieved from source QEMU by QMP qom-get command, like { "execute": "qom-get", "arguments": { "path": "/machine/peripheral/CARD_ID/virtio-net-pci.rom[0]", "property": "size" } } Note: we have extra initialization of size variable to zero in pci_add_option_rom to avoid false-positive "error: ‘size’ may be used uninitialized" Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230522201740.88960-2-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-20meson: Replace softmmu_ss -> system_ssPhilippe Mathieu-Daudé
We use the user_ss[] array to hold the user emulation sources, and the softmmu_ss[] array to hold the system emulation ones. Hold the latter in the 'system_ss[]' array for parity with user emulation. Mechanical change doing: $ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss) Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-10-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-09hw/pci/pci: Simplify pci_bar_address() using MACHINE_GET_CLASS() macroPhilippe Mathieu-Daudé
Remove unnecessary intermediate variables. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07hw/pci/pci.c: Don't leak PCIBus::irq_count[] in pci_bus_irqs()Bernhard Beschow
When calling pci_bus_irqs() multiple times on the same object without calling pci_bus_irqs_cleanup() in between PCIBus::irq_count[] is currently leaked. Let's fix this because Xen will do just that in a few commits, and because calling pci_bus_irqs_cleanup() in between seems fragile and cumbersome. Note that pci_bus_irqs_cleanup() now has to NULL irq_count such that pci_bus_irqs() doesn't do a double free. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Chuck Zmudzinski <brchuckz@aol.com> Message-Id: <20230403074124.3925-3-shentey@gmail.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2023-05-19hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0Leonardo Bras
Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK set for machine types < 8.0 will cause migration to fail if the target QEMU version is < 8.0.0 : qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0 qemu-system-x86_64: Failed to load PCIDevice:config qemu-system-x86_64: Failed to load e1000e:parent_obj qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e' qemu-system-x86_64: load of migration failed: Invalid argument The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0, with this cmdline: ./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX] In order to fix this, property x-pcie-err-unc-mask was introduced to control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by default, but is disabled if machine type <= 7.2. Fixes: 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register") Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Message-Id: <20230503002701.854329-1-leobras@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576 Tested-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-19pci: pci_add_option_rom(): refactor: use g_autofree for path variableVladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20230515125229.44836-3-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2023-05-19pci: pci_add_option_rom(): improve styleVladimir Sementsov-Ogievskiy
Fix over-80 lines and missing curly brackets for if-operators, which are required by QEMU coding style. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20230515125229.44836-2-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
2023-05-16hw/pci-bridge: Fix release ordering by embedding PCIBridgeWindows within ↵Jonathan Cameron
PCIBridge The lifetime of the PCIBridgeWindows instance accessed via the windows pointer in struct PCIBridge is managed separately from the PCIBridge itself. Triggered by ./qemu-system-x86_64 -M x-remote -display none -monitor stdio QEMU monitor: device_add cxl-downstream In some error handling paths (such as the above due to attaching a cxl-downstream port anything other than a cxl-upstream port) the g_free() of the PCIBridge windows in pci_bridge_region_cleanup() is called before the final call of flatview_uref() in address_space_set_flatview() ultimately from drain_call_rcu() At one stage this resulted in a crash, currently can still be observed using valgrind which records a use after free. When present, only one instance is allocated. pci_bridge_update_mappings() can operate directly on an instance rather than creating a new one and swapping it in. Thus there appears to be no reason to not directly couple the lifetimes of the two structures by embedding the PCIBridgeWindows within the PCIBridge removing the need for the problematic separate free. Patch is same as was posted deep in the discussion. https://lore.kernel.org/qemu-devel/20230403171232.000020bb@huawei.com/ Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230421122550.28234-1-Jonathan.Cameron@huawei.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-04-21pci: avoid accessing slot_reserved_mask directly outside of pci.cChuck Zmudzinski
This patch provides accessor functions as replacements for direct access to slot_reserved_mask according to the comment at the top of include/hw/pci/pci_bus.h which advises that data structures for PCIBus should not be directly accessed but instead be accessed using accessor functions in pci.h. Three accessor functions can conveniently replace all direct accesses of slot_reserved_mask. With this patch, the new accessor functions are used in hw/sparc64/sun4u.c and hw/xen/xen_pt.c and pci_bus.h is removed from the included header files of the same two files. No functional change intended. Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> Message-Id: <b1b7f134883cbc83e455abbe5ee225c71aa0e8d0.1678888385.git.brchuckz@aol.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [sun4u]
2023-03-11Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into stagingPeter Maydell
# -----BEGIN PGP SIGNATURE----- # Version: GnuPG v1 # # iQEcBAABAgAGBQJkCvgFAAoJEO8Ells5jWIRHiUH/jhydpJHIqnAPxHQAwGtmyhb # 9Z52UOzW5V6KxfZJ+bQ4RPFkS2UwcxmeadPHY4zvvJTVBLAgG3QVgP4igj8CXKCI # xRnwMgTNeu655kZQ5P/elTwdBTCJFODk7Egg/bH3H1ZiUhXBhVRhK7q/wMgtlZkZ # Kexo6txCK4d941RNzEh45ZaGhdELE+B+D7cRuQgBs/DXZtJpsyEzBbP8KYSMHuER # AXfWo0YIBYj7X3ek9D6j0pbOkB61vqtYd7W6xV4iDrJCcFBIOspJbbBb1tGCHola # AXo5/OhRmiQnp/c/HTbJIDbrj0sq/r7LxYK4zY1x7UPbewHS9R+wz+FfqSmoBF0= # =056y # -----END PGP SIGNATURE----- # gpg: Signature made Fri 10 Mar 2023 09:27:33 GMT # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * tag 'net-pull-request' of https://github.com/jasowang/qemu: (44 commits) ebpf: fix compatibility with libbpf 1.0+ docs/system/devices/igb: Add igb documentation tests/avocado: Add igb test igb: Introduce qtest for igb device tests/qtest/libqos/e1000e: Export macreg functions tests/qtest/e1000e-test: Fabricate ethernet header Intrdocue igb device emulation e1000: Split header files pcie: Introduce pcie_sriov_num_vfs net/eth: Introduce EthL4HdrProto e1000e: Implement system clock net/eth: Report if headers are actually present e1000e: Count CRC in Tx statistics e1000: Count CRC in Tx statistics e1000e: Combine rx traces MAINTAINERS: Add e1000e test files MAINTAINERS: Add Akihiko Odaki as a e1000e reviewer e1000e: Do not assert when MSI-X is disabled later hw/net/net_tx_pkt: Check the payload length hw/net/net_tx_pkt: Implement TCP segmentation ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-10pcie: Introduce pcie_sriov_num_vfsAkihiko Odaki
igb can use this function to change its behavior depending on the number of virtual functions currently enabled. Signed-off-by: Gal Hammer <gal.hammer@sap.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-07hw/pci: Add pcie_count_ds_port() and pcie_find_port_first() helpersJonathan Cameron
These two helpers enable host bridges to operate differently depending on the number of downstream ports, in particular if there is only a single port. Useful for CXL where HDM address decoders are allowed to be implicit in the host bridge if there is only a single root port. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230227153128.8164-2-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07hw/pci/aer: Make PCIE AER error injection facility available for other ↵Jonathan Cameron
emulation to use. This infrastructure will be reused for CXL RAS error injection in patches that follow. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230302133709.30373-8-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07hw/pci/aer: Add missing routing for AER errorsJonathan Cameron
PCIe r6.0 Figure 6-3 "Pseudo Logic Diagram for Selected Error Message Control and Status Bits" includes a right hand branch under "All PCI Express devices" that allows for messages to be generated or sent onwards without SERR# being set as long as the appropriate per error class bit in the PCIe Device Control Register is set. Implement that branch thus enabling routing of ERR_COR, ERR_NONFATAL and ERR_FATAL under OSes that set these bits appropriately (e.g. Linux) Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Message-Id: <20230302133709.30373-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07hw/pci/aer: Implement PCI_ERR_UNCOR_MASK registerJonathan Cameron
This register in AER should be both writeable and should have a default value with a couple of the errors masked including the Uncorrectable Internal Error used by CXL for it's error reporting. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Message-Id: <20230302133709.30373-2-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fan Ni <fan.ni@samsung.com>
2023-03-07pci: move acpi-index uniqueness check to generic PCI device codeIgor Mammedov
acpi-index is now working with non-hotpluggable buses (pci/q35 machine hostbridge), it can be used even if ACPI PCI hotplug is disabled and as result acpi-index uniqueness check will be omitted (since the check is done by ACPI PCI hotplug handler, which isn't wired when ACPI PCI hotplug is disabled). Move check and related code to generic PCIDevice so it would be independent of ACPI PCI hotplug. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20230302161543.286002-30-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07pci: fix 'hotplugglable' property behaviorIgor Mammedov
Currently the property may flip its state during VM bring up or just doesn't work as the name implies. In particular with PCIE root port that has 'hotplug={on|off}' property, and when it's turned off, one would expect 'hotpluggable' == false for any devices attached to it. Which is not the case since qbus_is_hotpluggable() used by the property just checks for presence of any hotplug_handler set on bus. The problem is that name BusState::hotplug_handler from its inception is misnomer, as it handles not only hotplug but also in many cases coldplug as well (i.e. generic wiring interface), and it's fine to have hotplug_handler set on bus while it doesn't support hotplug (ex. pcie-slot with hotplug=off). Another case of root port flipping 'hotpluggable' state when ACPI PCI hotplug is enabled in this case root port with 'hotplug=off' starts as hotpluggable and then later on, pcihp hotplug_handler clears hotplug_handler explicitly after checking root port's 'hotplug' property. So root-port hotpluggablity check sort of works if pcihp is enabled but is broken if pcihp is disabled. One way to deal with the issue is to ask hotplug_handler if bus it controls is hotpluggable or not. To do that add is_hotpluggable_bus() hook to HotplugHandler interface and use it in 'hotpluggable' property + teach pcie-slot to actually look into 'hotplug' property state before deciding if bus is hotpluggable. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20230302161543.286002-13-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-03Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Peter Maydell
into staging virtio,pc,pci: features, cleanups, fixes vhost-user support without ioeventfd word replacements in vhost user spec shpc improvements cleanups, fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQBO8QPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpMUMH/3/FVp4qaF4CDwCHn7xWFRJpOREIhX/iWfUu # lGkwxnB7Lfyqdg7i4CAfgMf2emWKZchEE2DamfCo5bIX0IgRU3DWcOdR9ePvJ29J # cKwIYpxZcB4RYSoWL5OUakQLCT3JOu4XWaXeVjyHABjQhf3lGpwN4KmIOBGOy/N6 # 0YHOQScW2eW62wIOwhAEuYQceMt6KU32Uw3tLnMbJliiBf3a/hPctVNM9TFY9pcd # UYHGfBx/zD45owf1lTVEQFDg0eqPZKWW29g5haiOd5oAyXHHolzu+bt3bU7lH46b # f7iP12LqDudyrgoF5YWv3NJ4HaGm5V3kPqNqLLF/mjF7alxG+N8= # =hN3h # -----END PGP SIGNATURE----- # gpg: Signature made Fri 03 Mar 2023 00:13:56 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (53 commits) tests/data/acpi/virt: drop (most) duplicate files. hw/cxl/mailbox: Use new UUID network order define for cel_uuid qemu/uuid: Add UUID static initializer qemu/bswap: Add const_le64() tests: acpi: Update q35/DSDT.cxl for removed duplicate UID hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge tests/acpi: Allow update of q35/DSDT.cxl hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL hw/pci-bridge/cxl_downstream: Fix type naming mismatch hw/mem/cxl_type3: Improve error handling in realize() MAINTAINERS: Add Fan Ni as Compute eXpress Link QEMU reviewer intel-iommu: send UNMAP notifications for domain or global inv desc smmu: switch to use memory_region_unmap_iommu_notifier_range() memory: introduce memory_region_unmap_iommu_notifier_range() intel-iommu: fail DEVIOTLB_UNMAP without dt mode intel-iommu: fail MAP notifier without caching mode memory: Optimize replay of guest mapping chardev/char-socket: set s->listener = NULL in char_socket_finalize hw/pci: Trace IRQ routing on PCI topology ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-02hw/pci: Trace IRQ routing on PCI topologyPhilippe Mathieu-Daudé
Trace how IRQ are rooted from EP to RC. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230211152239.88106-3-philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pcie: set power indicator to off on reset by defaultVladimir Sementsov-Ogievskiy
It should not be zero, the only valid values are ON, OFF and BLINK. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-13-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pcie: introduce pcie_sltctl_powered_off() helperVladimir Sementsov-Ogievskiy
In pcie_cap_slot_write_config() we check for PCI_EXP_SLTCTL_PWR_OFF in a bad form. We should distinguish PCI_EXP_SLTCTL_PWR which is a "mask" and PCI_EXP_SLTCTL_PWR_OFF which is value for that mask. Better code is in pcie_cap_slot_unplug_request_cb() and in pcie_cap_update_power(). Let's use same pattern everywhere. To simplify things add also a helper. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-12-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pcie: pcie_cap_slot_enable_power() use correct helperVladimir Sementsov-Ogievskiy
*_by_mask() helpers shouldn't be used here (and that's the only one). *_by_mask() helpers do shift their value argument, but in pcie.c code we use values that are already shifted appropriately. Happily, PCI_EXP_SLTCTL_PWR_ON is zero, so shift doesn't matter. But if we apply same helper for PCI_EXP_SLTCTL_PWR_OFF constant it will do wrong thing. So, let's use instead pci_word_test_and_clear_mask() which is already used in the file to clear PCI_EXP_SLTCTL_PWR_OFF bit in pcie_cap_slot_init() and pcie_cap_slot_reset(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-11-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pcie_regs: drop duplicated indicator value macrosVladimir Sementsov-Ogievskiy
We already have indicator values in include/standard-headers/linux/pci_regs.h , no reason to reinvent them in include/hw/pci/pcie_regs.h. (and we already have usage of PCI_EXP_SLTCTL_PWR_IND_BLINK and PCI_EXP_SLTCTL_PWR_IND_OFF in hw/pci/pcie.c, so let's be consistent) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-9-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pcie: pcie_cap_slot_write_config(): use correct macroVladimir Sementsov-Ogievskiy
PCI_EXP_SLTCTL_PIC_OFF is a value, and PCI_EXP_SLTCTL_PIC is a mask. Happily PCI_EXP_SLTCTL_PIC_OFF is a maximum value for this mask and is equal to the mask itself. Still the code looks like a bug. Let's make it more reader-friendly. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-8-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: refactor shpc_device_plug_common()Vladimir Sementsov-Ogievskiy
Rename it to shpc_device_get_slot(), to mention what it does rather than how it is used. It also helps to reuse it in further commit. Also, add a return value and get rid of local_err. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-7-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: pass PCIDevice pointer to shpc_slot_command()Vladimir Sementsov-Ogievskiy
We'll need it in further patch to report bridge in QAPI event. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-6-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: more generic handle hot-unplug in shpc_slot_command()Vladimir Sementsov-Ogievskiy
Free slot if both conditions (power-led = OFF and state = DISABLED) becomes true regardless of the sequence. It is similar to how PCIe hotplug works. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-5-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: shpc_slot_command(): handle PWRONLY -> ENABLED transitionVladimir Sementsov-Ogievskiy
ENABLED -> PWRONLY transition is not allowed and we handle it by shpc_invalid_command(). But PWRONLY -> ENABLED transition is silently ignored, which seems wrong. Let's handle it as correct. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-4-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: change shpc_get_status() return type to uint8_tVladimir Sementsov-Ogievskiy
The result of the function is always one byte. The result is always assigned to uint8_t variable. Also, shpc_get_status() should be symmetric to shpc_set_status() which has uint8_t value argument. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-3-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: set attention led to OFF on resetVladimir Sementsov-Ogievskiy
0 is not a valid state for the led. Let's start with OFF. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-2-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-01hw/xen: Support MSI mapping to PIRQDavid Woodhouse
The way that Xen handles MSI PIRQs is kind of awful. There is a special MSI message which targets a PIRQ. The vector in the low bits of data must be zero. The low 8 bits of the PIRQ# are in the destination ID field, the extended destination ID field is unused, and instead the high bits of the PIRQ# are in the high 32 bits of the address. Using the high bits of the address means that we can't intercept and translate these messages in kvm_send_msi(), because they won't be caught by the APIC — addresses like 0x1000fee46000 aren't in the APIC's range. So we catch them in pci_msi_trigger() instead, and deliver the event channel directly. That isn't even the worst part. The worst part is that Xen snoops on writes to devices' MSI vectors while they are *masked*. When a MSI message is written which looks like it targets a PIRQ, it remembers the device and vector for later. When the guest makes a hypercall to bind that PIRQ# (snooped from a marked MSI vector) to an event channel port, Xen *unmasks* that MSI vector on the device. Xen guests using PIRQ delivery of MSI don't ever actually unmask the MSI for themselves. Now that this is working we can finally enable XENFEAT_hvm_pirqs and let the guest use it all. Tested with passthrough igb and emulated e1000e + AHCI. CPU0 CPU1 0: 65 0 IO-APIC 2-edge timer 1: 0 14 xen-pirq 1-ioapic-edge i8042 4: 0 846 xen-pirq 4-ioapic-edge ttyS0 8: 1 0 xen-pirq 8-ioapic-edge rtc0 9: 0 0 xen-pirq 9-ioapic-level acpi 12: 257 0 xen-pirq 12-ioapic-edge i8042 24: 9600 0 xen-percpu -virq timer0 25: 2758 0 xen-percpu -ipi resched0 26: 0 0 xen-percpu -ipi callfunc0 27: 0 0 xen-percpu -virq debug0 28: 1526 0 xen-percpu -ipi callfuncsingle0 29: 0 0 xen-percpu -ipi spinlock0 30: 0 8608 xen-percpu -virq timer1 31: 0 874 xen-percpu -ipi resched1 32: 0 0 xen-percpu -ipi callfunc1 33: 0 0 xen-percpu -virq debug1 34: 0 1617 xen-percpu -ipi callfuncsingle1 35: 0 0 xen-percpu -ipi spinlock1 36: 8 0 xen-dyn -event xenbus 37: 0 6046 xen-pirq -msi ahci[0000:00:03.0] 38: 1 0 xen-pirq -msi-x ens4 39: 0 73 xen-pirq -msi-x ens4-rx-0 40: 14 0 xen-pirq -msi-x ens4-rx-1 41: 0 32 xen-pirq -msi-x ens4-tx-0 42: 47 0 xen-pirq -msi-x ens4-tx-1 Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-02-27hw/pci: Fix a typoPhilippe Mathieu-Daudé
Fix 'interrutp' typo. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230211152239.88106-2-philmd@linaro.org>
2023-02-17net: Move the code to collect available NIC models to a separate functionThomas Huth
The code that collects the available NIC models is not really specific to PCI anymore and will be required in the next patch, too, so let's move this into a new separate function in net.c instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-01-28pci: make sure pci_bus_is_express() won't error out with "discards ↵Igor Mammedov
‘const’ qualifier" function doesn't need RW aceess to passed in bus pointer, make it const. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20230112140312.3096331-31-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>