aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-07-23target/riscv: Simplify probing in vext_ldffRichard Henderson
The current pairing of tlb_vaddr_to_host with extra is either inefficient (user-only, with page_check_range) or incorrect (system, with probe_pages). For proper non-fault behaviour, use probe_access_flags with its nonfault parameter set to true. Reviewed-by: Max Chou <max.chou@sifive.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/s390x: Use set/clear_helper_retaddr in mem_helper.cRichard Henderson
Avoid a race condition with munmap in another thread. For access_memset and access_memmove, manage the value within the helper. For uses of access_{get,set}_byte, manage the value across the for loops. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/s390x: Use user_or_likely in access_memmoveRichard Henderson
Invert the conditional, indent the block, and use the macro that expands to true for user-only. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/s390x: Use user_or_likely in do_access_memsetRichard Henderson
Eliminate the ifdef by using a predicate that is always true with CONFIG_USER_ONLY. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/ppc: Improve helper_dcbz for user-onlyRichard Henderson
Mark the reserve_addr check unlikely. Use tlb_vaddr_to_host instead of probe_write, relying on the memset itself to test for page writability. Use set/clear_helper_retaddr so that we can properly unwind on segfault. With this, a trivial loop around guest memset will no longer spend nearly 25% of runtime within page_get_flags. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/ppc: Merge helper_{dcbz,dcbzep}Richard Henderson
Merge the two and pass the mmu_idx directly from translation. Swap the argument order in dcbz_common to avoid extra swaps. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/ppc: Split out helper_dbczl for 970Richard Henderson
We can determine at translation time whether the insn is or is not dbczl. We must retain a runtime check against the HID5 register, but we can move that to a separate function that never affects other ppc models. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/ppc: Hoist dcbz_size out of dcbz_commonRichard Henderson
The 970 logic does not apply to dcbzep, which is an e500 insn. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/ppc/mem_helper.c: Remove a conditional from dcbz_common()BALATON Zoltan
Instead of passing a bool and select a value within dcbz_common() let the callers pass in the right value to avoid this conditional statement. On PPC dcbz is often used to zero memory and some code uses it a lot. This change improves the run time of a test case that copies memory with a dcbz call in every iteration from 6.23 to 5.83 seconds. Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Message-Id: <20240622204833.5F7C74E6000@zero.eik.bme.hu> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2024-07-23target/arm: Use set/clear_helper_retaddr in SVE and SME helpersRichard Henderson
Avoid a race condition with munmap in another thread. Use around blocks that exclusively use "host_fn". Keep the blocks as small as possible, but without setting and clearing for every operation on one page. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23target/arm: Use set/clear_helper_retaddr in helper-a64.cRichard Henderson
Use these in helper_dc_dva and the FEAT_MOPS routines to avoid a race condition with munmap in another thread. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23accel/tcg: Move {set,clear}_helper_retaddr to cpu_ldst.hRichard Henderson
Use of these in helpers goes hand-in-hand with tlb_vaddr_to_host and other probing functions. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-22hw/nvme: Add SPDM over DOE supportWilfred Mallawa
Setup Data Object Exchange (DOE) as an extended capability for the NVME controller and connect SPDM to it (CMA) to it. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Klaus Jensen <k.jensen@samsung.com> Message-Id: <20240703092027.644758-4-alistair.francis@wdc.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22backends: Initial support for SPDM socket supportHuai-Cheng Kuo
SPDM enables authentication, attestation and key exchange to assist in providing infrastructure security enablement. It's a standard published by the DMTF [1]. SPDM supports multiple transports, including PCIe DOE and MCTP. This patch adds support to QEMU to connect to an external SPDM instance. SPDM support can be added to any QEMU device by exposing a TCP socket to a SPDM server. The server can then implement the SPDM decoding/encoding support, generally using libspdm [2]. This is similar to how the current TPM implementation works and means that the heavy lifting of setting up certificate chains, capabilities, measurements and complex crypto can be done outside QEMU by a well supported and tested library. 1: https://www.dmtf.org/standards/SPDM 2: https://github.com/DMTF/libspdm Signed-off-by: Huai-Cheng Kuo <hchkuo@avery-design.com.tw> Signed-off-by: Chris Browy <cbrowy@avery-design.com> Co-developed-by: Jonathan Cameron <Jonathan.cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [ Changes by WM - Bug fixes from testing ] Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> [ Changes by AF: - Convert to be more QEMU-ified - Move to backends as it isn't PCIe specific ] Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20240703092027.644758-3-alistair.francis@wdc.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/pci: Add all Data Object Types defined in PCIe r6.0Alistair Francis
Add all of the defined protocols/features from the PCIe-SIG r6.0 "Table 6-32 PCI-SIG defined Data Object Types (Vendor ID = 0001h)" table. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Message-Id: <20240703092027.644758-2-alistair.francis@wdc.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22tests/acpi: Add expected ACPI AML files for RISC-VSunil V L
As per the step 5 in the process documented in bios-tables-test.c, generate the expected ACPI AML data files for RISC-V using the rebuild-expected-aml.sh script and update the bios-tables-test-allowed-diff.h. These are all new files being added for the first time. Hence, iASL diff output is not added. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716144306.2432257-10-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22tests/qtest/bios-tables-test.c: Enable basic testing for RISC-VSunil V L
Add basic ACPI table test case for RISC-V. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716144306.2432257-9-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22tests/acpi: Add empty ACPI data files for RISC-VSunil V L
As per process documented (steps 1-3) in bios-tables-test.c, add empty AML data files for RISC-V ACPI tables and add the entries in bios-tables-test-allowed-diff.h. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716144306.2432257-8-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22tests/qtest/bios-tables-test.c: Remove the fall back pathSunil V L
The expected ACPI AML files are moved now under ${arch}/{machine} path. Hence, there is no need to search in old path which didn't have ${arch}. Remove the code which searches for the expected AML files under old path as well. Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716144306.2432257-7-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22tests/acpi: update expected DSDT blob for aarch64 and microvmSunil V L
After PCI link devices are moved out of the scope of PCI root complex, the DSDT files of machines which use GPEX, will change. So, update the expected AML files with these changes for these machines. Mainly, there are 2 changes. 1) Since the link devices are created now directly under _SB for all PCI root bridges in the system, they should have unique names. So, instead of GSIx, named those devices as LXXY where L means link, XX will have PCI bus number and Y will have the INTx number (ex: L000 or L001). The _PRT entries will also be updated to reflect this name change. 2) PCI link devices are moved from the scope of each PCI root bridge to directly under _SB. Below is the sample iASL difference for one such link device. Scope (\_SB) { Name (_HID, "LNRO0005") // _HID: Hardware ID Name (_UID, 0x1F) // _UID: Unique ID Name (_CCA, One) // _CCA: Cache Coherency Attribute Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings { Memory32Fixed (ReadWrite, 0x0A003E00, // Address Base 0x00000200, // Address Length ) Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, ) { 0x0000004F, } }) + Device (L000) + { + Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */) + Name (_UID, Zero) // _UID: Unique ID + Name (_PRS, ResourceTemplate () + { + Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, ) + { + 0x00000023, + } + }) + Name (_CRS, ResourceTemplate () + { + Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, ) + { + 0x00000023, + } + }) + Method (_SRS, 1, NotSerialized) // _SRS: Set Resource Settings + { + } + } + Device (PCI0) { Name (_HID, "PNP0A08" /* PCI Express Bus */) // _HID: Hardware ID Name (_CID, "PNP0A03" /* PCI Bus */) // _CID: Compatible ID Name (_SEG, Zero) // _SEG: PCI Segment Name (_BBN, Zero) // _BBN: BIOS Bus Number Name (_UID, Zero) // _UID: Unique ID Name (_STR, Unicode ("PCIe 0 Device")) // _STR: Description String Name (_CCA, One) // _CCA: Cache Coherency Attribute Name (_PRT, Package (0x80) // _PRT: PCI Routing Table { Package (0x04) { 0xFFFF, Zero, - GSI0, + L000, Zero }, ..... }) Device (GSI0) { Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */) Name (_UID, Zero) // _UID: Unique ID Name (_PRS, ResourceTemplate () { Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, ) { 0x00000023, } }) Name (_CRS, ResourceTemplate () { Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, ) { 0x00000023, } }) Method (_SRS, 1, NotSerialized) // _SRS: Set Resource Settings { } } } } Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Message-Id: <20240716144306.2432257-6-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22acpi/gpex: Create PCI link devices outside PCI root bridgeSunil V L
Currently, PCI link devices (PNP0C0F) are always created within the scope of the PCI root bridge. However, RISC-V needs these link devices to be created outside to ensure the probing order in the OS. This matches the example given in the ACPI specification [1] as well. Hence, create these link devices directly under _SB instead of under the PCI root bridge. To keep these link device names unique for multiple PCI bridges, change the device name from GSIx to LXXY format where XX is the PCI bus number and Y is the INTx. GPEX is currently used by riscv, aarch64/virt and x86/microvm machines. So, this change will alter the DSDT for those systems. [1] - ACPI 5.1: 6.2.13.1 Example: Using _PRT to Describe PCI IRQ Routing Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716144306.2432257-5-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22tests/acpi: Allow DSDT acpi table changes for aarch64Sunil V L
so that CI tests don't fail when those ACPI tables are updated in the next patch. This is as per the documentation in bios-tables-tests.c. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716144306.2432257-4-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UARTSunil V L
The requirement ACPI_060 in the RISC-V BRS specification [1], requires NS16550 compatible UART to have the HID RSCV0003. So, update the HID for the UART. [1] - https://github.com/riscv-non-isa/riscv-brs/releases/download/v0.0.2/riscv-brs-spec.pdf (Chapter 6) Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716144306.2432257-3-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLICSunil V L
As per the requirement ACPI_080 in the RISC-V Boot and Runtime Services (BRS) specification [1], PLIC and APLIC should be in namespace as well. So, add them using the defined HID. [1] - https://github.com/riscv-non-isa/riscv-brs/releases/download/v0.0.2/riscv-brs-spec.pdf (Chapter 6) Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716144306.2432257-2-sunilvl@ventanamicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domainEric Auger
Add a trace point on virtio_iommu_detach_endpoint_from_domain(). Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-7-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/vfio/common: Add vfio_listener_region_del_iommu trace eventEric Auger
Trace when VFIO gets notified about the deletion of an IOMMU MR. Also trace the name of the region in the add_iommu trace message. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-6-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-iommu: Remove the end point on detachEric Auger
We currently miss the removal of the endpoint in case of detach. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-5-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-iommu: Free [host_]resv_ranges on unset_iommu_devicesEric Auger
We are currently missing the deallocation of the [host_]resv_regions in case of hot unplug. Also to make things more simple let's rule out the case where multiple HostIOMMUDevices would be aliased and attached to the same IOMMUDevice. This allows to remove the handling of conflicting Host reserved regions. Anyway this is not properly supported at guest kernel level. On hotunplug the reserved regions are reset to the ones set by virtio-iommu property. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-4-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-iommu: Remove probe_doneEric Auger
Now we have switched to PCIIOMMUOps to convey host IOMMU information, the host reserved regions are transmitted when the PCIe topology is built. This happens way before the virtio-iommu driver calls the probe request. So let's remove the probe_done flag that allowed to check the probe was not done before the IOMMU MR got enabled. Besides this probe_done flag had a flaw wrt migration since it was not saved/restored. The only case at risk is if 2 devices were plugged to a PCIe to PCI bridge and thus aliased. First of all we discovered in the past this case was not properly supported for neither SMMU nor virtio-iommu on guest kernel side: see [RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr() https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/ If this were supported by the guest kernel, it is unclear what the call sequence would be from a virtio-iommu driver point of view. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-3-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"Eric Auger
This reverts commit 1b889d6e39c32d709f1114699a014b381bcf1cb1. There are different problems with that tentative fix: - Some resources are left dangling (resv_regions, host_resv_ranges) and memory subregions are left attached to the root MR although freed as embedded in the sdev IOMMUDevice. Finally the sdev->as is not destroyed and associated listeners are left. - Even when fixing the above we observe a memory corruption associated with the deallocation of the IOMMUDevice. This can be observed when a VFIO device is hotplugged, hot-unplugged and a system reset is issued. At this stage we have not been able to identify the root cause (IOMMU MR or as structs beeing overwritten and used later on?). - Another issue is HostIOMMUDevice are indexed by non aliased BDF whereas the IOMMUDevice is indexed by aliased BDF - yes the current naming is really misleading -. Given the state of the code I don't think the virtio-iommu device works in non singleton group case though. So let's revert the patch for now. This means the IOMMU MR/as survive the hotunplug. This is what is done in the intel_iommu for instance. It does not sound very logical to keep those but currently there is no symetric function to pci_device_iommu_address_space(). probe_done issue will be handled in a subsequent patch. Also resv_regions and host_resv_regions will be deallocated separately. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-2-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22gdbstub: Add helper function to unregister GDB register spaceSalil Mehta
Add common function to help unregister the GDB register space. This shall be done in context to the CPU unrealization. Note: These are common functions exported to arch specific code. For example, for ARM this code is being referred in associated arch specific patch-set: Link: https://lore.kernel.org/qemu-devel/20230926103654.34424-1-salil.mehta@huawei.com/ Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Xianglai Li <lixianglai@loongson.cn> Tested-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716111502.202344-8-salil.mehta@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22physmem: Add helper function to destroy CPU AddressSpaceSalil Mehta
Virtual CPU Hot-unplug leads to unrealization of a CPU object. This also involves destruction of the CPU AddressSpace. Add common function to help destroy the CPU AddressSpace. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Xianglai Li <lixianglai@loongson.cn> Tested-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716111502.202344-7-salil.mehta@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/acpi: Update CPUs AML with cpu-(ctrl)dev changeSalil Mehta
CPUs Control device(\\_SB.PCI0) register interface for the x86 arch is IO port based and existing CPUs AML code assumes _CRS objects would evaluate to a system resource which describes IO Port address. But on ARM arch CPUs control device(\\_SB.PRES) register interface is memory-mapped hence _CRS object should evaluate to system resource which describes memory-mapped base address. Update build CPUs AML function to accept both IO/MEMORY region spaces and accordingly update the _CRS object. Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Xianglai Li <lixianglai@loongson.cn> Tested-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716111502.202344-6-salil.mehta@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/acpi: Update GED _EVT method AML with CPU scanSalil Mehta
OSPM evaluates _EVT method to map the event. The CPU hotplug event eventually results in start of the CPU scan. Scan figures out the CPU and the kind of event(plug/unplug) and notifies it back to the guest. Update the GED AML _EVT method with the call to method \\_SB.CPUS.CSCN (via \\_SB.GED.CSCN) Architecture specific code [1] might initialize its CPUs AML code by calling common function build_cpus_aml() like below for ARM: build_cpus_aml(scope, ms, opts, xx_madt_cpu_entry, memmap[VIRT_CPUHP_ACPI].base, "\\_SB", "\\_SB.GED.CSCN", AML_SYSTEM_MEMORY); [1] https://lore.kernel.org/qemu-devel/20240613233639.202896-13-salil.mehta@huawei.com/ Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Tested-by: Xianglai Li <lixianglai@loongson.cn> Tested-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716111502.202344-5-salil.mehta@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/acpi: Update ACPI GED framework to support vCPU HotplugSalil Mehta
ACPI GED (as described in the ACPI 6.4 spec) uses an interrupt listed in the _CRS object of GED to intimate OSPM about an event. Later then demultiplexes the notified event by evaluating ACPI _EVT method to know the type of event. Use ACPI GED to also notify the guest kernel about any CPU hot(un)plug events. Note, GED interface is used by many hotplug events like memory hotplug, NVDIMM hotplug and non-hotplug events like system power down event. Each of these can be selected using a bit in the 32 bit GED IO interface. A bit has been reserved for the CPU hotplug event. ACPI CPU hotplug related initialization should only happen if ACPI_CPU_HOTPLUG support has been enabled for particular architecture. Add cpu_hotplug_hw_init() stub to avoid compilation break. Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Tested-by: Xianglai Li <lixianglai@loongson.cn> Tested-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20240716111502.202344-4-salil.mehta@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com>
2024-07-22hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header fileSalil Mehta
CPU ctrl-dev MMIO region length could be used in ACPI GED and various other architecture specific places. Move ACPI_CPU_HOTPLUG_REG_LEN macro to more appropriate common header file. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Tested-by: Xianglai Li <lixianglai@loongson.cn> Tested-by: Miguel Luis <miguel.luis@oracle.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716111502.202344-3-salil.mehta@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22accel/kvm: Extract common KVM vCPU {creation,parking} codeSalil Mehta
KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread is spawned. This is common to all the architectures as of now. Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't support vCPU removal. Therefore, its representative KVM vCPU object/context in Qemu is parked. Refactor architecture common logic so that some APIs could be reused by vCPU Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs with trace events. New APIs qemu_{create,park,unpark}_vcpu() can be externally called. No functional change is intended here. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Xianglai Li <lixianglai@loongson.cn> Tested-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240716111502.202344-2-salil.mehta@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22smbios: make memory device size configurable per MachineIgor Mammedov
Currently QEMU describes initial[1] RAM* in SMBIOS as a series of virtual DIMMs (capped at 16Gb max) using type 17 structure entries. Which is fine for the most cases. However when starting guest with terabytes of RAM this leads to too many memory device structures, which eventually upsets linux kernel as it reserves only 64K for these entries and when that border is crossed out it runs out of reserved memory. Instead of partitioning initial RAM on 16Gb DIMMs, use maximum possible chunk size that SMBIOS spec allows[2]. Which lets encode RAM in lower 31 bits of 32bit field (which amounts upto 2047Tb per DIMM). As result initial RAM will generate only one type 17 structure until host/guest reach ability to use more RAM in the future. Compat changes: We can't unconditionally change chunk size as it will break QEMU<->guest ABI (and migration). Thus introduce a new machine class field that would let older versioned machines to use legacy 16Gb chunks, while new(er) machine type[s] use maximum possible chunk size. PS: While it might seem to be risky to rise max entry size this large (much beyond of what current physical RAM modules support), I'd not expect it causing much issues, modulo uncovering bugs in software running within guest. And those should be fixed on guest side to handle SMBIOS spec properly, especially if guest is expected to support so huge RAM configs. In worst case, QEMU can reduce chunk size later if we would care enough about introducing a workaround for some 'unfixable' guest OS, either by fixing up the next machine type or giving users a CLI option to customize it. 1) Initial RAM - is RAM configured with help '-m SIZE' CLI option/ implicitly defined by machine. It doesn't include memory configured with help of '-device' option[s] (pcdimm,nvdimm,...) 2) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size PS: * tested on 8Tb host with RHEL6 guest, which seems to parse type 17 SMBIOS table entries correctly (according to 'dmidecode'). Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240715122417.4059293-1-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22docs: Document composable SR-IOV deviceAkihiko Odaki
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-8-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-net: Implement SR-IOV VFAkihiko Odaki
A virtio-net device can be added as a SR-IOV VF to another virtio-pci device that will be the PF. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-7-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-pci: Implement SR-IOV PFAkihiko Odaki
Allow user to attach SR-IOV VF to a virtio-pci PF. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-6-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22pcie_sriov: Allow user to create SR-IOV deviceAkihiko Odaki
A user can create a SR-IOV device by specifying the PF with the sriov-pf property of the VFs. The VFs must be added before the PF. A user-creatable VF must have PCIDeviceClass::sriov_vf_user_creatable set. Such a VF cannot refer to the PF because it is created before the PF. A PF that user-creatable VFs can be attached calls pcie_sriov_pf_init_from_user_created_vfs() during realization and pcie_sriov_pf_exit() when exiting. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-5-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22pcie_sriov: Check PCI Express for SR-IOV PFAkihiko Odaki
SR-IOV requires PCI Express. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-4-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22pcie_sriov: Ensure PF and VF are mutually exclusiveAkihiko Odaki
A device cannot be a SR-IOV PF and a VF at the same time. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-3-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22hw/pci: Fix SR-IOV VF number calculationAkihiko Odaki
pci_config_get_bar_addr() had a division by vf_stride. vf_stride needs to be non-zero when there are multiple VFs, but the specification does not prohibit to make it zero when there is only one VF. Do not perform the division for the first VF to avoid division by zero. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-2-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-23Merge tag 'pull-request-2024-07-22' of https://gitlab.com/thuth/qemu into ↵Richard Henderson
staging * Minor clean-ups and fixes for the qtests and Avocado tests * Fix crash that happens when introspecting scsi-block on older machine types * s390x: filter deprecated properties based on model expansion type # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmaeSUMRHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbVdQw/8DvGymXKwpS0F2aSHg3AZvjSCpkv3Y+fK # myQrzh30cv9Vhe/Y9do47HpfJ6Ug9SK6xG64K2o+BIW+G3+ZUwSHk24PoiALsrJf # 9qqya1upBJkEC5B4PhqRPS3GlbvBnKKEk8W6BMpUa2BToFV9MsG256cBVhUrRpGc # 6u80DgTNxCI1czsNkWVGJAt1oVLYYJIjz7UZ4VbZCH48o6r0iSUV6C01wccOFmNy # IXbspyyUftWFh9lO0i8PiYlXG2YEAmFry3gqD5vc+6BsFT4lMeoRFFxbVCddGKFc # iNwlH4ayjeISlEJeClImIdbHyZ+sDhPyy5x4cpQqmZudEPn+GVnZ0arm7OvXW/k8 # Yog4n7/cUz7GHnWbqYIFZMS1g1wmqm/9VPsVTzXAlTva4dTTs2p0tKAADHIAtPCI # jxSPpbuCuukDzUZGsNZyRGbex6g4B0tP4TMHRFxo5LVy9dKn2BLOHBWuzPevD9OO # FphZHUuGngcPi4GSFmlv7aCS0pqyWsCO+5EqoYUgO8yadyfiXN9pwjB6OnBZux0U # kbJOkkBJwEalhsiHmPFMnS8rkWa4Ye4ZJjj8XHRiecxSZOcNOcxyE+l2x8CV2aFB # UBR83nm86vXXpu86Yod3E+txDEUzKN5+B8X0q7Se0YvsWbB+1Dq/Co0Bdh/Wp70E # EPk5eqaSp8k= # =zB5F # -----END PGP SIGNATURE----- # gpg: Signature made Mon 22 Jul 2024 09:57:55 PM AEST # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] * tag 'pull-request-2024-07-22' of https://gitlab.com/thuth/qemu: target/s390x: filter deprecated properties based on model expansion type tests: increase timeout per instance of bios-tables-test qtest/fuzz: make range overlap check more readable hw: Fix crash that happens when introspecting scsi-block on older machine types tests/avocado/machine_aspeed.py: Increase timeout for TPM test tests/avocado: Remove the remainders of the virtiofs_submounts test tests/avocado/mem-addr-space-check: Remove unused "import signal" tests/avocado: Move LinuxTest related code into a separate file tests/avocado: Allow overwriting AVOCADO_SHOW env variable tests/avocado/boot_xen.py: use class attribute tests/avocado/boot_xen.py: unify tags tests/avocado/boot_xen.py: merge base classes Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-22chardev/char-win-stdio.c: restore old console modesongziming
If I use `-serial stdio` on Windows, after QEMU exits, the terminal could not handle arrow keys and tab any more. Because stdio backend on Windows sets console mode to virtual terminal input when starts, but does not restore the old mode when finalize. This small patch saves the old console mode and set it back. Signed-off-by: Ziming Song <s.ziming@hotmail.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-ID: <ME3P282MB25488BE7C39BF0C35CD0DA5D8CA82@ME3P282MB2548.AUSP282.PROD.OUTLOOK.COM>
2024-07-22hpet: avoid timer storms on periodic timersPaolo Bonzini
If the period is set to a value that is too low, there could be no time left to run the rest of QEMU. Do not trigger interrupts faster than 1 MHz. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-22hpet: store full 64-bit target value of the counterPaolo Bonzini
Store the full 64-bit value at which the timer should fire. This makes it possible to skip the imprecise hpet_calculate_diff() step, and to remove the clamping of the period to 31 or 63 bits. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-22hpet: accept 64-bit reads and writesPaolo Bonzini
Declare the MemoryRegionOps so that 64-bit reads and writes to the HPET are received directly. This makes it possible to unify the code to process low and high parts: for 32-bit reads, extract the desired word; for 32-bit writes, just merge the desired part into the old value and proceed as with a 64-bit write. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>