aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2023-03-02hw/misc: add a toy i2c echo deviceKlaus Jensen
Add an example I2C device to demonstrate how a slave may master the bus and send data asynchronously to another slave. The device will echo whatever it is sent to the device identified by the first byte received. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> [ clg: integrated fixes : https://lore.kernel.org/qemu-devel/Y3yMKAhOkYGtnkOp@cormorant.local/ ] Message-Id: <20220601210831.67259-7-its@irrelevant.dk> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-02hw/i2c: only schedule pending master when bus is idleKlaus Jensen
It is not given that the current master will release the bus after a transfer ends. Only schedule a pending master if the bus is idle. Fixes: 37fa5ca42623 ("hw/i2c: support multiple masters") Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Corey Minyard <cminyard@mvista.com> Message-Id: <20221116084312.35808-2-its@irrelevant.dk> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-02pcie: introduce pcie_sltctl_powered_off() helperVladimir Sementsov-Ogievskiy
In pcie_cap_slot_write_config() we check for PCI_EXP_SLTCTL_PWR_OFF in a bad form. We should distinguish PCI_EXP_SLTCTL_PWR which is a "mask" and PCI_EXP_SLTCTL_PWR_OFF which is value for that mask. Better code is in pcie_cap_slot_unplug_request_cb() and in pcie_cap_update_power(). Let's use same pattern everywhere. To simplify things add also a helper. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-12-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pcie: pcie_cap_slot_enable_power() use correct helperVladimir Sementsov-Ogievskiy
*_by_mask() helpers shouldn't be used here (and that's the only one). *_by_mask() helpers do shift their value argument, but in pcie.c code we use values that are already shifted appropriately. Happily, PCI_EXP_SLTCTL_PWR_ON is zero, so shift doesn't matter. But if we apply same helper for PCI_EXP_SLTCTL_PWR_OFF constant it will do wrong thing. So, let's use instead pci_word_test_and_clear_mask() which is already used in the file to clear PCI_EXP_SLTCTL_PWR_OFF bit in pcie_cap_slot_init() and pcie_cap_slot_reset(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-11-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pcie_regs: drop duplicated indicator value macrosVladimir Sementsov-Ogievskiy
We already have indicator values in include/standard-headers/linux/pci_regs.h , no reason to reinvent them in include/hw/pci/pcie_regs.h. (and we already have usage of PCI_EXP_SLTCTL_PWR_IND_BLINK and PCI_EXP_SLTCTL_PWR_IND_OFF in hw/pci/pcie.c, so let's be consistent) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-9-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pcie: pcie_cap_slot_write_config(): use correct macroVladimir Sementsov-Ogievskiy
PCI_EXP_SLTCTL_PIC_OFF is a value, and PCI_EXP_SLTCTL_PIC is a mask. Happily PCI_EXP_SLTCTL_PIC_OFF is a maximum value for this mask and is equal to the mask itself. Still the code looks like a bug. Let's make it more reader-friendly. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-8-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: refactor shpc_device_plug_common()Vladimir Sementsov-Ogievskiy
Rename it to shpc_device_get_slot(), to mention what it does rather than how it is used. It also helps to reuse it in further commit. Also, add a return value and get rid of local_err. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-7-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: pass PCIDevice pointer to shpc_slot_command()Vladimir Sementsov-Ogievskiy
We'll need it in further patch to report bridge in QAPI event. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-6-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: more generic handle hot-unplug in shpc_slot_command()Vladimir Sementsov-Ogievskiy
Free slot if both conditions (power-led = OFF and state = DISABLED) becomes true regardless of the sequence. It is similar to how PCIe hotplug works. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-5-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: shpc_slot_command(): handle PWRONLY -> ENABLED transitionVladimir Sementsov-Ogievskiy
ENABLED -> PWRONLY transition is not allowed and we handle it by shpc_invalid_command(). But PWRONLY -> ENABLED transition is silently ignored, which seems wrong. Let's handle it as correct. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-4-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: change shpc_get_status() return type to uint8_tVladimir Sementsov-Ogievskiy
The result of the function is always one byte. The result is always assigned to uint8_t variable. Also, shpc_get_status() should be symmetric to shpc_set_status() which has uint8_t value argument. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-3-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02pci/shpc: set attention led to OFF on resetVladimir Sementsov-Ogievskiy
0 is not a valid state for the led. Let's start with OFF. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-2-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02vdpa: stop all svq on device deletionEugenio Pérez
Not stopping them leave the device in a bad state when virtio-net fronted device is unplugged with device_del monitor command. This is not triggable in regular poweroff or qemu forces shutdown because cleanup is called right after vhost_vdpa_dev_start(false). But devices hot unplug does not call vdpa device cleanups. This lead to all the vhost_vdpa devices without stop the SVQ but the last. Fix it and clean the code, making it symmetric with vhost_vdpa_svqs_start. Fixes: dff4426fa656 ("vhost: Add Shadow VirtQueue kick forwarding capabilities") Reported-by: Lei Yang <leiyang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230209170004.899472-1-eperezma@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2023-03-02vhost-user: Adopt new backend namingMaxime Coquelin
The Vhost-user specification changed feature and request naming from _SLAVE_ to _BACKEND_. This patch adopts the new naming convention. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20230208203259.381326-4-maxime.coquelin@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02hw/timer/hpet: Fix expiration time overflowAkihiko Odaki
The expiration time provided for timer_mod() can overflow if a ridiculously large value is set to the comparator register. The resulting value can represent a past time after rounded, forcing the timer to fire immediately. If the timer is configured as periodic, it will rearm the timer again, and form an endless loop. Check if the expiration value will overflow, and if it will, stop the timer instead of rearming the timer with the overflowed time. This bug was found by Alexander Bulekov when fuzzing igb, a new network device emulation: https://patchew.org/QEMU/20230129053316.1071513-1-alxndr@bu.edu/ The fixed test case is: fuzz/crash_2d7036941dcda1ad4380bb8a9174ed0c949bcefd Fixes: 16b29ae180 ("Add HPET emulation to qemu (Beth Kon)") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20230131030037.18856-1-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02virtio-rng-pci: fix transitional migration compat for vectorsDr. David Alan Gilbert
In bad9c5a516 ("virtio-rng-pci: fix migration compat for vectors") I fixed the virtio-rng-pci migration compatibility, but it was discovered that we also need to fix the other aliases of the device for the transitional cases. Fixes: 9ea02e8f1 ('virtio-rng-pci: Allow setting nvectors, so we can use MSI-X') bz: https://bugzilla.redhat.com/show_bug.cgi?id=2162569 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20230207174944.138255-1-dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02vhost-user-rng: Back up vqs before cleaning up vhost_devAkihiko Odaki
vhost_dev_cleanup() clears vhost_dev so back up its vqs member to free the memory pointed by the member. Fixes: 821d28b88f ("vhost-user-rng: Add vhost-user-rng implementation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20230130140516.78078-1-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02vhost-user-i2c: Back up vqs before cleaning up vhost_devAkihiko Odaki
vhost_dev_cleanup() clears vhost_dev so back up its vqs member to free the memory pointed by the member. Fixes: 7221d3b634 ("hw/virtio: add boilerplate for vhost-user-i2c device") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20230130140435.78049-1-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02vhost-user-gpio: Configure vhost_dev when connectingAkihiko Odaki
vhost_dev_cleanup(), called from vu_gpio_disconnect(), clears vhost_dev so vhost-user-gpio must set the members of vhost_dev each time connecting. do_vhost_user_cleanup() should also acquire the pointer to vqs directly from VHostUserGPIO instead of referring to vhost_dev as it can be called after vhost_dev_cleanup(). Fixes: 27ba7b027f ("hw/virtio: add boilerplate for vhost-user-gpio device") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20230130140320.77999-1-akihiko.odaki@daynix.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02virtio-net: clear guest_announce feature if no cvq backendEugenio Pérez
Since GUEST_ANNOUNCE is emulated the feature bit could be set without backend support. This happens in the vDPA case. However, backend vDPA parent may not have CVQ support. This causes an incoherent feature set, and the driver may refuse to start. This happens in virtio-net Linux driver. This may be solved differently in the future. Qemu is able to emulate a CVQ just for guest_announce purposes, helping guest to notify the new location with vDPA devices that does not support it. However, this is left as a TODO as it is way more complex to backport. Tested with vdpa_net_sim, toggling manually VIRTIO_NET_F_CTRL_VQ in the driver and migrating it with x-svq=on. Fixes: 980003debddd ("vdpa: do not handle VIRTIO_NET_F_GUEST_ANNOUNCE in vhost-vdpa") Reported-by: Dawar, Gautam <gautam.dawar@amd.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230124161159.2182117-1-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Gautam Dawar <gautam.dawar@amd.com> Tested-by: Gautam Dawar <gautam.dawar@amd.com> Tested-by: Lei Yang <leiyang@redhat.com>
2023-03-02Revert "hw/i386: pass RNG seed via setup_data entry"Michael S. Tsirkin
This reverts commit 67f7e426e53833a5db75b0d813e8d537b8a75bd2. Additionally to the automatic revert, I went over the code and dropped all mentions of legacy_no_rng_seed manually, effectively reverting a combination of 2 additional commits: commit ffe2d2382e5f1aae1abc4081af407905ef380311 Author: Jason A. Donenfeld <Jason@zx2c4.com> Date: Wed Sep 21 11:31:34 2022 +0200 x86: re-enable rng seeding via SetupData commit 3824e25db1a84fadc50b88dfbe27047aa2f7f85d Author: Gerd Hoffmann <kraxel@redhat.com> Date: Wed Aug 17 10:39:40 2022 +0200 x86: disable rng seeding via setup_data Fixes: 67f7e426e5 ("hw/i386: pass RNG seed via setup_data entry") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-02Revert "x86: return modified setup_data only if read as memory, not as file"Michael S. Tsirkin
This reverts commit e935b735085dfa61d8e6d276b6f9e7687796a3c7. Fixes: e935b73508 ("x86: return modified setup_data only if read as memory, not as file") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-02Revert "x86: use typedef for SetupData struct"Michael S. Tsirkin
This reverts commit eebb38a5633a77f5fa79d6486d5b2fcf8fbe3c07. Fixes: eebb38a563 ("x86: use typedef for SetupData struct") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-02Revert "x86: reinitialize RNG seed on system reboot"Michael S. Tsirkin
This reverts commit 763a2828bf313ed55878b09759dc435355035f2e. Fixes: 763a2828bf ("x86: reinitialize RNG seed on system reboot") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-02Revert "x86: re-initialize RNG seed when selecting kernel"Michael S. Tsirkin
This reverts commit cc63374a5a7c240b7d3be734ef589dabbefc7527. Fixes: cc63374a5a ("x86: re-initialize RNG seed when selecting kernel") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-02Revert "x86: do not re-randomize RNG seed on snapshot load"Michael S. Tsirkin
This reverts commit 14b29fea742034186403914b4d013d0e83f19e78. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: 14b29fea74 ("x86: do not re-randomize RNG seed on snapshot load") Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-02Revert "x86: don't let decompressed kernel image clobber setup_data"Michael S. Tsirkin
This reverts commit eac7a7791bb6d719233deed750034042318ffd56. Fixes: eac7a7791b ("x86: don't let decompressed kernel image clobber setup_data") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-03-02hw/smbios: fix field corruption in type 4 tableJulia Suvorova
Since table type 4 of SMBIOS version 2.6 is shorter than 3.0, the strings which follow immediately after the struct fields have been overwritten by unconditional filling of later fields such as core_count2. Make these fields dependent on the SMBIOS version. Fixes: 05e27d74c7 ("hw/smbios: add core_count2 to smbios table type 4") Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2169904 Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20230223125747.254914-1-jusual@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Ani Sinha <ani@anisinha.ca> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-01hw/riscv: Move the dtb load bits outside of create_fdt()Bin Meng
Move the dtb load bits outside of create_fdt(), and put it explicitly in sifive_u_machine_init() and virt_machine_init(). With such change create_fdt() does exactly what its function name tells us. Suggested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Signed-off-by: Bin Meng <bmeng@tinylab.org> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Message-ID: <20230228074522.1845007-2-bmeng@tinylab.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-01hw/riscv: Skip re-generating DT nodes for a given DTBBin Meng
Launch qemu-system-riscv64 with a given dtb for 'sifive_u' and 'virt' machines, QEMU complains: qemu_fdt_add_subnode: Failed to create subnode /soc: FDT_ERR_EXISTS The whole DT generation logic should be skipped when a given DTB is present. Fixes: b1f19f238cae ("hw/riscv: write bootargs 'chosen' FDT after riscv_load_kernel()") Signed-off-by: Bin Meng <bmeng@tinylab.org> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Message-ID: <20230228074522.1845007-1-bmeng@tinylab.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-01hw/riscv/virt.c: do not use RISCV_FEATURE_MMU in create_fdt_socket_cpus()Daniel Henrique Barboza
Read cpu_ptr->cfg.mmu directly. As a bonus, use cpu_ptr in riscv_isa_string(). Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn> Reviewed-by: Bin Meng <bmeng@tinylab.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com> Message-ID: <20230222185205.355361-9-dbarboza@ventanamicro.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-01Merge branch 'xenfv-kvm-15' of git://git.infradead.org/users/dwmw2/qemu into ↵Paolo Bonzini
HEAD This adds support for emulating Xen under Linux/KVM, based on kernel patches which have been present since Linux v5.12. As with the kernel support, it's derived from work started by João Martins of Oracle in 2018. This series just adds the basic platform support — CPUID, hypercalls, event channels, a stub of XenStore. A full single-tenant internal implementation of XenStore, and patches to make QEMU's Xen PV drivers work with this Xen emulation, are waiting in the wings to be submitted in a follow-on patch series. As noted in the documentation, it's enabled by setting the xen-version property on the KVM accelerator, e.g.: qemu-system-x86_64 -serial mon:stdio -M q35 -display none -m 1G -smp 2 \ -accel kvm,xen-version=0x4000e,kernel-irqchip=split \ -kernel vmlinuz-6.0.7-301.fc37.x86_64 \ -append "console=ttyS0 root=/dev/sda1" \ -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \ -device ahci,id=ahci -device ide-hd,drive=disk,bus=ahci.0 Even before this was merged, we've already been using it to find and fix bugs in the Linux kernel Xen guest support: https://lore.kernel.org/all/4bffa69a949bfdc92c4a18e5a1c3cbb3b94a0d32.camel@infradead.org/ https://lore.kernel.org/all/871qnunycr.ffs@tglx/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-01qapi: Add 'acpi' field to 'query-machines' outputPeter Krempa
Report which machine types support ACPI so that management applications can properly use the 'acpi' property even on platforms such as ARM where support for ACPI depends on the machine type and thus checking presence of '-machine acpi=' in 'query-command-line-options' is insufficient. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <537625d3e25d345052322c42ca19812b98b4f49a.1677571792.git.pkrempa@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-01hw/xen: Subsume xen_be_register_common() into xen_be_init()David Woodhouse
Every caller of xen_be_init() checks and exits on error, then calls xen_be_register_common(). Just make xen_be_init() abort for itself and return void, and register the common devices too. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01kvm/i386: Add xen-evtchn-max-pirq propertyDavid Woodhouse
The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI devices. Allow it to be increased if the user desires. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Support MSI mapping to PIRQDavid Woodhouse
The way that Xen handles MSI PIRQs is kind of awful. There is a special MSI message which targets a PIRQ. The vector in the low bits of data must be zero. The low 8 bits of the PIRQ# are in the destination ID field, the extended destination ID field is unused, and instead the high bits of the PIRQ# are in the high 32 bits of the address. Using the high bits of the address means that we can't intercept and translate these messages in kvm_send_msi(), because they won't be caught by the APIC — addresses like 0x1000fee46000 aren't in the APIC's range. So we catch them in pci_msi_trigger() instead, and deliver the event channel directly. That isn't even the worst part. The worst part is that Xen snoops on writes to devices' MSI vectors while they are *masked*. When a MSI message is written which looks like it targets a PIRQ, it remembers the device and vector for later. When the guest makes a hypercall to bind that PIRQ# (snooped from a marked MSI vector) to an event channel port, Xen *unmasks* that MSI vector on the device. Xen guests using PIRQ delivery of MSI don't ever actually unmask the MSI for themselves. Now that this is working we can finally enable XENFEAT_hvm_pirqs and let the guest use it all. Tested with passthrough igb and emulated e1000e + AHCI. CPU0 CPU1 0: 65 0 IO-APIC 2-edge timer 1: 0 14 xen-pirq 1-ioapic-edge i8042 4: 0 846 xen-pirq 4-ioapic-edge ttyS0 8: 1 0 xen-pirq 8-ioapic-edge rtc0 9: 0 0 xen-pirq 9-ioapic-level acpi 12: 257 0 xen-pirq 12-ioapic-edge i8042 24: 9600 0 xen-percpu -virq timer0 25: 2758 0 xen-percpu -ipi resched0 26: 0 0 xen-percpu -ipi callfunc0 27: 0 0 xen-percpu -virq debug0 28: 1526 0 xen-percpu -ipi callfuncsingle0 29: 0 0 xen-percpu -ipi spinlock0 30: 0 8608 xen-percpu -virq timer1 31: 0 874 xen-percpu -ipi resched1 32: 0 0 xen-percpu -ipi callfunc1 33: 0 0 xen-percpu -virq debug1 34: 0 1617 xen-percpu -ipi callfuncsingle1 35: 0 0 xen-percpu -ipi spinlock1 36: 8 0 xen-dyn -event xenbus 37: 0 6046 xen-pirq -msi ahci[0000:00:03.0] 38: 1 0 xen-pirq -msi-x ens4 39: 0 73 xen-pirq -msi-x ens4-rx-0 40: 14 0 xen-pirq -msi-x ens4-rx-1 41: 0 32 xen-pirq -msi-x ens4-tx-0 42: 47 0 xen-pirq -msi-x ens4-tx-1 Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Support GSI mapping to PIRQDavid Woodhouse
If I advertise XENFEAT_hvm_pirqs then a guest now boots successfully as long as I tell it 'pci=nomsi'. [root@localhost ~]# cat /proc/interrupts CPU0 0: 52 IO-APIC 2-edge timer 1: 16 xen-pirq 1-ioapic-edge i8042 4: 1534 xen-pirq 4-ioapic-edge ttyS0 8: 1 xen-pirq 8-ioapic-edge rtc0 9: 0 xen-pirq 9-ioapic-level acpi 11: 5648 xen-pirq 11-ioapic-level ahci[0000:00:04.0] 12: 257 xen-pirq 12-ioapic-edge i8042 ... Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Implement emulated PIRQ hypercall supportDavid Woodhouse
This wires up the basic infrastructure but the actual interrupts aren't there yet, so don't advertise it to the guest. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01i386/xen: Implement HYPERVISOR_physdev_opDavid Woodhouse
Just hook up the basic hypercalls to stubs in xen_evtchn.c for now. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Automatically add xen-platform PCI device for emulated Xen guestsDavid Woodhouse
It isn't strictly mandatory but Linux guests at least will only map their grant tables over the dummy BAR that it provides, and don't have sufficient wit to map them in any other unused part of their guest address space. So include it by default for minimal surprise factor. As I come to document "how to run a Xen guest in QEMU", this means one fewer thing to tell the user about, according to the mantra of "if it needs documenting, fix it first, then document what remains". Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Add basic ring handling to xenstoreDavid Woodhouse
Extract requests, return ENOSYS to all of them. This is enough to allow older Linux guests to boot, as they need *something* back but it doesn't matter much what. A full implementation of a single-tentant internal XenStore copy-on-write tree with transactions and watches is waiting in the wings to be sent in a subsequent round of patches along with hooking up the actual PV disk back end in qemu, but this is enough to get guests booting for now. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Add xen_xenstore device for xenstore emulationDavid Woodhouse
Just the basic shell, with the event channel hookup. It only dumps the buffer for now; a real ring implmentation will come in a subsequent patch. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Add backend implementation of interdomain event channel supportDavid Woodhouse
The provides the QEMU side of interdomain event channels, allowing events to be sent to/from the guest. The API mirrors libxenevtchn, and in time both this and the real Xen one will be available through ops structures so that the PV backend drivers can use the correct one as appropriate. For now, this implementation can be used directly by our XenStore which will be for emulated mode only. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01i386/xen: handle PV timer hypercallsJoao Martins
Introduce support for one shot and periodic mode of Xen PV timers, whereby timer interrupts come through a special virq event channel with deadlines being set through: 1) set_timer_op hypercall (only oneshot) 2) vcpu_op hypercall for {set,stop}_{singleshot,periodic}_timer hypercalls Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Implement GNTTABOP_query_sizeDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01i386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_versonDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Support mapping grant framesDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Add xen_gnttab device for grant table emulationDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callbackDavid Woodhouse
The guest is permitted to specify an arbitrary domain/bus/device/function and INTX pin from which the callback IRQ shall appear to have come. In QEMU we can only easily do this for devices that actually exist, and even that requires us "knowing" that it's a PCMachine in order to find the PCI root bus — although that's OK really because it's always true. We also don't get to get notified of INTX routing changes, because we can't do that as a passive observer; if we try to register a notifier it will overwrite any existing notifier callback on the device. But in practice, guests using PCI_INTX will only ever use pin A on the Xen platform device, and won't swizzle the INTX routing after they set it up. So this is just fine. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callbackDavid Woodhouse
The GSI callback (and later PCI_INTX) is a level triggered interrupt. It is asserted when an event channel is delivered to vCPU0, and is supposed to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0 is cleared again. Thankfully, Xen does *not* assert the GSI if the guest sets its own evtchn_upcall_pending field; we only need to assert the GSI when we have delivered an event for ourselves. So that's the easy part, kind of. There's a slight complexity in that we need to hold the BQL before we can call qemu_set_irq(), and we definitely can't do that while holding our own port_lock (because we'll need to take that from the qemu-side functions that the PV backend drivers will call). So if we end up wanting to set the IRQ in a context where we *don't* already hold the BQL, defer to a BH. However, we *do* need to poll for the evtchn_upcall_pending flag being cleared. In an ideal world we would poll that when the EOI happens on the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd pairs — one is used to trigger the interrupt, and the other works in the other direction to 'resample' on EOI, and trigger the first eventfd again if the line is still active. However, QEMU doesn't seem to do that. Even VFIO level interrupts seem to be supported by temporarily unmapping the device's BARs from the guest when an interrupt happens, then trapping *all* MMIO to the device and sending the 'resample' event on *every* MMIO access until the IRQ is cleared! Maybe in future we'll plumb the 'resample' concept through QEMU's irq framework but for now we'll do what Xen itself does: just check the flag on every vmexit if the upcall GSI is known to be asserted. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>