aboutsummaryrefslogtreecommitdiff
path: root/include/sysemu
AgeCommit message (Collapse)Author
2021-07-21iothread: add aio-max-batch parameterStefano Garzarella
The `aio-max-batch` parameter will be propagated to AIO engines and it will be used to control the maximum number of queued requests. When there are in queue a number of requests equal to `aio-max-batch`, the engine invokes the system call to forward the requests to the kernel. This parameter allows us to control the maximum batch size to reduce the latency that requests might accumulate while queued in the AIO engine queue. If `aio-max-batch` is equal to 0 (default value), the AIO engine will use its default maximum batch size value. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20210721094211.69853-3-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-06-25block: add max_hw_transfer to BlockLimitsPaolo Bonzini
For block host devices, I/O can happen through either the kernel file descriptor I/O system calls (preadv/pwritev, io_submit, io_uring) or the SCSI passthrough ioctl SG_IO. In the latter case, the size of each transfer can be limited by the HBA, while for file descriptor I/O the kernel is able to split and merge I/O in smaller pieces as needed. Applying the HBA limits to file descriptor I/O results in more system calls and suboptimal performance, so this patch splits the max_transfer limit in two: max_transfer remains valid and is used in general, while max_hw_transfer is limited to the maximum hardware size. max_hw_transfer can then be included by the scsi-generic driver in the block limits page, to ensure that the stricter hardware limit is used. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-17Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into ↵Peter Maydell
staging * avoid deprecation warnings for SASL on macOS 10.11 or newer * fix -readconfig when config blocks have an id (like [chardev "qmp"]) * Error* initialization fixes * Improvements to ESP emulation (Mark) * Allow creating noreserve memory backends (David) * Improvements to query-memdev (David) * Bump compiler to C11 (Richard) * First round of SVM fixes from GSoC project (Lara) # gpg: Signature made Wed 16 Jun 2021 16:37:49 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: (45 commits) configure: Remove probe for _Static_assert qemu/compiler: Remove QEMU_GENERIC include/qemu/lockable: Use _Generic instead of QEMU_GENERIC util: Use unique type for QemuRecMutex in thread-posix.h util: Pass file+line to qemu_rec_mutex_unlock_impl util: Use real functions for thread-posix QemuRecMutex softfloat: Use _Generic instead of QEMU_GENERIC configure: Use -std=gnu11 target/i386: Added Intercept CR0 writes check target/i386: Added consistency checks for CR0 target/i386: Added consistency checks for VMRUN intercept and ASID target/i386: Refactored intercept checks into cpu_svm_has_intercept configure: map x32 to cpu_family x86_64 for meson hmp: Print "reserve" property of memory backends with "info memdev" qmp: Include "reserve" property of memory backends hmp: Print "share" property of memory backends with "info memdev" qmp: Include "share" property of memory backends qmp: Clarify memory backend properties returned via query-memdev hostmem: Wire up RAM_NORESERVE via "reserve" property util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-15hostmem: Wire up RAM_NORESERVE via "reserve" propertyDavid Hildenbrand
Let's provide a way to control the use of RAM_NORESERVE via memory backends using the "reserve" property which defaults to true (old behavior). Only Linux currently supports clearing the flag (and support is checked at runtime, depending on the setting of "/proc/sys/vm/overcommit_memory"). Windows and other POSIX systems will bail out with "reserve=false". The target use case is virtio-mem, which dynamically exposes memory inside a large, sparse memory area to the VM. This essentially allows avoiding to set "/proc/sys/vm/overcommit_memory == 0") when using virtio-mem and also supporting hugetlbfs in the future. As really only Linux implements RAM_NORESERVE right now, let's expose the property only with CONFIG_LINUX. Setting the property to "false" will then only fail in corner cases -- for example on very old kernels or when memory overcommit was completely disabled by the admin. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Cc: Markus Armbruster <armbru@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-11-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15sysemu: Make TPM structures inaccessible if CONFIG_TPM is not definedStefan Berger
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210614191335.1968807-5-stefanb@linux.ibm.com> [PMD: Remove tpm_init() / tpm_cleanup() stubs] Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-06-11accel/tcg: Merge tcg_exec_init into tcg_init_machineRichard Henderson
There is only one caller, and shortly we will need access to the MachineState, which tcg_init_machine already has. Reviewed-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-03hvf: Introduce hvf vcpu structAlexander Graf
We will need more than a single field for hvf going forward. To keep the global vcpu struct uncluttered, let's allocate a special hvf vcpu struct, similar to how hax does it. Signed-off-by: Alexander Graf <agraf@csgraf.de> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Tested-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-id: 20210519202253.76782-12-agraf@csgraf.de Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03hvf: Remove hvf-accel-ops.hAlexander Graf
We can move the definition of hvf_vcpu_exec() into our internal hvf header, obsoleting the need for hvf-accel-ops.h. Signed-off-by: Alexander Graf <agraf@csgraf.de> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-id: 20210519202253.76782-11-agraf@csgraf.de Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03hvf: Split out common code on vcpu init and destroyAlexander Graf
Until now, Hypervisor.framework has only been available on x86_64 systems. With Apple Silicon shipping now, it extends its reach to aarch64. To prepare for support for multiple architectures, let's start moving common code out into its own accel directory. This patch splits the vcpu init and destroy functions into a generic and an architecture specific portion. This also allows us to move the generic functions into the generic hvf code, removing exported functions. Signed-off-by: Alexander Graf <agraf@csgraf.de> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-id: 20210519202253.76782-8-agraf@csgraf.de Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03hvf: Make hvf_set_phys_mem() staticAlexander Graf
The hvf_set_phys_mem() function is only called within the same file. Make it static. Signed-off-by: Alexander Graf <agraf@csgraf.de> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-id: 20210519202253.76782-6-agraf@csgraf.de Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03hvf: Move hvf internal definitions into common headerAlexander Graf
Until now, Hypervisor.framework has only been available on x86_64 systems. With Apple Silicon shipping now, it extends its reach to aarch64. To prepare for support for multiple architectures, let's start moving common code out into its own accel directory. This patch moves a few internal struct and constant defines over. Signed-off-by: Alexander Graf <agraf@csgraf.de> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-id: 20210519202253.76782-5-agraf@csgraf.de Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03hvf: Move cpu functions into common directoryAlexander Graf
Until now, Hypervisor.framework has only been available on x86_64 systems. With Apple Silicon shipping now, it extends its reach to aarch64. To prepare for support for multiple architectures, let's start moving common code out into its own accel directory. This patch moves CPU and memory operations over. While at it, make sure the code is consumable on non-i386 systems. Signed-off-by: Alexander Graf <agraf@csgraf.de> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-id: 20210519202253.76782-4-agraf@csgraf.de Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03hvf: Move assert_hvf_ok() into common directoryAlexander Graf
Until now, Hypervisor.framework has only been available on x86_64 systems. With Apple Silicon shipping now, it extends its reach to aarch64. To prepare for support for multiple architectures, let's start moving common code out into its own accel directory. This patch moves assert_hvf_ok() and introduces generic build infrastructure. Signed-off-by: Alexander Graf <agraf@csgraf.de> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-id: 20210519202253.76782-2-agraf@csgraf.de Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-02block-backend: add drained_pollSergio Lopez
Allow block backends to poll their devices/users to check if they have been quiesced when entering a drained section. This will be used in the next patch to wait for the NBD server to be completely quiesced. Suggested-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20210602060552.17433-2-slp@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-05-26KVM: Cache kvm slot dirty bitmap sizePeter Xu
Cache it too because we'll reference it more frequently in the future. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-8-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26KVM: Provide helper to sync dirty bitmap from slot to ramblockPeter Xu
kvm_physical_sync_dirty_bitmap() calculates the ramblock offset in an awkward way from the MemoryRegionSection that passed in from the caller. The truth is for each KVMSlot the ramblock offset never change for the lifecycle. Cache the ramblock offset for each KVMSlot into the structure when the KVMSlot is created. With that, we can further simplify kvm_physical_sync_dirty_bitmap() with a helper to sync KVMSlot dirty bitmap to the ramblock dirty bitmap of a specific KVMSlot. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-6-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26KVM: Provide helper to get kvm dirty logPeter Xu
Provide a helper kvm_slot_get_dirty_log() to make the function kvm_physical_sync_dirty_bitmap() clearer. We can even cache the as_id into KVMSlot when it is created, so that we don't even need to pass it down every time. Since at it, remove return value of kvm_physical_sync_dirty_bitmap() because it should never fail. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-5-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26KVM: Use a big lock to replace per-kml slots_lockPeter Xu
Per-kml slots_lock will bring some trouble if we want to take all slots_lock of all the KMLs, especially when we're in a context that we could have taken some of the KML slots_lock, then we even need to figure out what we've taken and what we need to take. Make this simple by merging all KML slots_lock into a single slots lock. Per-kml slots_lock isn't anything that helpful anyway - so far only x86 has two address spaces (so, two slots_locks). All the rest archs will be having one address space always, which means there's actually one slots_lock so it will be the same as before. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210506160549.130416-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-14include/sysemu: Poison all accelerator CONFIG switches in common codeThomas Huth
We are already poisoning CONFIG_KVM since this switch is not working in common code. Do the same with the other accelerator switches, too (except for CONFIG_TCG, which is special, since it is also defined in config-host.h). Message-Id: <20210414112004.943383-2-thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-05-12Drop the deprecated unicore32 targetMarkus Armbruster
Target unicore32 was deprecated in commit 8e4ff4a8d2b, v5.2.0. See there for rationale. Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210503084034.3804963-3-armbru@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com>
2021-05-12Drop the deprecated lm32 targetMarkus Armbruster
Target lm32 was deprecated in commit d8498005122, v5.2.0. See there for rationale. Some of its code lives on in device models derived from milkymist ones: hw/char/digic-uart.c and hw/display/bcm2835_fb.c. Cc: Michael Walle <michael@walle.cc> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210503084034.3804963-2-armbru@redhat.com> Acked-by: Michael Walle <michael@walle.cc> [Trivial conflicts resolved, reST markup fixed]
2021-05-12Remove the deprecated moxie targetThomas Huth
There are no known users of this CPU anymore, and there are no binaries available online which could be used for regression tests, so the code has likely completely bit-rotten already. It's been marked as deprecated since two releases now and nobody spoke up that there is still a need to keep it, thus let's remove it now. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210430160355.698194-1-thuth@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [Commit message typos fixed, trivial conflicts resolved] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-05-10osdep: Make os-win32.h and os-posix.h handle 'extern "C"' themselvesPeter Maydell
Both os-win32.h and os-posix.h include system header files. Instead of having osdep.h include them inside its 'extern "C"' block, make these headers handle that themselves, so that we don't include the system headers inside 'extern "C"'. This doesn't fix any current problems, but it's conceptually the right way to handle system headers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2021-05-04Add NVMM accelerator: acceleration enlightenmentsReinoud Zandijk
Signed-off-by: Kamil Rytarowski <kamil@NetBSD.org> Signed-off-by: Reinoud Zandijk <reinoud@NetBSD.org> Message-Id: <20210402202535.11550-4-reinoud@NetBSD.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-24m68k: add the virtio devices aliasesLaurent Vivier
Similarly to 5f629d943cb0 ("s390x: fix s390 virtio aliases"), define the virtio aliases. This allows to start machines with virtio devices without knowledge of the implementation type. For instance, we can use "-device virtio-scsi" on m68k, s390x or PC, and the device will be respectively "virtio-scsi-device", "virtio-scsi-ccw" or "virtio-scsi-pci". This already exists for s390x and -ccw interfaces, add them for m68k and MMIO (-device) interfaces. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210319202335.2397060-3-laurent@vivier.eu> Message-Id: <20210323165308.15244-18-alex.bennee@linaro.org>
2021-03-24qdev: define list of archs with virtio-pci or virtio-ccwLaurent Vivier
This is used to define virtio-*-pci and virtio-*-ccw aliases rather than substracting the CCW architecture from all the others. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210319202335.2397060-2-laurent@vivier.eu> Message-Id: <20210323165308.15244-17-alex.bennee@linaro.org>
2021-03-19blockdev: Drop deprecated bogus -drive interface typeMarkus Armbruster
Drop the crap deprecated in commit a1b40bda08 "blockdev: Deprecate -drive with bogus interface type" (v5.1.0). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20210309161214.1402527-5-armbru@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>
2021-03-15exec: Get rid of phys_mem_set_alloc()David Hildenbrand
As the last user is gone, we can get rid of phys_mem_set_alloc() and simplify. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20210303130916.22553-3-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-03-12dma: Introduce dma_aligned_pow2_mask()Eric Auger
Currently get_naturally_aligned_size() is used by the intel iommu to compute the maximum invalidation range based on @size which is a power of 2 while being aligned with the @start address and less than the maximum range defined by @gaw. This helper is also useful for other iommu devices (virtio-iommu, SMMUv3) to make sure IOMMU UNMAP notifiers only are called with power of 2 range sizes. Let's move this latter into dma-helpers.c and rename it into dma_aligned_pow2_mask(). Also rewrite the helper so that it accomodates UINT64_MAX values for the size mask and max mask. It now returns a mask instead of a size. Change the caller. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-id: 20210309102742.30442-3-eric.auger@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-11Merge remote-tracking branch ↵Peter Maydell
'remotes/vivier2/tags/trivial-branch-for-6.0-pull-request' into staging Pull request # gpg: Signature made Wed 10 Mar 2021 21:56:09 GMT # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/trivial-branch-for-6.0-pull-request: (22 commits) sysemu: Let VMChangeStateHandler take boolean 'running' argument sysemu/runstate: Let runstate_is_running() return bool hw/lm32/Kconfig: Have MILKYMIST select LM32_DEVICES hw/lm32/Kconfig: Rename CONFIG_LM32 -> CONFIG_LM32_DEVICES hw/lm32/Kconfig: Introduce CONFIG_LM32_EVR for lm32-evr/uclinux boards qemu-common.h: Update copyright string to 2021 tests/fp/fp-test: Replace the word 'blacklist' qemu-options: Replace the word 'blacklist' seccomp: Replace the word 'blacklist' scripts/tracetool: Replace the word 'whitelist' ui: Replace the word 'whitelist' virtio-gpu: Adjust code space style exec/memory: Use struct Object typedef fuzz-test: remove unneccessary debugging flags net: Use id_generate() in the network subsystem, too MAINTAINERS: Fix the location of tools manuals vhost_user_gpu: Drop dead check for g_malloc() failure backends/dbus-vmstate: Fix short read error handling target/hexagon/gen_tcg_funcs: Fix a typo hw/elf_ops: Fix a typo ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-10device_tree: add qemu_fdt_setprop_string_array helperAlex Bennée
A string array in device tree is simply a series of \0 terminated strings next to each other. As libfdt doesn't support that directly we need to build it ourselves. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20210303173642.3805-4-alex.bennee@linaro.org>
2021-03-09sysemu: Let VMChangeStateHandler take boolean 'running' argumentPhilippe Mathieu-Daudé
The 'running' argument from VMChangeStateHandler does not require other value than 0 / 1. Make it a plain boolean. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210111152020.1422021-3-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-03-09sysemu/runstate: Let runstate_is_running() return boolPhilippe Mathieu-Daudé
runstate_check() returns a boolean. runstate_is_running() returns what runstate_check() returns, also a boolean. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210111152020.1422021-2-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-02-16replay: fix icount request when replaying clock accessPavel Dovgalyuk
Record/replay provides REPLAY_CLOCK_LOCKED macro to access the clock when vm_clock_seqlock is locked. This macro is needed because replay internals operate icount. In locked case replay use icount_get_raw_locked for icount request, which prevents excess locking which leads to deadlock. But previously only record code used *_locked function and replay did not. Therefore sometimes clock access lead to deadlocks. This patch fixes clock access for replay too and uses *_locked icount access function. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Message-Id: <161347990483.1313189.8371838968343494161.stgit@pasha-ThinkPad-X280> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16sev/i386: Don't allow a system reset under an SEV-ES guestTom Lendacky
An SEV-ES guest does not allow register state to be altered once it has been measured. When an SEV-ES guest issues a reboot command, Qemu will reset the vCPU state and resume the guest. This will cause failures under SEV-ES. Prevent that from occuring by introducing an arch-specific callback that returns a boolean indicating whether vCPUs are resettable. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Message-Id: <1ac39c441b9a3e970e9556e1cc29d0a0814de6fd.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16sev/i386: Allow AP booting under SEV-ESPaolo Bonzini
When SEV-ES is enabled, it is not possible modify the guests register state after it has been initially created, encrypted and measured. Normally, an INIT-SIPI-SIPI request is used to boot the AP. However, the hypervisor cannot emulate this because it cannot update the AP register state. For the very first boot by an AP, the reset vector CS segment value and the EIP value must be programmed before the register has been encrypted and measured. Search the guest firmware for the guest for a specific GUID that tells Qemu the value of the reset vector to use. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <22db2bfb4d6551aed661a9ae95b4fdbef613ca21.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16pc: add parser for OVMF reset blockJames Bottomley
OVMF is developing a mechanism for depositing a GUIDed table just below the known location of the reset vector. The table goes backwards in memory so all entries are of the form <data>|len|<GUID> Where <data> is arbtrary size and type, <len> is a uint16_t and describes the entire length of the entry from the beginning of the data to the end of the guid. The foot of the table is of this form and <len> for this case describes the entire size of the table. The table foot GUID is defined by OVMF as 96b582de-1fb2-45f7-baea-a366c55a082d and if the table is present this GUID is just below the reset vector, 48 bytes before the end of the firmware file. Add a parser for the ovmf reset block which takes a copy of the block, if the table foot guid is found, minus the footer and a function for later traversal to return the data area of any specified GUIDs. Signed-off-by: James Bottomley <jejb@linux.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210204193939.16617-2-jejb@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-10multi-process: define MPQemuMsg format and transmission functionsElena Ufimtseva
Defines MPQemuMsg, which is the message that is sent to the remote process. This message is sent over QIOChannel and is used to command the remote process to perform various tasks. Define transmission functions used by proxy and by remote. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 56ca8bcf95195b2b195b08f6b9565b6d7410bce5.1611938319.git.jag.raman@oracle.com [Replace struct iovec send[2] = {0} with {} to make clang happy as suggested by Peter Maydell <peter.maydell@linaro.org>. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-02-08sev: Add Error ** to sev_kvm_init()David Gibson
This allows failures to be reported richly and idiomatically. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08confidential guest support: Rework the "memory-encryption" propertyDavid Gibson
Currently the "memory-encryption" property is only looked at once we get to kvm_init(). Although protection of guest memory from the hypervisor isn't something that could really ever work with TCG, it's not conceptually tied to the KVM accelerator. In addition, the way the string property is resolved to an object is almost identical to how a QOM link property is handled. So, create a new "confidential-guest-support" link property which sets this QOM interface link directly in the machine. For compatibility we keep the "memory-encryption" property, but now implemented in terms of the new property. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08sev: Remove false abstraction of flash encryptionDavid Gibson
When AMD's SEV memory encryption is in use, flash memory banks (which are initialed by pc_system_flash_map()) need to be encrypted with the guest's key, so that the guest can read them. That's abstracted via the kvm_memcrypt_encrypt_data() callback in the KVM state.. except, that it doesn't really abstract much at all. For starters, the only call site is in code specific to the 'pc' family of machine types, so it's obviously specific to those and to x86 to begin with. But it makes a bunch of further assumptions that need not be true about an arbitrary confidential guest system based on memory encryption, let alone one based on other mechanisms: * it assumes that the flash memory is defined to be encrypted with the guest key, rather than being shared with hypervisor * it assumes that that hypervisor has some mechanism to encrypt data into the guest, even though it can't decrypt it out, since that's the whole point * the interface assumes that this encrypt can be done in place, which implies that the hypervisor can write into a confidential guests's memory, even if what it writes isn't meaningful So really, this "abstraction" is actually pretty specific to the way SEV works. So, this patch removes it and instead has the PC flash initialization code call into a SEV specific callback. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-05accel: replace struct CpusAccel with AccelOpsClassClaudio Fontana
This will allow us to centralize the registration of the cpus.c module accelerator operations (in accel/accel-softmmu.c), and trigger it automatically using object hierarchy lookup from the new accel_init_interfaces() initialization step, depending just on which accelerators are available in the code. Rename all tcg-cpus.c, kvm-cpus.c, etc to tcg-accel-ops.c, kvm-accel-ops.c, etc, matching the object type names. Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20210204163931.7358-18-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-02-05accel: extend AccelState and AccelClass to user-modeClaudio Fontana
Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [claudio: rebased on Richard's splitwx work] Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20210204163931.7358-17-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-01-27block: Separate blk_is_writable() and blk_supports_write_perm()Kevin Wolf
Currently, blk_is_read_only() tells whether a given BlockBackend can only be used in read-only mode because its root node is read-only. Some callers actually try to answer a slightly different question: Is the BlockBackend configured to be writable, by taking write permissions on the root node? This can differ, for example, for CD-ROM devices which don't take write permissions, but may be backed by a writable image file. scsi-cd allows write requests to the drive if blk_is_read_only() returns false. However, the write request will immediately run into an assertion failure because the write permission is missing. This patch introduces separate functions for both questions. blk_supports_write_perm() answers the question whether the block node/image file can support writable devices, whereas blk_is_writable() tells whether the BlockBackend is currently configured to be writable. All calls of blk_is_read_only() are converted to one of the two new functions. Fixes: https://bugs.launchpad.net/bugs/1906693 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210118123448.307825-2-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-01-12whpx: move internal definitions to whpx-internal.hPaolo Bonzini
Only leave the external interface in sysemu/whpx.h. whpx_apic_in_platform is moved to a .c file because it needs whpx_state. Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201219090637.1700900-3-pbonzini@redhat.com>
2021-01-07tcg: Add --accel tcg,split-wx propertyRichard Henderson
Plumb the value through to alloc_code_gen_buffer. This is not supported by any os or tcg backend, so for now enabling it will result in an error. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2020-12-15vl: Add option to avoid stopping VM upon guest panicAlejandro Jimenez
The current default action of pausing a guest after a panic event is received leaves the responsibility to resume guest execution to the management layer. The reasons for this behavior are discussed here: https://lore.kernel.org/qemu-devel/52148F88.5000509@redhat.com/ However, in instances like the case of older guests (Linux and Windows) using a pvpanic device but missing support for the PVPANIC_CRASHLOADED event, and Windows guests using the hv-crash enlightenment, it is desirable to allow the guests to continue running after sending a PVPANIC_PANICKED event. This allows such guests to proceed to capture a crash dump and automatically reboot without intervention of a management layer. Add an option to avoid stopping a VM after a panic event is received, by passing: -action panic=none in the command line arguments, or during runtime by using an upcoming QMP command. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Message-Id: <1607705564-26264-3-git-send-email-alejandro.j.jimenez@oracle.com> [Do not fix panic action in the variable, instead modify -no-shutdown. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15qmp: generalize watchdog-set-action to -no-reboot/-no-shutdownAlejandro Jimenez
Add a QMP command to allow for the behaviors specified by the -no-reboot and -no-shutdown command line option to be set at runtime. The new command is named set-action and takes optional arguments, named after an event, that provide a corresponding action to take. Example: -> { "execute": "set-action", "arguments": { "reboot": "none", "shutdown": "poweroff", "watchdog": "debug" } } <- { "return": {} } Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Message-Id: <1607705564-26264-4-git-send-email-alejandro.j.jimenez@oracle.com> [Split the series differently, with -action based on the QMP command. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15vl: make qemu_get_machine_opts staticPaolo Bonzini
Machine options can be retrieved as properties of the machine object. Encourage that by removing the "easy" accessor to machine options. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15chardev: do not use machine_init_donePaolo Bonzini
machine_init_done is not the right flag to check when preconfig is taken into account; for example "./qemu-system-x86_64 -serial mon:stdio -preconfig" does not print the QEMU monitor header until after exit_preconfig. Add back a custom bool for mux character devices. This partially undoes commit c7278b4355 ("chardev: introduce chr_machine_done hook", 2018-03-12), but it keeps the cleaner logic using a function pointer in ChardevClass. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>