Age | Commit message (Collapse) | Author |
|
The acc_flag check for write should have been against PAGE_WRITE_ORG,
not PAGE_WRITE. But it is better to combine two acc_flag checks
to a single check against access_type. This matches the system code
in cputlb.c.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2647
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: 20241111145002.144995-1-richard.henderson@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
|
|
https://gitlab.com/peterx/qemu into staging
Migration pull request for softfreeze
v2:
- Patch "migration: Move cpu-throttle.c from system to migration",
fix build on MacOS, and subject spelling
NOTE: checkpatch.pl could report a false positive on this branch:
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#21:
{include/sysemu => migration}/cpu-throttle.h | 0
That's covered by "F: migration/" entry.
Changelog:
- Peter's cleanup patch on migrate_fd_cleanup()
- Peter's cleanup patch to introduce thread name macros
- Hanna's error path fix for vmstate subsection save()s
- Hyman's auto converge enhancement on background dirty sync
- Peter's additional tracepoints for save state entries
- Thomas's build fix for OpenBSD in dirtyrate.c
- Peter's deprecation of query-migrationthreads command
- Peter's cleanup/fixes from the "export misc.h" series
- Maciej's two small patches from multifd+vfio series
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZyTbVRIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wan3wD+L4TVNDc34Hy4mvWu7u1lCOePX0GBdUEc
# oEeBGblwbrcBAIR8d+5z9O5YcWH1coozG1aUC4qCtSHHk5TGbJk4/UUD
# =XB5Q
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 01 Nov 2024 13:44:53 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu:
migration/multifd: Zero p->flags before starting filling a packet
migration/ram: Add load start trace event
migration: Drop migration_is_idle()
migration: Drop migration_is_setup_or_active()
migration: Unexport ram_mig_init()
migration: Unexport dirty_bitmap_mig_init()
migration: Take migration object refcount earlier for threads
migration: Deprecate query-migrationthreads command
migration/dirtyrate: Silence warning about strcpy() on OpenBSD
tests/migration: Add case for periodic ramblock dirty sync
migration: Support periodic RAMBlock dirty bitmap sync
migration: Remove "rs" parameter in migration_bitmap_sync_precopy
migration: Move cpu-throttle.c from system to migration
migration: Stop CPU throttling conditionally
accel/tcg/icount-common: Remove the reference to the unused header file
migration: Ensure vmstate_save() sets errp
migration: Put thread names together with macros
migration: Cleanup migrate_fd_cleanup() on accessing to_dst_file
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/5e33b423d0b8506e5cb33fff42b50aa301b7731b.1729146786.git.yong.huang@smartx.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
ops is assigned again just below, and the result of the assignment must
be non-NULL.
Originally, the check for NULL was meant to be a check for the existence
of the ops class:
ops = ACCEL_OPS_CLASS(object_class_by_name(ops_name));
...
g_assert(ops != NULL);
(where the ops assignment begot the one that I am removing); but this is
meaningless now that oc is checked to be non-NULL before ops is assigned
(commit 5141e9a23fc, "accel: abort if we fail to load the accelerator
plugin", 2022-11-06).
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
34e5e1 refactored the plugin context initialization. After this change,
tcg_ctx->plugin_insn is not reset inconditionnally anymore, but only if
one plugin at least is active.
When uninstalling the last plugin active, we stopped reinitializing
tcg_ctx->plugin_insn, which leads to memory callbacks being emitted.
This results in an error as they don't appear in a plugin op sequence as
expected.
The correct fix is to make sure we reset plugin translation variables
after current block translation ends. This way, we can catch any
potential misuse of those after a given block, in more than fixing the
current bug.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2570
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Tested-by: Robbin Ehn <rehn@rivosinc.com>
Message-Id: <20241015003819.984601-1-pierrick.bouvier@linaro.org>
[AJB: trim patch version details from commit msg]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20241023113406.1284676-19-alex.bennee@linaro.org>
|
|
We try to avoid using cpu_loop_exit_atomic as it brings in an all-core
sync point. However on some cpu/kernel/benchmark combinations it is
starting to show up in the performance profile. To make it easier to
see whats going on add tracepoints for the slow path so we can see
what is triggering the wait.
It seems for a modern CPU it can be quite a bit, for example:
./qemu-system-aarch64 \
-machine type=virt,virtualization=on,pflash0=rom,pflash1=efivars,gic-version=max \
-smp 4 \
-accel tcg \
-device virtio-net-pci,netdev=unet \
-device virtio-scsi-pci \
-device scsi-hd,drive=hd \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-blockdev driver=raw,node-name=hd,file.driver=host_device,file.filename=/dev/zen-ssd2/trixie-arm64,discard=unmap \
-serial mon:stdio \
-blockdev node-name=rom,driver=file,filename=(pwd)/pc-bios/edk2-aarch64-code.fd,read-only=true \
-blockdev node-name=efivars,driver=file,filename=$HOME/images/qemu-arm64-efivars \
-m 8192 \
-object memory-backend-memfd,id=mem,size=8G,share=on \
-kernel /home/alex/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image -append "root=/dev/sda2 console=ttyAMA0 systemd.unit=benchmark-stress-ng.service" \
-display none -d trace:load_atom\*_fallback,trace:store_atom\*_fallback
With:
-cpu neoverse-v1,pauth-impdef=on => 2203343
With:
-cpu cortex-a76 => 0
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20241023113406.1284676-9-alex.bennee@linaro.org>
|
|
The exact set of available memory attributes can vary by VM. In the
future it might vary depending on enabled capabilities, too. Query the
extension on the VM level instead of on the KVM level, and only after
architecture-specific initialization.
Inspired by an analogous patch by Tom Dohrmann.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
KVM_CAP_MULTI_ADDRESS_SPACE used to be a global capability, but with the
introduction of AMD SEV-SNP confidential VMs, the number of address spaces
can vary by VM type.
Query the extension on the VM level instead of on the KVM level.
Inspired by an analogous patch by Tom Dohrmann.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
KVM_CAP_READONLY_MEM used to be a global capability, but with the
introduction of AMD SEV-SNP confidential VMs, this extension is not
always available on all VM types [1,2].
Query the extension on the VM level instead of on the KVM level.
[1] https://patchwork.kernel.org/project/kvm/patch/20240809190319.1710470-2-seanjc@google.com/
[2] https://patchwork.kernel.org/project/kvm/patch/20240902144219.3716974-1-erbse.13@gmx.de/
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>
Link: https://lore.kernel.org/r/20240903062953.3926498-1-erbse.13@gmx.de
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This value used to reflect the maximum supported memslots from KVM kernel.
Rename it to be clearer.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240917163835.194664-5-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This will make all nr_slots counters to be named in the same manner.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240917163835.194664-4-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Make the default max nr_slots a macro, it's only used when KVM reports
nothing.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240917163835.194664-3-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Zhiyi reported an infinite loop issue in VFIO use case. The cause of that
was a separate discussion, however during that I found a regression of
dirty sync slowness when profiling.
Each KVMMemoryListerner maintains an array of kvm memslots. Currently it's
statically allocated to be the max supported by the kernel. However after
Linux commit 4fc096a99e ("KVM: Raise the maximum number of user memslots"),
the max supported memslots reported now grows to some number large enough
so that it may not be wise to always statically allocate with the max
reported.
What's worse, QEMU kvm code still walks all the allocated memslots entries
to do any form of lookups. It can drastically slow down all memslot
operations because each of such loop can run over 32K times on the new
kernels.
Fix this issue by making the memslots to be allocated dynamically.
Here the initial size was set to 16 because it should cover the basic VM
usages, so that the hope is the majority VM use case may not even need to
grow at all (e.g. if one starts a VM with ./qemu-system-x86_64 by default
it'll consume 9 memslots), however not too large to waste memory.
There can also be even better way to address this, but so far this is the
simplest and should be already better even than before we grow the max
supported memslots. For example, in the case of above issue when VFIO was
attached on a 32GB system, there are only ~10 memslots used. So it could
be good enough as of now.
In the above VFIO context, measurement shows that the precopy dirty sync
shrinked from ~86ms to ~3ms after this patch applied. It should also apply
to any KVM enabled VM even without VFIO.
NOTE: we don't have a FIXES tag for this patch because there's no real
commit that regressed this in QEMU. Such behavior existed for a long time,
but only start to be a problem when the kernel reports very large
nr_slots_max value. However that's pretty common now (the kernel change
was merged in 2021) so we attached cc:stable because we'll want this change
to be backported to stable branches.
Cc: qemu-stable <qemu-stable@nongnu.org>
Reported-by: Zhiyi Guo <zhguo@redhat.com>
Tested-by: Zhiyi Guo <zhguo@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240917163835.194664-2-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Currently the QemuLockCnt data structure and associated functions are
in the include/qemu/thread.h header. Move them to their own
qemu/lockcnt.h. The main reason for doing this is that it means we
can autogenerate the documentation comments into the docs/devel
documentation.
The copyright/author in the new header is drawn from lockcnt.c,
since the header changes were added in the same commit as
lockcnt.c; since neither thread.h nor lockcnt.c state an explicit
license, the standard default of GPL-2-or-later applies.
We include the new header (and the .c file, which was accidentally
omitted previously) in the "RCU" part of MAINTAINERS, since that
is where the lockcnt.rst documentation is categorized.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20240816132212.3602106-7-peter.maydell@linaro.org
|
|
When we have a tlb miss, defer the alignment check to
the new tlb_fill_align hook. Move the existing alignment
check so that we only perform it with a tlb hit.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Add a new callback to handle softmmu paging. Return the page
details directly, instead of passing them indirectly to
tlb_set_page. Handle alignment simultaneously with paging so
that faults are handled with target-specific priority.
Route all calls of the two hooks through a tlb_fill_align
function local to cputlb.c.
As yet no targets implement the new hook.
As yet cputlb.c does not use the new alignment check.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Split out of mmu_lookup.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to use "memop_" prefix, like other functions
that operate on MemOp.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
There should be no "just in case"; the page is already
in the tlb, and known to be not readable.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
It is used in a couple of places only, both within the same target.
Those can use the cflags just as well, so remove the separate field.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20241010083641.1785069-1-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Commit e505a063ba ("translate-all: Add assert_(memory|tb)_lock
annotations") states page_set_flags() is "public APIs and [is]
documented as needing them held for linux-user mode".
Document the prototype.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240822095045.72643-2-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
This is necessary to provide discernible error messages to the caller.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240927104743.218468-2-jusual@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Refactor setting up of dirty ring code in kvm_init() so that is can be
reused in the future patchsets.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240912061838.4501-1-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Refactoring the core logic around KVM_CREATE_VM into its own separate function
so that it can be called from other functions in subsequent patches. There is
no functional change in this patch.
CC: pbonzini@redhat.com
CC: zhao1.liu@intel.com
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240808113838.1697366-1-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
error_report() is more appropriate for error situations. Replace fprintf with
error_report() and error_printf() as appropriate. Some improvement in error
reporting also happens as a part of this change. For example:
From:
$ ./qemu-system-x86_64 --accel kvm
Could not access KVM kernel module: No such file or directory
To:
$ ./qemu-system-x86_64 --accel kvm
qemu-system-x86_64: --accel kvm: Could not access KVM kernel module: No such file or directory
CC: qemu-trivial@nongnu.org
CC: zhao1.liu@intel.com
CC: armbru@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240828124539.62672-1-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
staging
* Convert more Avocado tests to the new functional test framework
* Clean up assert() statements, use g_assert_not_reached() when possible
* Improve output of the gitlab CI jobs
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmbz7xgRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWm6A//eVn+tzyyKCX/xdXlf7XyVpezvRpTFPOS
# HyO0WMkCf2kGmu6qYKx/fDZg86opdQzPLH2gPkuVrGOMZ0Z2630DjH0jNih8lL9Q
# J1oRX5YlU92chlzNmq59WB/j9CKd91ILtOoaPBuZkDob57yGEYVzCPqetVvF7L2+
# +rbnccrNPumGJFt035fxUGiGfgsmp28MHQzDwQdyr38uGjyNlqvqidfC8Vj1qzqP
# B7HvhGB/vkF0eHaanMt2el/ZuLKf+qeCi//F/CiXGMYnuKXyShA/Db6xvMElw1jB
# aQdwphP71IO+cxjJLaNjDHKGFstArsM/E21qlaSTBi+FTmPiwVULpVTiBmWsjhOh
# /klpdgRHf0hL2MciYKyOWgjlTocx3rEKjCTe2U5tpta9fp9CrlgMQotjDZIbohGI
# ULNahrW3Zmg4EmXDApfhYMXsQsSgWas9QSkmxzJzDp0VC7tf2Oq7RxeySrlw9MCx
# OG2qQY+rNcJ3NnpATjfAJpT1kg/IahDOCNHfLEaj1u13XVQIthVADvHwy5WxbwRP
# mwp3V9e9sUoznkM2eV646lzmkMim/WdYBF0YpT7eBs80+GoXZ0thx9IqWmwzX/ox
# rndBczVN+RY6PydJP40yljdvS7ArRT73wHqL6yKHfDpvFc4/p5mxTWwLQ3yJbXbE
# T3I+wtgfBU8=
# =FH7b
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 25 Sep 2024 12:08:08 BST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2024-09-25' of https://gitlab.com/thuth/qemu: (44 commits)
.gitlab-ci.d: Make separate collapsible log sections for build and test
.gitlab-ci.d: Split build and test in cross build job templates
scripts/checkpatch.pl: emit error when using assert(false)
tests/qtest: remove return after g_assert_not_reached()
qom: remove return after g_assert_not_reached()
qobject: remove return after g_assert_not_reached()
migration: remove return after g_assert_not_reached()
hw/ppc: remove return after g_assert_not_reached()
hw/pci: remove return after g_assert_not_reached()
hw/net: remove return after g_assert_not_reached()
hw/hyperv: remove return after g_assert_not_reached()
include/qemu: remove return after g_assert_not_reached()
tcg/loongarch64: remove break after g_assert_not_reached()
fpu: remove break after g_assert_not_reached()
target/riscv: remove break after g_assert_not_reached()
target/arm: remove break after g_assert_not_reached()
hw/tpm: remove break after g_assert_not_reached()
hw/scsi: remove break after g_assert_not_reached()
hw/net: remove break after g_assert_not_reached()
hw/acpi: remove break after g_assert_not_reached()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
This patch is part of a series that moves towards a consistent use of
g_assert_not_reached() rather than an ad hoc mix of different
assertion mechanisms.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20240919044641.386068-16-pierrick.bouvier@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
|
|
https://gitlab.com/stsquad/qemu into staging
TCG plugin memory instrumentation updates
- deprecate plugins on 32 bit hosts
- deprecate plugins with TCI
- extend memory API to save value
- add check-tcg tests to exercise new memory API
- fix timer deadlock with non-changing timer
- add basic block vector plugin to contrib
- add cflow plugin to contrib
- extend syscall plugin to dump write memory
- validate ips plugin arguments meet minimum slice value
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmbsPCUACgkQ+9DbCVqe
# KkTm1gf9Hs5Zfdng0E+7sr5Dpa5F+cJOXU9QJhoTWJ4XC16CygWByqMXbyeX/kvm
# HXJEm6OnkADJhikIUCoBko8uK4/96iWSrDL0sEdzASX4SM/tXu684KeL+j9G/Ql8
# iqxm6tIjaJqmbSZRMp0l5jD+ZBltRMCzBNdK1suJR2ppQgqfKj3qMLVLtq2hhqPH
# qPgwKm44hk9BEpHYqXaivzSWN5GKCgvp5ECcFXCBhDcM+8W7Dl3Mv6X0pWOpYcKZ
# d2a5KUt+Xp7WB2jkOgJYr0zKCOQCiCjGSfm/30qRDOUnwiLRWbfamRI9jUDNUtfy
# RYR+GaspurGCwSkwICdlvj+vFp/16Q==
# =5wfo
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Sep 2024 15:58:45 BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-tcg-plugin-memory-190924-1' of https://gitlab.com/stsquad/qemu:
contrib/plugins: avoid hanging program
plugins: add option to dump write argument to syscall plugin
plugins: add plugin API to read guest memory
contrib/plugins: Add a plugin to generate basic block vectors
util/timer: avoid deadlock when shutting down
tests/tcg: add a system test to check memory instrumentation
tests/tcg: ensure s390x-softmmu output redirected
tests/tcg: only read/write 64 bit words on 64 bit systems
tests/tcg: clean up output of memory system test
tests/tcg/multiarch: add test for plugin memory access
tests/tcg/plugins/mem: add option to print memory accesses
tests/tcg: allow to check output of plugins
tests/tcg: add mechanism to run specific tests with plugins
plugins: extend API to get latest memory value accessed
plugins: save value during memory accesses
contrib/plugins: control flow plugin
deprecation: don't enable TCG plugins by default with TCI
deprecation: don't enable TCG plugins by default on 32 bit hosts
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Different code paths handle memory accesses:
- tcg generated code
- load/store helpers
- atomic helpers
This value is saved in cpu->neg.plugin_mem_value_{high,low}. Values are
written only for accessed word size (upper bits are not set).
Atomic operations are doing read/write at the same time, so we generate
two memory callbacks instead of one, to allow plugins to access distinct
values.
For now, we can have access only up to 128 bits, thus split this in two
64 bits words. When QEMU will support wider operations, we'll be able to
reconsider this.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240724194708.1843704-2-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240916085400.1046925-5-alex.bennee@linaro.org>
|
|
The code at the tail end of the loop in kvm_dirty_ring_reaper_thread()
is unreachable, because there is no way for execution to leave the
loop. Replace it with a g_assert_not_reached().
(The code has always been unreachable, right from the start
when the function was added in commit b4420f198dd8.)
Resolves: Coverity CID 1547687
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240815131206.3231819-3-peter.maydell@linaro.org
|
|
In kvm_init_vcpu()and do_kvm_destroy_vcpu(), the return value from
kvm_ioctl(..., KVM_GET_VCPU_MMAP_SIZE, ...)
is an 'int', but we put it into a 'long' logal variable mmap_size.
Coverity then complains that there might be a truncation when we copy
that value into the 'int ret' which we use for returning a value in
an error-exit codepath. This can't ever actually overflow because
the value was in an 'int' to start with, but it makes more sense
to use 'int' for mmap_size so we don't do the widen-then-narrow
sequence in the first place.
Resolves: Coverity CID 1547515
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240815131206.3231819-2-peter.maydell@linaro.org
|
|
This patch's main focus is to use the previously added
hvf_get_physical_address_range to inform VM creation
about the IPA size we need for the VM, so we can extend
the default 36b IPA size and support VMs with 64+GB of
RAM. This is done by freezing the memory map, computing
the highest GPA and then (depending on if the platform
supports an IPA size that large) telling the kernel to
use a size >= for the VM. In pursuit of this a couple of
things related to how we handle the physical address range
we expose to guests were altered, but for an explanation of
what we were doing:
Today, to get the IPA size we were reading id_aa64mmfr0_el1's
PARange field from a newly made vcpu. Unfortunately, HVF just
returns the hosts PARange directly for the initial value and
not the IPA size that will actually back the VM, so we believe
we have much more address space than we actually do today it seems.
Starting in macOS 13.0 some APIs were introduced to be able to
query the maximum IPA size the kernel supports, and to set the IPA
size for a given VM. However, this still has a couple of issues
on < macOS 15. Up until macOS 15 (and if the hardware supported
it) the max IPA size was 39 bits which is not a valid PARange
value, so we can't clamp down what we advertise in the vcpu's
id_aa64mmfr0_el1 to our IPA size. Starting in macOS 15 however,
the maximum IPA size is 40 bits (if it's supported in the hardware
as well) which is also a valid PARange value so we can set our IPA
size to the maximum as well as clamp down the PARange we advertise
to the guest. This allows VMs with 64+ GB of RAM and should fix the
oddness of the PARange situation as well.
Signed-off-by: Danny Canter <danny_canter@apple.com>
Message-id: 20240828111552.93482-4-danny_canter@apple.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
This is preliminary work to split up hv_vm_create
logic per platform so we can support creating VMs
with > 64GB of RAM on Apple Silicon machines. This
is done via ARM HVF's hv_vm_config_create() (and
other APIs that modify this config that will be
coming in future patches). This should have no
behavioral difference at all as hv_vm_config_create()
just assigns the same default values as if you just
passed NULL to the function.
Signed-off-by: Danny Canter <danny_canter@apple.com>
Message-id: 20240828111552.93482-3-danny_canter@apple.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Change the data type of the ioctl _request_ argument from 'int' to
'unsigned long' for the various accel/kvm functions which are
essentially wrappers around the ioctl() syscall.
The correct type for ioctl()'s 'request' argument is confused:
* POSIX defines the request argument as 'int'
* glibc uses 'unsigned long' in the prototype in sys/ioctl.h
* the glibc info documentation uses 'int'
* the Linux manpage uses 'unsigned long'
* the Linux implementation of the syscall uses 'unsigned int'
If we wrap ioctl() with another function which uses 'int' as the
type for the request argument, then requests with the 0x8000_0000
bit set will be sign-extended when the 'int' is cast to
'unsigned long' for the call to ioctl().
On x86_64 one such example is the KVM_IRQ_LINE_STATUS request.
Bit requests with the _IOC_READ direction bit set, will have the high
bit set.
Fortunately the Linux Kernel truncates the upper 32bit of the request
on 64bit machines (because it uses 'unsigned int', and see also Linus
Torvalds' comments in
https://sourceware.org/bugzilla/show_bug.cgi?id=14362 )
so this doesn't cause active problems for us. However it is more
consistent to follow the glibc ioctl() prototype when we define
functions that are essentially wrappers around ioctl().
This resolves a Coverity issue where it points out that in
kvm_get_xsave() we assign a value (KVM_GET_XSAVE or KVM_GET_XSAVE2)
to an 'int' variable which can't hold it without overflow.
Resolves: Coverity CID 1547759
Signed-off-by: Johannes Stoelp <johannes.stoelp@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20240815122747.3053871-1-peter.maydell@linaro.org
[PMM: Rebased patch, adjusted commit message, included note about
Coverity fix, updated the type of the local var in kvm_get_xsave,
updated the comment in the KVMState struct definition]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
The main loop in rr_cpu_thread_fn() can never terminate, so the
code at the end of the function to clean up the RCU subsystem is
dead code. Replace it with g_assert_not_reached().
(This is different from the other cpu_thread_fn for e.g. MTTCG or
for the KVM accelerator -- those can exit, if the vCPU they
are responsible for is unplugged. But the RR cpu thread fn
handles all CPUs in the system in a round-robin way, so even
if one is unplugged it keeps looping.)
Resolves: Coverity CID 1547782
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240815143634.3413679-1-peter.maydell@linaro.org
|
|
This reverts commit 1f881ea4a444ef36a8b6907b0b82be4b3af253a2.
That commit causes reverse_debugging.py test failures, and does
not seem to solve the root cause of the problem x86-64 still
hangs in record/replay tests.
The problem with short-cutting the iowait that was taken during
record phase is that related events will not get consumed at the
same points (e.g., reading the clock).
A hang with zero icount always seems to be a symptom of an earlier
problem that has caused the recording to become out of synch with
the execution and consumption of events by replay.
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-6-npiggin@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-14-alex.bennee@linaro.org>
|
|
Loop should exit prematurely on successfully finding out the parked vCPU (struct
KVMParkedVcpu) in the 'struct KVMState' maintained 'kvm_parked_vcpus' list of
parked vCPUs.
Fixes: Coverity CID 1558552
Fixes: 08c3286822 ("accel/kvm: Extract common KVM vCPU {creation,parking} code")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20240725145132.99355-1-salil.mehta@huawei.com
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <CAFEAcA-3_d1c7XSXWkFubD-LsW5c5i95e6xxV09r2C9yGtzcdA@mail.gmail.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
There are distinct helpers for creating and parking a KVM vCPU.
However, there can be cases where a platform needs to create and
immediately park the vCPU during early stages of vcpu init which
can later be reused when vcpu thread gets initialized. This would
help detect failures with kvm_create_vcpu at an early stage.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
|
|
Misc HW patch queue
- Restrict probe_access*() functions to TCG (Phil)
- Extract do_invalidate_device_tlb from vtd_process_device_iotlb_desc (Clément)
- Fixes in Loongson IPI model (Bibo & Phil)
- Make docs/interop/firmware.json compatible with qapi-gen.py script (Thomas)
- Correct MPC I2C MMIO region size (Zoltan)
- Remove useless cast in Loongson3 Virt machine (Yao)
- Various uses of range overlap API (Yao)
- Use ERRP_GUARD macro in nubus_virtio_mmio_realize (Zhao)
- Use DMA memory API in Goldfish UART model (Phil)
- Expose fifo8_pop_buf and introduce fifo8_drop (Phil)
- MAINTAINERS updates (Zhao, Phil)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmagFF8ACgkQ4+MsLN6t
# wN5bKg//f5TwUhsy2ff0FJpHheDOj/9Gc2nZ1U/Fp0E5N3sz3A7MGp91wye6Xwi3
# XG34YN9LK1AVzuCdrEEs5Uaxs1ZS1R2mV+fZaGHwYYxPDdnXxGyp/2Q0eyRxzbcN
# zxE2hWscYSZbPVEru4HvZJKfp4XnE1cqA78fJKMAdtq0IPq38tmQNRlJ+gWD9dC6
# ZUHXPFf3DnucvVuwqb0JYO/E+uJpcTtgR6pc09Xtv/HFgMiS0vKZ1I/6LChqAUw9
# eLMpD/5V2naemVadJe98/dL7gIUnhB8GTjsb4ioblG59AO/uojutwjBSQvFxBUUw
# U5lX9OSn20ouwcGiqimsz+5ziwhCG0R6r1zeQJFqUxrpZSscq7NQp9ygbvirm+wS
# edLc8yTPf4MtYOihzPP9jLPcXPZjEV64gSnJISDDFYWANCrysX3suaFEOuVYPl+s
# ZgQYRVSSYOYHgNqBSRkPKKVUxskSQiqLY3SfGJG4EA9Ktt5lD1cLCXQxhdsqphFm
# Ws3zkrVVL0EKl4v/4MtCgITIIctN1ZJE9u3oPJjASqSvK6EebFqAJkc2SidzKHz0
# F3iYX2AheWNHCQ3HFu023EvFryjlxYk95fs2f6Uj2a9yVbi813qsvd3gcZ8t0kTT
# +dmQwpu1MxjzZnA6838R6OCMnC+UpMPqQh3dPkU/5AF2fc3NnN8=
# =J/I2
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jul 2024 06:36:47 AM AEST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240723' of https://github.com/philmd/qemu: (28 commits)
MAINTAINERS: Add myself as a reviewer of machine core
MAINTAINERS: Cover guest-agent in QAPI schema
util/fifo8: Introduce fifo8_drop()
util/fifo8: Expose fifo8_pop_buf()
util/fifo8: Rename fifo8_pop_buf() -> fifo8_pop_bufptr()
util/fifo8: Rename fifo8_peek_buf() -> fifo8_peek_bufptr()
util/fifo8: Use fifo8_reset() in fifo8_create()
util/fifo8: Fix style
chardev/char-fe: Document returned value on error
hw/char/goldfish: Use DMA memory API
hw/nubus/virtio-mmio: Fix missing ERRP_GUARD() in realize handler
dump: make range overlap check more readable
crypto/block-luks: make range overlap check more readable
system/memory_mapping: make range overlap check more readable
sparc/ldst_helper: make range overlap check more readable
cxl/mailbox: make range overlap check more readable
util/range: Make ranges_overlap() return bool
hw/mips/loongson3_virt: remove useless type cast
hw/i2c/mpc_i2c: Fix mmio region size
docs/interop/firmware.json: convert "Example" section
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
* target/i386/kvm: support for reading RAPL MSRs using a helper program
* hpet: emulation improvements
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaelL4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMXoQf+K77lNlHLETSgeeP3dr7yZPOmXjjN
# qFY/18jiyLw7MK1rZC09fF+n9SoaTH8JDKupt0z9M1R10HKHLIO04f8zDE+dOxaE
# Rou3yKnlTgFPGSoPPFr1n1JJfxtYlLZRoUzaAcHUaa4W7JR/OHJX90n1Rb9MXeDk
# jV6P0v1FWtIDdM6ERm9qBGoQdYhj6Ra2T4/NZKJFXwIhKEkxgu4yO7WXv8l0dxQz
# jE4fKotqAvrkYW1EsiVZm30lw/19duhvGiYeQXoYhk8KKXXjAbJMblLITSNWsCio
# 3l6Uud/lOxekkJDAq5nH3H9hCBm0WwvwL+0vRf3Mkr+/xRGvrhtmUdp8NQ==
# =00mB
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 03:19:58 AM AEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
hpet: avoid timer storms on periodic timers
hpet: store full 64-bit target value of the counter
hpet: accept 64-bit reads and writes
hpet: place read-only bits directly in "new_val"
hpet: remove unnecessary variable "index"
hpet: ignore high bits of comparator in 32-bit mode
hpet: fix and cleanup persistence of interrupt status
Add support for RAPL MSRs in KVM/Qemu
tools: build qemu-vmsr-helper
qio: add support for SO_PEERCRED for socket channel
target/i386: do not crash if microvm guest uses SGX CPUID leaves
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
into staging
virtio,pci,pc: features,fixes
pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
# zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
# yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
# wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
# VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
# M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
# =k8e9
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 10:16:31 AM AEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (61 commits)
hw/nvme: Add SPDM over DOE support
backends: Initial support for SPDM socket support
hw/pci: Add all Data Object Types defined in PCIe r6.0
tests/acpi: Add expected ACPI AML files for RISC-V
tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V
tests/acpi: Add empty ACPI data files for RISC-V
tests/qtest/bios-tables-test.c: Remove the fall back path
tests/acpi: update expected DSDT blob for aarch64 and microvm
acpi/gpex: Create PCI link devices outside PCI root bridge
tests/acpi: Allow DSDT acpi table changes for aarch64
hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART
hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC
virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain
hw/vfio/common: Add vfio_listener_region_del_iommu trace event
virtio-iommu: Remove the end point on detach
virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices
virtio-iommu: Remove probe_done
Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"
gdbstub: Add helper function to unregister GDB register space
physmem: Add helper function to destroy CPU AddressSpace
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
This API is specific to TCG (already handled by hardware
accelerators), so restrict it with #ifdef'ry. Remove
unnecessary stubs.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240529155918.6221-1-philmd@linaro.org>
|
|
accel/tcg: Export set/clear_helper_retaddr
target/arm: Use set_helper_retaddr for dc_zva, sve and sme
target/ppc: Tidy dcbz helpers
target/ppc: Use set_helper_retaddr for dcbz
target/s390x: Use set_helper_retaddr in mem_helper.c
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmafJKIdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+FBAf7Bup+karxeGHZx2rN
# cPeF248bcCWTxBWHK7dsYze4KqzsrlNIJlPeOKErU2bbbRDZGhOp1/N95WVz+P8V
# 6Ny63WTsAYkaFWKxE6Jf0FWJlGw92btk75pTV2x/TNZixg7jg0vzVaYkk0lTYc5T
# m5e4WycYEbzYm0uodxI09i+wFvpd+7WCnl6xWtlJPWZENukvJ36Ss43egFMDtuMk
# vTJuBkS9wpwZ9MSi6EY6M+Raieg8bfaotInZeDvE/yRPNi7CwrA7Dgyc1y626uBA
# joGkYRLzhRgvT19kB3bvFZi1AXa0Pxr+j0xJqwspP239Gq5qezlS5Bv/DrHdmGHA
# jaqSwg==
# =XgUE
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 01:33:54 PM AEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20240723' of https://gitlab.com/rth7680/qemu:
target/riscv: Simplify probing in vext_ldff
target/s390x: Use set/clear_helper_retaddr in mem_helper.c
target/s390x: Use user_or_likely in access_memmove
target/s390x: Use user_or_likely in do_access_memset
target/ppc: Improve helper_dcbz for user-only
target/ppc: Merge helper_{dcbz,dcbzep}
target/ppc: Split out helper_dbczl for 970
target/ppc: Hoist dcbz_size out of dcbz_common
target/ppc/mem_helper.c: Remove a conditional from dcbz_common()
target/arm: Use set/clear_helper_retaddr in SVE and SME helpers
target/arm: Use set/clear_helper_retaddr in helper-a64.c
accel/tcg: Move {set,clear}_helper_retaddr to cpu_ldst.h
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Use of these in helpers goes hand-in-hand with tlb_vaddr_to_host
and other probing functions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread
is spawned. This is common to all the architectures as of now.
Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't
support vCPU removal. Therefore, its representative KVM vCPU object/context in
Qemu is parked.
Refactor architecture common logic so that some APIs could be reused by vCPU
Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs
with trace events. New APIs qemu_{create,park,unpark}_vcpu() can be externally
called. No functional change is intended here.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240716111502.202344-2-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Starting with the "Sandy Bridge" generation, Intel CPUs provide a RAPL
interface (Running Average Power Limit) for advertising the accumulated
energy consumption of various power domains (e.g. CPU packages, DRAM,
etc.).
The consumption is reported via MSRs (model specific registers) like
MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are
64 bits registers that represent the accumulated energy consumption in
micro Joules. They are updated by microcode every ~1ms.
For now, KVM always returns 0 when the guest requests the value of
these MSRs. Use the KVM MSR filtering mechanism to allow QEMU handle
these MSRs dynamically in userspace.
To limit the amount of system calls for every MSR call, create a new
thread in QEMU that updates the "virtual" MSR values asynchronously.
Each vCPU has its own vMSR to reflect the independence of vCPUs. The
thread updates the vMSR values with the ratio of energy consumed of
the whole physical CPU package the vCPU thread runs on and the
thread's utime and stime values.
All other non-vCPU threads are also taken into account. Their energy
consumption is evenly distributed among all vCPUs threads running on
the same physical CPU package.
To overcome the problem that reading the RAPL MSR requires priviliged
access, a socket communication between QEMU and the qemu-vmsr-helper is
mandatory. You can specified the socket path in the parameter.
This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock
Actual limitation:
- Works only on Intel host CPU because AMD CPUs are using different MSR
adresses.
- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at
the moment.
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-4-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
data was correctly copied, but size of array was not set
(g_array_sized_new only reserves memory, but does not set size).
As a result, callbacks were not called for code path relying on
plugin_register_vcpu_mem_cb().
Found when trying to trigger mem access callbacks for atomic
instructions.
Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240706191335.878142-2-pierrick.bouvier@linaro.org>
Message-Id: <20240718094523.1198645-6-alex.bennee@linaro.org>
|
|
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
|
|
The TCGCPUOps::cpu_exec_interrupt hook is currently not mandatory; if
it is left NULL then we treat it as if it had returned false. However
since pretty much every architecture needs to handle interrupts,
almost every target we have provides the hook. The one exception is
Tricore, which doesn't currently implement the architectural
interrupt handling.
Add a "do nothing" implementation of cpu_exec_hook for Tricore,
assert on startup that the CPU does provide the hook, and remove
the runtime NULL check before calling it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240712113949.4146855-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
Now that all targets set TCGCPUOps::cpu_exec_halt, we can make it
mandatory and remove the fallback handling that calls cpu_has_work.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|