aboutsummaryrefslogtreecommitdiff
path: root/util
AgeCommit message (Collapse)Author
2018-10-02timer: introduce new virtual clockPavel Dovgalyuk
Slirp and VNC modules use virtual clock for processing some events that are related to the guest execution speed. But virtual clock-related events are consideres to be deterministic and are recorded/replayed by icount mechanism. But slirp and VNC lie outside the recorded guest core (which includes CPU and peripherals). Therefore slirp and VNC are external for the guest, but should work at guest speed. This patch introduces new virtual clock which can be used for external subsystems for running timers that are synchronized with the guest. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Message-Id: <20180912082002.3228.82417.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02util: use fcntl() for qemu_write_pidfile() lockingMarc-André Lureau
Daniel Berrangé suggested to use fcntl() locks rather than lockf(). 'man lockf': On Linux, lockf() is just an interface on top of fcntl(2) locking. Many other systems implement lockf() in this way, but note that POSIX.1 leaves the relationship between lockf() and fcntl(2) locks unspecified. A portable application should probably avoid mixing calls to these interfaces. IOW, if its just a shim around fcntl() on many systems, it is clearer if we just use fcntl() directly, as we then know how fcntl() locks will behave if they're on a network filesystem like NFS. Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180831145314.14736-3-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02util: add qemu_write_pidfile()Marc-André Lureau
There are variants of qemu_create_pidfile() in qemu-pr-helper and qemu-ga. Let's have a common implementation in libqemuutil. The code is initially based from pr-helper write_pidfile(), with various improvements and suggestions from Daniel Berrangé: QEMU will leave the pidfile existing on disk when it exits which initially made me think it avoids the deletion race. The app managing QEMU, however, may well delete the pidfile after it has seen QEMU exit, and even if the app locks the pidfile before deleting it, there is still a race. eg consider the following sequence QEMU 1 libvirtd QEMU 2 1. lock(pidfile) 2. exit() 3. open(pidfile) 4. lock(pidfile) 5. open(pidfile) 6. unlink(pidfile) 7. close(pidfile) 8. lock(pidfile) IOW, at step 8 the new QEMU has successfully acquired the lock, but the pidfile no longer exists on disk because it was deleted after the original QEMU exited. While we could just say no external app should ever delete the pidfile, I don't think that is satisfactory as people don't read docs, and admins don't like stale pidfiles being left around on disk. To make this robust, I think we might want to copy libvirt's approach to pidfile acquisition which runs in a loop and checks that the file on disk /after/ acquiring the lock matches the file that was locked. Then we could in fact safely let QEMU delete its own pidfiles on clean exit.. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180831145314.14736-2-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02hostmem-memfd: add checks before adding hostmem-memfd & propertiesMarc-André Lureau
Run some memfd-related checks before registering hostmem-memfd & various properties. This will help libvirt to figure out what the host is supposed to be capable of. qemu_memfd_check() is changed to a less optimized version, since it is used with various flags, it no longer caches the result. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180906161415.8543-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02qsp: use atomic64 accessorsEmilio G. Cota
With the seqlock, we either have to use atomics to remain within defined behaviour (and note that 64-bit atomics aren't always guaranteed to compile, irrespective of __nocheck), or drop the atomics and be in undefined behaviour territory. Fix it by dropping the seqlock and using atomic64 accessors. This will limit scalability when !CONFIG_ATOMIC64, but those machines (1) don't have many users and (2) are unlikely to have many cores. - With CONFIG_ATOMIC64: $ tests/atomic_add-bench -n 1 -m -p Throughput: 13.00 Mops/s - Forcing !CONFIG_ATOMIC64: $ tests/atomic_add-bench -n 1 -m -p Throughput: 10.89 Mops/s Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20180910232752.31565-5-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02util: add atomic64Emilio G. Cota
This introduces read/set accessors for int64_t and uint64_t. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20180910232752.31565-3-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02cacheinfo: add i/d cache_linesize_logEmilio G. Cota
Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20180910232752.31565-2-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-28Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180926' into stagingPeter Maydell
Queued tcg patches # gpg: Signature made Wed 26 Sep 2018 19:27:22 BST # gpg: using RSA key 64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20180926: tcg/i386: fix vector operations on 32-bit hosts qht-bench: add -p flag to precompute hash values qht: constify arguments to some internal functions qht: constify qht_statistics_init qht: constify qht_lookup qht: fix comment in qht_bucket_remove_entry qht: drop ht argument from qht iterators test-qht: speed up + test qht_resize test-qht: test deletion of the last entry in a bucket test-qht: test removal of non-existent entries test-qht: test qht_iter_remove qht: add qht_iter_remove qht: remove unused map param from qht_remove__locked Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-28Merge remote-tracking branch 'remotes/famz/tags/staging-pull-request' into ↵Peter Maydell
staging Block and testing patches - Paolo's AIO fixes. - VMDK streamOptimized corner case fix - VM testing improvment on -cpu # gpg: Signature made Wed 26 Sep 2018 03:54:08 BST # gpg: using RSA key CA35624C6A9171C6 # gpg: Good signature from "Fam Zheng <famz@redhat.com>" # Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6 * remotes/famz/tags/staging-pull-request: vmdk: align end of file to a sector boundary tests/vm: Use -cpu max rather than -cpu host aio-posix: do skip system call if ctx->notifier polling succeeds aio-posix: compute timeout before polling aio-posix: fix concurrent access to poll_disable_cnt Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-26qht: constify arguments to some internal functionsEmilio G. Cota
These functions do not modify their @ht or @bucket arguments. Constify those arguments. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-09-26qht: constify qht_statistics_initEmilio G. Cota
Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-09-26qht: constify qht_lookupEmilio G. Cota
seqlock_read_begin takes a const param since c04649eeea ("seqlock: constify seqlock_read_begin", 2018-08-23), so we can constify the entire lookup. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-09-26qht: fix comment in qht_bucket_remove_entryEmilio G. Cota
Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-09-26qht: drop ht argument from qht iteratorsEmilio G. Cota
Accessing the HT from an iterator results almost always in a deadlock. Given that only one qht-internal function uses this argument, drop it from the interface. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-09-26qht: add qht_iter_removeEmilio G. Cota
This currently has no users, but the use case is so common that I think we must support it. Note that without the appended we cannot safely remove a set of elements; a 2-step approach (i.e. qht_iter first, keep track of the to-be-deleted elements, and then a bunch of qht_remove calls) would be racy, since between the iteration and the removals other threads might insert additional elements. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-09-26qht: remove unused map param from qht_remove__lockedEmilio G. Cota
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-09-26aio-posix: do skip system call if ctx->notifier polling succeedsPaolo Bonzini
Commit 70232b5253 ("aio-posix: Don't count ctx->notifier as progress when 2018-08-15), by not reporting progress, causes aio_poll to execute the system call when polling succeeds because of ctx->notifier. This introduces latency before the call to aio_bh_poll() and negates the advantages of polling, unfortunately. The fix builds on the previous patch, separating the effect of polling on the timeout from the progress reported to aio_poll(). ctx->notifier does zero the timeout, causing the caller to skip the system call, but it does not report progress, so that the bug fix of commit 70232b5253 still stands. Fixes: 70232b5253a3c4e03ed1ac47ef9246a8ac66c6fa Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180912171040.1732-4-pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-09-26aio-posix: compute timeout before pollingPaolo Bonzini
This is a preparation for the next patch, and also a very small optimization. Compute the timeout only once, before invoking try_poll_mode, and adjust it in run_poll_handlers. The adjustment is the polling time when polling fails, or zero (non-blocking) if polling succeeds. Fixes: 70232b5253a3c4e03ed1ac47ef9246a8ac66c6fa Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180912171040.1732-3-pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-09-26aio-posix: fix concurrent access to poll_disable_cntPaolo Bonzini
It is valid for an aio_set_fd_handler to happen concurrently with aio_poll. In that case, poll_disable_cnt can change under the heels of aio_poll, and the assertion on poll_disable_cnt can fail in run_poll_handlers. Therefore, this patch simply checks the counter on every polling iteration. There are no particular needs for ordering, since the polling loop is terminated anyway by aio_notify at the end of aio_set_fd_handler. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180912171040.1732-2-pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-09-25Merge remote-tracking branch ↵Peter Maydell
'remotes/huth-gitlab/tags/pull-request-2018-09-25' into staging - Deprecate the usage of a network backend via "name" instead of "id" - Deprecate the "enforce-config-section" machine parameter - Re-enable the wdt_ib700, endianness and vmxnet3 qtests - Some trivial fixes and doc update patches that crossed my way # gpg: Signature made Tue 25 Sep 2018 16:58:42 BST # gpg: using RSA key 2ED9D774FE702DB5 # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" # gpg: aka "Thomas Huth <thuth@redhat.com>" # gpg: aka "Thomas Huth <huth@tuxfamily.org>" # gpg: aka "Thomas Huth <th.huth@posteo.de>" # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/huth-gitlab/tags/pull-request-2018-09-25: Revert "check: Move VMXNET3 test to common" Revert "check: Move endianess test to common" Revert "check: Move wdt_ib700 test to common" tests/migration: Speed up the test on ppc64 hw/qdev-core: Fix description of instance_init qdev: fix a typo in comment docs: Fix some typos (most found by codespell) trivial: Make bios files and source files non-executable memfd: fix possible usage of the uninitialized file descriptor hw/core/machine: Officially deprecate the enforce-config-section parameter net/slirp: Deprecate the [hub_id name] parameter tuple net: Deprecate the "name" parameter of -net Makefile: Add missing dependency for qemu-deprecated.texi Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25memfd: fix possible usage of the uninitialized file descriptorDima Stepanov
The qemu_memfd_alloc_check() routine allocates the fd variable on stack. This variable is initialized inside the qemu_memfd_alloc() function. There are several cases when *fd will be left unintialized which can lead to the unexpected close() in the qemu_memfd_free() call. Set file descriptor to -1 before calling the qemu_memfd_alloc routine. Signed-off-by: Dima Stepanov <dimastep@yandex-team.ru> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2018-09-25block: Use a single global AioWaitKevin Wolf
When draining a block node, we recurse to its parent and for subtree drains also to its children. A single AIO_WAIT_WHILE() is then used to wait for bdrv_drain_poll() to become true, which depends on all of the nodes we recursed to. However, if the respective child or parent becomes quiescent and calls bdrv_wakeup(), only the AioWait of the child/parent is checked, while AIO_WAIT_WHILE() depends on the AioWait of the original node. Fix this by using a single AioWait for all callers of AIO_WAIT_WHILE(). This may mean that the draining thread gets a few more unnecessary wakeups because an unrelated operation got completed, but we already wake it up when something _could_ have changed rather than only if it has certainly changed. Apart from that, drain is a slow path anyway. In theory it would be possible to use wakeups more selectively and still correctly, but the gains are likely not worth the additional complexity. In fact, this patch is a nice simplification for some places in the code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25block: Add missing locking in bdrv_co_drain_bh_cb()Kevin Wolf
bdrv_do_drained_begin/end() assume that they are called with the AioContext lock of bs held. If we call drain functions from a coroutine with the AioContext lock held, we yield and schedule a BH to move out of coroutine context. This means that the lock for the home context of the coroutine is released and must be re-acquired in the bottom half. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25util/async: use qemu_aio_coroutine_enter in co_schedule_bh_cbSergio Lopez
AIO Coroutines shouldn't by managed by an AioContext different than the one assigned when they are created. aio_co_enter avoids entering a coroutine from a different AioContext, calling aio_co_schedule instead. Scheduled coroutines are then entered by co_schedule_bh_cb using qemu_coroutine_enter, which just calls qemu_aio_coroutine_enter with the current AioContext obtained with qemu_get_current_aio_context. Eventually, co->ctx will be set to the AioContext passed as an argument to qemu_aio_coroutine_enter. This means that, if an IO Thread's AioConext is being processed by the Main Thread (due to aio_poll being called with a BDS AioContext, as it happens in AIO_WAIT_WHILE among other places), the AioContext from some coroutines may be wrongly replaced with the one from the Main Thread. This is the root cause behind some crashes, mainly triggered by the drain code at block/io.c. The most common are these abort and failed assertion: util/async.c:aio_co_schedule 456 if (scheduled) { 457 fprintf(stderr, 458 "%s: Co-routine was already scheduled in '%s'\n", 459 __func__, scheduled); 460 abort(); 461 } util/qemu-coroutine-lock.c: 286 assert(mutex->holder == self); But it's also known to cause random errors at different locations, and even SIGSEGV with broken coroutine backtraces. By using qemu_aio_coroutine_enter directly in co_schedule_bh_cb, we can pass the correct AioContext as an argument, making sure co->ctx is not wrongly altered. Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-09-24qemu-error: add {error, warn}_report_once_condCornelia Huck
Add two functions to print an error/warning report once depending on a passed-in condition variable and flip it if printed. This is useful if you want to print a message not once-globally, but e.g. once-per-device. Inspired by warn_once() in hw/vfio/ccw.c, which has been replaced with warn_report_once_cond(). Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20180830145902.27376-2-cohuck@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Function comments reworded] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-08-27Merge remote-tracking branch ↵Peter Maydell
'remotes/kraxel/tags/ui-20180827-v4-pull-request' into staging ui: misc fixes which piled up during 3.0 release freeze # gpg: Signature made Mon 27 Aug 2018 09:53:07 BST # gpg: using RSA key 4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/ui-20180827-v4-pull-request: util: promote qemu_egl_rendernode_open() to libqemuutil dmabuf: add y0_top, pass it to spice ui/vnc: Remove useless parenthesis around DIV_ROUND_UP macro ui/sdl2: Fix broken -full-screen CLI option spice-display: fix qemu_spice_cursor_refresh_bh locking spice-display: access ptr_x/ptr_y under Mutex vnc: remove support for deprecated tls, x509, x509verify options doc: switch to modern syntax for VNC TLS setup sdl2: redraw correctly when scanout_mode enabled. ui: use enum to string helpers vnc: fix memleak of the "vnc-worker-output" name ui/sdl2: Remove the obsolete SDL_INIT_NOPARACHUTE flag Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-27util: promote qemu_egl_rendernode_open() to libqemuutilMarc-André Lureau
vhost-user-gpu will share the same code to open a DRM node. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180713130916.4153-20-marcandre.lureau@redhat.com> [ kraxel: buildfix: util/drm.o must be CONFIG_OPENGL not CONFIG_LINUX ] Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2018-08-24json: Reject invalid UTF-8 sequencesMarkus Armbruster
We reject bytes that can't occur in valid UTF-8 (\xC0..\xC1, \xF5..\xFF in the lexer. That's insufficient; there's plenty of invalid UTF-8 not containing these bytes, as demonstrated by check-qjson: * Malformed sequences - Unexpected continuation bytes - Missing continuation bytes after start bytes other than \xC0..\xC1, \xF5..\xFD. * Overlong sequences with start bytes other than \xC0..\xC1, \xF5..\xFD. * Invalid code points Fixing this in the lexer would be bothersome. Fixing it in the parser is straightforward, so do that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180823164025.12553-23-armbru@redhat.com>
2018-08-23util/oslib-win32: indicate alignment for qemu_anon_ram_alloc()David Hildenbrand
Let's set the alignment just like for the posix variant. This will implicitly set the alignment of the underlying memory region and therefore make memory_region_get_alignment(mr) return something > 0 for all memory backends applicable to PCDIMM/NVDIMM. The allocation granularity is ususally 64k, while the page size is 4k. The documentation of VirtualAlloc is not really comprehensible in case only MEM_COMMIT is specified without an address. We'll detect the actual values and then go for the bigger one. The expection is, that it will always be 64k aligned. (The assumption is that MEM_COMMIT does an implicit MEM_RESERVE, so the address will always be aligned to the allocation granularity. And the allocation granularity is always bigger than the page size). This will allow us to drop special handling in pc.c for memory_region_get_alignment(mr) == 0, as we can then assume that it is always set (and AFAICS >= getpagesize()). For pc in pc_memory_plug(), under Windows TARGET_PAGE_SIZE == getpagesize(), therefore alignment of DIMMs will not change, and therefore also not the guest physical memory layout. For spapr in spapr_memory_plug(), an alignment of 0 would have been used until now. As QEMU_ALIGN_UP will crash with the alignment being 0, this never worked, so we don't have to care about compatibility handling. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180801133444.11269-3-david@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23module: Use QEMU_MODULE_DIR as a search pathryang
The current paths for modules are CONFIG_QEMU_MODDIR and paths relative to the executable. Qemu and its modules can be installed and executed in paths that are different from these search paths. This change allows a search path to be specified by environment variable. An example usage for this is postmarketOS[1]. This is a build environment for Alpine Linux. It sets up Alpine Linux in a chroot environment. Alpine's Qemu packages are installed in the chroot. The Alpine Linux Qemu package is used to test compiled Alpine Linux system images. This way there isn't a reliance on the which ever version of Qemu the host system / distro provides. postmarketOS executes Qemu on host system outside of the chroot The Qemu module search path needs to point to the location of the chroot relative to the host system. e.g. The root of the Alpine Linux chroot is: ~/.local/var/pmbootstrap/chroot_native/ Alpine's Qemu is installed at ~/.local/var/pmbootstrap/chroot_native/usr/bin/ The Qemu module search path needs to be: QEMU_MODULE_DIR=~/.local/var/pmbootstrap/chroot_native/usr/lib/qemu/ [1] https://postmarketos.org/ Signed-off-by: ryang <decatf@gmail.com> Message-Id: <20180704181010.GA918@computer> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: track BQL callers explicitlyEmilio G. Cota
The BQL is acquired via qemu_mutex_lock_iothread(), which makes the profiler assign the associated wait time (i.e. most of BQL wait time) entirely to that function. This loses the original call site information, which does not help diagnose BQL contention. Fix it by tracking the callers explicitly. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: support call site coalescingEmilio G. Cota
Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: add qsp_resetEmilio G. Cota
I first implemented this by deleting all entries in the global hash table. But doing that safely slows down profiling, since we'd need to introduce rcu_read_lock/unlock in the fast path. What's implemented here avoids messing with the thread-local data in the global hash table. It achieves this by taking a snapshot of the current state, so that subsequent reports present the delta wrt to the snapshot. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: add sort_by option to qsp_reportEmilio G. Cota
Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: QEMU's Synchronization ProfilerEmilio G. Cota
The goal of this module is to profile synchronization primitives (i.e. mutexes, recursive mutexes and condition variables) so that scalability issues can be quickly diagnosed. Sync primitives are profiled by QSP based on the vaddr of the object accessed as well as the call site (file:line_nr). That means the same object called from two different call sites will be tracked in separate entries, which might be reported together or separately (see subsequent commit on call site coalescing). Some perf numbers: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz Command: taskset -c 0 tests/atomic_add-bench -d 5 -m - Before: 54.80 Mops/s - After: 54.75 Mops/s That is, a negligible slowdown due to the now indirect call to qemu_mutex_lock. Note that using a branch instead of an indirect call introduces a more severe slowdown (53.65 Mops/s, i.e. 2% slowdown). Enabling the profiler (with -p, added in this series) is more interesting: - No profiling: 54.75 Mops/s - W/ profiling: 12.53 Mops/s That is, a 4.36X slowdown. We can break down this slowdown by removing the get_clock calls or the entry lookup: - No profiling: 54.75 Mops/s - W/o get_clock: 25.37 Mops/s - W/o entry lookup: 19.30 Mops/s - W/ profiling: 12.53 Mops/s Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-15aio-posix: Improve comment around marking node deletedFam Zheng
The counter is for qemu_lockcnt_inc/dec sections (read side), qemu_lockcnt_lock/unlock is for the write side. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180803063917.30292-1-famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15aio: Do aio_notify_accept only during blocking aio_pollFam Zheng
An aio_notify() pairs with an aio_notify_accept(). The former should happen in the main thread or a vCPU thread, and the latter should be done in the IOThread. There is one rare case that the main thread or vCPU thread may "steal" the aio_notify() event just raised by itself, in bdrv_set_aio_context() [1]. The sequence is like this: main thread IO Thread =============================================================== bdrv_drained_begin() aio_disable_external(ctx) aio_poll(ctx, true) ctx->notify_me += 2 ... bdrv_drained_end() ... aio_notify() ... bdrv_set_aio_context() aio_poll(ctx, false) [1] aio_notify_accept(ctx) ppoll() /* Hang! */ [1] is problematic. It will clear the ctx->notifier event so that the blocked ppoll() will not return. (For the curious, this bug was noticed when booting a number of VMs simultaneously in RHV. One or two of the VMs will hit this race condition, making the VIRTIO device unresponsive to I/O commands. When it hangs, Seabios is busy waiting for a read request to complete (read MBR), right after initializing the virtio-blk-pci device, using 100% guest CPU. See also https://bugzilla.redhat.com/show_bug.cgi?id=1562750 for the original bug analysis.) aio_notify() only injects an event when ctx->notify_me is set, correspondingly aio_notify_accept() is only useful when ctx->notify_me _was_ set. Move the call to it into the "blocking" branch. This will effectively skip [1] and fix the hang. Furthermore, blocking aio_poll is only allowed on home thread (in_aio_context_home_thread), because otherwise two blocking aio_poll()'s can steal each other's ctx->notifier event and cause hanging just like described above. Cc: qemu-stable@nongnu.org Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180809132259.18402-3-famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15aio-posix: Don't count ctx->notifier as progress when pollingFam Zheng
The same logic exists in fd polling. This change is especially important to avoid busy loop once we limit aio_notify_accept() to blocking aio_poll(). Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180809132259.18402-2-famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-07-30timer: remove replay clock probe in deadline calculationPavel Dovgalyuk
Ciro Santilli reported that commit a5ed352596a8b7eb2f9acce34371b944ac3056c4 breaks the execution replay. It happens due to the probing the clock for the new instances of iothread. However, this probing was made in replay mode for the timer lists that are empty. This patch removes clock probing in replay mode. It is an artifact of the old version with another thread model. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Message-Id: <20180725121526.12867.17866.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-17opts: remove redundant check for NULL parameterDaniel P. Berrangé
No callers of get_opt_value() pass in a NULL for the "value" parameter, so the check is redundant. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180514171913.17664-4-berrange@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-17i386: fix regression parsing multiboot initrd modulesDaniel P. Berrangé
The logic for parsing the multiboot initrd modules was messed up in commit 950c4e6c94b15cd0d8b63891dddd7a8dbf458e6a Author: Daniel P. Berrangé <berrange@redhat.com> Date: Mon Apr 16 12:17:43 2018 +0100 opts: don't silently truncate long option values Causing the length to be undercounter, and the number of modules over counted. It also passes NULL to get_opt_value() which was not robust at accepting a NULL value. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180514171913.17664-2-berrange@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-29Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into stagingPeter Maydell
The Darwin host support still needs some more work. It won't make it for soft-freeze, but I'd like these preparatory patches to be merged anyway. # gpg: Signature made Fri 29 Jun 2018 11:39:04 BST # gpg: using RSA key 71D4D5E5822F73D6 # gpg: Good signature from "Greg Kurz <groug@kaod.org>" # gpg: aka "Gregory Kurz <gregory.kurz@free.fr>" # gpg: aka "[jpeg image of size 3330]" # Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6 * remotes/gkurz/tags/for-upstream: 9p: darwin: Explicitly cast comparisons of mode_t with -1 cutils: Provide strchrnul Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-29Merge remote-tracking branch 'remotes/berrange/tags/min-glib-pull-request' ↵Peter Maydell
into staging glib: update the min required version This updates the minimum required glib version to 2.40 # gpg: Signature made Fri 29 Jun 2018 12:24:58 BST # gpg: using RSA key BE86EBB415104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF * remotes/berrange/tags/min-glib-pull-request: glib: enforce the minimum required version and warn about old APIs glib: bump min required glib library version to 2.40 util: remove redundant include of glib.h and add osdep.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-29Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* "info mtree" improvements (Alexey) * fake VPD block limits for SCSI passthrough (Daniel Barboza) * chardev and main loop fixes (Daniel Berrangé, Sergio, Stefan) * help fixes (Eduardo) * pc-dimm refactoring (David) * tests improvements and fixes (Emilio, Thomas) * SVM emulation fixes (Jan) * MemoryRegionCache fix (Eric) * WHPX improvements (Justin) * ESP cleanup (Mark) * -overcommit option (Michael) * qemu-pr-helper fixes (me) * "info pic" improvements for x86 (Peter) * x86 TCG emulation fixes (Richard) * KVM slot handling fix (Shannon) * Next round of deprecation (Thomas) * Windows dump format support (Viktor) # gpg: Signature made Fri 29 Jun 2018 12:03:05 BST # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (60 commits) tests/boot-serial: Do not delete the output file in case of errors hw/scsi: add VPD Block Limits emulation hw/scsi: centralize SG_IO calls into single function hw/scsi: cleanups before VPD BL emulation dump: add Windows live system dump dump: add fallback KDBG using in Windows dump dump: use system context in Windows dump dump: add Windows dump format to dump-guest-memory i386/cpu: make -cpu host support monitor/mwait kvm: support -overcommit cpu-pm=on|off hmp: obsolete "info ioapic" ioapic: support "info irq" ioapic: some proper indents when dump info ioapic: support "info pic" doc: another fix to "info pic" target-i386: Mark cpu_vmexit noreturn target-i386: Allow interrupt injection after STGI target-i386: Add NMI interception to SVM memory/hmp: Print owners/parents in "info mtree" WHPX: register for unrecognized MSR exits ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-29glib: bump min required glib library version to 2.40Daniel P. Berrangé
Per supported platforms doc[1], the various min glib on relevant distros is: RHEL-7: 2.50.3 Debian (Stretch): 2.50.3 Debian (Jessie): 2.42.1 OpenBSD (Ports): 2.54.3 FreeBSD (Ports): 2.50.3 OpenSUSE Leap 15: 2.54.3 SLE12-SP2: 2.48.2 Ubuntu (Xenial): 2.48.0 macOS (Homebrew): 2.56.0 This suggests that a minimum glib of 2.42 is a reasonable target. The GLibC compile farm, however, uses Ubuntu 14.04 (Trusty) which only has glib 2.40.0, and this is needed for testing during merge. Thus an exception is made to the documented platform support policy to allow for all three current LTS releases to be supported. Docker jobs that not longer satisfy this new min version are removed. [1] https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-06-29util: remove redundant include of glib.h and add osdep.hDaniel P. Berrangé
Code must only ever include glib.h indirectly via the glib-compat.h header file, because we will need some macros set before glib.h is pulled in. Adding extra includes of glib.h will (soon) cause compile failures such as: In file included from /home/berrange/src/virt/qemu/include/qemu/osdep.h:107, from /home/berrange/src/virt/qemu/include/qemu/iova-tree.h:26, from util/iova-tree.c:13: /home/berrange/src/virt/qemu/include/glib-compat.h:22: error: "GLIB_VERSION_MIN_REQUIRED" redefined [-Werror] #define GLIB_VERSION_MIN_REQUIRED GLIB_VERSION_2_40 In file included from /usr/include/glib-2.0/glib/gtypes.h:34, from /usr/include/glib-2.0/glib/galloca.h:32, from /usr/include/glib-2.0/glib.h:30, from util/iova-tree.c:12: /usr/include/glib-2.0/glib/gversionmacros.h:237: note: this is the location of the previous definition # define GLIB_VERSION_MIN_REQUIRED (GLIB_VERSION_CUR_STABLE) Furthermore, the osdep.h include should always be done directly from the .c file rather than indirectly via any .h file. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-06-29cutils: Provide strchrnulKeno Fischer
strchrnul is a GNU extension and thus unavailable on a number of targets. In the review for a commit removing strchrnul from 9p, I was asked to create a qemu_strchrnul helper to factor out this functionality. Do so, and use it in a number of other places in the code base that inlined the replacement pattern in a place where strchrnul could be used. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-28QemuMutex: support --enable-debug-mutexPaolo Bonzini
We have had some tracing tools for mutex but it's not easy to use them for e.g. dead locks. Let's provide "--enable-debug-mutex" parameter when configure to allow QemuMutex to store the last owner that took specific lock. It will be easy to use this tool to debug deadlocks since we can directly know who took the lock then as long as we can have a debugger attached to the process. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180425025459.5258-4-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28qemu-thread: introduce qemu-thread-common.hPeter Xu
Introduce some hooks for the shared part of qemu thread between POSIX and Windows implementations. Note that in qemu_mutex_unlock_impl() we moved the call before unlock operation which should make more sense. And we don't need qemu_mutex_post_unlock() hook. Put all these shared hooks into the header files. It should be internal to qemu-thread but not for qemu-thread users, hence put into util/ directory. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180425025459.5258-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-27linux-aio: properly bubble up errors from initializationNishanth Aravamudan
laio_init() can fail for a couple of reasons, which will lead to a NULL pointer dereference in laio_attach_aio_context(). To solve this, add a aio_setup_linux_aio() function which is called early in raw_open_common. If this fails, propagate the error up. The signature of aio_get_linux_aio() was not modified, because it seems preferable to return the actual errno from the possible failing initialization calls. Additionally, when the AioContext changes, we need to associate a LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context callback and call the new aio_setup_linux_aio(), which will allocate a new AioContext if needed, and return errors on failures. If it fails for any reason, fallback to threaded AIO with an error message, as the device is already in-use by the guest. Add an assert that aio_get_linux_aio() cannot return NULL. Signed-off-by: Nishanth Aravamudan <naravamudan@digitalocean.com> Message-id: 20180622193700.6523-1-naravamudan@digitalocean.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>