aboutsummaryrefslogtreecommitdiff
path: root/util
AgeCommit message (Collapse)Author
2018-08-24json: Reject invalid UTF-8 sequencesMarkus Armbruster
We reject bytes that can't occur in valid UTF-8 (\xC0..\xC1, \xF5..\xFF in the lexer. That's insufficient; there's plenty of invalid UTF-8 not containing these bytes, as demonstrated by check-qjson: * Malformed sequences - Unexpected continuation bytes - Missing continuation bytes after start bytes other than \xC0..\xC1, \xF5..\xFD. * Overlong sequences with start bytes other than \xC0..\xC1, \xF5..\xFD. * Invalid code points Fixing this in the lexer would be bothersome. Fixing it in the parser is straightforward, so do that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180823164025.12553-23-armbru@redhat.com>
2018-08-23util/oslib-win32: indicate alignment for qemu_anon_ram_alloc()David Hildenbrand
Let's set the alignment just like for the posix variant. This will implicitly set the alignment of the underlying memory region and therefore make memory_region_get_alignment(mr) return something > 0 for all memory backends applicable to PCDIMM/NVDIMM. The allocation granularity is ususally 64k, while the page size is 4k. The documentation of VirtualAlloc is not really comprehensible in case only MEM_COMMIT is specified without an address. We'll detect the actual values and then go for the bigger one. The expection is, that it will always be 64k aligned. (The assumption is that MEM_COMMIT does an implicit MEM_RESERVE, so the address will always be aligned to the allocation granularity. And the allocation granularity is always bigger than the page size). This will allow us to drop special handling in pc.c for memory_region_get_alignment(mr) == 0, as we can then assume that it is always set (and AFAICS >= getpagesize()). For pc in pc_memory_plug(), under Windows TARGET_PAGE_SIZE == getpagesize(), therefore alignment of DIMMs will not change, and therefore also not the guest physical memory layout. For spapr in spapr_memory_plug(), an alignment of 0 would have been used until now. As QEMU_ALIGN_UP will crash with the alignment being 0, this never worked, so we don't have to care about compatibility handling. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180801133444.11269-3-david@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23module: Use QEMU_MODULE_DIR as a search pathryang
The current paths for modules are CONFIG_QEMU_MODDIR and paths relative to the executable. Qemu and its modules can be installed and executed in paths that are different from these search paths. This change allows a search path to be specified by environment variable. An example usage for this is postmarketOS[1]. This is a build environment for Alpine Linux. It sets up Alpine Linux in a chroot environment. Alpine's Qemu packages are installed in the chroot. The Alpine Linux Qemu package is used to test compiled Alpine Linux system images. This way there isn't a reliance on the which ever version of Qemu the host system / distro provides. postmarketOS executes Qemu on host system outside of the chroot The Qemu module search path needs to point to the location of the chroot relative to the host system. e.g. The root of the Alpine Linux chroot is: ~/.local/var/pmbootstrap/chroot_native/ Alpine's Qemu is installed at ~/.local/var/pmbootstrap/chroot_native/usr/bin/ The Qemu module search path needs to be: QEMU_MODULE_DIR=~/.local/var/pmbootstrap/chroot_native/usr/lib/qemu/ [1] https://postmarketos.org/ Signed-off-by: ryang <decatf@gmail.com> Message-Id: <20180704181010.GA918@computer> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: track BQL callers explicitlyEmilio G. Cota
The BQL is acquired via qemu_mutex_lock_iothread(), which makes the profiler assign the associated wait time (i.e. most of BQL wait time) entirely to that function. This loses the original call site information, which does not help diagnose BQL contention. Fix it by tracking the callers explicitly. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: support call site coalescingEmilio G. Cota
Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: add qsp_resetEmilio G. Cota
I first implemented this by deleting all entries in the global hash table. But doing that safely slows down profiling, since we'd need to introduce rcu_read_lock/unlock in the fast path. What's implemented here avoids messing with the thread-local data in the global hash table. It achieves this by taking a snapshot of the current state, so that subsequent reports present the delta wrt to the snapshot. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: add sort_by option to qsp_reportEmilio G. Cota
Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23qsp: QEMU's Synchronization ProfilerEmilio G. Cota
The goal of this module is to profile synchronization primitives (i.e. mutexes, recursive mutexes and condition variables) so that scalability issues can be quickly diagnosed. Sync primitives are profiled by QSP based on the vaddr of the object accessed as well as the call site (file:line_nr). That means the same object called from two different call sites will be tracked in separate entries, which might be reported together or separately (see subsequent commit on call site coalescing). Some perf numbers: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz Command: taskset -c 0 tests/atomic_add-bench -d 5 -m - Before: 54.80 Mops/s - After: 54.75 Mops/s That is, a negligible slowdown due to the now indirect call to qemu_mutex_lock. Note that using a branch instead of an indirect call introduces a more severe slowdown (53.65 Mops/s, i.e. 2% slowdown). Enabling the profiler (with -p, added in this series) is more interesting: - No profiling: 54.75 Mops/s - W/ profiling: 12.53 Mops/s That is, a 4.36X slowdown. We can break down this slowdown by removing the get_clock calls or the entry lookup: - No profiling: 54.75 Mops/s - W/o get_clock: 25.37 Mops/s - W/o entry lookup: 19.30 Mops/s - W/ profiling: 12.53 Mops/s Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-15aio-posix: Improve comment around marking node deletedFam Zheng
The counter is for qemu_lockcnt_inc/dec sections (read side), qemu_lockcnt_lock/unlock is for the write side. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180803063917.30292-1-famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15aio: Do aio_notify_accept only during blocking aio_pollFam Zheng
An aio_notify() pairs with an aio_notify_accept(). The former should happen in the main thread or a vCPU thread, and the latter should be done in the IOThread. There is one rare case that the main thread or vCPU thread may "steal" the aio_notify() event just raised by itself, in bdrv_set_aio_context() [1]. The sequence is like this: main thread IO Thread =============================================================== bdrv_drained_begin() aio_disable_external(ctx) aio_poll(ctx, true) ctx->notify_me += 2 ... bdrv_drained_end() ... aio_notify() ... bdrv_set_aio_context() aio_poll(ctx, false) [1] aio_notify_accept(ctx) ppoll() /* Hang! */ [1] is problematic. It will clear the ctx->notifier event so that the blocked ppoll() will not return. (For the curious, this bug was noticed when booting a number of VMs simultaneously in RHV. One or two of the VMs will hit this race condition, making the VIRTIO device unresponsive to I/O commands. When it hangs, Seabios is busy waiting for a read request to complete (read MBR), right after initializing the virtio-blk-pci device, using 100% guest CPU. See also https://bugzilla.redhat.com/show_bug.cgi?id=1562750 for the original bug analysis.) aio_notify() only injects an event when ctx->notify_me is set, correspondingly aio_notify_accept() is only useful when ctx->notify_me _was_ set. Move the call to it into the "blocking" branch. This will effectively skip [1] and fix the hang. Furthermore, blocking aio_poll is only allowed on home thread (in_aio_context_home_thread), because otherwise two blocking aio_poll()'s can steal each other's ctx->notifier event and cause hanging just like described above. Cc: qemu-stable@nongnu.org Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180809132259.18402-3-famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-08-15aio-posix: Don't count ctx->notifier as progress when pollingFam Zheng
The same logic exists in fd polling. This change is especially important to avoid busy loop once we limit aio_notify_accept() to blocking aio_poll(). Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180809132259.18402-2-famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2018-07-30timer: remove replay clock probe in deadline calculationPavel Dovgalyuk
Ciro Santilli reported that commit a5ed352596a8b7eb2f9acce34371b944ac3056c4 breaks the execution replay. It happens due to the probing the clock for the new instances of iothread. However, this probing was made in replay mode for the timer lists that are empty. This patch removes clock probing in replay mode. It is an artifact of the old version with another thread model. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Message-Id: <20180725121526.12867.17866.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-17opts: remove redundant check for NULL parameterDaniel P. Berrangé
No callers of get_opt_value() pass in a NULL for the "value" parameter, so the check is redundant. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180514171913.17664-4-berrange@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-17i386: fix regression parsing multiboot initrd modulesDaniel P. Berrangé
The logic for parsing the multiboot initrd modules was messed up in commit 950c4e6c94b15cd0d8b63891dddd7a8dbf458e6a Author: Daniel P. Berrangé <berrange@redhat.com> Date: Mon Apr 16 12:17:43 2018 +0100 opts: don't silently truncate long option values Causing the length to be undercounter, and the number of modules over counted. It also passes NULL to get_opt_value() which was not robust at accepting a NULL value. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180514171913.17664-2-berrange@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-29Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into stagingPeter Maydell
The Darwin host support still needs some more work. It won't make it for soft-freeze, but I'd like these preparatory patches to be merged anyway. # gpg: Signature made Fri 29 Jun 2018 11:39:04 BST # gpg: using RSA key 71D4D5E5822F73D6 # gpg: Good signature from "Greg Kurz <groug@kaod.org>" # gpg: aka "Gregory Kurz <gregory.kurz@free.fr>" # gpg: aka "[jpeg image of size 3330]" # Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6 * remotes/gkurz/tags/for-upstream: 9p: darwin: Explicitly cast comparisons of mode_t with -1 cutils: Provide strchrnul Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-29Merge remote-tracking branch 'remotes/berrange/tags/min-glib-pull-request' ↵Peter Maydell
into staging glib: update the min required version This updates the minimum required glib version to 2.40 # gpg: Signature made Fri 29 Jun 2018 12:24:58 BST # gpg: using RSA key BE86EBB415104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF * remotes/berrange/tags/min-glib-pull-request: glib: enforce the minimum required version and warn about old APIs glib: bump min required glib library version to 2.40 util: remove redundant include of glib.h and add osdep.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-29Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* "info mtree" improvements (Alexey) * fake VPD block limits for SCSI passthrough (Daniel Barboza) * chardev and main loop fixes (Daniel Berrangé, Sergio, Stefan) * help fixes (Eduardo) * pc-dimm refactoring (David) * tests improvements and fixes (Emilio, Thomas) * SVM emulation fixes (Jan) * MemoryRegionCache fix (Eric) * WHPX improvements (Justin) * ESP cleanup (Mark) * -overcommit option (Michael) * qemu-pr-helper fixes (me) * "info pic" improvements for x86 (Peter) * x86 TCG emulation fixes (Richard) * KVM slot handling fix (Shannon) * Next round of deprecation (Thomas) * Windows dump format support (Viktor) # gpg: Signature made Fri 29 Jun 2018 12:03:05 BST # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (60 commits) tests/boot-serial: Do not delete the output file in case of errors hw/scsi: add VPD Block Limits emulation hw/scsi: centralize SG_IO calls into single function hw/scsi: cleanups before VPD BL emulation dump: add Windows live system dump dump: add fallback KDBG using in Windows dump dump: use system context in Windows dump dump: add Windows dump format to dump-guest-memory i386/cpu: make -cpu host support monitor/mwait kvm: support -overcommit cpu-pm=on|off hmp: obsolete "info ioapic" ioapic: support "info irq" ioapic: some proper indents when dump info ioapic: support "info pic" doc: another fix to "info pic" target-i386: Mark cpu_vmexit noreturn target-i386: Allow interrupt injection after STGI target-i386: Add NMI interception to SVM memory/hmp: Print owners/parents in "info mtree" WHPX: register for unrecognized MSR exits ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-29glib: bump min required glib library version to 2.40Daniel P. Berrangé
Per supported platforms doc[1], the various min glib on relevant distros is: RHEL-7: 2.50.3 Debian (Stretch): 2.50.3 Debian (Jessie): 2.42.1 OpenBSD (Ports): 2.54.3 FreeBSD (Ports): 2.50.3 OpenSUSE Leap 15: 2.54.3 SLE12-SP2: 2.48.2 Ubuntu (Xenial): 2.48.0 macOS (Homebrew): 2.56.0 This suggests that a minimum glib of 2.42 is a reasonable target. The GLibC compile farm, however, uses Ubuntu 14.04 (Trusty) which only has glib 2.40.0, and this is needed for testing during merge. Thus an exception is made to the documented platform support policy to allow for all three current LTS releases to be supported. Docker jobs that not longer satisfy this new min version are removed. [1] https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-06-29util: remove redundant include of glib.h and add osdep.hDaniel P. Berrangé
Code must only ever include glib.h indirectly via the glib-compat.h header file, because we will need some macros set before glib.h is pulled in. Adding extra includes of glib.h will (soon) cause compile failures such as: In file included from /home/berrange/src/virt/qemu/include/qemu/osdep.h:107, from /home/berrange/src/virt/qemu/include/qemu/iova-tree.h:26, from util/iova-tree.c:13: /home/berrange/src/virt/qemu/include/glib-compat.h:22: error: "GLIB_VERSION_MIN_REQUIRED" redefined [-Werror] #define GLIB_VERSION_MIN_REQUIRED GLIB_VERSION_2_40 In file included from /usr/include/glib-2.0/glib/gtypes.h:34, from /usr/include/glib-2.0/glib/galloca.h:32, from /usr/include/glib-2.0/glib.h:30, from util/iova-tree.c:12: /usr/include/glib-2.0/glib/gversionmacros.h:237: note: this is the location of the previous definition # define GLIB_VERSION_MIN_REQUIRED (GLIB_VERSION_CUR_STABLE) Furthermore, the osdep.h include should always be done directly from the .c file rather than indirectly via any .h file. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-06-29cutils: Provide strchrnulKeno Fischer
strchrnul is a GNU extension and thus unavailable on a number of targets. In the review for a commit removing strchrnul from 9p, I was asked to create a qemu_strchrnul helper to factor out this functionality. Do so, and use it in a number of other places in the code base that inlined the replacement pattern in a place where strchrnul could be used. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-28QemuMutex: support --enable-debug-mutexPaolo Bonzini
We have had some tracing tools for mutex but it's not easy to use them for e.g. dead locks. Let's provide "--enable-debug-mutex" parameter when configure to allow QemuMutex to store the last owner that took specific lock. It will be easy to use this tool to debug deadlocks since we can directly know who took the lock then as long as we can have a debugger attached to the process. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180425025459.5258-4-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28qemu-thread: introduce qemu-thread-common.hPeter Xu
Introduce some hooks for the shared part of qemu thread between POSIX and Windows implementations. Note that in qemu_mutex_unlock_impl() we moved the call before unlock operation which should make more sense. And we don't need qemu_mutex_post_unlock() hook. Put all these shared hooks into the header files. It should be internal to qemu-thread but not for qemu-thread users, hence put into util/ directory. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180425025459.5258-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-27linux-aio: properly bubble up errors from initializationNishanth Aravamudan
laio_init() can fail for a couple of reasons, which will lead to a NULL pointer dereference in laio_attach_aio_context(). To solve this, add a aio_setup_linux_aio() function which is called early in raw_open_common. If this fails, propagate the error up. The signature of aio_get_linux_aio() was not modified, because it seems preferable to return the actual errno from the possible failing initialization calls. Additionally, when the AioContext changes, we need to associate a LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context callback and call the new aio_setup_linux_aio(), which will allocate a new AioContext if needed, and return errors on failures. If it fails for any reason, fallback to threaded AIO with an error message, as the device is already in-use by the guest. Add an assert that aio_get_linux_aio() cannot return NULL. Signed-off-by: Nishanth Aravamudan <naravamudan@digitalocean.com> Message-id: 20180622193700.6523-1-naravamudan@digitalocean.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-21Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180615' into stagingPeter Maydell
TCG patch queue: Workaround macos assembler lossage. Eliminate tb_lock. Fix TB code generation overflow. # gpg: Signature made Fri 15 Jun 2018 20:40:56 BST # gpg: using RSA key 64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20180615: tcg: Reduce max TB opcode count tcg: remove tb_lock translate-all: remove tb_lock mention from cpu_restore_state_from_tb cputlb: remove tb_lock from tlb_flush functions translate-all: protect TB jumps with a per-destination-TB lock translate-all: discard TB when tb_link_page returns an existing matching TB translate-all: introduce assert_no_pages_locked translate-all: add page_locked assertions translate-all: use per-page locking in !user-mode translate-all: move tb_invalidate_phys_page_range up in the file translate-all: work page-by-page in tb_invalidate_phys_range_1 translate-all: remove hole in PageDesc translate-all: make l1_map lockless translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx tcg: track TBs with per-region BST's qht: return existing entry when qht_insert fails qht: require a default comparison function tcg/i386: Use byte form of xgetbv instruction Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-19Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches: - Active mirror (blockdev-mirror copy-mode=write-blocking) - bdrv_drain_*() fixes and test cases - Fix crash with scsi-hd and drive_del # gpg: Signature made Mon 18 Jun 2018 17:44:10 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (35 commits) iotests: Add test for active mirroring block/mirror: Add copy mode QAPI interface block/mirror: Add active mirroring job: Add job_progress_increase_remaining() block/mirror: Add MirrorBDSOpaque block/dirty-bitmap: Add bdrv_dirty_iter_next_area test-hbitmap: Add non-advancing iter_next tests hbitmap: Add @advance param to hbitmap_iter_next() block: Generalize should_update_child() rule block/mirror: Use source as a BdrvChild block/mirror: Wait for in-flight op conflicts block/mirror: Use CoQueue to wait on in-flight ops block/mirror: Convert to coroutines block/mirror: Pull out mirror_perform() block: fix QEMU crash with scsi-hd and drive_del test-bdrv-drain: Test graph changes in drain_all section block: Allow graph changes in bdrv_drain_all_begin/end sections block: ignore_bds_parents parameter for drain functions block: Move bdrv_drain_all_begin() out of coroutine context block: Allow AIO_WAIT_WHILE with NULL ctx ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-18hbitmap: Add @advance param to hbitmap_iter_next()Max Reitz
This new parameter allows the caller to just query the next dirty position without moving the iterator. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180613181823.13618-8-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-06-18monitor: add lock to protect mon_fdsetsPeter Xu
Introduce a new global big lock for mon_fdsets. Take it where needed. The monitor_fdset_get_fd() handling is a bit tricky: now we need to call qemu_mutex_unlock() which might pollute errno, so we need to make sure the correct errno be passed up to the callers. To make things simpler, we let monitor_fdset_get_fd() return the -errno directly when error happens, then in qemu_open() we move it back into errno. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180608035511.7439-8-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-06-15qht: return existing entry when qht_insert failsEmilio G. Cota
The meaning of "existing" is now changed to "matches in hash and ht->cmp result". This is saner than just checking the pointer value. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15qht: require a default comparison functionEmilio G. Cota
qht_lookup now uses the default cmp function. qht_lookup_custom is defined to retain the old behaviour, that is a cmp function is explicitly provided. qht_insert will gain use of the default cmp in the next patch. Note that we move qht_lookup_custom's @func to be the last argument, which makes the new qht_lookup as simple as possible. Instead of this (i.e. keeping @func 2nd): 0000000000010750 <qht_lookup>: 10750: 89 d1 mov %edx,%ecx 10752: 48 89 f2 mov %rsi,%rdx 10755: 48 8b 77 08 mov 0x8(%rdi),%rsi 10759: e9 22 ff ff ff jmpq 10680 <qht_lookup_custom> 1075e: 66 90 xchg %ax,%ax We get: 0000000000010740 <qht_lookup>: 10740: 48 8b 4f 08 mov 0x8(%rdi),%rcx 10744: e9 37 ff ff ff jmpq 10680 <qht_lookup_custom> 10749: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15block: Add block-specific QDict headerMax Reitz
There are numerous QDict functions that have been introduced for and are used only by the block layer. Move their declarations into an own header file to reflect that. While qdict_extract_subqdict() is in fact used outside of the block layer (in util/qemu-config.c), it is still a function related very closely to how the block layer works with nested QDicts, namely by sometimes flattening them. Therefore, its declaration is put into this header as well and util/qemu-config.c includes it with a comment stating exactly which function it needs. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20180509165530.29561-7-mreitz@redhat.com> [Copyright note tweaked, superfluous includes dropped] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-13Purge uses of banned g_assert_FOO()Markus Armbruster
We banned use of certain g_assert_FOO() functions outside tests, and made checkpatch.pl flag them (commit 6e9389563e5). We neglected to purge existing uses. Do that now. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180608170231.27912-1-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: John Snow <jsnow@redhat.com>
2018-06-11qemu-option: Pull out "Supported options" printMax Reitz
It really is up to the caller to decide what this list of options means. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20180509210023.20283-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-06-04Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell
staging Pull request * Copy offloading for qemu-img convert (iSCSI, raw, and qcow2) If the underlying storage supports copy offloading, qemu-img convert will use it instead of performing reads and writes. This avoids data transfers and thus frees up storage bandwidth for other purposes. SCSI EXTENDED COPY and Linux copy_file_range(2) are used to implement this optimization. * Drop spurious "WARNING: I\/O thread spun for 1000 iterations" warning # gpg: Signature made Mon 04 Jun 2018 12:20:08 BST # gpg: using RSA key 9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: main-loop: drop spin_counter qemu-img: Convert with copy offloading block-backend: Add blk_co_copy_range iscsi: Implement copy offloading iscsi: Create and use iscsi_co_wait_for_task iscsi: Query and save device designator when opening file-posix: Implement bdrv_co_copy_range qcow2: Implement copy offloading raw: Implement copy offloading raw: Check byte range uniformly block: Introduce API for copy offloading Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-01main-loop: drop spin_counterStefan Hajnoczi
Commit d759c951f3287fad04210a52f2dc93f94cf58c7f ("replay: push replay_mutex_lock up the call tree") removed the !timeout lock optimization in the main loop. The idea of the optimization was to avoid ping-pongs between threads by keeping the Big QEMU Lock held across non-blocking (!timeout) main loop iterations. A warning is printed when the main loop spins without releasing BQL for long periods of time. These warnings were supposed to aid debugging but in practice they just alarm users. They are considered noise because the cause of spinning is not shown and is hard to find. Now that the lock optimization has been removed, there is no danger of hogging the BQL. Drop the spin counter and the infamous warning. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-06-01memfd: Avoid Coverity warning about integer overflowPeter Maydell
Coverity complains about qemu_memfd_create() (CID 1385858) because we calculate a bit position htsize which could be up to 63, but then use it in "1 << htsize" which is a 32-bit integer calculation and could push the 1 off the top of the value. Silence the complaint bu using "1ULL"; this isn't a bug in practice since a hugetlbsize of 4GB is not very plausible. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20180515172729.24564-1-peter.maydell@linaro.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-23util: implement simple iova treePeter Xu
Introduce a simplest iova tree implementation based on GTree. CC: QEMU Stable <qemu-stable@nongnu.org> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-20replace functions which are only available in glib-2.24Olaf Hering
Currently the minimal supported version of glib is 2.22. Since testing is done with a glib that claims to be 2.22, but in fact has APIs from newer version of glib, this bug was not caught during submit of the patch referenced below. Replace g_realloc_n, which is available only since 2.24, with g_renew. Fixes commit 418026ca43 ("util: Introduce vfio helpers") Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> CC: qemu-stable@nongnu.org
2018-05-20Remove unnecessary variables for function return valueLaurent Vivier
Re-run Coccinelle script scripts/coccinelle/return_directly.cocci Signed-off-by: Laurent Vivier <lvivier@redhat.com> ppc part Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2018-05-18iothread: fix epollfd leak in the process of delIOThreadJie Wang
When we call addIOThread, the epollfd created in aio_context_setup, but not close it in the process of delIOThread, so the epollfd will leak. Reorder the code in aio_epoll_disable and reuse it. Signed-off-by: Jie Wang <wangjie88@huawei.com> Message-Id: <1526517763-11108-1-git-send-email-wangjie88@huawei.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [Mention change to aio_epoll_disable in commit message. - Fam] Signed-off-by: Fam Zheng <famz@redhat.com>
2018-05-15tcg: Optionally log FPU state in TCG -d cpu loggingPeter Maydell
Usually the logging of the CPU state produced by -d cpu is sufficient to diagnose problems, but sometimes you want to see the state of the floating point registers as well. We don't want to enable that by default as it adds a lot of extra data to the log; instead, allow it to be optionally enabled via -d fpu. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180510130024.31678-1-peter.maydell@linaro.org
2018-05-09opts: don't silently truncate long option valuesDaniel P. Berrangé
The existing QemuOpts parsing code uses a fixed size 1024 byte buffer for storing the option values. If a value exceeded this size it was silently truncated and no error reported to the user. Long option values is not a common scenario, but it is conceivable that they will happen. eg if the user has a very deeply nested filesystem it would be possible to come up with a disk path that was > 1024 bytes. Most of the time if such data was silently truncated, the user would get an error about opening a non-existant disk. If they're unlucky though, QEMU might use a completely different disk image from another VM, which could be considered a security issue. Another example program was in using the -smbios command line arg with very large data blobs. In this case the silent truncation will be providing semantically incorrect data to the guest OS for SMBIOS tables. If the operating system didn't limit the user's argv when spawning QEMU, the code should honour whatever length arguments were given without imposing its own length restrictions. This patch thus changes the code to use a heap allocated buffer for storing the values during parsing, lifting the arbitrary length restriction. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180416111743.8473-4-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-05-09opts: don't silently truncate long parameter keysDaniel P. Berrangé
The existing QemuOpts parsing code uses a fixed size 128 byte buffer for storing the parameter keys. If a key exceeded this size it was silently truncate and no error reported to the user. This behaviour was reasonable & harmless because traditionally the key names are all statically declared, and it was known that no code was declaring a key longer than 127 bytes. This assumption, however, ceased to be valid once the block layer added support for dot-separate compound keys. This syntax allows for keys that can be arbitrarily long, limited only by the number of block drivers you can stack up. With this usage, silently truncating the key name can never lead to correct behaviour. Hopefully such truncation would turn into an error, when the block code then tried to extract options later, but there's no guarantee that will happen. It is conceivable that an option specified by the user may be truncated and then ignored. This could have serious consequences, possibly even leading to security problems if the ignored option set a security relevant parameter. If the operating system didn't limit the user's argv when spawning QEMU, the code should honour whatever length arguments were given without imposing its own length restrictions. This patch thus changes the code to use a heap allocated buffer for storing the keys during parsing, lifting the arbitrary length restriction. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180416111743.8473-3-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-05-09accel: use g_strsplit for parsing accelerator namesDaniel P. Berrangé
Instead of re-using the get_opt_name() method from QemuOpts to split a string on ':', just use g_strsplit(). Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180416111743.8473-2-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-05-09qemu-thread: always keep the posix wrapper layerPeter Xu
We will conditionally have a wrapper layer depending on whether the host has the PTHREAD_SETNAME capability. It complicates stuff. Let's keep the wrapper there; we opt out the pthread_setname_np() call only. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180412053444.17801-1-peterx@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-04qobject: Replace qobject_incref/QINCREF qobject_decref/QDECREFMarc-André Lureau
Now that we can safely call QOBJECT() on QObject * as well as its subtypes, we can have macros qobject_ref() / qobject_unref() that work everywhere instead of having to use QINCREF() / QDECREF() for QObject and qobject_incref() / qobject_decref() for its subtypes. The replacement is mechanical, except I broke a long line, and added a cast in monitor_qmp_cleanup_req_queue_locked(). Unlike qobject_decref(), qobject_unref() doesn't accept void *. Note that the new macros evaluate their argument exactly once, thus no need to shout them. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, semantic conflict resolved, commit message improved] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-04-27Make qemu_mempath_getpagesize() accept NULLDavid Gibson
qemu_mempath_getpagesize() gets the effective (host side) page size for a block of memory backed by an mmap()ed file on the host. It requires the mem_path parameter to be non-NULL. This ends up meaning all the callers need a different case for handling anonymous memory (for memory-backend-ram or default memory with -mem-path is not specified). We can make all those callers a little simpler by having qemu_mempath_getpagesize() accept NULL, and treat that as the anonymous memory case. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2018-04-09memfd: fix vhost-user-test on non-memfd capable hostMarc-André Lureau
On RHEL7, memfd is not supported, and vhost-user-test fails: TEST: tests/vhost-user-test... (pid=10248) /x86_64/vhost-user/migrate: qemu-system-x86_64: -object memory-backend-memfd,id=mem,size=2M,: failed to create memfd FAIL There is a qemu_memfd_check() to prevent running memfd path, but it also checks for fallback implementation. Let's specialize qemu_memfd_check() to check memfd only, while qemu_memfd_alloc_check() checks for the qemu_memfd_alloc() API. Reported-by: Miroslav Rezanina <mrezanin@redhat.com> Tested-by: Miroslav Rezanina <mrezanin@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180328121804.16203-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-04-05sys_membarrier: fix up include directivesBruce Rogers
Our rule right now is to use <> for external headers only. util/sys_membarrier.c violates that. Fix it up. Signed-off-by: Bruce Rogers <brogers@suse.com> Message-Id: <20180329151018.15319-1-brogers@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-27coroutine: avoid co_queue_wakeup recursionStefan Hajnoczi
qemu_aio_coroutine_enter() is (indirectly) called recursively when processing co_queue_wakeup. This can lead to stack exhaustion. This patch rewrites co_queue_wakeup in an iterative fashion (instead of recursive) with bounded memory usage to prevent stack exhaustion. qemu_co_queue_run_restart() is inlined into qemu_aio_coroutine_enter() and the qemu_coroutine_enter() call is turned into a loop to avoid recursion. There is one change that is worth mentioning: Previously, when coroutine A queued coroutine B, qemu_co_queue_run_restart() entered coroutine B from coroutine A. If A was terminating then it would still stay alive until B yielded. After this patch B is entered by A's parent so that a A can be deleted immediately if it is terminating. It is safe to make this change since B could never interact with A if it was terminating anyway. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20180322152834.12656-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-26iothread: fix breakage on windowsPeter Xu
OOB can enable iothread for parsing even on Windows. We need some tunes to enable that on Windows otherwise it'll break Windows users. This patch fixes the breakage on Windows with qemu-system-ppc.exe. Reported-by: Howard Spoelstra <hsp.cat7@gmail.com> Tested-by: Howard Spoelstra <hsp.cat7@gmail.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180322085630.23654-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>