slackcoder/qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer

Age	Commit message (Collapse)	Author
2015-08-14	Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵	Peter Maydell
	staging # gpg: Signature made Fri 14 Aug 2015 15:41:14 BST using RSA key ID 81AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" * remotes/stefanha/tags/block-pull-request: throttle: add throttle_max_is_missing_limit() test throttle: refuse bps_max/iops_max without bps/iops Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13	Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging	Peter Maydell
	virtio,pc,acpi fixes, cleanups Mostly cleanups, notably Eduardo's compat code rework, and smbios rearrangement for use by ARM. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 13 Aug 2015 12:59:16 BST using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" * remotes/mst/tags/for_upstream: (24 commits) MAINTAINERS: list smbios maintainers smbios: move smbios code into a common folder smbios: remove dependency on x86 e820 tables smbios: extract x86 smbios building code into a function acpi: avoid potential uninitialized access to cpu_hp_io_base virtio-net: remove useless codes pci: allow 0 address for PCI IO/MEM regions pc: Remove redundant arguments from pc_memory_init() pc: Remove redundant arguments from pc_cmos_init() pc: Remove redundant arguments from *load_linux() pc: Use PCMachineState as pc_guest_info_init() argument pc: Move {above,below}_4g_mem_size variables to PCMachineState pc: Use PCMachineState for pc_memory_init() argument pc: Use PCMachineState for pc_cmos_init() argument pc: Eliminate pc_default_machine_options() pc: Eliminate pc_common_machine_options() pc: Move PCMachineClass, PCMachineState to qemu/typedefs.h pc: Rename pc_machine variables to pcms pc: Use error_abort when registering properties target-i386: Remove x86_cpu_compat_set_features() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13	smbios: move smbios code into a common folder	Wei Huang
	To share smbios among different architectures, this patch moves SMBIOS code (smbios.c and smbios.h) from x86 specific folders into new hw/smbios directories. As a result, CONFIG_SMBIOS=y is defined in x86 default config files. Acked-by: Gabriel Somlo <somlo@cmu.edu> Tested-by: Gabriel Somlo <somlo@cmu.edu> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Leif Lindholm <leif.lindholm@linaro.org> Signed-off-by: Wei Huang <wei@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	smbios: remove dependency on x86 e820 tables	Wei Huang
	Current smbios builds type 19 table from e820, which is x86 specific. This patch removes smbios' dependency on e820 by passing an array of memory area to smbios_get_tables(). Acked-by: Gabriel Somlo <somlo@cmu.edu> Tested-by: Gabriel Somlo <somlo@cmu.edu> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Leif Lindholm <leif.lindholm@linaro.org> Signed-off-by: Wei Huang <wei@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	virtio-net: remove useless codes	Jason Wang
	After commit 40bad8f3deba15e2074ff34cfe923c12916b1cc5("virtio-net: fix used len for tx"), async_tx.len was no longer used afterwards. So remove useless codes with it. Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pci: allow 0 address for PCI IO/MEM regions	Laurent Vivier
	Some kernels program a 0 address for io regions. PCI 3.0 spec section 6.2.5.1 doesn't seem to disallow this. based on patch by Michael Roth <mdroth@linux.vnet.ibm.com> Add pci_allow_0_addr in MachineClass to conditionally allow addr 0 for pseries, as this can break other architectures. This patch allows to hotplug PCI card in pseries machine, as the first added card BAR0 is always set to 0 address. This as a temporary hack, waiting to fix PCI memory priorities for more machine types... Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Remove redundant arguments from pc_memory_init()	Eduardo Habkost
	Remove arguments that can be found in PCMachineState. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Remove redundant arguments from pc_cmos_init()	Eduardo Habkost
	Remove arguments that can be found in PCMachineState. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Remove redundant arguments from *load_linux()	Eduardo Habkost
	Remove arguments that can be found in PCMachineState. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Use PCMachineState as pc_guest_info_init() argument	Eduardo Habkost
	Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Move {above,below}_4g_mem_size variables to PCMachineState	Eduardo Habkost
	This will make the info readily available for the other initialization functions, and will allow us to simplify their argument list. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Use PCMachineState for pc_memory_init() argument	Eduardo Habkost
	pc_memory_init() already expects a PCMachineState object, there's no point in upcasting it to MachineState before calling the function. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Use PCMachineState for pc_cmos_init() argument	Eduardo Habkost
	pc_cmos_init() already expects a PCMachineState object, there's no point in upcasting it to MachineState before calling the function. While doing it, reorder the arguments so PCMachineState is the first function argument. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Eliminate pc_default_machine_options()	Eduardo Habkost
	The only PC machines that didn't call pc_default_machine_options() were isaps and xenfv. Both were already overwriting max_cpus, and only isapc was not overwriting hot_add_cpu. After making isapc set hot_add_cpu to NULL, we can move the pc_default_machine_options() code the PC common class_init. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Eliminate pc_common_machine_options()	Eduardo Habkost
	All TYPE_PC_MACHINE subclasses call pc_common_machine_options(). TYPE_PC_MACHINE can simply initialize the common options on class_init directly. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Move PCMachineClass, PCMachineState to qemu/typedefs.h	Eduardo Habkost
	They will be used inside hw/xen/xen.h, which doesn't include hw/i386/pc.h. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	pc: Use PC_COMPAT_* for CPUID feature compatibility	Eduardo Habkost
	Now we can use compat_props to keep CPUID feature compatibility, using the boolean QOM properties for CPUID feature flags. This simplifies the compatibility code, and reduces duplication between pc_piix.c and pc_q35.c. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-13	hw/arm/gic: Kill code duplication	Pavel Fedin
	Extracted duplicated initialization code from SW-emulated and KVM GIC implementations and put into gic_init_irqs_and_mmio() Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Message-id: 8ea5b2781ef39cb5989420987fc73c70e377687d.1438758065.git.p.fedin@samsung.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13	Merge memory_region_init_reservation() into memory_region_init_io()	Pavel Fedin
	Just specifying ops = NULL in some cases can be more convenient than having two functions. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 78a379ab1b6b30ab497db7971ad336dad1dbee76.1438758065.git.p.fedin@samsung.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13	i.MX: Split GPT emulator in a header file and a source file	Jean-Christophe Dubois
	Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Message-id: e32fba56b9dae3cc7c83726550514b2d0c890ae0.1437080501.git.jcd@tribudubois.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13	i.MX: Split EPIT emulator in a header file and a source file	Jean-Christophe Dubois
	Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Message-id: 948927cab0c85da9a753c5f6d5501323d5604c8e.1437080501.git.jcd@tribudubois.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13	i.MX: Split CCM emulator in a header file and a source file	Jean-Christophe Dubois
	Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Message-id: b1d6f990229b2608bbaba24f4ff359571c0b07da.1437080501.git.jcd@tribudubois.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13	i.MX: Split AVIC emulator in a header file and a source file	Jean-Christophe Dubois
	Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Message-id: 06829257e845d693be05c7d491134313c1615d1a.1437080501.git.jcd@tribudubois.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13	i.MX: Split UART emulator in a header file and a source file	Jean-Christophe Dubois
	Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Message-id: a51ef50fa222a614169056d5389a6d3ed6a63b04.1437080501.git.jcd@tribudubois.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-05	virtio: fix 1.0 virtqueue migration	Jason Wang
	1.0 does not requires physically-contiguous pages layout for a virtqueue. So we could not infer avail and used from desc. This means we need to migrate vring.avail and vring.used when host support virtio 1.0. This fixes malfunction of virtio 1.0 device after migration. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-08-05	throttle: refuse bps_max/iops_max without bps/iops	Stefan Hajnoczi
	The bps_max/iops_max values are meaningless without corresponding bps/iops values. Reported an error if bps_max/iops_max is given without bps/iops. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1438683733-21111-2-git-send-email-stefanha@redhat.com
2015-08-04	target-i386: fix IvyBridge xlevel in PC_COMPAT_2_3	Radim Krčmář
	Previous patch changed xlevel and missed the compatibility code. Fixes: 3046bb5debc8 ("target-i386: emulate CPUID level of real hardware") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-08-03	migration: Fix global state with Xen.	Anthony PERARD
	When doing migration via the QMP command xen_save_devices_state, the current runstate is not store into the global state section. Also the current runstate is not the one we want on the receiver side. During migration, the Xen toolstack paused QEMU before save the devices state. Also, the toolstack expect QEMU to autostart when the migration is finished. So this patch store "running" as it's current runstate. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-07-29	AioContext: force event loop iteration using BH	Stefan Hajnoczi
	The notify_me optimization introduced in commit eabc97797310 ("AioContext: fix broken ctx->dispatching optimization") skips event_notifier_set() calls when the event loop thread is not blocked in ppoll(2). This optimization causes a deadlock if two aio_context_acquire() calls race. notify_me = 0 during the race so the winning thread can enter ppoll(2) unaware that the other thread is waiting its turn to acquire the AioContext. This patch forces ppoll(2) to return by scheduling a BH instead of calling aio_notify(). The following deadlock with virtio-blk dataplane is fixed: qemu ... -object iothread,id=iothread0 \ -drive if=none,id=drive0,file=test.img,... \ -device virtio-blk-pci,iothread=iothread0,drive=drive0 This command-line results in a hang early on without this patch. Thanks to Paolo Bonzini <pbonzini@redhat.com> for investigating this bug with me. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1438101249-25166-4-git-send-email-pbonzini@redhat.com Message-Id: <1438014819-18125-3-git-send-email-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-28	Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging	Peter Maydell
	virtio fixes for 2.4 Mostly virtio 1 spec compliance fixes. We are unlikely to make it perfectly compliant in the first release, but it seems worth it to try. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon Jul 27 21:55:48 2015 BST using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" * remotes/mst/tags/for_upstream: virtio: minor cleanup acpi: fix pvpanic device is not shown in ui virtio-blk: only clear VIRTIO_F_ANY_LAYOUT for legacy device virtio-blk: fail get_features when both scsi and 1.0 were set virtio: get_features() can fail virtio-pci: fix memory MR cleanup for modern virtio: set any_layout in virtio core virtio-9p: fix any_layout virtio-serial: fix ANY_LAYOUT virtio: hide legacy features from modern guests Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-27	Fix Cortex-A9 global timer	Johannes Schlatow
	The auto increment bit of the timer control register was wrongly defined. See Cortex-A9 MPcore Technical Reference Manual, Section 4.4.2. Signed-off-by: Johannes Schlatow <schlatow@ida.ing.tu-bs.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-07-27	vmstate: remove unused declaration	Marc-André Lureau
	Since 38e0735e, register_device_unmigratable() has been removed Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-07-27	qemu-common.h: Document cutils.c string functions	Peter Maydell
	Add documentation comments for various utility string functions which we have implemented in util/cutils.c: pstrcpy() strpadcpy() pstrcat() strstart() stristart() qemu_strnlen() qemu_strsep() Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-07-27	virtio: get_features() can fail	Jason Wang
	Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-27	hw/net: add simple phy support to mcf_fec driver	Greg Ungerer
	The Linux fec driver needs at least basic phy support to probe and work. The current qemu mcf_fec emulation has no support for the reading or writing of the MDIO lines to access an attached phy. This code adds a very simple set of register results for a fixed phy setup - very similar to that used on an m5208evb board. This is enough to probe and identify an emulated attached phy. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435296436-12152-4-git-send-email-gerg@uclinux.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-27	hw/net: add ANLPAR bit definitions to generic mii	Greg Ungerer
	Add a base set of bit definitions for the standard MII phy "Auto-Negotiation Link Partner Ability Register" (ANLPAR). The original definitions moved into mii.h from the allwinner_emac driver did not define these. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435296436-12152-3-git-send-email-gerg@uclinux.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-27	hw/net: create common collection of MII definitions	Greg Ungerer
	Create a common set of definitions of address and register values for ethernet MII phys. A few of the current ethernet drivers have at least a partial set of these definitions. Others just use hard coded raw constant numbers. This initial set is copied directly from the allwinner_emac code. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435296436-12152-2-git-send-email-gerg@uclinux.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-27	virtio: set any_layout in virtio core	Michael S. Tsirkin
	Exceptions: - virtio-blk - compat machine types Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-27	virtio: hide legacy features from modern guests	Michael S. Tsirkin
	NOTIFY_ON_EMPTY, ANY_LAYOUT and BAD are only valid on the legacy interface. Hide them from modern guests. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-24	memory: count number of active VGA logging clients	Paolo Bonzini
	For a board that has multiple framebuffer devices, both of them might want to use DIRTY_MEMORY_VGA on the same memory region. The lack of reference counting in memory_region_set_log makes this very awkward to implement. Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-22	AioContext: optimize clearing the EventNotifier	Paolo Bonzini
	It is pretty rare for aio_notify to actually set the EventNotifier. It can happen with worker threads such as thread-pool.c's, but otherwise it should never be set thanks to the ctx->notify_me optimization. The previous patch, unfortunately, added an unconditional call to event_notifier_test_and_clear; now add a userspace fast path that avoids the call. Note that it is not possible to do the same with event_notifier_set; it would break, as proved (again) by the included formal model. This patch survived over 3000 reboots on aarch64 KVM. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Message-id: 1437487673-23740-7-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-22	AioContext: fix broken ctx->dispatching optimization	Paolo Bonzini
	This patch rewrites the ctx->dispatching optimization, which was the cause of some mysterious hangs that could be reproduced on aarch64 KVM only. The hangs were indirectly caused by aio_poll() and in particular by flash memory updates's call to blk_write(), which invokes aio_poll(). Fun stuff: they had an extremely short race window, so much that adding all kind of tracing to either the kernel or QEMU made it go away (a single printf made it half as reproducible). On the plus side, the failure mode (a hang until the next keypress) made it very easy to examine the state of the process with a debugger. And there was a very nice reproducer from Laszlo, which failed pretty often (more than half of the time) on any version of QEMU with a non-debug kernel; it also failed fast, while still in the firmware. So, it could have been worse. For some unknown reason they happened only with virtio-scsi, but that's not important. It's more interesting that they disappeared with io=native, making thread-pool.c a likely suspect for where the bug arose. thread-pool.c is also one of the few places which use bottom halves across threads, by the way. I hope that no other similar bugs exist, but just in case :) I am going to describe how the successful debugging went... Since the likely culprit was the ctx->dispatching optimization, which mostly affects bottom halves, the first observation was that there are two qemu_bh_schedule() invocations in the thread pool: the one in the aio worker and the one in thread_pool_completion_bh. The latter always causes the optimization to trigger, the former may or may not. In order to restrict the possibilities, I introduced new functions qemu_bh_schedule_slow() and qemu_bh_schedule_fast(): /* qemu_bh_schedule_slow: / ctx = bh->ctx; bh->idle = 0; if (atomic_xchg(&bh->scheduled, 1) == 0) { event_notifier_set(&ctx->notifier); } / qemu_bh_schedule_fast: / ctx = bh->ctx; bh->idle = 0; assert(ctx->dispatching); atomic_xchg(&bh->scheduled, 1); Notice how the atomic_xchg is still in qemu_bh_schedule_slow(). This was already debated a few months ago, so I assumed it to be correct. In retrospect this was a very good idea, as you'll see later. Changing thread_pool_completion_bh() to qemu_bh_schedule_fast() didn't trigger the assertion (as expected). Changing the worker's invocation to qemu_bh_schedule_slow() didn't hide the bug (another assumption which luckily held). This already limited heavily the amount of interaction between the threads, hinting that the problematic events must have triggered around thread_pool_completion_bh(). As mentioned early, invoking a debugger to examine the state of a hung process was pretty easy; the iothread was always waiting on a poll(..., -1) system call. Infinite timeouts are much rarer on x86, and this could be the reason why the bug was never observed there. With the buggy sequence more or less resolved to an interaction between thread_pool_completion_bh() and poll(..., -1), my "tracing" strategy was to just add a few qemu_clock_get_ns(QEMU_CLOCK_REALTIME) calls, hoping that the ordering of aio_ctx_prepare(), aio_ctx_dispatch, poll() and qemu_bh_schedule_fast() would provide some hint. The output was: (gdb) p last_prepare $3 = 103885451 (gdb) p last_dispatch $4 = 103876492 (gdb) p last_poll $5 = 115909333 (gdb) p last_schedule $6 = 115925212 Notice how the last call to qemu_poll_ns() came after aio_ctx_dispatch(). This makes little sense unless there is an aio_poll() call involved, and indeed with a slightly different instrumentation you can see that there is one: (gdb) p last_prepare $3 = 107569679 (gdb) p last_dispatch $4 = 107561600 (gdb) p last_aio_poll $5 = 110671400 (gdb) p last_schedule $6 = 110698917 So the scenario becomes clearer: iothread VCPU thread -------------------------------------------------------------------------- aio_ctx_prepare aio_ctx_check qemu_poll_ns(timeout=-1) aio_poll aio_dispatch thread_pool_completion_bh qemu_bh_schedule() At this point bh->scheduled = 1 and the iothread has not been woken up. The solution must be close, but this alone should not be a problem, because the bottom half is only rescheduled to account for rare situations (see commit 3c80ca1, thread-pool: avoid deadlock in nested aio_poll() calls, 2014-07-15). Introducing a third thread---a thread pool worker thread, which also does qemu_bh_schedule()---does bring out the problematic case. The third thread must be awakened after* the callback is complete and thread_pool_completion_bh has redone the whole loop, explaining the short race window. And then this is what happens: thread pool worker -------------------------------------------------------------------------- <I/O completes> qemu_bh_schedule() Tada, bh->scheduled is already 1, so qemu_bh_schedule() does nothing and the iothread is never woken up. This is where the bh->scheduled optimization comes into play---it is correct, but removing it would have masked the bug. So, what is the bug? Well, the question asked by the ctx->dispatching optimization ("is any active aio_poll dispatching?") was wrong. The right question to ask instead is "is any active aio_poll not dispatching", i.e. in the prepare or poll phases? In that case, the aio_poll is sleeping or might go to sleep anytime soon, and the EventNotifier must be invoked to wake it up. In any other case (including if there is no active aio_poll at all!) we can just wait for the next prepare phase to pick up the event (e.g. a bottom half); the prepare phase will avoid the blocking and service the bottom half. Expressing the invariant with a logic formula, the broken one looked like: !(exists(thread): in_dispatching(thread)) => !optimize or equivalently: !(exists(thread): in_aio_poll(thread) && in_dispatching(thread)) => !optimize In the correct one, the negation is in a slightly different place: (exists(thread): in_aio_poll(thread) && !in_dispatching(thread)) => !optimize or equivalently: (exists(thread): in_prepare_or_poll(thread)) => !optimize Even if the difference boils down to moving an exclamation mark :) the implementation is quite different. However, I think the new one is simpler to understand. In the old implementation, the "exists" was implemented with a boolean value. This didn't really support well the case of multiple concurrent event loops, but I thought that this was okay: aio_poll holds the AioContext lock so there cannot be concurrent aio_poll invocations, and I was just considering nested event loops. However, aio_poll _could_ indeed be concurrent with the GSource. This is why I came up with the wrong invariant. In the new implementation, "exists" is computed simply by counting how many threads are in the prepare or poll phases. There are some interesting points to consider, but the gist of the idea remains: 1) AioContext can be used through GSource as well; as mentioned in the patch, bit 0 of the counter is reserved for the GSource. 2) the counter need not be updated for a non-blocking aio_poll, because it won't sleep forever anyway. This is just a matter of checking the "blocking" variable. This requires some changes to the win32 implementation, but is otherwise not too complicated. 3) as mentioned above, the new implementation will not call aio_notify when there is no active aio_poll at all. The tests have to be adjusted for this change. The calls to aio_notify in async.c are fine; they only want to kick aio_poll out of a blocking wait, but need not do anything if aio_poll is not running. 4) nested aio_poll: these just work with the new implementation; when a nested event loop is invoked, the outer event loop is never in the prepare or poll phases. The outer event loop thus has already decremented the counter. Reported-by: Richard W. M. Jones <rjones@redhat.com> Reported-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Message-id: 1437487673-23740-5-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-20	timer: rename NSEC_PER_SEC due to Mac OS X header clash	Stefan Hajnoczi
	Commit e0cf11f31c24cfb17f44ed46c254d84c78e7f6e9 ("timer: Use a single definition of NSEC_PER_SEC for the whole codebase") renamed NANOSECONDS_PER_SECOND to NSEC_PER_SEC. On Mac OS X there is a <dispatch/time.h> system header which also defines NSEC_PER_SEC. This causes compiler warnings. Let's use the old name instead. It's longer but it doesn't clash. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1436364609-7929-1-git-send-email-stefanha@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-20	Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging	Peter Maydell
	virtio, vhost, pc fixes for 2.4 The only notable thing here is vhost-user multiqueue revert. We'll work on making it stable in 2.5, reverting now means we won't have to maintain bug for bug compability forever. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon Jul 20 12:24:00 2015 BST using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" * remotes/mst/tags/for_upstream: virtio-net: remove virtio queues if the guest doesn't support multiqueue virtio-net: Flush incoming queues when DRIVER_OK is being set pci_add_capability: remove duplicate comments virtio-net: unbreak any layout Revert "vhost-user: add multi queue support" ich9: fix skipped vmstate_memhp_state subsection Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-20	virtio-net: unbreak any layout	Jason Wang
	Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611 ("virtio-net: byteswap virtio-net header") breaks any layout by requiring out_sg[0].iov_len >= n->guest_hdr_len. Fixing this by copying header to temporary buffer if swap is needed, and then use this buffer as part of out_sg. Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611 ("virtio-net: byteswap virtio-net header") Cc: qemu-stable@nongnu.org Cc: clg@fr.ibm.com Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-07-17	virtio-rng: trigger timer only when guest requests for entropy	Pankaj Gupta
	This patch triggers timer only when guest requests for entropy. As soon as first request from guest for entropy comes we set the timer. Timer bumps up the quota value when it gets triggered. Signed-off-by: Pankaj Gupta <pagupta@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1436962608-9961-2-git-send-email-pagupta@redhat.com> [Re-worded patch subject, removed extra whitespace -- Amit] Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-07-16	virtio-input: move sys/ioctl.h include	Gerd Hoffmann
	Drop from include/standard-headers/linux/input.h Add to hw/input/virtio-input-host.c instead. That allows to build virtio-input (except pass-through) on windows. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2015-07-15	target-i386: emulate CPUID level of real hardware	Radim Krčmář
	W10 insider has a bug where it ignores CPUID level and interprets CPUID.(EAX=07H, ECX=0H) incorrectly, because CPUID in fact returned CPUID.(EAX=04H, ECX=0H); this resulted in execution of unsupported instructions. While it's a Windows bug, there is no reason to emulate incorrect level. I used http://instlatx64.atw.hu/ as a source of CPUID and checked that it matches Penryn Xeon X5472, Westmere Xeon W3520, SandyBridge i5-2540M, and Haswell i5-4670T. kvm64 and qemu64 were bumped to 0xD to allow all available features for them (and to avoid the same Windows bug). Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-07-15	migration: We also want to store the global state for savevm	Juan Quintela
	Commit df4b1024526cae3479da3492d6371fd4a7324a03 introduced global_state section. But it only filled the state while doing migration. While doing a savevm, we stored an empty string as state. So when we did a loadvm, it complained that state was invalid. Fedora 21, 4.1.1, qemu 2.4.0-rc0 > ../../configure --target-list="x86_64-softmmu" 068 2s ... - output mismatch (see 068.out.bad) --- /home/bos/jhuston/src/qemu/tests/qemu-iotests/068.out 2015-07-08 17:56:18.588164979 -0400 +++ 068.out.bad 2015-07-09 17:39:58.636651317 -0400 @@ -6,6 +6,8 @@ QEMU X.Y.Z monitor - type 'help' for more information (qemu) savevm 0 (qemu) quit +qemu-system-x86_64: Unknown savevm section or instance 'globalstate' 0 +qemu-system-x86_64: Error -22 while loading VM state QEMU X.Y.Z monitor - type 'help' for more information (qemu) quit *** done Failures: 068 Failed 1 of 1 tests Actually, there were two problems here: - we registered global_state too late for load_vm (fixed on another patch on the list) - we didn't store a valid state for savevm (fixed by this patch). Reported-by: John Snow <jsnow@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-07-14	block: Fix backing file child when modifying graph	Kevin Wolf
	This patch moves bdrv_attach_child() from the individual places that add a backing file to a BDS to bdrv_set_backing_hd(), which is called by all of them. It also adds bdrv_detach_child() there. For normal operation (starting with one backing file chain and not changing it until the topmost image is closed) and live snapshots, this constitutes no change in behaviour. For all other cases, this is a fix for the bug that the old backing file was still referenced as a child, and the new one wasn't referenced. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>