aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-09-26nbd/server: send more than one extent of base:allocation contextVladimir Sementsov-Ogievskiy
This is necessary for efficient block-status export, for clients which support it. (qemu is not yet such a client, but could become one.) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180704112302.471456-3-vsementsov@virtuozzo.com> [eblake: grammar tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-09-26qapi: bitmap-merge: document name changeJohn Snow
We named these using underscores instead of the preferred dash, document this nearby so we cannot possibly forget to rectify this when we remove the 'x-' prefixes when the feature becomes stable. We do not implement the change ahead of time to avoid more work for libvirt to do in order to figure out how to use the beta version of the API needlessly. Reported-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-Id: <20180919190934.16284-1-jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: typo fix] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-09-26nbd/server: fix bitmap exportVladimir Sementsov-Ogievskiy
bitmap_to_extents function is broken: it switches dirty variable after every iteration, however it can process only part of dirty (or zero) area during one iteration in case when this area is too large for one extent. Fortunately, the bug doesn't produce wrong extent flags: it just inserts a zero-length extent between sequential extents representing large dirty (or zero) area. However, zero-length extents are forbidden by the NBD protocol. So, a careful client should consider such a reply as a server fault, while a less-careful will likely ignore zero-length extents. The bug can only be triggered by a client that requests block status for nearly 4G at once (a request of 4G and larger is impossible per the protocol, and requests smaller than 4G less the bitmap granularity cause the loop to quit iterating rather than revisit the tail of the large area); it also cannot trigger if the client used the NBD_CMD_FLAG_REQ_ONE flag. Since qemu 3.0 as client (using the x-dirty-bitmap extension) always passes the flag, it is immune; and we are not aware of other open-source clients that know how to request qemu:dirty-bitmap:FOO contexts. Clients that want to avoid the bug could cap block status requests to a smaller length, such as 2G or 3G. Fix this by more careful handling of dirty variable. Bug was introduced in 3d068aff16 "nbd/server: implement dirty bitmap export", with the whole function. and is present in v3.0.0 release. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180914165116.23182-1-vsementsov@virtuozzo.com> CC: qemu-stable@nongnu.org Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: improved commit message] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-09-25Merge remote-tracking branch 'remotes/xanclic/tags/pull-block-2018-09-25' ↵Peter Maydell
into staging Block layer patches: - Drain fixes - node-name parameters for block-commit - Refactor block jobs to use transactional callbacks for exiting # gpg: Signature made Tue 25 Sep 2018 16:12:44 BST # gpg: using RSA key F407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/xanclic/tags/pull-block-2018-09-25: (42 commits) test-bdrv-drain: Test draining job source child and parent block: Use a single global AioWait test-bdrv-drain: Fix outdated comments test-bdrv-drain: AIO_WAIT_WHILE() in job .commit/.abort job: Avoid deadlocks in job_completed_txn_abort() test-bdrv-drain: Test nested poll in bdrv_drain_poll_top_level() block: Remove aio_poll() in bdrv_drain_poll variants blockjob: Lie better in child_job_drained_poll() block-backend: Decrease in_flight only after callback block-backend: Fix potential double blk_delete() block-backend: Add .drained_poll callback block: Add missing locking in bdrv_co_drain_bh_cb() test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callback job: Use AIO_WAIT_WHILE() in job_finish_sync() test-blockjob: Acquire AioContext around job_cancel_sync() test-bdrv-drain: Drain with block jobs in an I/O thread aio-wait: Increase num_waiters even in home thread blockjob: Wake up BDS when job becomes idle job: Fix missing locking due to mismerge job: Fix nested aio_poll() hanging in job_txn_apply ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20180925' into ↵Peter Maydell
staging HMP pull 2018-09-25 # gpg: Signature made Tue 25 Sep 2018 15:11:09 BST # gpg: using RSA key 0516331EBC5BFDE7 # gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" # Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7 * remotes/dgilbert/tags/pull-hmp-20180925: qmp, hmp: add PCI subsystem id and vendor id to PCI info hmp: fix migrate status timer leak monitor: print message when using 'help' with an unknown command Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25Merge remote-tracking branch ↵Peter Maydell
'remotes/pmaydell/tags/pull-target-arm-20180925-1' into staging target-arm queue: * target/arm: Fix cpu_get_tb_cpu_state() for non-SVE CPUs * hw/arm/exynos4210: fix Exynos4210 UART support * hw/arm/virt-acpi-build: Add a check for memory-less NUMA nodes * arm: Add BBC micro:bit machine * aspeed/i2c: Fix interrupt handling bugs * hw/arm/smmu-common: Fix the name of the iommu memory regions * hw/arm/smmuv3: fix eventq recording and IRQ triggerring * hw/intc/arm_gic: Document QEMU interface * hw/intc/arm_gic: Drop GIC_BASE_IRQ macro * hw/net/pcnet-pci: Convert away from old_mmio accessors * hw/timer/cmsdk-apb-dualtimer: Add missing 'break' statements * aspeed/timer: fix compile breakage with clang 3.4.2 * hw/arm/aspeed: change the FMC flash model of the AST2500 evb * hw/arm/aspeed: Minor code cleanups * target/arm: Start AArch32 CPUs with EL2 but not EL3 in Hyp mode # gpg: Signature made Tue 25 Sep 2018 15:23:11 BST # gpg: using RSA key 3C2525ED14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20180925-1: (21 commits) target/arm: Start AArch32 CPUs with EL2 but not EL3 in Hyp mode aspeed/smc: fix some alignment issues hw/arm/aspeed: Add an Aspeed machine class hw/arm/aspeed: change the FMC flash model of the AST2500 evb aspeed/timer: fix compile breakage with clang 3.4.2 hw/timer/cmsdk-apb-dualtimer: Add missing 'break' statements hw/net/pcnet-pci: Unify pcnet_ioport_read/write and pcnet_mmio_read/write hw/net/pcnet-pci: Convert away from old_mmio accessors hw/intc/arm_gic: Drop GIC_BASE_IRQ macro hw/intc/arm_gic: Document QEMU interface hw/arm/smmuv3: fix eventq recording and IRQ triggerring hw/arm/smmu-common: Fix the name of the iommu memory regions aspeed/i2c: Fix receive done interrupt handling aspeed/i2c: Handle receive command in separate function aspeed/i2c: interrupts should be cleared by software only arm: Add BBC micro:bit machine arm: Add Nordic Semiconductor nRF51 SoC MAINTAINERS: Add NRF51 entry hw/arm/virt-acpi-build: Add a check for memory-less NUMA nodes hw/arm/exynos4210: fix Exynos4210 UART support ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25target/arm: Start AArch32 CPUs with EL2 but not EL3 in Hyp modePeter Maydell
The ARMv8 architecture defines that an AArch32 CPU starts in SVC mode, unless EL2 is the highest available EL, in which case it starts in Hyp mode. (In ARMv7 a CPU with EL2 but not EL3 was not a valid configuration, but we don't specifically reject this if the user asks for one.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20180823135047.16525-1-peter.maydell@linaro.org
2018-09-25aspeed/smc: fix some alignment issuesCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180921161939.822-6-clg@kaod.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25hw/arm/aspeed: Add an Aspeed machine classCédric Le Goater
The code looks better, it removes duplicated lines and it will ease the introduction of common properties for the Aspeed machines. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180921161939.822-4-clg@kaod.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25hw/arm/aspeed: change the FMC flash model of the AST2500 evbCédric Le Goater
The AST2500 evb is shipped with a W25Q256 which has a non volatile bit to make the chip operate in 4 Byte address mode at power up. This should be an interesting feature to model as it will exercise a bit more the SMC controllers and MMIO execution at boot time. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-id: 20180921161939.822-3-clg@kaod.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25aspeed/timer: fix compile breakage with clang 3.4.2Cédric Le Goater
In file included from /home/thuth/devel/qemu/hw/timer/aspeed_timer.c:16: /home/thuth/devel/qemu/include/hw/misc/aspeed_scu.h:37:3: error: redefinition of typedef 'AspeedSCUState' is a C11 feature [-Werror,-Wtypedef-redefinition] } AspeedSCUState; ^ /home/thuth/devel/qemu/include/hw/timer/aspeed_timer.h:27:31: note: previous definition is here typedef struct AspeedSCUState AspeedSCUState; Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180921161939.822-2-clg@kaod.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25hw/timer/cmsdk-apb-dualtimer: Add missing 'break' statementsPeter Maydell
Add 'break' statements missing from a switch in the APB dual-timer write function. Spotted by Coverity as CID 1395626 and 1395633. Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180924123122.14549-1-peter.maydell@linaro.org
2018-09-25hw/net/pcnet-pci: Unify pcnet_ioport_read/write and pcnet_mmio_read/writePeter Maydell
The only difference between our implementation of the pcnet ioport accessors and the mmio accessors is that the former check BCR_DWIO to see what access widths are permitted for addresses in the aprom range (0x0..0xf). In fact our failure to do this in the mmio accessors is a bug (one which was fixed for the ioport accessors in commit 7ba79741970 in 2011). The data sheet for the Am79C970A does not describe the DWIO bit as only applying for I/O space mapped I/O resources and not memory mapped I/O resources, and our MMIO accessors already honour DWIO for accesses in the 0x10..0x1f range (since the pcnet_ioport_{read,write}{w,l} functions check it). The data sheet for the later but compatible Am79C976 is clearer: it states specifically "DWIO mode applies to both I/O- and memory-mapped acceses." This seems to be reasonable evidence in favour of interpretating the Am79C970A spec as being the same. (NB: Linux's pcnet driver only supports I/O accesses, so the MMIO access part of this device is probably untested anyway.) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25hw/net/pcnet-pci: Convert away from old_mmio accessorsPeter Maydell
Convert the pcnet-pci device away from using the old_mmio MemoryRegionOps accessor functions. This commit is a no-behaviour-change API conversion. (Since PCNET_PNPMMIO_SIZE is 0x20, the old "addr & 0x10" check and the new "addr < 0x10" check are exact opposites; the new code is phrased to be parallel with the pcnet_io_read/write functions.) I have left a TODO comment marker because the similarity between the MMIO and IO accessor behaviour is suspicious and they could be combined, but this will be left to a different patch. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25hw/intc/arm_gic: Drop GIC_BASE_IRQ macroPeter Maydell
The GIC_BASE_IRQ macro is a leftover from when we shared code between the GICv2 and the v7M NVIC. Since the NVIC is now split off, GIC_BASE_IRQ is always 0, and we can just delete it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Luc Michel <luc.michel@greensocs.com> Message-id: 20180824161819.11085-1-peter.maydell@linaro.org
2018-09-25hw/intc/arm_gic: Document QEMU interfacePeter Maydell
The GICv2's QEMU interface (sysbus MMIO regions, IRQs, etc) is now quite complicated with the addition of the virtualization extensions. Add a comment in the header file which documents it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Luc Michel <luc.michel@greensocs.com> Message-id: 20180823103818.31189-1-peter.maydell@linaro.org
2018-09-25hw/arm/smmuv3: fix eventq recording and IRQ triggerringEric Auger
The event queue management is broken today. Event records are not properly written as EVT_SET_* macro was not updating the actual event record. Also the event queue interrupt is not correctly triggered. Fixes: bb981004eaf4 ("hw/arm/smmuv3: Event queue recording helper") Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-id: 20180921070138.10114-3-eric.auger@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-25Merge remote-tracking branch 'kevin/tags/for-upstream' into blockMax Reitz
Block layer patches: - Fix some jobs/drain/aio_poll related hangs - commit: Add top-node/base-node options - linux-aio: Fix locking for qemu_laio_process_completions() - Fix use after free error in bdrv_open_inherit # gpg: Signature made Tue Sep 25 15:54:01 2018 CEST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * kevin/tags/for-upstream: (26 commits) test-bdrv-drain: Test draining job source child and parent block: Use a single global AioWait test-bdrv-drain: Fix outdated comments test-bdrv-drain: AIO_WAIT_WHILE() in job .commit/.abort job: Avoid deadlocks in job_completed_txn_abort() test-bdrv-drain: Test nested poll in bdrv_drain_poll_top_level() block: Remove aio_poll() in bdrv_drain_poll variants blockjob: Lie better in child_job_drained_poll() block-backend: Decrease in_flight only after callback block-backend: Fix potential double blk_delete() block-backend: Add .drained_poll callback block: Add missing locking in bdrv_co_drain_bh_cb() test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callback job: Use AIO_WAIT_WHILE() in job_finish_sync() test-blockjob: Acquire AioContext around job_cancel_sync() test-bdrv-drain: Drain with block jobs in an I/O thread aio-wait: Increase num_waiters even in home thread blockjob: Wake up BDS when job becomes idle job: Fix missing locking due to mismerge job: Fix nested aio_poll() hanging in job_txn_apply ... Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-09-25test-bdrv-drain: Test draining job source child and parentKevin Wolf
For the block job drain test, don't only test draining the source and the target node, but create a backing chain for the source (source_backing <- source <- source_overlay) and test draining each of the nodes in it. When using iothreads, the source node (and therefore the job) is in a different AioContext than the drain, which happens from the main thread. This way, the main thread waits in AIO_WAIT_WHILE() for the iothread to make process and aio_wait_kick() is required to notify it. The test validates that calling bdrv_wakeup() for a child or a parent node will actually notify AIO_WAIT_WHILE() instead of letting it hang. Increase the sleep time a bit (to 1 ms) because the test case is racy and with the shorter sleep, it didn't reproduce the bug it is supposed to test for me under 'rr record -n'. This was because bdrv_drain_invoke_entry() (in the main thread) was only called after the job had already reached the pause point, so we got a bdrv_dec_in_flight() from the main thread and the additional aio_wait_kick() when the job becomes idle (that we really wanted to test here) wasn't even necessary any more to make progress. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25block: Use a single global AioWaitKevin Wolf
When draining a block node, we recurse to its parent and for subtree drains also to its children. A single AIO_WAIT_WHILE() is then used to wait for bdrv_drain_poll() to become true, which depends on all of the nodes we recursed to. However, if the respective child or parent becomes quiescent and calls bdrv_wakeup(), only the AioWait of the child/parent is checked, while AIO_WAIT_WHILE() depends on the AioWait of the original node. Fix this by using a single AioWait for all callers of AIO_WAIT_WHILE(). This may mean that the draining thread gets a few more unnecessary wakeups because an unrelated operation got completed, but we already wake it up when something _could_ have changed rather than only if it has certainly changed. Apart from that, drain is a slow path anyway. In theory it would be possible to use wakeups more selectively and still correctly, but the gains are likely not worth the additional complexity. In fact, this patch is a nice simplification for some places in the code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25test-bdrv-drain: Fix outdated commentsKevin Wolf
Commit 89bd030533e changed the test case from using job_sleep_ns() to using qemu_co_sleep_ns() instead. Also, block_job_sleep_ns() became job_sleep_ns() in commit 5d43e86e11f. In both cases, some comments in the test case were not updated. Do that now. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-09-25test-bdrv-drain: AIO_WAIT_WHILE() in job .commit/.abortKevin Wolf
This adds tests for calling AIO_WAIT_WHILE() in the .commit and .abort callbacks. Both reasons why .abort could be called for a single job are tested: Either .run or .prepare could return an error. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25job: Avoid deadlocks in job_completed_txn_abort()Kevin Wolf
Amongst others, job_finalize_single() calls the .prepare/.commit/.abort callbacks of the individual job driver. Recently, their use was adapted for all block jobs so that they involve code calling AIO_WAIT_WHILE() now. Such code must be called under the AioContext lock for the respective job, but without holding any other AioContext lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25test-bdrv-drain: Test nested poll in bdrv_drain_poll_top_level()Kevin Wolf
This is a regression test for a deadlock that could occur in callbacks called from the aio_poll() in bdrv_drain_poll_top_level(). The AioContext lock wasn't released and therefore would be taken a second time in the callback. This would cause a possible AIO_WAIT_WHILE() in the callback to hang. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2018-09-25block: Remove aio_poll() in bdrv_drain_poll variantsKevin Wolf
bdrv_drain_poll_top_level() was buggy because it didn't release the AioContext lock of the node to be drained before calling aio_poll(). This way, callbacks called by aio_poll() would possibly take the lock a second time and run into a deadlock with a nested AIO_WAIT_WHILE() call. However, it turns out that the aio_poll() call isn't actually needed any more. It was introduced in commit 91af091f923, which is effectively reverted by this patch. The cases it was supposed to fix are now covered by bdrv_drain_poll(), which waits for block jobs to reach a quiescent state. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25blockjob: Lie better in child_job_drained_poll()Kevin Wolf
Block jobs claim in .drained_poll() that they are in a quiescent state as soon as job->deferred_to_main_loop is true. This is obviously wrong, they still have a completion BH to run. We only get away with this because commit 91af091f923 added an unconditional aio_poll(false) to the drain functions, but this is bypassing the regular drain mechanisms. However, just removing this and telling that the job is still active doesn't work either: The completion callbacks themselves call drain functions (directly, or indirectly with bdrv_reopen), so they would deadlock then. As a better lie, tell that the job is active as long as the BH is pending, but falsely call it quiescent from the point in the BH when the completion callback is called. At this point, nested drain calls won't deadlock because they ignore the job, and outer drains will wait for the job to really reach a quiescent state because the callback is already running. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25block-backend: Decrease in_flight only after callbackKevin Wolf
Request callbacks can do pretty much anything, including operations that will yield from the coroutine (such as draining the backend). In that case, a decreased in_flight would be visible to other code and could lead to a drain completing while the callback hasn't actually completed yet. Note that reordering these operations forbids calling drain directly inside an AIO callback. As Paolo explains, indirectly calling it is okay: - Calling it through a coroutine is okay, because then bdrv_drained_begin() goes through bdrv_co_yield_to_drain() and you have in_flight=2 when bdrv_co_yield_to_drain() yields, then soon in_flight=1 when the aio_co_wake() in the AIO callback completes, then in_flight=0 after the bottom half starts. - Calling it through a bottom half would be okay too, as long as the AIO callback remembers to do inc_in_flight/dec_in_flight just like bdrv_co_yield_to_drain() and bdrv_co_drain_bh_cb() do A few more important cases that come to mind: - A coroutine that yields because of I/O is okay, with a sequence similar to bdrv_co_yield_to_drain(). - A coroutine that yields with no I/O pending will correctly decrease in_flight to zero before yielding. - Calling more AIO from the callback won't overflow the counter just because of mutual recursion, because AIO functions always yield at least once before invoking the callback. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-25block-backend: Fix potential double blk_delete()Kevin Wolf
blk_unref() first decreases the refcount of the BlockBackend and calls blk_delete() if the refcount reaches zero. Requests can still be in flight at this point, they are only drained during blk_delete(): At this point, arbitrary callbacks can run. If any callback takes a temporary BlockBackend reference, it will first increase the refcount to 1 and then decrease it to 0 again, triggering another blk_delete(). This will cause a use-after-free crash in the outer blk_delete(). Fix it by draining the BlockBackend before decreasing to refcount to 0. Assert in blk_ref() that it never takes the first refcount (which would mean that the BlockBackend is already being deleted). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25block-backend: Add .drained_poll callbackKevin Wolf
A bdrv_drain operation must ensure that all parents are quiesced, this includes BlockBackends. Otherwise, callbacks called by requests that are completed on the BDS layer, but not quite yet on the BlockBackend layer could still create new requests. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25block: Add missing locking in bdrv_co_drain_bh_cb()Kevin Wolf
bdrv_do_drained_begin/end() assume that they are called with the AioContext lock of bs held. If we call drain functions from a coroutine with the AioContext lock held, we yield and schedule a BH to move out of coroutine context. This means that the lock for the home context of the coroutine is released and must be re-acquired in the bottom half. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callbackKevin Wolf
This is a regression test for a deadlock that occurred in block job completion callbacks (via job_defer_to_main_loop) because the AioContext lock was taken twice: once in job_finish_sync() and then again in job_defer_to_main_loop_bh(). This would cause AIO_WAIT_WHILE() to hang. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2018-09-25job: Use AIO_WAIT_WHILE() in job_finish_sync()Kevin Wolf
job_finish_sync() needs to release the AioContext lock of the job before calling aio_poll(). Otherwise, callbacks called by aio_poll() would possibly take the lock a second time and run into a deadlock with a nested AIO_WAIT_WHILE() call. Also, job_drain() without aio_poll() isn't necessarily enough to make progress on a job, it could depend on bottom halves to be executed. Combine both open-coded while loops into a single AIO_WAIT_WHILE() call that solves both of these problems. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25test-blockjob: Acquire AioContext around job_cancel_sync()Kevin Wolf
All callers in QEMU proper hold the AioContext lock when calling job_finish_sync(). test-blockjob should do the same when it calls the function indirectly through job_cancel_sync(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2018-09-25test-bdrv-drain: Drain with block jobs in an I/O threadKevin Wolf
This extends the existing drain test with a block job to include variants where the block job runs in a different AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2018-09-25aio-wait: Increase num_waiters even in home threadKevin Wolf
Even if AIO_WAIT_WHILE() is called in the home context of the AioContext, we still want to allow the condition to change depending on other threads as long as they kick the AioWait. Specfically block jobs can be running in an I/O thread and should then be able to kick a drain in the main loop context. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2018-09-25blockjob: Wake up BDS when job becomes idleKevin Wolf
In the context of draining a BDS, the .drained_poll callback of block jobs is called. If this returns true (i.e. there is still some activity pending), the drain operation may call aio_poll() with blocking=true to wait for completion. As soon as the pending activity is completed and the job finally arrives in a quiescent state (i.e. its coroutine either yields with busy=false or terminates), the block job must notify the aio_poll() loop to wake up, otherwise we get a deadlock if both are running in different threads. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-09-25job: Fix missing locking due to mismergeKevin Wolf
job_completed() had a problem with double locking that was recently fixed independently by two different commits: "job: Fix nested aio_poll() hanging in job_txn_apply" "jobs: add exit shim" One fix removed the first aio_context_acquire(), the other fix removed the other one. Now we have a bug again and the code is run without any locking. Add it back in one of the places. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>
2018-09-25job: Fix nested aio_poll() hanging in job_txn_applyFam Zheng
All callers have acquired ctx already. Doing that again results in aio_poll() hang. This fixes the problem that a BDRV_POLL_WHILE() in the callback cannot make progress because ctx is recursively locked, for example, when drive-backup finishes. There are two callers of job_finalize(): fam@lemon:~/work/qemu [master]$ git grep -w -A1 '^\s*job_finalize' blockdev.c: job_finalize(&job->job, errp); blockdev.c- aio_context_release(aio_context); -- job-qmp.c: job_finalize(job, errp); job-qmp.c- aio_context_release(aio_context); -- tests/test-blockjob.c: job_finalize(&job->job, &error_abort); tests/test-blockjob.c- assert(job->job.status == JOB_STATUS_CONCLUDED); Ignoring the test, it's easy to see both callers to job_finalize (and job_do_finalize) have acquired the context. Cc: qemu-stable@nongnu.org Reported-by: Gu Nini <ngu@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-09-25util/async: use qemu_aio_coroutine_enter in co_schedule_bh_cbSergio Lopez
AIO Coroutines shouldn't by managed by an AioContext different than the one assigned when they are created. aio_co_enter avoids entering a coroutine from a different AioContext, calling aio_co_schedule instead. Scheduled coroutines are then entered by co_schedule_bh_cb using qemu_coroutine_enter, which just calls qemu_aio_coroutine_enter with the current AioContext obtained with qemu_get_current_aio_context. Eventually, co->ctx will be set to the AioContext passed as an argument to qemu_aio_coroutine_enter. This means that, if an IO Thread's AioConext is being processed by the Main Thread (due to aio_poll being called with a BDS AioContext, as it happens in AIO_WAIT_WHILE among other places), the AioContext from some coroutines may be wrongly replaced with the one from the Main Thread. This is the root cause behind some crashes, mainly triggered by the drain code at block/io.c. The most common are these abort and failed assertion: util/async.c:aio_co_schedule 456 if (scheduled) { 457 fprintf(stderr, 458 "%s: Co-routine was already scheduled in '%s'\n", 459 __func__, scheduled); 460 abort(); 461 } util/qemu-coroutine-lock.c: 286 assert(mutex->holder == self); But it's also known to cause random errors at different locations, and even SIGSEGV with broken coroutine backtraces. By using qemu_aio_coroutine_enter directly in co_schedule_bh_cb, we can pass the correct AioContext as an argument, making sure co->ctx is not wrongly altered. Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-09-25qemu-iotests: Test snapshot=on with nonexistent TMPDIRAlberto Garcia
We just fixed a bug that was causing a use-after-free when QEMU was unable to create a temporary snapshot. This is a test case for this scenario. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-09-25block: Fix use after free error in bdrv_open_inherit()Alberto Garcia
When a block device is opened with BDRV_O_SNAPSHOT and the bdrv_append_temp_snapshot() call fails then the error code path tries to unref the already destroyed 'options' QDict. This can be reproduced easily by setting TMPDIR to a location where the QEMU process can't write: $ TMPDIR=/nonexistent $QEMU -drive driver=null-co,snapshot=on Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-09-25block/linux-aio: acquire AioContext before qemu_laio_process_completionsSergio Lopez
In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 #2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 #3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 #4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 #5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 #6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 #10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 #11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 #12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 #13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 #2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 #3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 #4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 #5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 #6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 #10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 #11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 #12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 #13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 #14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 #15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 #16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 #17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 #18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 #19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 #20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 #21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 #22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 #23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 #24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 #25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 #26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 #27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 #28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 #29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 #30 0x00005631fbd81412 in main_loop () at vl.c:1866 #31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-09-25qemu-iotests: Test commit with top-node/base-nodeKevin Wolf
This adds some tests for block-commit with the new options top-node and base-node (taking node names) instead of top and base (taking file names). Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-09-25commit: Add top-node/base-node optionsKevin Wolf
The block-commit QMP command required specifying the top and base nodes of the commit jobs using the file name of that node. While this works in simple cases (local files with absolute paths), the file names generated for more complicated setups can be hard to predict. The block-commit command has more problems than just this, so we want to replace it altogether in the long run, but libvirt needs a reliable way to address nodes now. So we don't want to wait for a new, cleaner command, but just add the minimal thing needed right now. This adds two new options top-node and base-node to the command, which allow specifying node names instead. They are mutually exclusive with the old options. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-09-25qmp, hmp: add PCI subsystem id and vendor id to PCI infoDenis V. Lunev
This is a long story. Red Hat has relicensed Windows KVM device drivers in 2018 and there was an agreement that to avoid WHQL driver conflict software manufacturers should set proper PCI subsystem vendor ID in their distributions. Thus PCI subsystem vendor id becomes actively used. The problem is that this field is applied by us via hardware compats. Thus technically it could be lost. This patch adds PCI susbsystem id and vendor id to exportable parameters for validation. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: Markus Armbruster <armbru@redhat.com> Message-Id: <20180918095852.28422-1-den@openvz.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-09-25hmp: fix migrate status timer leakMarc-André Lureau
Spotted by ASAN doing some manual testing: Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f5fcdc75e50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f5fcd47241d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x55f989be92ce in timer_new /home/elmarco/src/qq/include/qemu/timer.h:561 #3 0x55f989be92ff in timer_new_ms /home/elmarco/src/qq/include/qemu/timer.h:630 #4 0x55f989c0219d in hmp_migrate /home/elmarco/src/qq/hmp.c:2038 #5 0x55f98955927b in handle_hmp_command /home/elmarco/src/qq/monitor.c:3498 #6 0x55f98955fb8c in monitor_command_cb /home/elmarco/src/qq/monitor.c:4371 #7 0x55f98ad40f11 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:393 #8 0x55f98955fa4f in monitor_read /home/elmarco/src/qq/monitor.c:4354 #9 0x55f98aae30d7 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175 #10 0x55f98aae317a in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187 #11 0x55f98aae940c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 #12 0x55f98ab63018 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 #13 0x7f5fcd46c8ac in g_main_dispatch gmain.c:3177 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180901134652.25884-1-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-09-25monitor: print message when using 'help' with an unknown commandCollin Walling
When typing 'help' followed by an unknown command, QEMU will not print anything to the command line to let the user know they typed a bad command. Let's fix this by printing a message to the monitor when this happens. For example: (qemu) help xyz unknown command: 'xyz' Reported-by: Stefan Zimmermann <stzi@linux.ibm.com> Signed-off-by: Collin Walling <walling@linux.ibm.com> Message-Id: <1532115624-27568-1-git-send-email-walling@linux.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-09-25blockdev: document transactional shortcomingsJohn Snow
Presently only the backup job really guarantees what one would consider transactional semantics. To guard against someone helpfully adding them in the future, document that there are shortcomings in the model that would need to be audited at that time. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20180906130225.5118-17-jsnow@redhat.com Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-09-25block/backup: qapi documentation fixupJohn Snow
Fix documentation to match the other jobs amended for 3.1. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20180906130225.5118-16-jsnow@redhat.com Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-09-25qapi/block-stream: expose new job propertiesJohn Snow
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20180906130225.5118-15-jsnow@redhat.com Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>