Age | Commit message (Collapse) | Author |
|
Rework qemu_io() to be analogous to qemu_img(); a function that requires
a return code of zero by default unless disabled explicitly.
Tests that use qemu_io():
030 040 041 044 055 056 093 124 129 132 136 148 149 151 152 163 165 205
209 219 236 245 248 254 255 257 260 264 280 298 300 302 304
image-fleecing migrate-bitmaps-postcopy-test migrate-bitmaps-test
migrate-during-backup migration-permissions
Test that use qemu_io_log():
242 245 255 274 303 307 nbd-reconnect-on-open
Copy-pastables for testing/verification:
./check -qcow2 030 040 041 044 055 056 124 129 132 151 152 163 165 209 \
219 236 242 245 248 254 255 257 260 264 274 \
280 298 300 302 303 304 307 image-fleecing \
migrate-bitmaps-postcopy-test migrate-bitmaps-test \
migrate-during-backup nbd-reconnect-on-open
./check -raw 093 136 148 migration-permissions
./check -nbd 205
# ./configure configure --disable-gnutls --enable-gcrypt
# this ALSO requires passwordless sudo.
./check -luks 149
# Just the tests that were edited in this commit:
./check -qcow2 030 040 242 245
./check -raw migration-permissions
./check -nbd 205
./check -luks 149
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20220418211504.943969-8-jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
|
|
An iotest's 'paused' condition is fickle; it will be reported as true
whenever the job is drained, for example, or when it is in the process
of completing.
030 and 041 contain such checks, we should replace them by checking the
job status instead. (As was done for 129 in commit f9a6256b48f29c2816
for the 'busy' condition.)
Additionally, when we want to test that a job is paused on error, we
might want to give it some time to actually switch to the paused state.
Do that by waiting on the corresponding JOB_STATUS_CHANGE event. (But
only if they are not already paused; the loops these places are in fetch
all VM events, so they may have already fetched that event from the
queue.)
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220324180221.24508-1-hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
When the stream block job cuts out the nodes between top and base in
stream_prepare(), it does not drain the subtree manually; it fetches the
base node, and tries to insert it as the top node's backing node with
bdrv_set_backing_hd(). bdrv_set_backing_hd() however will drain, and so
the actual base node might change (because the base node is actually not
part of the stream job) before the old base node passed to
bdrv_set_backing_hd() is installed.
This has two implications:
First, the stream job does not keep a strong reference to the base node.
Therefore, if it is deleted in bdrv_set_backing_hd()'s drain (e.g.
because some other block job is drained to finish), we will get a
use-after-free. We should keep a strong reference to that node.
Second, even with such a strong reference, the problem remains that the
base node might change before bdrv_set_backing_hd() actually runs and as
a result the wrong base node is installed.
Both effects can be seen in 030's TestParallelOps.test_overlapping_5()
case, which has five nodes, and simultaneously streams from the middle
node to the top node, and commits the middle node down to the base node.
As it is, this will sometimes crash, namely when we encounter the
above-described use-after-free.
Taking a strong reference to the base node, we no longer get a crash,
but the resuling block graph is less than ideal: The expected result is
obviously that all middle nodes are cut out and the base node is the
immediate backing child of the top node. However, if stream_prepare()
takes a strong reference to its base node (the middle node), and then
the commit job finishes in bdrv_set_backing_hd(), supposedly dropping
that middle node, the stream job will just reinstall it again.
Therefore, we need to keep the whole subtree drained in
stream_prepare(), so that the graph modification it performs is
effectively atomic, i.e. that the base node it fetches is still the base
node when bdrv_set_backing_hd() sets it as the top node's backing node.
Verify this by asserting in said 030's test case that the base node is
always the top node's immediate backing child when both jobs are done.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220324140907.17192-1-hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
|
|
See the comment for why this is necessary.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20211111120829.81329-11-hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-11-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
|
|
Some error messages contain ambiguous representations of the 'node-name'
parameter. This can be particularly confusing when exchanging QMP
messages (C = client, S = server):
C: {"execute": "block_resize", "arguments": { "device": "my_file", "size": 26843545600 }}
S: {"error": {"class": "GenericError", "desc": "Cannot find device=my_file nor node_name="}}
^^^^^^^^^
This error message suggests one could send a message with a key called
'node_name':
C: {"execute": "block_resize", "arguments": { "node_name": "my_file", "size": 26843545600 }}
^^^^^^^^^
but using the underscore is actually incorrect, the parameter should be
'node-name':
S: {"error": {"class": "GenericError", "desc": "Parameter 'node_name' is unexpected"}}
This behavior was uncovered in bz1651437, but I ended up going down a
rabbit hole looking for other areas where this miscommunication might
occur and changing those accordingly as well.
Fixes: https://bugzilla.redhat.com/1651437
Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
Message-Id: <20210305151929.1947331-2-ckuehl@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
I reproduced the following crash fast enough:
0 raise () at /lib64/libc.so.6
1 abort () at /lib64/libc.so.6
2 _nl_load_domain.cold () at /lib64/libc.so.6
3 annobin_assert.c_end () at /lib64/libc.so.6
4 bdrv_reopen_multiple (bs_queue=0x55de75fa9b70, errp=0x0)
at ../block.c:3820
5 bdrv_reopen_set_read_only (bs=0x55de760fc020, read_only=true,
errp=0x0) at ../block.c:3870
6 stream_clean (job=0x55de75fa9410) at ../block/stream.c:99
7 job_clean (job=0x55de75fa9410) at ../job.c:680
8 job_finalize_single (job=0x55de75fa9410) at ../job.c:696
9 job_txn_apply (job=0x55de75fa9410,
fn=0x55de741eee27 <job_finalize_single>) at ../job.c:158
10 job_do_finalize (job=0x55de75fa9410) at ../job.c:805
11 job_completed_txn_success (job=0x55de75fa9410) at ../job.c:855
12 job_completed (job=0x55de75fa9410) at ../job.c:868
13 job_exit (opaque=0x55de75fa9410) at ../job.c:888
14 aio_bh_call (bh=0x55de76b9b4e0) at ../util/async.c:136
15 aio_bh_poll (ctx=0x55de75bc5300) at ../util/async.c:164
16 aio_dispatch (ctx=0x55de75bc5300) at ../util/aio-posix.c:381
17 aio_ctx_dispatch (source=0x55de75bc5300, callback=0x0,
user_data=0x0) at ../util/async.c:306
18 g_main_context_dispatch () at /lib64/libglib-2.0.so.0
19 glib_pollfds_poll () at ../util/main-loop.c:232
20 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:255
21 main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
22 qemu_main_loop () at ../softmmu/runstate.c:722
23 main (argc=20, argv=0x7ffe218f0268, envp=0x7ffe218f0310) at
../softmmu/main.c:50
(gdb) fr 4
4 bdrv_reopen_multiple (bs_queue=0x55de75fa9b70, errp=0x0) at
../block.c:3820
3820 assert(perm == state->perm);
(gdb) list
3815
3816 if (ret == 0) {
3817 uint64_t perm, shared;
3818
3819 bdrv_get_cumulative_perm(state->bs, &perm,
&shared);
3820 assert(perm == state->perm);
3821 assert(shared == state->shared_perm);
3822
3823 bdrv_set_perm(state->bs);
3824 } else {
(gdb) p perm
$1 = 1
(gdb) p state->perm
$2 = 0
Then I had 38 successful iterations and another crash:
0 bdrv_check_update_perm (bs=0x5631ac97bc50, q=0x0, new_used_perm=1,
new_shared_perm=31, ignore_children=0x0, errp=0x7ffd9d477cf8) at
../block.c:2197
1 bdrv_root_attach_child
(child_bs=0x5631ac97bc50, child_name=0x5631aaf6b1f9 "backing",
child_class=0x5631ab280ca0 <child_of_bds>, child_role=8,
ctx=0x5631ab757300, perm=1, shared_perm=31, opaque=0x5631abb8c020,
errp=0x7ffd9d477cf8)
at ../block.c:2642
2 bdrv_attach_child (parent_bs=0x5631abb8c020,
child_bs=0x5631ac97bc50, child_name=0x5631aaf6b1f9 "backing",
child_class=0x5631ab280ca0 <child_of_bds>, child_role=8,
errp=0x7ffd9d477cf8)
at ../block.c:2719
3 bdrv_set_backing_hd (bs=0x5631abb8c020, backing_hd=0x5631ac97bc50,
errp=0x7ffd9d477cf8) at ../block.c:2854
4 stream_prepare (job=0x5631ac751eb0) at ../block/stream.c:74
5 job_prepare (job=0x5631ac751eb0) at ../job.c:784
6 job_txn_apply (job=0x5631ac751eb0, fn=0x5631aacb1156 <job_prepare>)
at ../job.c:158
7 job_do_finalize (job=0x5631ac751eb0) at ../job.c:801
8 job_completed_txn_success (job=0x5631ac751eb0) at ../job.c:855
9 job_completed (job=0x5631ac751eb0) at ../job.c:868
10 job_exit (opaque=0x5631ac751eb0) at ../job.c:888
11 aio_bh_call (bh=0x7f3d9c007680) at ../util/async.c:136
12 aio_bh_poll (ctx=0x5631ab757300) at ../util/async.c:164
13 aio_dispatch (ctx=0x5631ab757300) at ../util/aio-posix.c:381
14 aio_ctx_dispatch (source=0x5631ab757300, callback=0x0,
user_data=0x0) at ../util/async.c:306
15 g_main_context_dispatch () at /lib64/libglib-2.0.so.0
16 glib_pollfds_poll () at ../util/main-loop.c:232
17 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:255
18 main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
19 qemu_main_loop () at ../softmmu/runstate.c:722
20 main (argc=20, argv=0x7ffd9d478198, envp=0x7ffd9d478240) at
../softmmu/main.c:50
(gdb) list
2192 QLIST_FOREACH(c, &bs->parents, next_parent) {
2193 if (g_slist_find(ignore_children, c)) {
2194 continue;
2195 }
2196
2197 if ((new_used_perm & c->shared_perm) != new_used_perm)
{
2198 char *user = bdrv_child_user_desc(c);
2199 char *perm_names = bdrv_perm_names(new_used_perm &
~c->shared_perm);
2200
2201 error_setg(errp, "Conflicts with use by %s as '%s',
which does not "
(gdb) p c
$1 = (BdrvChild *) 0x8585858585858585
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20210205111021.715240-1-vsementsov@virtuozzo.com
Reviewed-by: Eric Blake <eblake@redhat.com>
[PMM: trimmed the part of the commit message referring to
as-yet-unapplied patchseries]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
This patch completes the series with the COR-filter applied to
block-stream operations.
Adding the filter makes it possible in future implement discarding
copied regions in backing files during the block-stream job, to reduce
the disk overuse (we need control on permissions).
Also, the filter now is smart enough to do copy-on-read with specified
base, so we have benefit on guest reads even when doing block-stream of
the part of the backing chain.
Several iotests are slightly modified due to filter insertion.
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201216061703.70908-14-vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
test_stream_parallel run parallel stream jobs, intersecting so that top
of one is base of another. It's OK now, but it would be a problem if
insert the filter, as one job will want to use another job's filter as
above_base node.
Correct thing to do is move to new interface: "bottom" argument instead
of base. This guarantees that jobs don't intersect by their actions.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20201216061703.70908-12-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
We are going to drop group file. Define group in tests as a preparatory
step.
The patch is generated by
cd tests/qemu-iotests
grep '^[0-9]\{3\} ' group | while read line; do
file=$(awk '{print $1}' <<< "$line");
groups=$(sed -e 's/^... //' <<< "$line");
awk "NR==2{print \"# group: $groups\"}1" $file > tmp;
cat tmp > $file;
done
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210116134424.82867-7-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
The test_stream_parallel test still occasionally fails in the CI.
Thus let's disable it during "make check" for now so that it does
not cause trouble during merge tests. We can enable it again once
the problem has been resolved.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200907113824.134788-1-thuth@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
It can happen that the throttling of the stream job doesn't make it slow
enough that we can be sure that it still exists when it is referenced
again. Just use a much smaller speed to make this very unlikely to
happen again.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200716132829.20127-1-kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
There are many existing qcow2 images that specify a backing file but
no format. This has been the source of CVEs in the past, but has
become more prominent of a problem now that libvirt has switched to
-blockdev. With older -drive, at least the probing was always done by
qemu (so the only risk of a changed format between successive boots of
a guest was if qemu was upgraded and probed differently). But with
newer -blockdev, libvirt must specify a format; if libvirt guesses raw
where the image was formatted, this results in data corruption visible
to the guest; conversely, if libvirt guesses qcow2 where qemu was
using raw, this can result in potential security holes, so modern
libvirt instead refuses to use images without explicit backing format.
The change in libvirt to reject images without explicit backing format
has pointed out that a number of tools have been far too reliant on
probing in the past. It's time to set a better example in our own
iotests of properly setting this parameter.
iotest calls to create, rebase, and convert are all impacted to some
degree. It's a bit annoying that we are inconsistent on command line
- while all of those accept -o backing_file=...,backing_fmt=..., the
shortcuts are different: create and rebase have -b and -F, while
convert has -B but no -F. (amend has no shortcuts, but the previous
patch just deprecated the use of amend to change backing chains).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200706203954.341758-9-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
test_overlapping_3() throttles its active commit job so it can be sure
the job is still busy when it checks that you can't start a conflicting
streaming job.
However, it only sets the commit job back to full speed when it is
ready, which takes a few seconds while it's throttled. We can already
reset the limit after having checked that block-stream returns an error
and save these seconds.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200513100025.33543-1-kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
We can turn logging on/off globally instead of per-function.
Remove use_log from run_job, and use python logging to turn on
diffable output when we run through a script entry point.
iotest 245 changes output order due to buffering reasons.
An extended note on python logging:
A NullHandler is added to `qemu.iotests` to stop output from being
generated if this code is used as a library without configuring logging.
A NullHandler is only needed at the root, so a duplicate handler is not
needed for `qemu.iotests.diff_io`.
When logging is not configured, messages at the 'WARNING' levels or
above are printed with default settings. The NullHandler stops this from
occurring, which is considered good hygiene for code used as a library.
See https://docs.python.org/3/howto/logging.html#library-config
When logging is actually enabled (always at the behest of an explicit
call by a client script), a root logger is implicitly created at the
root, which allows messages to propagate upwards and be handled/emitted
from the root logger with default settings.
When we want iotest logging, we attach a handler to the
qemu.iotests.diff_io logger and disable propagation to avoid possible
double-printing.
For more information on python logging infrastructure, I highly
recommend downloading the pip package `logging_tree`, which provides
convenient visualizations of the hierarchical logging configuration
under different circumstances.
See https://pypi.org/project/logging_tree/ for more information.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200331000014.11581-15-jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
Use the program search path to find the Python 3 interpreter.
Patch created mechanically by running:
$ sed -i "s,^#\!/usr/bin/\(env\ \)\?python$,#\!/usr/bin/env python3," \
$(git grep -l 'if __name__.*__main__')
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200130163232.10446-4-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
|
|
The python code already contains a possibility to skip tests if the
corresponding driver is not available in the qemu binary - use it
in more spots to avoid that the tests are failing if the driver has
been disabled.
While we're at it, we can now also remove some of the old checks that
were using iotests.supports_quorum() - and which were apparently not
working as expected since the tests aborted instead of being skipped
when "quorum" was missing in the QEMU binary.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
The error message for a negative speed uses QERR_INVALID_PARAMETER,
which implies that the 'speed' option doesn't even exist:
{"error": {"class": "GenericError", "desc": "Invalid parameter 'speed'"}}
Make it use QERR_INVALID_PARAMETER_VALUE instead:
{"error": {"class": "GenericError", "desc": "Parameter 'speed' expects a non-negative value"}}
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
|
|
Most of our Python unittest-style tests only support the file protocol.
You can run them with any other protocol, but the test will simply
ignore your choice and use file anyway.
We should let them signal that they require the file protocol so they
are skipped when you want to test some other protocol.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This tests that the stream job exits cleanly (without abort) when the
top node is read-only and cannot be reopened read/write.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190703172813.6868-12-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
We recently removed the dependency of the stream job on its base node.
That makes it OK to use a commit filter node there. Test that.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Tested-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190703172813.6868-11-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
Currently, 030 just compares the error class, which does not say
anything.
Before HEAD^ added throttling to test_overlapping_4, that test actually
usually failed because node2 was already gone, not because it was the
commit and stream job were not allowed to overlap.
Prevent such problems in the future by comparing the error description
instead.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Tested-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190703172813.6868-9-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
Currently, TestParallelOps in 030 creates images that are too small for
job throttling to be effective. This is reflected by the fact that it
never undoes the throttling.
Increase the image size and undo the throttling when the job should be
completed. Also, add throttling in test_overlapping_4, or the jobs may
not be so overlapping after all. In fact, the error usually emitted
here is that node2 simply does not exist, not that overlapping jobs are
not allowed -- the fact that this job ignores the exact error messages
and just checks the error class is something that should be fixed in a
follow-up patch.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Tested-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190703172813.6868-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
The base node of a block-stream operation indicates the first image
from the backing chain starting from which no data is copied to the
top node.
The block-stream job allows others to use that base image, so a second
block-stream job could be writing to it at the same time. An important
restriction is that the base image must not disappear while the stream
job is ongoing. stream_start() freezes the backing chain from top to
base with that purpose but it does it too late in the code so there is
a race condition there.
This bug was fixed in the previous commit, and this patch contains an
iotest for this scenario.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
In Python 3, / is always a floating-point division. We usually do not
want this, and as Python 2.7 understands // as well, change all integer
divisions to use that.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Message-Id: <20181022135307.14398-5-mreitz@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
|
|
This adds a QMP event that is emitted whenever a job transitions from
one status to another.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Split out the pause command into the actual pause and the wait.
Not every usage presently needs to resubmit a pause request.
The intent with the next commit will be to explicitly disallow
redundant or meaningless pause/resume requests, so the tests
need to become more judicious to reflect that.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This patch tweaks TestParallelOps in iotest 030 so it allocates data
in smaller regions (256KB/512KB instead of 512KB/1MB) and the
block-stream job in test_stream_commit() only needs to copy data that
is at the very end of the image.
This way when the block-stream job is awakened it will finish right
away without any chance of being stopped by block_job_sleep_ns(). This
triggers the bug that was fixed by 3d5d319e1221082974711af1d09d82f and
1a63a907507fbbcfaee3f622907ec24 and is therefore a more useful test
case for parallel block jobs.
After this patch the aforementiond bug can also be reproduced with the
test_stream_parallel() test case.
Since with this change the stream job in test_stream_commit() finishes
early, this patch introduces a similar test case where both jobs are
slowed down so they can actually run in parallel.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Cc: John Snow <jsnow@redhat.com>
Message-id: 20180306130121.30243-1-berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
This patch fixes two race conditions in 030:
1. The first is in TestENOSPC.test_enospc(). After resuming the job,
querying it to confirm it is no longer paused may fail because in the
meantime it might have completed already. The same was fixed in
TestEIO.test_ignore() already (in commit
2c3b44da07d341557a8203cc509ea07fe3605e11).
2. The second is in TestSetSpeed.test_set_speed_invalid(): Here, a
stream job is started on a drive without any break points, with a
block-job-set-speed invoked subsequently. However, without any break
points, the job might have completed in the meantime (on tmpfs at
least); or it might complete before cancel_and_wait() which expects
the job to still exist. This can be fixed like everywhere else by
pausing the drive (installing break points) before starting the job
and letting cancel_and_wait() resume it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171109203025.27493-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
We don't need to make any assumptions about the graph layout above the
top node of the commit operation any more. Remove the use of
bdrv_find_overlay() and related variables from the commit job code.
bdrv_drop_intermediate() doesn't use the 'active' parameter any more, so
we can just drop it.
The overlay node was previously added to the block job to get a
BLK_PERM_GRAPH_MOD. We really need to respect those permissions in
bdrv_drop_intermediate() now, but as long as we haven't figured out yet
how BLK_PERM_GRAPH_MOD is actually supposed to work, just leave a TODO
comment there.
With this change, it is now possible to perform another block job on an
overlay node without conflicts. qemu-iotests 030 is changed accordingly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
Test cases 030, 041 and 055 used to sleep for a second after calling
block-job-pause to make sure that the block job had time to actually
get into paused state. We can instead poll its status and use that one
second only as a timeout.
The tests also slept a second for checking that the block jobs don't
make progress while being paused. Half a second is more than enough for
this.
These changes reduce the total time for the three tests by 25 seconds on
my laptop (from 155 seconds to 130).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
|
|
This adds a small test for the image streaming error path for failing
block_job_create(), which would have found the null pointer dereference
in commit a170a91f.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
|
|
qemu-img and qemu-io commands when guest is running need "-U" option,
add it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
We can't rely on a non-paused job to be present and running for us.
Assume that if the job is not present that it completed already.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
The block-stream command has traditionally used the 'base' parameter
to indicate the image to copy the data from. This test checks that the
'base-node' parameter can also be used for the same purpose.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Quorum children are special in the sense that they're not directly
attached to a block backend but they're not used as backing images
either. However the intermediate block streaming code supports
streaming to them. This is a test case for that scenario.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
As with test_stream_parallel(), we allow mixing block-stream and
block-commit operations in the same backing chain as long as there's
no overlap among the involved nodes.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
These test cases check that it's not possible to perform two
block-stream or block-commit operations if there are nodes involved in
both.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This test case checks that it's possible to launch several stream
operations in parallel in the same snapshot chain, each one involving
a different set of nodes.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This adds test_stream_intermediate(), similar to test_stream() but
streams to the intermediate image instead.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
block-stream to accept a node-name without lifting the restriction that
we're operating at a root node.
In case of an invalid device name, the command returns the GenericError
error class now instead of DeviceNotFound, because this is what
qmp_get_root_bs() returns.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
|
|
This patch tests that in a partial block-stream operation, no data is
ever copied from the base image.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 5272a2aa57bc0b3f981f8b3e0c813e58a88c974b.1458566441.git.berto@igalia.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
|
|
This test is streaming to the top layer using the intermediate image
as the base. This is a mistake since block-stream never copies data
from the base image and its backing chain, so this is effectively a
no-op.
In addition to fixing the base parameter, this patch also writes some
data to the intermediate image before the test, so there's something
to copy and the test is meaningful.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 2efa304da38b32d47c120ce728568a589c5a3afc.1458566441.git.berto@igalia.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
|
|
the stop_test case tests that we can resume a block-stream
command after it has stopped/paused due to error. We cannot
always reliably query it before it finishes after resume, though,
so make this a conditional.
The important thing is that we are still testing that it has stopped,
and that it finishes successfully after we send a resume command.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-4-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
The test test_stream_pause in this class uses vm.pause_drive, which
requires a blkdebug driver on top of image, otherwise it's no-op and the
test running is undeterministic.
So add it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This eliminates code duplication.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Pause the drive and start the block job, so we won't miss the block job.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Case 030 occasionally fails because of block job compltes too fast to be
captured by script, and 'unexpected qmp event' of job completion causes
the test failure.
Simply fill in some data to the test image to make this false alarm less
likely to happen.
(For other benefits to prefill data to test image, see also commit
ab68cdfaa).
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Both 030 and 041 use create_image(). Move it to iotests.py.
Also drop ImageStreamingTestCase since the class now has no methods.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
The cancel_and_wait() function has been duplicated in 030 and 041. Move
it into iotests.py and let it return the event so tests can perform
additional asserts.
Note that 041's cancel_and_wait(wait_ready=True) is replaced by
wait_ready_and_cancel(), which uses the new wait_ready() and
cancel_and_wait() underneath.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|