aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-04-01nbd/server: Advertise actual minimum block sizeEric Blake
Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their reply according to bdrv_block_status() boundaries. If the block device has a request_alignment smaller than 512, but we advertise a block alignment of 512 to the client, then this can result in the server reply violating client expectations by reporting a smaller region of the export than what the client is permitted to address (although this is less of an issue for qemu 4.0 clients, given recent client patches to overlook our non-compliance at EOF). Since it's always better to be strict in what we send, it is worth advertising the actual minimum block limit rather than blindly rounding it up to 512. Note that this patch is not foolproof - it is still possible to provoke non-compliant server behavior using: $ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file That is arguably a bug in the blkdebug driver (it should never pass back block status smaller than its alignment, even if it has to make multiple bdrv_get_status calls and determine the least-common-denominator status among the group to return). It may also be possible to observe issues with a backing layer with smaller alignment than the active layer, although so far I have been unable to write a reliable iotest for that scenario (but again, an issue like that could be argued to be a bug in the block layer, or something where we need a flag to bdrv_block_status() to state whether the result must be aligned to the current layer's limits or can be subdivided for accuracy when chasing backing files). Anyways, as blkdebug is not normally used, and as this patch makes our server more interoperable with qemu 3.1 clients, it is worth applying now, even while we still work on a larger patch series for the 4.1 timeframe to have byte-accurate file lengths. Note that the iotests output changes - for 223 and 233, we can see the server's better granularity advertisement; and for 241, the three test cases have the following effects: - natural alignment: the server's smaller alignment is now advertised, and the hole reported at EOF is now the right result; we've gotten rid of the server's non-compliance - forced server alignment: the server still advertises 512 bytes, but still sends a mid-sector hole. This is still a server compliance bug, which needs to be fixed in the block layer in a later patch; output does not change because the client is already being tolerant of the non-compliance - forced client alignment: the server's smaller alignment means that the client now sees the server's status change mid-sector without any protocol violations, but the fact that the map shows an unaligned mid-sector hole is evidence of the block layer problems with aligned block status, to be fixed in a later patch Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190329042750.14704-7-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: rebase to enhanced iotest 241 coverage]
2019-04-01block: Add bdrv_get_request_alignment()Eric Blake
The next patch needs access to a device's minimum permitted alignment, since NBD wants to advertise this to clients. Add an accessor function, borrowing from blk_get_max_transfer() for accessing a backend's block limits. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190329042750.14704-6-eblake@redhat.com>
2019-04-01nbd/client: Support qemu-img convert from unaligned sizeEric Blake
If an NBD server advertises a size that is not a multiple of a sector, the block layer rounds up that size, even though we set info.size to the exact byte value sent by the server. The block layer then proceeds to let us read or query block status on the hole that it added past EOF, which the NBD server is unlikely to be happy with. Fortunately, qemu as a server never advertizes an unaligned size, so we generally don't run into this problem; but the nbdkit server makes it easy to test: $ printf %1000d 1 > f1 $ ~/nbdkit/nbdkit -fv file f1 & pid=$! $ qemu-img convert -f raw nbd://localhost:10809 f2 $ kill $pid $ qemu-img compare f1 f2 Pre-patch, the server attempts a 1024-byte read, which nbdkit rightfully rejects as going beyond its advertised 1000 byte size; the conversion fails and the output files differ (not even the first sector is copied, because qemu-img does not follow ddrescue's habit of trying smaller reads to get as much information as possible in spite of errors). Post-patch, the client's attempts to read (and query block status, for new enough nbdkit) are properly truncated to the server's length, with sane handling of the hole the block layer forced on us. Although f2 ends up as a larger file (1024 bytes instead of 1000), qemu-img compare shows the two images to have identical contents for display to the guest. I didn't add iotests coverage since I didn't want to add a dependency on nbdkit in iotests. I also did NOT patch write, trim, or write zeroes - these commands continue to fail (usually with ENOSPC, but whatever the server chose), because we really can't write to the end of the file, and because 'qemu-img convert' is the most common case where we care about being tolerant (which is read-only). Perhaps we could truncate the request if the client is writing zeros to the tail, but that seems like more work, especially if the block layer is fixed in 4.1 to track byte-accurate sizing (in which case this patch would be reverted as unnecessary). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190329042750.14704-5-eblake@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com>
2019-04-01nbd/client: Reject inaccessible tail of inconsistent serverEric Blake
The NBD spec suggests that a server should never advertise a size inconsistent with its minimum block alignment, as that tail is effectively inaccessible to a compliant client obeying those block constraints. Since we have a habit of rounding up rather than truncating, to avoid losing the last few bytes of user input, and we cannot access the tail when the server advertises bogus block sizing, abort the connection to alert the server to fix their bug. And rejecting such servers matches what we already did for a min_block that was not a power of 2 or which was larger than max_block. Does not impact either qemu (which always sends properly aligned sizes) or nbdkit (which does not send minimum block requirements yet); so this is mostly aimed at new NBD server implementations, and ensures that the rest of our code can assume the size is aligned. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190330155704.24191-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-03-30nbd/client: Report offsets in bdrv_block_statusEric Blake
It is desirable for 'qemu-img map' to have the same output for a file whether it is served over file or nbd protocols. However, ever since we implemented block status for NBD (2.12), the NBD protocol forgot to inform the block layer that as the final layer in the chain, the offset is valid; without an offset, the human-readable form of qemu-img map gives up with the unhelpful: $ nbdkit -U - data data="1" size=512 --run 'qemu-img map $nbd' Offset Length Mapped to File qemu-img: File contains external, encrypted or compressed clusters. The --output=json form always works, because it is reporting the lower-level bdrv_block_status results directly rather than trying to filter out sparse ranges for human consumption - but now it also shows the offset member. With this patch, the human output changes to: Offset Length Mapped to File 0 0x200 0 nbd+unix://?socket=/tmp/nbdkitOxeoLa/socket This change is observable to several iotests. Fixes: 78a33ab5 Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190329042750.14704-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-03-30nbd/client: Lower min_block for block-status, unaligned sizeEric Blake
We have a latent bug in our NBD client code, tickled by the brand new nbdkit 1.11.10 block status support: $ nbdkit --filter=log --filter=truncate -U - \ data data="1" size=511 truncate=64K logfile=/dev/stdout \ --run 'qemu-img convert $nbd /var/tmp/out' ... qemu-img: block/io.c:2122: bdrv_co_block_status: Assertion `*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed. The culprit? Our implementation of .bdrv_co_block_status can return unaligned block status for any server that operates with a lower actual alignment than what we tell the block layer in request_alignment, in violation of the block layer's constraints. To date, we've been unable to trip the bug, because qemu as NBD server always advertises block sizing (at which point it is a server bug if the server sends unaligned status - although qemu 3.1 is such a server and I've sent separate patches for 4.0 both to get the server to obey the spec, and to let the client to tolerate server oddities at EOF). But nbdkit does not (yet) advertise block sizing, and therefore is not in violation of the spec for returning block status at whatever boundaries it wants, and those unaligned results can occur anywhere rather than just at EOF. While we are still wise to avoid sending sub-sector read/write requests to a server of unknown origin, we MUST consider that a server telling us block status without an advertised block size is correct. So, we either have to munge unaligned answers from the server into aligned ones that we hand back to the block layer, or we have to tell the block layer about a smaller alignment. Similarly, if the server advertises an image size that is not sector-aligned, we might as well assume that the server intends to let us access those tail bytes, and therefore supports a minimum block size of 1, regardless of whether the server supports block status (although we still need more patches to fix the problem that with an unaligned image, we can send read or block status requests that exceed EOF to the server). Again, qemu as server cannot trip this problem (because it rounds images to sector alignment), but nbdkit advertised unaligned size even before it gained block status support. Solve both alignment problems at once by using better heuristics on what alignment to report to the block layer when the server did not give us something to work with. Note that very few NBD servers implement block status (to date, only qemu and nbdkit are known to do so); and as the NBD spec mentioned block sizing constraints prior to documenting block status, it can be assumed that any future implementations of block status are aware that they must advertise block size if they want a minimum size other than 1. We've had a long history of struggles with picking the right alignment to use in the block layer, as evidenced by the commit message of fd8d372d (v2.12) that introduced the current choice of forced 512-byte alignment. There is no iotest coverage for this fix, because qemu can't provoke it, and I didn't want to make test 241 dependent on nbdkit. Fixes: fd8d372d Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190329042750.14704-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Richard W.M. Jones <rjones@redhat.com>
2019-03-30iotests: Add 241 to test NBD on unaligned imagesEric Blake
Add a test for the NBD client workaround in the previous patch. It's not really feasible for an iotest to assume a specific tracing engine, so we can't really probe trace_nbd_parse_blockstatus_compliance to see if the server was fixed vs. whether the client just worked around the server (other than by rearranging order between code patches and this test). But having a successful exchange sure beats the previous state of an error message. Since format probing can change alignment, we can use that as an easy way to test several configurations. Not tested yet, but worth adding to this test in future patches: an NBD server that can advertise a non-sector-aligned size (such as nbdkit) causes qemu as the NBD client to misbehave when it rounds the size up and accesses beyond the advertised size. Qemu as NBD server never advertises a non-sector-aligned size (since bdrv_getlength() currently rounds up to sector boundaries); until qemu can act as such a server, testing that flaw will have to rely on external binaries. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190329042750.14704-2-eblake@redhat.com> Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: add forced-512 alignment, and nbdkit reproducer comment]
2019-03-30nbd-client: Work around server BLOCK_STATUS misalignment at EOFEric Blake
The NBD spec is clear that a server that advertises a minimum block size should reply to NBD_CMD_BLOCK_STATUS with extents aligned accordingly. However, we know that the qemu NBD server implementation has had a corner-case bug where it is not compliant with the spec, present since the introduction of NBD_CMD_BLOCK_STATUS in qemu 2.12 (and unlikely to be patched in time for 4.0). Namely, when qemu is serving a file that is not a multiple of 512 bytes, it rounds the size advertised over NBD up to the next sector boundary (someday, I'd like to fix that to be byte-accurate, but it's a much bigger audit not appropriate for this release); yet if the final sector contains data prior to EOF, lseek(SEEK_HOLE) will point to the implicit hole mid-sector which qemu then reported over NBD. We are well within our rights to hang up on a server that can't follow the spec, but it is more useful to try and keep the connection alive in spite of the problem. Do so by tracing a message about the problem, and then either truncating the request back to an aligned boundary (if it covered more than the final sector) or widening it out to the full boundary with a forced status of data (since truncating would result in 0 bytes, but we have to make progress, and valid since data is a default-safe answer). And in practice, since the problem only happens on a sector that starts with data and ends with a hole, we are going to want to read that full sector anyway (where qemu as the server fills in the tail beyond EOF with appropriate NUL bytes). Easy reproduction: $ printf %1000d 1 > file $ qemu-nbd -f raw -t file & pid=$! $ qemu-img map --output=json -f raw nbd://localhost:10809 qemu-img: Could not read file metadata: Invalid argument $ kill $pid where the patched version instead succeeds with: [{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}] Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190326171317.4036-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-03-30qemu-img: Gracefully shutdown when map can't finishEric Blake
Trying 'qemu-img map -f raw nbd://localhost:10809' causes the NBD server to output a scary message: qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected end-of-file before all bytes were read This is because the NBD client, being remote, has no way to expose a human-readable map (the --output=json data is fine, however). But because we exit(1) right after the message, causing the client to bypass all block cleanup, the server sees the abrupt exit and warns, whereas it would be silent had the client had a chance to send NBD_CMD_DISC. Other protocols may have similar cleanup issues, where failure to blk_unref() could cause unintended effects. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190326184043.7544-1-eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2019-03-30nbd: Permit simple error to NBD_CMD_BLOCK_STATUSEric Blake
The NBD spec is clear that when structured replies are active, a simple error reply is acceptable to any command except for NBD_CMD_READ. However, we were mistakenly requiring structured errors for NBD_CMD_BLOCK_STATUS, and hanging up on a server that gave a simple error (since qemu does not behave as such a server, we didn't notice the problem until now). Broken since its introduction in commit 78a33ab5 (v2.12). Noticed while debugging a separate failure reported by nbdkit while working out its initial implementation of BLOCK_STATUS, although it turns out that nbdkit also chose to send structured error replies for BLOCK_STATUS, so I had to manually provoke the situation by hacking qemu's server to send a simple error reply: | diff --git i/nbd/server.c w/nbd/server.c | index fd013a2817a..833288d7c45 100644 | 00--- i/nbd/server.c | +++ w/nbd/server.c | @@ -2269,6 +2269,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client, | "discard failed", errp); | | case NBD_CMD_BLOCK_STATUS: | + return nbd_co_send_simple_reply(client, request->handle, ENOMEM, | + NULL, 0, errp); | if (!request->len) { | return nbd_send_generic_reply(client, request->handle, -EINVAL, | "need non-zero length", errp); | Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Richard W.M. Jones <rjones@redhat.com> Message-Id: <20190325190104.30213-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-03-30nbd: Don't lose server's error to NBD_CMD_BLOCK_STATUSEric Blake
When the server replies with a (structured [*]) error to NBD_CMD_BLOCK_STATUS, without any extent information sent first, the client code was blindly throwing away the server's error code and instead telling the caller that EIO occurred. This has been broken since its introduction in 78a33ab5 (v2.12, where we should have called: error_setg(&local_err, "Server did not reply with any status extents"); nbd_iter_error(&iter, false, -EIO, &local_err); to declare the situation as a non-fatal error if no earlier error had already been flagged, rather than just blindly slamming iter.err and iter.ret), although it is more noticeable since commit 7f86068d, which actually tries hard to preserve the server's code thanks to a separate iter.request_ret. [*] The spec is clear that the server is also permitted to reply with a simple error, but that's a separate fix. I was able to provoke this scenario with a hack to the server, then seeing whether ENOMEM makes it back to the caller: | diff --git a/nbd/server.c b/nbd/server.c | index fd013a2817a..29c7995de02 100644 | --- a/nbd/server.c | +++ b/nbd/server.c | @@ -2269,6 +2269,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client, | "discard failed", errp); | | case NBD_CMD_BLOCK_STATUS: | + return nbd_send_generic_reply(client, request->handle, -ENOMEM, | + "no status for you today", errp); | if (!request->len) { | return nbd_send_generic_reply(client, request->handle, -EINVAL, | "need non-zero length", errp); | -- Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190325190104.30213-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-03-30nbd: Tolerate some server non-compliance in NBD_CMD_BLOCK_STATUSEric Blake
The NBD spec states that NBD_CMD_FLAG_REQ_ONE (which we currently always use) should not reply with an extent larger than our request, and that the server's response should be exactly one extent. Right now, that means that if a server sends more than one extent, we treat the server as broken, fail the block status request, and disconnect, which prevents all further use of the block device. But while good software should be strict in what it sends, it should be tolerant in what it receives. While trying to implement NBD_CMD_BLOCK_STATUS in nbdkit, we temporarily had a non-compliant server sending too many extents in spite of REQ_ONE. Oddly enough, 'qemu-img convert' with qemu 3.1 failed with a somewhat useful message: qemu-img: Protocol error: invalid payload for NBD_REPLY_TYPE_BLOCK_STATUS which then disappeared with commit d8b4bad8, on the grounds that an error message flagged only at the time of coroutine teardown is pointless, and instead we should rely on the actual failed API to report an error - in other words, the 3.1 behavior was masking the fact that qemu-img was not reporting an error. That has since been fixed in the previous patch, where qemu-img convert now fails with: qemu-img: error while reading block status of sector 0: Invalid argument But even that is harsh. Since we already partially relaxed things in commit acfd8f7a to tolerate a server that exceeds the cap (although that change was made prior to the NBD spec actually putting a cap on the extent length during REQ_ONE - in fact, the NBD spec change was BECAUSE of the qemu behavior prior to that commit), it's not that much harder to argue that we should also tolerate a server that sends too many extents. But at the same time, it's nice to trace when we are being tolerant of server non-compliance, in order to help server writers fix their implementations to be more portable (if they refer to our traces, rather than just stderr). Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190323212639.579-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-03-30qemu-img: Report bdrv_block_status failuresEric Blake
If bdrv_block_status_above() fails, we are aborting the convert process but failing to print an error message. Broken in commit 690c7301 (v2.4) when rewriting convert's logic. Discovered when teaching nbdkit to support NBD_CMD_BLOCK_STATUS, and accidentally violating the protocol by returning more than one extent in spite of qemu asking for NBD_CMD_FLAG_REQ_ONE. The qemu NBD code should probably handle the server's non-compliance more gracefully than failing with EINVAL, but qemu-img shouldn't be silently squelching any block status failures. It doesn't help that qemu 3.1 masks the qemu-img bug with extra noise that the nbd code is dumping to stderr (that noise was cleaned up in d8b4bad8). Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190323212639.579-2-eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-03-29Merge remote-tracking branch 'remotes/rth/tags/pull-axp-20190325' into stagingPeter Maydell
Update palcode for machine checks. # gpg: Signature made Mon 25 Mar 2019 23:09:24 GMT # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-axp-20190325: pc-bios: Update palcode-clipper Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-29Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Fri 29 Mar 2019 07:30:26 GMT # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: net: tap: use qemu_set_nonblock MAINTAINERS: Update the latest email address e1000: Delay flush queue when receive RCTL net/socket: learn to talk with a unix dgram socket Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-29Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20190329' ↵Peter Maydell
into staging ppc patch queue 2019-03-29 Here's a set of bugfixes for ppc, aimed at qemu-4.0 during hard freeze. We have one cleanup that's not strictly a bugfix, but will avoid an ugly external interface making it to a released version. We have one change to generic code to tweak the semantics of qemu_getrampagesize() which fixes a bug for ppc. This does have a possible impact on s390x which uses this function for a different purpose. I've discussed with David Hildenbrand and Igor Mammedov, however and we think it won't immediately break anything due to some existing bugs in the s390 usage. David H will be following up with some s390 fixes in that area. # gpg: Signature made Fri 29 Mar 2019 03:27:49 GMT # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-4.0-20190329: exec: Only count mapped memory backends for qemu_getrampagesize() spapr/irq: Add XIVE sanity checks on non-P9 machines spapr: Simplify handling of host-serial and host-model values target/ppc: Fix QEMU crash with stxsdx target/ppc: Improve comment of bcctr used for spectre v2 mitigation target/ppc: Consolidate 64-bit server processor detection in a helper target/ppc: Enable "decrement and test CTR" version of bcctr target/ppc: Fix TCG temporary leaks in gen_bcond() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-29net: tap: use qemu_set_nonblockLi Qiang
The fcntl will change the flags directly, use qemu_set_nonblock() instead. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-03-29MAINTAINERS: Update the latest email addressZhang Chen
Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-03-29e1000: Delay flush queue when receive RCTLyuchenlin
Due to too early RCT0 interrput, win10x32 may hang on booting. This problem can be reproduced by doing power cycle on win10x32 guest. In our environment, we have 10 win10x32 and stress power cycle. The problem will happen about 20 rounds. Below shows some log with comment: The normal case: 22831@1551928392.984687:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 22831@1551928392.985655:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 22831@1551928392.985801:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 22831@1551928393.056710:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: ICR read: 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 22831@1551928393.077548:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: ICR read: 0 e1000: set_ics 2, ICR 0, IMR 0 e1000: set_ics 2, ICR 2, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 22831@1551928393.102974:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 22831@1551928393.103267:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: RCTL: 255, mac_reg[RCTL] = 0x40002 <- win10x32 says it can handle RX now e1000: set_ics 0, ICR 2, IMR 9d <- unmask interrupt e1000: RCTL: 255, mac_reg[RCTL] = 0x48002 e1000: set_ics 80, ICR 2, IMR 9d <- interrupt and work! ... The bad case: 27744@1551930483.117766:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 27744@1551930483.118398:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 27744@1551930483.198063:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: ICR read: 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 27744@1551930483.218675:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: set_ics 0, ICR 0, IMR 0 e1000: ICR read: 0 e1000: set_ics 2, ICR 0, IMR 0 e1000: set_ics 2, ICR 2, IMR 0 e1000: RCTL: 0, mac_reg[RCTL] = 0x0 27744@1551930483.241768:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 27744@1551930483.241979:e1000x_rx_disabled Received packet dropped because receive is disabled RCTL = 0 e1000: RCTL: 255, mac_reg[RCTL] = 0x40002 <- win10x32 says it can handle RX now e1000: set_ics 80, ICR 2, IMR 0 <- flush queue (caused by setting RCTL) e1000: set_ics 0, ICR 82, IMR 9d <- unmask interrupt and because 0x82&0x9d != 0 generate interrupt, hang on here... To workaround this problem, simply delay flush queue. Also stop receiving when timer is going to run. Tested on CentOS, Win7SP1x64 and Win10x32. Signed-off-by: yuchenlin <yuchenlin@synology.com> Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-03-29net/socket: learn to talk with a unix dgram socketMarc-André Lureau
-net socket has a fd argument, and may be passed pre-opened sockets. TCP sockets use framing. UDP sockets have datagram boundaries. When given a unix dgram socket, it will be able to read from it, but will attempt to send on the dgram_dst, which is unset. The other end will not receive the data. Let's teach -net socket to recognize a UNIX DGRAM socket, and use the regular send() command (without dgram_dst). This makes running slirp out-of-process possible that way (python pseudo-code): a, b = socket.socketpair(socket.AF_UNIX, socket.SOCK_DGRAM) subprocess.Popen('qemu -net socket,fd=%d -net user' % a.fileno(), shell=True) subprocess.Popen('qemu ... -net nic -net socket,fd=%d' % b.fileno(), shell=True) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-03-29exec: Only count mapped memory backends for qemu_getrampagesize()David Gibson
qemu_getrampagesize() works out the minimum host page size backing any of guest RAM. This is required in a few places, such as for POWER8 PAPR KVM guests, because limitations of the hardware virtualization mean the guest can't use pagesizes larger than the host pages backing its memory. However, it currently checks against *every* memory backend, whether or not it is actually mapped into guest memory at the moment. This is incorrect. This can cause a problem attempting to add memory to a POWER8 pseries KVM guest which is configured to allow hugepages in the guest (e.g. -machine cap-hpt-max-page-size=16m). If you attempt to add non-hugepage, you can (correctly) create a memory backend, however it (correctly) will throw an error when you attempt to map that memory into the guest by 'device_add'ing a pc-dimm. What's not correct is that if you then reset the guest a startup check against qemu_getrampagesize() will cause a fatal error because of the new memory object, even though it's not mapped into the guest. This patch corrects the problem by adjusting find_max_supported_pagesize() (called from qemu_getrampagesize() via object_child_foreach) to exclude non-mapped memory backends. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
2019-03-29spapr/irq: Add XIVE sanity checks on non-P9 machinesCédric Le Goater
On non-P9 machines, the XIVE interrupt mode is not advertised, see spapr_dt_ov5_platform_support(). Add a couple of checks on the machine configuration to filter bogus setups and prevent OS failures : Interrupt modes CPU/Compat XICS XIVE dual P8/P8 OK QEMU failure (1) OK (3) P9/P8 OK QEMU failure (2) OK (3) P9/P9 OK OK OK (1) CPU exception model is incompatible with XIVE and the presenters will fail to realize. (2) CPU exception model is compatible with XIVE, but the XIVE CAS advertisement is dropped when in POWER8 mode. So we could ended up booting with the XIVE DT properties but without the HCALLs. Avoid confusing Linux with such settings and fail under QEMU. (3) force XICS in machine init Remove the check on XIVE-only machines in spapr_machine_init(), which has now become redundant. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190328100044.11408-1-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-29spapr: Simplify handling of host-serial and host-model valuesDavid Gibson
27461d69a0f "ppc: add host-serial and host-model machine attributes (CVE-2019-8934)" introduced 'host-serial' and 'host-model' machine properties for spapr to explicitly control the values advertised to the guest in device tree properties with the same names. The previous behaviour on KVM was to unconditionally populate the device tree with the real host serial number and model, which leaks possibly sensitive information about the host to the guest. To maintain compatibility for old machine types, we allowed those props to be set to "passthrough" to take the value from the host as before. Or they could be set to "none" to explicitly omit the device tree items. Special casing specific values on what's otherwise a user supplied string is very ugly. So, this patch simplifies things by implementing the backwards compatibility in a different way: we have a machine class flag set for the older machines, and we only load the host values into the device tree if A) they're not set by the user and B) we have that flag set. This does mean that the "passthrough" functionality is no longer available with the current machine type. That's ok though: if a user or management layer really wants the information passed through they can read it themselves (OpenStack Nova already does something similar for x86). It also means the user can't explicitly ask for the values to be omitted on the old machine types. I think that's an acceptable trade-off: if you care enough about not leaking the host information you can either move to the new machine type, or use a dummy value for the properties. For the new machine type, this also removes an odd inconsistency between running on a POWER and non-POWER (or non-Linux) hosts: if the host information couldn't be read from where we expect (in the host's device tree as exposed by Linux), we'd fallback to omitting the guest device tree items. While we're there, improve some poorly worded comments, and the help text for the properties. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org>
2019-03-29target/ppc: Fix QEMU crash with stxsdxGreg Kurz
I've been hitting several QEMU crashes while running a fedora29 ppc64le guest under TCG. Each time, this would occur several minutes after the guest reached login: Fedora 29 (Twenty Nine) Kernel 4.20.6-200.fc29.ppc64le on an ppc64le (hvc0) Web console: https://localhost:9090/ localhost login: tcg/tcg.c:3211: tcg fatal error This happens because a bug crept up in the gen_stxsdx() helper when it was converted to use VSR register accessors by commit 8b3b2d75c7c04 "target/ppc: introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access". The code creates a temporary, passes it directly to gen_qemu_st64_i64() and then to set_cpu_vrsh()... which looks like this was mistakenly coded as a load instead of a store. Reverse the logic: read the VSR to the temporary first and then store it to memory. Fixes: 8b3b2d75c7c0481544e277dad226223245e058eb Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155371035249.2038502.12364252604337688538.stgit@bahia.lan> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-29target/ppc: Improve comment of bcctr used for spectre v2 mitigationGreg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155359567174.1794128.3183997593369465355.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-29target/ppc: Consolidate 64-bit server processor detection in a helperGreg Kurz
We use PPC_SEGMENT_64B in various places to guard code that is specific to 64-bit server processors compliant with arch 2.x. Consolidate the logic in a helper macro with an explicit name. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155327783157.1283071.3747129891004927299.stgit@bahia.lan> Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-29target/ppc: Enable "decrement and test CTR" version of bcctrGreg Kurz
Even if all ISAs up to v3 indeed mention: If the "decrement and test CTR" option is specified (BO2=0), the instruction form is invalid. The UMs of all existing 64-bit server class processors say: If BO[2] = 0, the contents of CTR (before any update) are used as the target address and for the test of the contents of CTR to resolve the branch. The contents of the CTR are then decremented and written back to the CTR. The linux kernel has spectre v2 mitigation code that relies on a BO[2] = 0 variant of bcctr, which is now activated by default on spapr, even with TCG. This causes linux guests to panic with the default machine type under TCG. Since any CPU model can provide its own behaviour for invalid forms, we could possibly introduce a new instruction flag to handle this. In practice, since the behaviour is shared by all 64-bit server processors starting with 970 up to POWER9, let's reuse the PPC_SEGMENT_64B flag. Caveat: this may have to be fixed later if POWER10 introduces a different behaviour. The existing behaviour of throwing a program interrupt is kept for all other CPU models. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155327782604.1283071.10640596307206921951.stgit@bahia.lan> Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-29target/ppc: Fix TCG temporary leaks in gen_bcond()Greg Kurz
Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155327782047.1283071.10234727692461848972.stgit@bahia.lan> Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-28Merge remote-tracking branch ↵Peter Maydell
'remotes/alistair/tags/pull-device-tree-20190327' into staging Device Tree Pull Request for 4.0 A single patch updating the MAINTAINERS file for 4.0. # gpg: Signature made Wed 27 Mar 2019 17:02:00 GMT # gpg: using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054 # gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [full] # Primary key fingerprint: F6C4 AC46 D493 4868 D3B8 CE8F 21E1 0D29 DF97 7054 * remotes/alistair/tags/pull-device-tree-20190327: MAINTAINERS: Update the device tree maintainers Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-28Merge remote-tracking branch 'remotes/otubo/tags/pull-seccomp-20190327' into ↵Peter Maydell
staging pull-seccomp-20190327 # gpg: Signature made Wed 27 Mar 2019 12:12:39 GMT # gpg: using RSA key DF32E7C0F0FFF9A2 # gpg: Good signature from "Eduardo Otubo (Senior Software Engineer) <otubo@redhat.com>" [full] # Primary key fingerprint: D67E 1B50 9374 86B4 0723 DBAB DF32 E7C0 F0FF F9A2 * remotes/otubo/tags/pull-seccomp-20190327: seccomp: report more useful errors from seccomp seccomp: don't kill process for resource control syscalls Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-28Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* Kconfig improvements (msi_nonbroken, imply for default PCI devices) * intel-iommu: sharing passthrough FlatViews (Peter) * Fix for SEV with VFIO (Brijesh) * Allow compilation without CONFIG_PARALLEL (Thomas) # gpg: Signature made Thu 21 Mar 2019 16:42:24 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (23 commits) virtio-vga: only enable for specific boards config-all-devices.mak: rebuild on reconfigure minikconf: fix parser typo intel-iommu: optimize nodmar memory regions test-announce-self: convert to qgraph hw/alpha/Kconfig: DP264 hardware requires e1000 network card hw/hppa/Kconfig: Dino board requires e1000 network card hw/sh4/Kconfig: r2d machine requires the rtl8139 network card hw/ppc/Kconfig: e500 based machines require virtio-net-pci device hw/ppc/Kconfig: Bamboo machine requires e1000 network card hw/mips/Kconfig: Fulong 2e board requires ati-vga/rtl8139 PCI devices hw/mips/Kconfig: Malta machine requires the pcnet network card hw/i386/Kconfig: enable devices that can be created by default hw/isa/Kconfig: PIIX4 southbridge requires USB UHCI hw/isa/Kconfig: i82378 SuperIO requires PC speaker device prep: do not select I82374 hw/i386/Kconfig: PC uses I8257, not I82374 hw/char/parallel: Make it possible to compile also without CONFIG_PARALLEL target/i386: sev: Do not pin the ram device memory region memory: Fix the memory region type assignment order ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/rdma/Makefile.objs # hw/riscv/sifive_plic.c
2019-03-28Merge remote-tracking branch 'remotes/xtensa/tags/20190326-xtensa' into stagingPeter Maydell
target/xtensa fixes for v4.0: - fix translation of FLIX bundles with multiple references to the same register; - don't announce exit simcall; - clean up tests/tcg/xtensa. # gpg: Signature made Tue 26 Mar 2019 17:58:59 GMT # gpg: using RSA key 2B67854B98E5327DCDEB17D851F9CC91F83FA044 # gpg: issuer "jcmvbkbc@gmail.com" # gpg: Good signature from "Max Filippov <filippov@cadence.com>" [unknown] # gpg: aka "Max Filippov <max.filippov@cogentembedded.com>" [full] # gpg: aka "Max Filippov <jcmvbkbc@gmail.com>" [full] # Primary key fingerprint: 2B67 854B 98E5 327D CDEB 17D8 51F9 CC91 F83F A044 * remotes/xtensa/tags/20190326-xtensa: tests/tcg/xtensa: clean up test set target/xtensa: don't announce exit simcall target/xtensa: fix break_dependency for repeated resources Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-27MAINTAINERS: Update the device tree maintainersAlistair Francis
Remove Alex as a Device Tree maintainer as requested by him. Add myself as a maintainer to avoid it being orphaned. Also add David as a Reviewer (R) as he is the libfdt and DTC maintainer. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alexander Graf <agraf@csgraf.de> Acked-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-27seccomp: report more useful errors from seccompDaniel P. Berrangé
Most of the seccomp functions return errnos as a negative return value. The code is currently ignoring these and reporting a generic error message for all seccomp failure scenarios making debugging painful. Report a more precise error from each failed call and include errno if it is available. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Eduardo Otubo <otubo@redhat.com>
2019-03-27seccomp: don't kill process for resource control syscallsDaniel P. Berrangé
The Mesa library tries to set process affinity on some of its threads in order to optimize its performance. Currently this results in QEMU being immediately terminated when seccomp is enabled. Mesa doesn't consider failure of the process affinity settings to be fatal to its operation, but our seccomp policy gives it no choice in gracefully handling this denial. It is reasonable to consider that malicious code using the resource control syscalls to be a less serious attack than if they were trying to spawn processes or change UIDs and other such things. Generally speaking changing the resource control setting will "merely" affect quality of service of processes on the host. With this in mind, rather than kill the process, we can relax the policy for these syscalls to return the EPERM errno value. This allows callers to detect that QEMU does not want them to change resource allocations, and apply some reasonable fallback logic. The main downside to this is for code which uses these syscalls but does not check the return value, blindly assuming they will always succeeed. Returning an errno could result in sub-optimal behaviour. Arguably though such code is already broken & needs fixing regardless. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Eduardo Otubo <otubo@redhat.com>
2019-03-26Update version for v4.0.0-rc1 releasev4.0.0-rc1Peter Maydell
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-26Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches: - Fix slow pre-zeroing in qemu-img convert - Test case for block job pausing on I/O errors # gpg: Signature made Tue 26 Mar 2019 15:28:00 GMT # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: qemu-io: Add write -n for BDRV_REQ_NO_FALLBACK qemu-img: Use BDRV_REQ_NO_FALLBACK for pre-zeroing file-posix: Support BDRV_REQ_NO_FALLBACK for zero writes block: Advertise BDRV_REQ_NO_FALLBACK in filter drivers block: Add BDRV_REQ_NO_FALLBACK block: Remove error messages in bdrv_make_zero() iotests: add 248: test resume mirror after auto pause on ENOSPC Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-26Merge remote-tracking branch ↵Peter Maydell
'remotes/kraxel/tags/fixes-20190326-pull-request' into staging fixes for 4.0: ohci and ati-vga # gpg: Signature made Tue 26 Mar 2019 14:05:40 GMT # gpg: using RSA key 4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full] # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full] # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full] # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/fixes-20190326-pull-request: ati-vga: Fix indexed access to video memory ohci: don't die on ED_LINK_LIMIT overflow Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-26Merge remote-tracking branch ↵Peter Maydell
'remotes/pmaydell/tags/pull-target-arm-20190326' into staging target-arm queue: * Set SIMDMISC and FPMISC for 32-bit -cpu max (fixes regression from 3.1) * fix vCont packet handling when no thread is specified # gpg: Signature made Tue 26 Mar 2019 13:09:48 GMT # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20190326: gdbstub: fix vCont packet handling when no thread is specified target/arm: Set SIMDMISC and FPMISC for 32-bit -cpu max Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-26gdbstub: fix vCont packet handling when no thread is specifiedLuc Michel
The vCont packet accepts a series of actions, each being applied on a given thread ID. Giving no thread ID for an action is valid and means "all threads". This commit fixes vCont packets being incorrectly rejected when no thread ID was given for an action. In multiprocess mode, the GDB Remote Protocol specification is unclear on what "all threads" means. We choose to apply the action on all threads of all attached processes. This commit is based on the initial fix by Lucien Murray-Pitts. Fixes: e40e5204af8388 Reported-by: Lucien Murray-Pitts <lucienmp_antispam@yahoo.com> Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Luc Michel <luc.michel@greensocs.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190325110452.6756-1-luc.michel@greensocs.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-26target/arm: Set SIMDMISC and FPMISC for 32-bit -cpu maxRichard Henderson
Fixes: https://bugs.launchpad.net/bugs/1821430 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20190325161338.6536-1-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-26ati-vga: Fix indexed access to video memoryBALATON Zoltan
Coverity (CID 1399700) found that this was wrong so instead of trying to do it by hand use existing access functions that should work better. Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Message-id: 20190318223842.427CB7456B2@zero.eik.bme.hu Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-03-26ohci: don't die on ED_LINK_LIMIT overflowLaurent Vivier
Stop processing the descriptor list instead. The next frame timer tick will resume the work Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1686705 Suggested-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-id: 20190321085212.10796-1-lvivier@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-03-26qemu-io: Add write -n for BDRV_REQ_NO_FALLBACKKevin Wolf
This makes the new BDRV_REQ_NO_FALLBACK flag available in the qemu-io write command. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Eric Blake <eblake@redhat.com>
2019-03-26qemu-img: Use BDRV_REQ_NO_FALLBACK for pre-zeroingKevin Wolf
If qemu-img convert sees that the target image isn't zero-initialised yet, it tries to do an efficient zero write for the whole image first to save the overhead of repeated explicit zero writes during the conversion. Obviously, this provides only an advantage if the pre-zeroing is actually efficient. Otherwise, we can end up writing zeroes slowly while zeroing out the whole image, and then overwrite the same blocks again with real data, potentially doubling the written data. Pass BDRV_REQ_NO_FALLBACK to blk_make_zero() to avoid this case. If we can't efficiently zero out, we'll instead write explicit zeroes only if there is no data to be written to a block. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Eric Blake <eblake@redhat.com>
2019-03-26file-posix: Support BDRV_REQ_NO_FALLBACK for zero writesKevin Wolf
We know that the kernel implements a slow fallback code path for BLKZEROOUT, so if BDRV_REQ_NO_FALLBACK is given, we shouldn't call it. The other operations we call in the context of .bdrv_co_pwrite_zeroes should usually be quick, so no modification should be needed for them. If we ever notice that there are additional problematic cases, we can still make these conditional as well. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Eric Blake <eblake@redhat.com>
2019-03-26block: Advertise BDRV_REQ_NO_FALLBACK in filter driversKevin Wolf
Filter drivers that support .bdrv_co_pwrite_zeroes can safely advertise BDRV_REQ_NO_FALLBACK because they just forward the request flags to their child node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Eric Blake <eblake@redhat.com>
2019-03-26block: Add BDRV_REQ_NO_FALLBACKKevin Wolf
For qemu-img convert, we want an operation that zeroes out the whole image if this can be done efficiently, but that returns an error otherwise so we don't write explicit zeroes and immediately overwrite them with the real data, potentially doubling the amount of data to be written. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Eric Blake <eblake@redhat.com>
2019-03-26block: Remove error messages in bdrv_make_zero()Kevin Wolf
There is only a single caller of bdrv_make_zero(), which is qemu-img convert. If the function fails, we just fall back to a different method of zeroing out blocks on the target image. There is no good reason to print error messages on stderr when the higher level operation will actually succeed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Eric Blake <eblake@redhat.com>
2019-03-26iotests: add 248: test resume mirror after auto pause on ENOSPCVladimir Sementsov-Ogievskiy
Test that mirror job actually resume on resume command after being automatically paused on ENOSPC error. It's a follow-up test for 8d9648cbf3e "blockjob: fix user pause in block_job_error_action" Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: John Snow <jsnow@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>