aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-20block: Convert BB interface to byte-based discardsEric Blake
Change sector-based blk_discard(), blk_co_discard(), and blk_aio_discard() to instead be byte-based blk_pdiscard(), blk_co_pdiscard(), and blk_aio_pdiscard(). NBD gets a lot simpler now that ignoring the unaligned portion of a byte-based discard request is handled under the hood by the block layer. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-6-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20block: Convert bdrv_aio_discard() to byte-basedEric Blake
Another step towards byte-based interfaces everywhere. Replace the sector-based bdrv_aio_discard() with a new byte-based bdrv_aio_pdiscard(), which silently ignores any unaligned head or tail. Driver callbacks will be converted in followup patches. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1468624988-423-5-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20block: Switch BlockRequest to byte-basedEric Blake
BlockRequest is the internal struct used by bdrv_aio_*. At the moment, all such calls were sector-based, but we will eventually convert to byte-based; start by changing the internal variables to be byte-based. No change to behavior, although the read and write code can now go byte-based through more of the stack. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1468624988-423-4-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20block: Convert bdrv_discard() to byte-basedEric Blake
Another step towards byte-based interfaces everywhere. Replace the sector-based bdrv_discard() with a new byte-based bdrv_pdiscard(), which silently ignores any unaligned head or tail. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-3-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20block: Convert bdrv_co_discard() to byte-basedEric Blake
Another step towards byte-based interfaces everywhere. Replace the sector-based bdrv_co_discard() with a new byte-based bdrv_co_pdiscard(), which silently ignores any unaligned head or tail. Driver callbacks will be converted in followup patches. By calculating the alignment outside of the loop, and clamping the max discard to an aligned value, we can simplify the actions done within the loop. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-2-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20iscsi: Rely on block layer to break up large requestsEric Blake
Now that the block layer honors max_request, we don't need to bother with an EINVAL on overlarge requests, but can instead assert that requests are well-behaved. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468607524-19021-7-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20nbd: Drop unused offset parameterEric Blake
Now that NBD relies on the block layer to fragment things, we no longer need to track an offset argument for which fragment of a request we are actually servicing. While at it, use true and false instead of 0 and 1 for a bool parameter. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468607524-19021-6-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20nbd: Rely on block layer to break up large requestsEric Blake
Now that the block layer will honor max_transfer, we can simplify our code to rely on that guarantee. The readv code can call directly into nbd-client, just as the writev code has done since commit 52a4650. Interestingly enough, while qemu-io 'w 0 40m' splits into a 32M and 8M transaction, 'w -z 0 40m' splits into two 16M and an 8M, because the block layer caps the bounce buffer for writing zeroes at 16M. When we later introduce support for NBD_CMD_WRITE_ZEROES, we can get a full 32M zero write (or larger, if the client and server negotiate that write zeroes can use a larger size than ordinary writes). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468607524-19021-5-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20block: Fragment writes to max transfer lengthEric Blake
Drivers should be able to rely on the block layer honoring the max transfer length, rather than needing to return -EINVAL (iscsi) or manually fragment things (nbd). We already fragment write zeroes at the block layer; this patch adds the fragmentation for normal writes, after requests have been aligned (fragmenting before alignment would lead to multiple unaligned requests, rather than just the head and tail). When fragmenting a large request where FUA was requested, but where we know that FUA is implemented by flushing all requests rather than the given request, then we can still get by with only one flush. Note, however, that we need a followup patch to the raw format driver to avoid a regression in the number of flushes actually issued. The return value was previously nebulous on success (sometimes zero, sometimes the length written); since we never have a short write, and since fragmenting may store yet another positive value in 'ret', change the function to always return 0 on success, matching what we do in bdrv_aligned_preadv(). Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1468607524-19021-4-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20raw_bsd: Don't advertise flags not supported by protocol layerEric Blake
The raw format layer supports all flags via passthrough - but it only makes sense to pass through flags that the lower layer actually supports. The next patch gives stronger reasoning for why this is correct. At the moment, the raw format layer ignores the max_transfer limit of its protocol layer, and an attempt to do the qemu-io 'w -f 0 40m' to an NBD server that lacks FUA will pass the entire 40m request to the NBD driver, which then fragments the request itself into a 32m write, 8m write, and flush. But once the block layer starts honoring limits and fragmenting packets, the raw driver will hand the NBD driver two separate requests; if both requests have BDRV_REQ_FUA set, then this would result in a 32m write, flush, 8m write, and second flush. By having the raw layer no longer advertise FUA support when the protocol layer lacks it, we are back to a single flush at the block layer for the overall 40m request. Note that 'w -f -z 0 40m' does not currently exhibit the same problem, because there, the fragmentation does not occur until at the NBD layer (the raw layer has .bdrv_co_pwrite_zeroes, and the NBD layer doesn't advertise max_pwrite_zeroes to constrain things at the raw layer) - but the problem is latent and we would again have too many flushes without this patch once the NBD layer implements support for the new NBD_CMD_WRITE_ZEROES command, if it sets max_pwrite_zeroes to the same 32m limit as recommended by the NBD protocol. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468607524-19021-3-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20block: Fragment reads to max transfer lengthEric Blake
Drivers should be able to rely on the block layer honoring the max transfer length, rather than needing to return -EINVAL (iscsi) or manually fragment things (nbd). This patch adds the fragmentation in the block layer, after requests have been aligned (fragmenting before alignment would lead to multiple unaligned requests, rather than just the head and tail). The return value was previously nebulous on success on whether it was zero or the length read; and fragmenting may introduce yet other non-zero values if we use the last length read. But as at least some callers are sloppy and expect only zero on success, it is easiest to just guarantee 0. [Fix uninitialized ret local variable in bdrv_aligned_preadv(). --Stefan] Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1468607524-19021-2-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20Merge remote-tracking branch ↵Peter Maydell
'remotes/pmaydell/tags/pull-target-arm-20160719' into staging target-arm queue: * fix two minor Coverity complaints # gpg: Signature made Tue 19 Jul 2016 18:02:34 BST # gpg: using RSA key 0x3C2525ED14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20160719: arm_gicv3: Add assert()s to tell Coverity that offsets are aligned target-arm: Fix unreachable code in gicv3_class_name() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-20Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160719-2' ↵Peter Maydell
into staging linux-user fixes before 2.7 freeze, fix commit message # gpg: Signature made Tue 19 Jul 2016 14:18:54 BST # gpg: using RSA key 0xB44890DEDE3C9BC0 # gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>" # gpg: aka "Riku Voipio <riku.voipio@linaro.org>" # Primary key fingerprint: FF82 03C8 C391 98AE 0581 41EF B448 90DE DE3C 9BC0 * remotes/riku/tags/pull-linux-user-20160719-2: linux-user: AArch64 has sync_file_range, not sync_file_range2 linux-user: Fix type for SIOCATMARK ioctl linux-user: define missing sparc syscalls linux-user: Fix terminal control ioctls linux-user: Add some new blk ioctls linux-user: Handle short lengths in host_to_target_sockaddr() linux-user: Forget about synchronous signal once it is delivered linux-user: Correct type for LOOP_GET_STATUS{,64} ioctls linux-user: Correct type for BLKSSZGET linux-user: Add loop control ioctls linux-user: Check sigsetsize argument to syscalls linux-user: add nested netlink types linux-user: convert sockaddr_ll from host to target linux-user: add fd_trans helper in do_recvfrom() linux-user: fix netlink memory corruption linux-user: fd_trans_*_data() returns the length Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19arm_gicv3: Add assert()s to tell Coverity that offsets are alignedPeter Maydell
Coverity complains that the GICR_IPRIORITYR case in gicv3_readl() can overflow an array, because it doesn't know that the offsets passed to that function must be word aligned. Add some assert()s which hopefully tell Coverity that this isn't possible. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1468261372-17508-1-git-send-email-peter.maydell@linaro.org
2016-07-19target-arm: Fix unreachable code in gicv3_class_name()Peter Maydell
Coverity complains that the exit() in gicv3_class_name() can be unreachable, because if TARGET_AARCH64 is defined then all code paths return before reaching it. Move the exit() up to the error_report() that it belongs with. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org> Message-id: 1468260552-8400-1-git-send-email-peter.maydell@linaro.org
2016-07-19disas: Fix ATTRIBUTE_UNUSED define clash with ALSA headersPeter Maydell
disas/bfd.h defines ATTRIBUTE_UNUSED, but unfortunately the ALSA system headers also define this macro, which means that you can get a compilation failure if building with ALSA and any files happen to include the alsa headers before bfd.h rather than the other way around. This is unfortunate namespace pollution by the ALSA headers but we can work around it. Add an #ifndef guard to bfd.h and remove the unnecessary extra definition in disas/arm.c to fix this. Reported-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468937076-21503-1-git-send-email-peter.maydell@linaro.org
2016-07-19Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* two old patches from prospective GSoC students * i386 -kernel device tree support * Coverity fix * memory usage improvement from Peter * checkpatch fix * g_path_get_dirname cleanup * caching of block status for iSCSI # gpg: Signature made Tue 19 Jul 2016 07:43:41 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: target-i386: Remove redundant HF_SOFTMMU_MASK block/iscsi: allow caching of the allocation map block/iscsi: fix rounding in iscsi_allocationmap_set Move README to markdown cpu-exec: Move down some declarations in cpu_exec() exec: avoid realloc in phys_map_node_reserve checkpatch: consider git extended headers valid patches megasas: remove useless check for cmd->frame compiler: never omit assertions if using a static analysis tool hw/i386: add device tree support Changed malloc to g_malloc, free to g_free in bsd-user/qemu.h use g_path_get_dirname instead of dirname Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19Merge remote-tracking branch 'remotes/stsquad/tags/pull-travis-20160718-1' ↵Peter Maydell
into staging Make IRC a little less noisy # gpg: Signature made Mon 18 Jul 2016 16:42:57 BST # gpg: using RSA key 0xFBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-travis-20160718-1: .travis.yml: Disable IRC build status updates from forks Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19linux-user: AArch64 has sync_file_range, not sync_file_range2Peter Maydell
The AArch64 Linux ABI syscall 84 is sync_file_range, not sync_file_range2 (in the kernel it uses the asm-generic headers and does not define __ARCH_WANT_SYNC_FILE_RANGE2). Update our TARGET_NR_* definitions accordingly. This fixes the sync_file_range syscall which otherwise gets its arguments in the wrong order. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: Fix type for SIOCATMARK ioctlPeter Maydell
The SIOCATMARK ioctl takes an argument which should be a pointer to an integer where the kernel will write the result. We were incorrectly declaring it as TYPE_NULL which would mean it would always fail (with EFAULT) when it should succeed. Correct the type. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: define missing sparc syscallsLaurent Vivier
NR_lookup_dcookie, NR_fadvise64, NR_fadvise64_64 Signed-off-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: Fix terminal control ioctlsTimothy Pearson
TIOCGPTN and related terminal control ioctls were not converted to the guest ioctl format on x86_64 targets. Convert these ioctls to enable terminal functionality on x86_64 guests. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-20160719-1' into ↵Peter Maydell
staging vnc: bugfixes for -rc0 # gpg: Signature made Tue 19 Jul 2016 08:27:05 BST # gpg: using RSA key 0x4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/pull-vnc-20160719-1: vnc-tight: fix regression with libxenstore vnc-enc-tight: fix off-by-one bug vnc: make sure we finish disconnect Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' ↵Peter Maydell
into staging Update OpenBIOS images # gpg: Signature made Tue 19 Jul 2016 07:42:30 BST # gpg: using RSA key 0x5BC2C56FAE0F321F # gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" # Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F * remotes/mcayland/tags/qemu-openbios-signed: Update OpenBIOS images to e79bca6 built from submodule. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19linux-user: Add some new blk ioctlsPeter Maydell
Add some new blk ioctls (these are 0x12,119 through to 0x12,127). Several of these are used by mke2fs; this silences the warnings: mke2fs 1.42.12 (29-Aug-2014) Unsupported ioctl: cmd=0x127b Unsupported ioctl: cmd=0x127a warning: Unable to get device geometry for /dev/loop5 Unsupported ioctl: cmd=0x127c Unsupported ioctl: cmd=0x127c Unsupported ioctl: cmd=0x1277 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: Handle short lengths in host_to_target_sockaddr()Peter Maydell
If userspace specifies a short buffer for a target sockaddr, the kernel will only copy in as much as it has space for (or none at all if the length is zero) -- see the kernel move_addr_to_user() function. Mimic this in QEMU's host_to_target_sockaddr() routine. In particular, this fixes a segfault running the LTP recvfrom01 test, where the guest makes a recvfrom() call with a bad buffer pointer and other parameters which cause the kernel to set the addrlen to zero; because we did not skip the attempt to swap the sa_family field we segfaulted on the bad address. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: Forget about synchronous signal once it is deliveredPeter Maydell
Commit 655ed67c2a248cf which switched synchronous signals to benig recorded in ts->sync_signal rather than in a queue with every other signal had a bug: we failed to clear the flag indicating that a synchronous signal was pending when we delivered it. This meant that we would take the signal again and again every time the guest made a syscall. (This is a bug introduced in my refactoring of Timothy Baldwin's original code.) Fix this by passing in the struct emulated_sigtable* to handle_pending_signal(), so that we clear the pending flag in the ts->sync_signal struct when handling a synchronous signal. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: Correct type for LOOP_GET_STATUS{,64} ioctlsPeter Maydell
The LOOP_GET_STATUS and LOOP_GET_STATUS64 ioctls were incorrectly defined as IOC_W rather than IOC_R, which meant we weren't correctly copying the information back from the kernel to the guest. The loop_info64 structure definition was also missing a member and using the wrong type for several 32-bit fields. In particular, this meant that "kpartx -d image.img" didn't work and "losetup -a" behaved strangely. Correct the ioctl type definitions. Reported-by: Chanho Park <chanho61.park@samsung.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: Correct type for BLKSSZGETPeter Maydell
The BLKSSZGET ioctl takes an argument which is a pointer to an int. We were incorrectly declaring it to take a pointer to a long, which meant that we would incorrectly write to memory which we should not if the guest is a 64-bit architecture. In particular, kpartx uses this ioctl to write to an int on the stack, which tends to result in it crashing immediately. Reported-by: Chanho Park <chanho61.park@samsung.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: Add loop control ioctlsPeter Maydell
Add support for the /dev/loop-control ioctls: LOOP_CTL_ADD LOOP_CTL_REMOVE LOOP_CTL_GET_FREE [RV: fixed to apply to new header guards] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: Check sigsetsize argument to syscallsPeter Maydell
Many syscalls which take a sigset_t argument also take an argument giving the size of the sigset_t. The kernel insists that this matches its idea of the type size and fails EINVAL if it is not. Implement this logic in QEMU. (This mostly just means some LTP test cases which check error cases now pass.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
2016-07-19linux-user: add nested netlink typesLaurent Vivier
Nested types are used by the kernel to send link information and protocol properties. We can see following errors with "ip link show": Unimplemented nested type 26 Unimplemented nested type 26 Unimplemented nested type 18 Unimplemented nested type 26 Unimplemented nested type 18 Unimplemented nested type 26 This patch implements nested types 18 (IFLA_LINKINFO) and 26 (IFLA_AF_SPEC). Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: convert sockaddr_ll from host to targetLaurent Vivier
As we convert sockaddr for AF_PACKET family for sendto() (target to host) we need also to convert this for getsockname() (host to target). arping uses getsockname() to get the the interface address and uses this address with sendto(). Tested with: /sbin/arping -D -q -c2 -I eno1 192.168.122.88 ... getsockname(3, {sa_family=AF_PACKET, proto=0x806, if2, pkttype=PACKET_HOST, addr(6)={1, 10c37b6b9a76}, [18]) = 0 ... sendto(3, "..." 28, 0, {sa_family=AF_PACKET, proto=0x806, if2, pkttype=PACKET_HOST, addr(6)={1, ffffffffffff}, 20) = 28 ... Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: add fd_trans helper in do_recvfrom()Laurent Vivier
Fix passwd using netlink audit. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: fix netlink memory corruptionLaurent Vivier
Netlink is byte-swapping data in the guest memory (it's bad). It's ok when the data come from the host as they are generated by the host. But it doesn't work when data come from the guest: the guest can try to reuse these data whereas they have been byte-swapped. This is what happens in glibc: glibc generates a sequence number in nlh.nlmsg_seq and calls sendto() with this nlh. In sendto(), we byte-swap nlmsg.seq. Later, after the recvmsg(), glibc compares nlh.nlmsg_seq with sequence number given in return, and of course it fails (hangs), because nlh.nlmsg_seq is not valid anymore. The involved code in glibc is: sysdeps/unix/sysv/linux/check_pf.c:make_request() ... req.nlh.nlmsg_seq = time (NULL); ... if (TEMP_FAILURE_RETRY (__sendto (fd, (void *) &req, sizeof (req), 0, (struct sockaddr *) &nladdr, sizeof (nladdr))) < 0) <here req.nlh.nlmsg_seq has been byte-swapped> ... do { ... ssize_t read_len = TEMP_FAILURE_RETRY (__recvmsg (fd, &msg, 0)); ... struct nlmsghdr *nlmh; for (nlmh = (struct nlmsghdr *) buf; NLMSG_OK (nlmh, (size_t) read_len); nlmh = (struct nlmsghdr *) NLMSG_NEXT (nlmh, read_len)) { <we compare nlmh->nlmsg_seq with corrupted req.nlh.nlmsg_seq> if (nladdr.nl_pid != 0 || (pid_t) nlmh->nlmsg_pid != pid || nlmh->nlmsg_seq != req.nlh.nlmsg_seq) continue; ... else if (nlmh->nlmsg_type == NLMSG_DONE) /* We found the end, leave the loop. */ done = true; } } while (! done); As we have a continue on "nlmh->nlmsg_seq != req.nlh.nlmsg_seq", "done" cannot be set to "true" and we have an infinite loop. It's why commands like "apt-get update" or "dnf update hangs". Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19linux-user: fd_trans_*_data() returns the lengthLaurent Vivier
fd_trans_target_to_host_data() and fd_trans_host_to_target_data() must return the length of processed data. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-07-19Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Tue 19 Jul 2016 03:33:40 BST # gpg: using RSA key 0xEF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: e1000e: fix building without CONFIG_VMXNET3_PCI MAINTAINERS: release Scott from being a rocker maintainer tap: fix memory leak on failure to create a multiqueue tap device net: fix incorrect argument to iov_to_buf net: fix incorrect access to pointer e1000e: fix incorrect access to pointer Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into stagingPeter Maydell
# gpg: Signature made Mon 18 Jul 2016 23:53:15 BST # gpg: using RSA key 0x7DEF8106AAFC390E # gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" # Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB # Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E * remotes/jnsnow/tags/ide-pull-request: block: ignore flush requests when storage is clean tests: in IDE and AHCI tests perform DMA write before flushing ide: set retry_unit for PIO and FLUSH requests ide: refactor retry_unit set and clear into separate function Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' ↵Peter Maydell
into staging # gpg: Signature made Mon 18 Jul 2016 22:59:55 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/tracing-pull-request: trace: Add QAPI/QMP interfaces to query and control per-vCPU tracing state trace: Allow event name pattern in "info trace-events" trace: Conditionally trace events based on their per-vCPU state trace: Add per-vCPU tracing states for events with the 'vcpu' property trace: Cosmetic changes on fast-path tracing disas: Remove unused macro '_' trace: Identify events with the 'vcpu' property trace: [bsd-user] Commandline arguments to control tracing trace: [linux-user] Commandline arguments to control tracing Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160718.0' ↵Peter Maydell
into staging VFIO update 2016-07-18 One fix for 2.7-rc0 which hides the ARI extended capability, fixing multifunction support in PCIe configurations where the assigned device function topology does not match the host (Alex Williamson) # gpg: Signature made Mon 18 Jul 2016 18:02:27 BST # gpg: using RSA key 0x239B9B6E3BB08B22 # gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" # gpg: aka "Alex Williamson <alex@shazbot.org>" # gpg: aka "Alex Williamson <alwillia@redhat.com>" # gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" # Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B 8A90 239B 9B6E 3BB0 8B22 * remotes/awilliam/tags/vfio-update-20160718.0: vfio/pci: Hide ARI capability Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-19Update OpenBIOS images to e79bca6 built from submodule.Mark Cave-Ayland
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2016-07-19target-i386: Remove redundant HF_SOFTMMU_MASKSergey Fedorov
'HF_SOFTMMU_MASK' is only set when 'CONFIG_SOFTMMU' is defined. So there's no need in this flag: test 'CONFIG_SOFTMMU' instead. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20160715175852.30749-6-sergey.fedorov@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-19block/iscsi: allow caching of the allocation mapPeter Lieven
until now the allocation map was used only as a hint if a cluster is allocated or not. If a block was not allocated (or Qemu had no info about the allocation status) a get_block_status call was issued to check the allocation status and possibly avoid a subsequent read of unallocated sectors. If a block known to be allocated the get_block_status call was omitted. In the other case a get_block_status call was issued before every read to avoid the necessity for a consistent allocation map. To avoid the potential overhead of calling get_block_status for each and every read request this took only place for the bigger requests. This patch enhances this mechanism to cache the allocation status and avoid calling get_block_status for blocks where the allocation status has been queried before. This allows for bypassing the read request even for smaller requests and additionally omits calling get_block_status for known to be unallocated blocks. Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1468831940-15556-3-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-19block/iscsi: fix rounding in iscsi_allocationmap_setPeter Lieven
when setting clusters as alloacted the boundaries have to be expanded. As Paolo pointed out the calculation of the number of clusters is wrong: Suppose cluster_sectors is 2, sector_num = 1, nb_sectors = 6: In the "mark allocated" case, you want to set 0..8, i.e. cluster_num=0, nb_clusters=4. 0--.--2--.--4--.--6--.--8 <--|_________________|--> (<--> = expanded) Instead you are setting nb_clusters=3, so that 6..8 is not marked. 0--.--2--.--4--.--6--.--8 <--|______________|!!! (! = wrong) Cc: qemu-stable@nongnu.org Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1468831940-15556-2-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-19Move README to markdownPranith Kumar
Move the README file to markdown so that it makes the github page look prettier. I know that github repo is a mirror and not the official repo, but I think it doesn't hurt to have it in markdown format. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160715043111.29007-1-bobby.prani@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-18block: ignore flush requests when storage is cleanEvgeny Yakovlev
Some guests (win2008 server for example) do a lot of unnecessary flushing when underlying media has not changed. This adds additional overhead on host when calling fsync/fdatasync. This change introduces a write generation scheme in BlockDriverState. Current write generation is checked against last flushed generation to avoid unnessesary flushes. The problem with excessive flushing was found by a performance test which does parallel directory tree creation (from 2 processes). Results improved from 0.424 loops/sec to 0.432 loops/sec. Each loop creates 10^3 directories with 10 files in each. This affected some blkdebug testcases that were expecting error logs from failure-injected flushes which are now skipped entirely (tests 026 071 089). This also affects the performance of block jobs and thus BLOCK_JOB_READY events for driver-mirror and active block-commit commands now arrives faster, before QMP send successfully returns to caller (tests 141 144). Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468870792-7411-5-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: John Snow <jsnow@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>
2016-07-18tests: in IDE and AHCI tests perform DMA write before flushingEvgeny Yakovlev
Due to changes in flush behaviour clean disks stopped generating flush_to_disk events and IDE and AHCI tests that test flush commands started to fail. This change adds additional DMA writes to affected tests before sending flush commands so that bdrv_flush actually generates flush_to_disk event. Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468870792-7411-4-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: John Snow <jsnow@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>
2016-07-18ide: set retry_unit for PIO and FLUSH requestsEvgeny Yakovlev
The following sequence of tests discovered a problem in IDE emulation: 1. Send DMA write to IDE device 0 2. Send CMD_FLUSH_CACHE to same IDE device which will be failed by block layer using blkdebug script in tests/ide-test:test_retry_flush When doing DMA request ide/core.c will set s->retry_unit to s->unit in ide_start_dma. When dma completes ide_set_inactive sets retry_unit to -1. After that ide_flush_cache runs and fails thanks to blkdebug. ide_flush_cb calls ide_handle_rw_error which asserts that s->retry_unit == s->unit. But s->retry_unit is still -1 after previous DMA completion and flush does not use anything related to retry. This patch restricts retry unit assertion only to ops that actually use retry logic. Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468870792-7411-3-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: John Snow <jsnow@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>
2016-07-18ide: refactor retry_unit set and clear into separate functionEvgeny Yakovlev
Code to set and clear state associated with retry in moved into ide_set_retry and ide_clear_retry to make adding retry setups easier. Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468870792-7411-2-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: John Snow <jsnow@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>
2016-07-18trace: Add QAPI/QMP interfaces to query and control per-vCPU tracing stateLluís Vilanova
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>