slackcoder/qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer

Age	Commit message (Collapse)	Author
2019-06-14	iotests: Filter 175's allocation information	Max Reitz
	It is possible for an empty file to take up blocks on a filesystem, for example: $ qemu-img create -f raw test.img 1G Formatting 'test.img', fmt=raw size=1073741824 $ mkfs.ext4 -I 128 -q test.img $ mkdir test-mount $ sudo mount -o loop test.img test-mount $ sudo touch test-mount/test-file $ stat -c 'blocks=%b' test-mount/test-file blocks=8 These extra blocks (one cluster) are apparently used for metadata, because they are always there, on top of blocks used for data: $ sudo dd if=/dev/zero of=test-mount/test-file bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00135339 s, 775 MB/s $ stat -c 'blocks=%b' test-mount/test-file blocks=2056 Make iotest 175 take this into account. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Nir Soffer <nsoffer@redhat.com> Message-id: 20190516144319.12570-1-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-14	event_match: always match on None value	John Snow
	Before, event_match didn't always recurse if the event value was not a dictionary, and would instead check for equality immediately. By delaying equality checking to post-recursion, we can allow leaf values like "5" to match "None" and take advantage of the generic None-returns-True clause. This makes the matching a little more obviously consistent at the expense of being able to check for explicit None values, which is probably not that important given what this function is used for. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20190528183857.26167-1-jsnow@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-14	iotests: add iotest 256 for testing blockdev-backup across iothread contexts	John Snow
	Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20190523170643.20794-6-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> [mreitz: Moved from 250 to 256] Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-14	iotests.py: rewrite run_job to be pickier	John Snow
	Don't pull events out of the queue that don't belong to us; be choosier so that we can use this method to drive jobs that were launched by transactions that may have more jobs. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20190523170643.20794-5-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-14	QEMUMachine: add events_wait method	John Snow
	Instead of event_wait which looks for a single event, add an events_wait which can look for any number of events simultaneously. However, it will still only return one at a time, whichever happens first. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20190523170643.20794-4-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-14	iotests.py: do not use infinite waits	John Snow
	Cap waits to 60 seconds so that iotests can fail gracefully if something goes wrong. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20190523170643.20794-3-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-14	blockdev-backup: don't check aio_context too early	John Snow
	in blockdev_backup_prepare, we check to make sure that the target is associated with a compatible aio context. However, do_blockdev_backup is called later and has some logic to move the target to a compatible aio_context. The transaction version will fail certain commands needlessly early as a result. Allow blockdev_backup_prepare to simply call do_blockdev_backup, which will ultimately decide if the contexts are compatible or not. Note: the transaction version has always disallowed this operation since its initial commit bd8baecd (2014), whereas the version of qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to enforce the aio_context switch instead. It's not clear, and I can't see from the mailing list archives at the time, why the two functions take a different approach. It wasn't until later in efd7556708b (2016) that the standalone version tried to determine if it could set the context or not. Reported-by: aihua liang <aliang@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498 Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20190523170643.20794-2-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-14	Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20190613.0' ↵	Peter Maydell
	into staging VFIO updates 2019-06-13 - Hide resizable BAR capability to prevent false guest resizing (Alex Williamson) - Allow relocation to fix bogus MSI-X hardware (Alex Williamson) - Condense IRQ setup into a common helper (Eric Auger) # gpg: Signature made Thu 13 Jun 2019 18:24:43 BST # gpg: using RSA key 239B9B6E3BB08B22 # gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" [full] # gpg: aka "Alex Williamson <alex@shazbot.org>" [full] # gpg: aka "Alex Williamson <alwillia@redhat.com>" [full] # gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" [full] # Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B 8A90 239B 9B6E 3BB0 8B22 * remotes/awilliam/tags/vfio-updates-20190613.0: vfio/common: Introduce vfio_set_irq_signaling helper vfio/pci: Allow MSI-X relocation to fixup bogus PBA vfio/pci: Hide Resizable BAR capability Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-13	Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2019-06-13' into ↵	Peter Maydell
	staging nbd patches for 2019-06-13 - add 'qemu-nbd --pid-file' - NBD-related iotest improvements - NBD code refactoring in preparation for reconnect # gpg: Signature made Thu 13 Jun 2019 16:37:58 BST # gpg: using RSA key A7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2019-06-13: block/nbd: merge NBDClientSession struct back to BDRVNBDState block/nbd: merge nbd-client.* to nbd.c block/nbd-client: drop stale logout nbd/server: Nicer spelling of max BLOCK_STATUS reply length iotests: Let 233 run concurrently iotests: Use qemu-nbd's --pid-file qemu-nbd: Do not close stderr iotests.py: Add qemu_nbd_early_pipe() qemu-nbd: Add --pid-file option Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-13	vfio/common: Introduce vfio_set_irq_signaling helper	Eric Auger
	The code used to assign an interrupt index/subindex to an eventfd is duplicated many times. Let's introduce an helper that allows to set/unset the signaling for an ACTION_TRIGGER, ACTION_MASK or ACTION_UNMASK action. In the error message, we now use errno in case of any VFIO_DEVICE_SET_IRQS ioctl failure. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-13	vfio/pci: Allow MSI-X relocation to fixup bogus PBA	Alex Williamson
	The MSI-X relocation code can sometimes be used to work around bogus MSI-X capabilities, but this test for whether the PBA is outside of the specified BAR causes the device to error before we can apply a relocation. Let it proceed if we intend to relocate MSI-X anyway. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-13	vfio/pci: Hide Resizable BAR capability	Alex Williamson
	The resizable BAR capability is currently exposed read-only from the kernel and we don't yet implement a protocol for virtualizing it to the VM. Exposing it to the guest read-only introduces poor behavior as the guest has no reason to test that a control register write is accepted by the hardware. This can lead to cases where the guest OS assumes the BAR has been resized, but it hasn't. This has been observed when assigning AMD Vega GPUs. Note, this does not preclude future enablement of resizable BARs, but it's currently incorrect to expose this capability as read-only, so better to not expose it at all. Reported-by: James Courtier-Dutton <james.dutton@gmail.com> Tested-by: James Courtier-Dutton <james.dutton@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-13	block/nbd: merge NBDClientSession struct back to BDRVNBDState	Vladimir Sementsov-Ogievskiy
	No reason to keep it separate, it differs from others block driver behavior and therefore confuses. Instead of generic 'state = (State*)bs->opaque' we have to use special helper. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190611102720.86114-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2019-06-13	block/nbd: merge nbd-client.* to nbd.c	Vladimir Sementsov-Ogievskiy
	No reason for keeping driver handlers realization separate from driver structure. We can get rid of extra header file. While being here, fix comments style, restore forgotten comments for NBD_FOREACH_REPLY_CHUNK and nbd_reply_chunk_iter_receive, remove extra includes. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190611102720.86114-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2019-06-13	block/nbd-client: drop stale logout	Vladimir Sementsov-Ogievskiy
	Drop one on failure path (we have errp) and turn two others into trace points. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190611102720.86114-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2019-06-13	Merge remote-tracking branch ↵	Peter Maydell
	'remotes/pmaydell/tags/pull-target-arm-20190613-1' into staging target-arm queue: * convert aarch32 VFP decoder to decodetree (includes tightening up decode in a few places) * fix minor bugs in VFP short-vector handling * hw/core/bus.c: Only the main system bus can have no parent * smmuv3: Fix decoding of ID register range * Implement NSACR gating of floating point * Use tcg_gen_gvec_bitsel # gpg: Signature made Thu 13 Jun 2019 15:15:39 BST # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20190613-1: (47 commits) target/arm: Fix short-vector increment behaviour target/arm: Convert float-to-integer VCVT insns to decodetree target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree target/arm: Convert VJCVT to decodetree target/arm: Convert integer-to-float insns to decodetree target/arm: Convert double-single precision conversion insns to decodetree target/arm: Convert VFP round insns to decodetree target/arm: Convert the VCVT-to-f16 insns to decodetree target/arm: Convert the VCVT-from-f16 insns to decodetree target/arm: Convert VFP comparison insns to decodetree target/arm: Convert VMOV (register) to decodetree target/arm: Convert VSQRT to decodetree target/arm: Convert VNEG to decodetree target/arm: Convert VABS to decodetree target/arm: Convert VMOV (imm) to decodetree target/arm: Convert VFP fused multiply-add insns to decodetree target/arm: Convert VDIV to decodetree target/arm: Convert VSUB to decodetree target/arm: Convert VADD to decodetree target/arm: Convert VNMUL to decodetree ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-13	target/arm: Fix short-vector increment behaviour	Peter Maydell
	For VFP short vectors, the VFP registers are divided into a series of banks: for single-precision these are s0-s7, s8-s15, s16-s23 and s24-s31; for double-precision they are d0-d3, d4-d7, ... d28-d31. Some banks are "scalar" meaning that use of a register within them triggers a pure-scalar or mixed vector-scalar operation rather than a full vector operation. The scalar banks are s0-s7, d0-d3 and d16-d19. When using a bank as part of a vector operation, we iterate through it, increasing the register number by the specified stride each time, and wrapping around to the beginning of the bank. Unfortunately our calculation of the "increment" part of this was incorrect: vd = ((vd + delta_d) & (bank_mask - 1)) \| (vd & bank_mask) will only do the intended thing if bank_mask has exactly one set high bit. For instance for doubles (bank_mask = 0xc), if we start with vd = 6 and delta_d = 2 then vd is updated to 12 rather than the intended 4. This only causes problems in the unlikely case that the starting register is not the first in its bank: if the register number doesn't have to wrap around then the expression happens to give the right answer. Fix this bug by abstracting out the "check whether register is in a scalar bank" and "advance register within bank" operations to utility functions which use the right bit masking operations. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert float-to-integer VCVT insns to decodetree	Peter Maydell
	Convert the float-to-integer VCVT instructions to decodetree. Since these are the last unconverted instructions, we can delete the old decoder structure entirely now. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree	Peter Maydell
	Convert the VCVT (between floating-point and fixed-point) instructions to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VJCVT to decodetree	Peter Maydell
	Convert the VJCVT instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert integer-to-float insns to decodetree	Peter Maydell
	Convert the VCVT integer-to-float instructions to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert double-single precision conversion insns to decodetree	Peter Maydell
	Convert the VCVT double/single precision conversion insns to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP round insns to decodetree	Peter Maydell
	Convert the VFP round-to-integer instructions VRINTR, VRINTZ and VRINTX to decodetree. These instructions were only introduced as part of the "VFP misc" additions in v8A, so we check this. The old decoder's implementation was incorrectly providing them even for v7A CPUs. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert the VCVT-to-f16 insns to decodetree	Peter Maydell
	Convert the VCVTT and VCVTB instructions which convert from f32 and f64 to f16 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit store of the right half of the input single-precision register rather than doing a load/modify/store sequence on the full 32 bits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert the VCVT-from-f16 insns to decodetree	Peter Maydell
	Convert the VCVTT, VCVTB instructions that deal with conversion from half-precision floats to f32 or 64 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit load of the right half of the input single-precision register rather than loading the full 32 bits and then doing a separate shift or sign-extension. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP comparison insns to decodetree	Peter Maydell
	Convert the VFP comparison instructions to decodetree. Note that comparison instructions should not honour the VFP short-vector length and stride information: they are scalar-only operations. This applies to all the 2-operand instructions except for VMOV, VABS, VNEG and VSQRT. (In the old decoder this is implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VMOV (register) to decodetree	Peter Maydell
	Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VSQRT to decodetree	Peter Maydell
	Convert the VSQRT instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VNEG to decodetree	Peter Maydell
	Convert the VNEG instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VABS to decodetree	Peter Maydell
	Convert the VFP VABS instruction to decodetree. Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or VFPGen2OpDPFn because none of the operations which use this format and support short vectors will need it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VMOV (imm) to decodetree	Peter Maydell
	Convert the VFP VMOV (immediate) instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP fused multiply-add insns to decodetree	Peter Maydell
	Convert the VFP fused multiply-add instructions (VFNMA, VFNMS, VFMA, VFMS) to decodetree. Note that in the old decode structure we were implementing these to honour the VFP vector stride/length. These instructions were introduced in VFPv4, and in the v7A architecture they are UNPREDICTABLE if the vector stride or length are non-zero. In v8A they must UNDEF if stride or length are non-zero, like all VFP instructions; we choose to UNDEF always. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VDIV to decodetree	Peter Maydell
	Convert the VDIV instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VSUB to decodetree	Peter Maydell
	Convert the VSUB instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VADD to decodetree	Peter Maydell
	Convert the VADD instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VNMUL to decodetree	Peter Maydell
	Convert the VNMUL instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VMUL to decodetree	Peter Maydell
	Convert the VMUL instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP VNMLA to decodetree	Peter Maydell
	Convert the VFP VNMLA instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP VNMLS to decodetree	Peter Maydell
	Convert the VFP VNMLS instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP VMLS to decodetree	Peter Maydell
	Convert the VFP VMLS instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP VMLA to decodetree	Peter Maydell
	Convert the VFP VMLA instruction to decodetree. This is the first of the VFP 3-operand data processing instructions, so we include in this patch the code which loops over the elements for an old-style VFP vector operation. The existing code to do this looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since we are going to be converting instructions one at a time anyway we can take the opportunity to make the new loop use TCG temporaries, which means we can do that conversion one operation at a time rather than needing to do it all in one go. We include an UNDEF check which was missing in the old code: short-vector operations (with stride or length non-zero) were deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec field does not indicate that support for short vectors is present we UNDEF the operations that would use them. (This is a change of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which previously were all incorrectly allowing short-vector operations.) Note that the conversion fixes a bug in the old code for the case of VFP short-vector "mixed scalar/vector operations". These happen where the destination register is in a vector bank but but the second operand is in a scalar bank. For example vmla.f64 d10, d1, d16 with length 2 stride 2 is equivalent to the pair of scalar operations vmla.f64 d10, d1, d16 vmla.f64 d8, d3, d16 where the destination and first input register cycle through their vector but the second input is scalar (d16). In the old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d} as a temporary output for the multiply, which trashes the second input operand. For the fully-scalar case (where we never do a second iteration) and the fully-vector case (where the loop loads the new second input operand) this doesn't matter, but for the mixed scalar/vector case we will end up using the wrong value for later loop iterations. In the new code we use TCG temporaries and so avoid the bug. This bug is present for all the multiply-accumulate insns that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS. Note 2: the expression used to calculate the next register number in the vector bank is not in fact correct; we leave this behaviour unchanged from the old decoder and will fix this bug later in the series. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d	Peter Maydell
	Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans functions which perform the memory accesses by going via the TCG globals cpu_F0s and cpu_F0d, to use local TCG temps instead. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert the VFP load/store multiple insns to decodetree	Peter Maydell
	Convert the VFP load/store multiple insns to decodetree. This includes tightening up the UNDEF checking for pre-VFPv3 CPUs which only have D0-D15 : they now UNDEF for any access to D16-D31, not merely when the smallest register in the transfer list is in D16-D31. This conversion does not try to share code between the single precision and the double precision versions; this looks a bit duplicative of code, but it leaves the door open for a future refactoring which gets rid of the use of the "F0" registers by inlining the various functions like gen_vfp_ld() and gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }" conditionalisation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP VLDR and VSTR to decodetree	Peter Maydell
	Convert the VFP single load/store insns VLDR and VSTR to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VFP two-register transfer insns to decodetree	Peter Maydell
	Convert the VFP two-register transfer instructions to decodetree (in the v8 Arm ARM these are the "Advanced SIMD and floating-point 64-bit move" encoding group). Again, we expand out the sequences involving gen_vfp_msr() and gen_msr_vfp(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert "single-precision" register moves to decodetree	Peter Maydell
	Convert the "single-precision" register moves to decodetree: * VMSR * VMRS * VMOV between general purpose register and single precision Note that the VMSR/VMRS conversions make our handling of the "should this UNDEF?" checks consistent between the two instructions: * VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0 (previously was a nop) * VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better (previously was a nop) * VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better (previously would write to the register, which had no guest-visible effect because we always UNDEF reads) We also tighten up the decode: we were previously underdecoding some SBZ or SBO bits. The conversion of VMOV_single includes the expansion out of the gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr() sequences into the simpler direct load/store of the TCG temp via neon_{load,store}_reg32(): we know in the new function that we're always single-precision, we don't need to use the old-and-deprecated cpu_F0* TCG globals, and we don't happen to have the declaration of gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the new function is. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert "double-precision" register moves to decodetree	Peter Maydell
	Convert the "double-precision" register moves to decodetree: this covers VMOV scalar-to-gpreg, VMOV gpreg-to-scalar and VDUP. Note that the conversion process has tightened up a few of the UNDEF encoding checks: we now correctly forbid: * VMOV-to-gpr with U:opc1:opc2 == 10x00 or x0x10 * VMOV-from-gpr with opc1:opc2 == 0x10 * VDUP with B:E == 11 * VDUP with Q == 1 and Vn<0> == 1 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- The accesses of elements < 32 bits could be improved by doing direct ld/st of the right size rather than 32-bit read-and-shift or read-modify-write, but we leave this for later cleanup, since this series is generally trying to stick to fixing the decode. Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Add helpers for VFP register loads and stores	Peter Maydell
	The current VFP code has two different idioms for loading and storing from the VFP register file: 1 using the gen_mov_F0_vreg() and similar functions, which load and store to a fixed set of TCG globals cpu_F0s, CPU_F0d, etc 2 by direct calls to tcg_gen_ld_f64() and friends We want to phase out idiom 1 (because the use of the fixed globals is a relic of a much older version of TCG), but idiom 2 is quite longwinded: tcg_gen_ld_f64(tmp, cpu_env, vfp_reg_offset(true, reg)) requires us to specify the 64-bitness twice, once in the function name and once by passing 'true' to vfp_reg_offset(). There's no guard against accidentally passing the wrong flag. Instead, let's move to a convention of accessing 64-bit registers via the existing neon_load_reg64() and neon_store_reg64(), and provide new neon_load_reg32() and neon_store_reg32() for the 32-bit equivalents. Implement the new functions and use them in the code in translate-vfp.inc.c. We will convert the rest of the VFP code as we do the decodetree conversion in subsequent commits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Move the VFP trans_* functions to translate-vfp.inc.c	Peter Maydell
	Move the trans_*() functions we've just created from translate.c to translate-vfp.inc.c. This is pure code motion with no textual changes (this can be checked with 'git show --color-moved'). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13	target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree	Peter Maydell
	Convert the VCVTA/VCVTN/VCVTP/VCVTM instructions to decodetree. trans_VCVT() is temporarily left in translate.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>