aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-07-05ioapic: use irq number instead of vector in ioapic_eoi_broadcastLi Qiang
When emulating irqchip in qemu, such as following command: x86_64-softmmu/qemu-system-x86_64 -m 1024 -smp 4 -hda /home/test/test.img -machine kernel-irqchip=off --enable-kvm -vnc :0 -device edu -monitor stdio We will get a crash with following asan output: (qemu) /home/test/qemu5/qemu/hw/intc/ioapic.c:266:27: runtime error: index 35 out of bounds for type 'int [24]' ================================================================= ==113504==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61b000003114 at pc 0x5579e3c7a80f bp 0x7fd004bf8c10 sp 0x7fd004bf8c00 WRITE of size 4 at 0x61b000003114 thread T4 #0 0x5579e3c7a80e in ioapic_eoi_broadcast /home/test/qemu5/qemu/hw/intc/ioapic.c:266 #1 0x5579e3c6f480 in apic_eoi /home/test/qemu5/qemu/hw/intc/apic.c:428 #2 0x5579e3c720a7 in apic_mem_write /home/test/qemu5/qemu/hw/intc/apic.c:802 #3 0x5579e3b1e31a in memory_region_write_accessor /home/test/qemu5/qemu/memory.c:503 #4 0x5579e3b1e6a2 in access_with_adjusted_size /home/test/qemu5/qemu/memory.c:569 #5 0x5579e3b28d77 in memory_region_dispatch_write /home/test/qemu5/qemu/memory.c:1497 #6 0x5579e3a1b36b in flatview_write_continue /home/test/qemu5/qemu/exec.c:3323 #7 0x5579e3a1b633 in flatview_write /home/test/qemu5/qemu/exec.c:3362 #8 0x5579e3a1bcb1 in address_space_write /home/test/qemu5/qemu/exec.c:3452 #9 0x5579e3a1bd03 in address_space_rw /home/test/qemu5/qemu/exec.c:3463 #10 0x5579e3b8b979 in kvm_cpu_exec /home/test/qemu5/qemu/accel/kvm/kvm-all.c:2045 #11 0x5579e3ae4499 in qemu_kvm_cpu_thread_fn /home/test/qemu5/qemu/cpus.c:1287 #12 0x5579e4cbdb9f in qemu_thread_start util/qemu-thread-posix.c:502 #13 0x7fd0146376da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) #14 0x7fd01436088e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e This is because in ioapic_eoi_broadcast function, we uses 'vector' to index the 's->irq_eoi'. To fix this, we should uses the irq number. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190622002119.126834-1-liq3ea@163.com>
2019-07-05hw/i386: Fix linker error when ISAPC is disabledJulio Montes
v2: include config-devices.h to use CONFIG_IDE_ISA Message-Id: <20190705143554.10295-2-julio.montes@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-07-05Makefile: generate header file with the list of devices enabledJulio Montes
v2: generate config-devices.h which contains the list of devices enabled Message-Id: <20190705143554.10295-1-julio.montes@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-07-05target/i386: kvm: Fix when nested state is needed for migrationLiran Alon
When vCPU is in VMX operation and enters SMM mode, it temporarily exits VMX operation but KVM maintained nested-state still stores the VMXON region physical address, i.e. even when the vCPU is in SMM mode then (nested_state->hdr.vmx.vmxon_pa != -1ull). Therefore, there is no need to explicitly check for KVM_STATE_NESTED_SMM_VMXON to determine if it is necessary to save nested-state as part of migration stream. Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190624230514.53326-1-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05minikconf: do not include variables from MINIKCONF_ARGS in ↵Paolo Bonzini
config-all-devices.mak When minikconf writes config-devices.mak, it includes all variables including those from MINIKCONF_ARGS. This causes values from config-host.mak to "stick" to the ones used in generating config-devices.mak, because config-devices.mak is included after config-host.mak. Avoid this by omitting assignments coming from the command line in the output of minikconf. Reported-by: Christophe de Dinechin <dinechin@redhat.com> Reviewed-by: Christophe de Dinechin <dinechin@redhat.com> Tested-by: Christophe de Dinechin <dinechin@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05target/i386: fix feature check in hyperv-stub.cAlex Bennée
Commit 2d384d7c8 broken the build when built with: configure --without-default-devices --disable-user The reason was the conversion of cpu->hyperv_synic to cpu->hyperv_synic_kvm_only although the rest of the patch introduces a feature checking mechanism. So I've fixed the KVM_EXIT_HYPERV_SYNIC in hyperv-stub to do the same feature check as in the real hyperv.c Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Message-Id: <20190624123835.28869-1-alex.bennee@linaro.org> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05ioapic: clear irq_eoi when updating the ioapic redirect table entryLi Qiang
irq_eoi is used to count the number of irq injected during eoi broadcast. It should be set to 0 when updating the ioapic's redirect table entry. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190624151635.22494-1-liq3ea@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05intel_iommu: Fix unexpected unmaps during global unmapPeter Xu
This is an replacement work of Yan Zhao's patch: https://www.mail-archive.com/qemu-devel@nongnu.org/msg625340.html vtd_address_space_unmap() will do proper page mask alignment to make sure each IOTLB message will have correct masks for notification messages (2^N-1), but sometimes it can be expanded to even supercede the registered range. That could lead to unexpected UNMAP of already mapped regions in some other notifiers. Instead of doing mindless expension of the start address and address mask, we split the range into smaller ones and guarantee that each small range will have correct masks (2^N-1) and at the same time we should also try our best to generate as less IOTLB messages as possible. Reported-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Message-Id: <20190624091811.30412-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05intel_iommu: Fix incorrect "end" for vtd_address_space_unmapYan Zhao
IOMMUNotifier is with inclusive ranges, so we should check against (VTD_ADDRESS_SIZE(s->aw_bits) - 1). Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> [peterx: split from another bigger patch] Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190624091811.30412-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05i386/kvm: Fix build with -m32Max Reitz
find_next_bit() takes a pointer of type "const unsigned long *", but the first argument passed here is a "uint64_t *". These types are incompatible when compiling qemu with -m32. Just use ctz64() instead. Fixes: c686193072a47032d83cb4e131dc49ae30f9e5d Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190624193913.28343-1-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05checkpatch: do not warn for multiline parenthesized returned valuePaolo Bonzini
While indeed we do not want to have return (a); it is less clear that this applies to return (a && b); Some editors indent more nicely if you have parentheses, and some people's eyes may appreciate that as well. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1561116534-21814-1-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05pc: fix possible NULL pointer dereference in ↵Igor Mammedov
pc_machine_get_device_memory_region_size() QEMU will crash when device-memory-region-size property is read if ms->device_memory wasn't initialized yet. Crash can be reproduced with: $QEMU -preconfig -qmp unix:qmp_socket,server,nowait & ./scripts/qmp/qom-get -s qmp_socket /machine.device-memory-region-size Instead of crashing return 0 if ms->device_memory hasn't been initialized. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1560174635-22602-1-git-send-email-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-01Merge remote-tracking branch ↵Peter Maydell
'remotes/bkoppelmann2/tags/pull-tricore-20190625' into staging * Add FTOIZ/UTOF/QSEED insns * Fix sync of hflags and swapped args of RRPW_INSERT # gpg: Signature made Tue 25 Jun 2019 14:05:03 BST # gpg: using RSA key 6E636A7E83F2DD0CFA6E6E370AD2C6396B69CA14 # gpg: issuer "kbastian@mail.uni-paderborn.de" # gpg: Good signature from "Bastian Koppelmann <kbastian@mail.uni-paderborn.de>" [full] # Primary key fingerprint: 6E63 6A7E 83F2 DD0C FA6E 6E37 0AD2 C639 6B69 CA14 * remotes/bkoppelmann2/tags/pull-tricore-20190625: tricore: add QSEED instruction tricore: sync ctx.hflags with tb->flags tricore: fix RRPW_INSERT instruction tricore: add UTOF instruction tricore: add FTOIZ instruction Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-01Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190624' into ↵Peter Maydell
staging Xen queue * Fix build * xen-block: support feature-large-sector-size * xen-block: Support IOThread polling for PV shared rings * Avoid usage of a VLA * Cleanup Xen headers usage # gpg: Signature made Mon 24 Jun 2019 16:30:32 BST # gpg: using RSA key F80C006308E22CFD8A92E7980CF5572FD7FB55AF # gpg: issuer "anthony.perard@citrix.com" # gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" [marginal] # gpg: aka "Anthony PERARD <anthony.perard@citrix.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 5379 2F71 024C 600F 778A 7161 D8D5 7199 DF83 42C8 # Subkey fingerprint: F80C 0063 08E2 2CFD 8A92 E798 0CF5 572F D7FB 55AF * remotes/aperard/tags/pull-xen-20190624: xen: Import other xen/io/*.h Revert xen/io/ring.h of "Clean up a few header guard symbols" xen: Drop includes of xen/hvm/params.h xen: Avoid VLA xen-bus / xen-block: add support for event channel polling xen-bus: allow AioContext to be specified for each event channel xen-bus: use a separate fd for each event channel xen-block: support feature-large-sector-size Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-01Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-06-24' ↵Peter Maydell
into staging Block patches: - The SSH block driver now uses libssh instead of libssh2 - The VMDK block driver gets read-only support for the seSparse subformat - Various fixes # gpg: Signature made Mon 24 Jun 2019 15:42:56 BST # gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40 # gpg: issuer "mreitz@redhat.com" # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full] # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/maxreitz/tags/pull-block-2019-06-24: iotests: Fix 205 for concurrent runs ssh: switch from libssh2 to libssh vmdk: Add read-only support for seSparse snapshots vmdk: Reduce the max bound for L1 table size vmdk: Fix comment regarding max l1_size coverage iotest 134: test cluster-misaligned encrypted write blockdev: enable non-root nodes for transaction drive-backup source nvme: do not advertise support for unsupported arbitration mechanism Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-25tricore: add QSEED instructionAndreas Konopik
Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de> Signed-off-by: David Brenken <david.brenken@efs-auto.de> Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de> Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de> Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Message-Id: <20190624070339.4408-6-david.brenken@efs-auto.org> Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> [BK: Added fp_status arg to float32_is_signaling_nan()]
2019-06-25tricore: sync ctx.hflags with tb->flagsGeorg Hofstetter
Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de> Signed-off-by: David Brenken <david.brenken@efs-auto.de> Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de> Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de> Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Message-Id: <20190624070339.4408-5-david.brenken@efs-auto.org> Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
2019-06-25tricore: fix RRPW_INSERT instructionDavid Brenken
Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de> Signed-off-by: David Brenken <david.brenken@efs-auto.de> Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de> Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de> Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Message-Id: <20190624070339.4408-4-david.brenken@efs-auto.org> Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
2019-06-25tricore: add UTOF instructionDavid Brenken
Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de> Signed-off-by: David Brenken <david.brenken@efs-auto.de> Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de> Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de> Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Message-Id: <20190624070339.4408-3-david.brenken@efs-auto.org> Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
2019-06-25tricore: add FTOIZ instructionDavid Brenken
Signed-off-by: Andreas Konopik <andreas.konopik@efs-auto.de> Signed-off-by: David Brenken <david.brenken@efs-auto.de> Signed-off-by: Georg Hofstetter <georg.hofstetter@efs-auto.de> Signed-off-by: Robert Rasche <robert.rasche@efs-auto.de> Signed-off-by: Lars Biermanski <lars.biermanski@efs-auto.de> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Message-Id: <20190624070339.4408-2-david.brenken@efs-auto.org> Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
2019-06-24iotests: Fix 205 for concurrent runsMax Reitz
Tests should place their files into the test directory. This includes Unix sockets. 205 currently fails to do so, which prevents it from being run concurrently. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190618210238.9524-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24ssh: switch from libssh2 to libsshPino Toscano
Rewrite the implementation of the ssh block driver to use libssh instead of libssh2. The libssh library has various advantages over libssh2: - easier API for authentication (for example for using ssh-agent) - easier API for known_hosts handling - supports newer types of keys in known_hosts Use APIs/features available in libssh 0.8 conditionally, to support older versions (which are not recommended though). Adjust the iotest 207 according to the different error message, and to find the default key type for localhost (to properly compare the fingerprint with). Contributed-by: Max Reitz <mreitz@redhat.com> Adjust the various Docker/Travis scripts to use libssh when available instead of libssh2. The mingw/mxe testing is dropped for now, as there are no packages for it. Signed-off-by: Pino Toscano <ptoscano@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20190620200840.17655-1-ptoscano@redhat.com Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 5873173.t2JhDm7DL7@lindworm.usersys.redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24vmdk: Add read-only support for seSparse snapshotsSam Eiderman
Until ESXi 6.5 VMware used the vmfsSparse format for snapshots (VMDK3 in QEMU). This format was lacking in the following: * Grain directory (L1) and grain table (L2) entries were 32-bit, allowing access to only 2TB (slightly less) of data. * The grain size (default) was 512 bytes - leading to data fragmentation and many grain tables. * For space reclamation purposes, it was necessary to find all the grains which are not pointed to by any grain table - so a reverse mapping of "offset of grain in vmdk" to "grain table" must be constructed - which takes large amounts of CPU/RAM. The format specification can be found in VMware's documentation: https://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf In ESXi 6.5, to support snapshot files larger than 2TB, a new format was introduced: SESparse (Space Efficient). This format fixes the above issues: * All entries are now 64-bit. * The grain size (default) is 4KB. * Grain directory and grain tables are now located at the beginning of the file. + seSparse format reserves space for all grain tables. + Grain tables can be addressed using an index. + Grains are located in the end of the file and can also be addressed with an index. - seSparse vmdks of large disks (64TB) have huge preallocated headers - mainly due to L2 tables, even for empty snapshots. * The header contains a reverse mapping ("backmap") of "offset of grain in vmdk" to "grain table" and a bitmap ("free bitmap") which specifies for each grain - whether it is allocated or not. Using these data structures we can implement space reclamation efficiently. * Due to the fact that the header now maintains two mappings: * The regular one (grain directory & grain tables) * A reverse one (backmap and free bitmap) These data structures can lose consistency upon crash and result in a corrupted VMDK. Therefore, a journal is also added to the VMDK and is replayed when the VMware reopens the file after a crash. Since ESXi 6.7 - SESparse is the only snapshot format available. Unfortunately, VMware does not provide documentation regarding the new seSparse format. This commit is based on black-box research of the seSparse format. Various in-guest block operations and their effect on the snapshot file were tested. The only VMware provided source of information (regarding the underlying implementation) was a log file on the ESXi: /var/log/hostd.log Whenever an seSparse snapshot is created - the log is being populated with seSparse records. Relevant log records are of the form: [...] Const Header: [...] constMagic = 0xcafebabe [...] version = 2.1 [...] capacity = 204800 [...] grainSize = 8 [...] grainTableSize = 64 [...] flags = 0 [...] Extents: [...] Header : <1 : 1> [...] JournalHdr : <2 : 2> [...] Journal : <2048 : 2048> [...] GrainDirectory : <4096 : 2048> [...] GrainTables : <6144 : 2048> [...] FreeBitmap : <8192 : 2048> [...] BackMap : <10240 : 2048> [...] Grain : <12288 : 204800> [...] Volatile Header: [...] volatileMagic = 0xcafecafe [...] FreeGTNumber = 0 [...] nextTxnSeqNumber = 0 [...] replayJournal = 0 The sizes that are seen in the log file are in sectors. Extents are of the following format: <offset : size> This commit is a strict implementation which enforces: * magics * version number 2.1 * grain size of 8 sectors (4KB) * grain table size of 64 sectors * zero flags * extent locations Additionally, this commit proivdes only a subset of the functionality offered by seSparse's format: * Read-only * No journal replay * No space reclamation * No unmap support Hence, journal header, journal, free bitmap and backmap extents are unused, only the "classic" (L1 -> L2 -> data) grain access is implemented. However there are several differences in the grain access itself. Grain directory (L1): * Grain directory entries are indexes (not offsets) to grain tables. * Valid grain directory entries have their highest nibble set to 0x1. * Since grain tables are always located in the beginning of the file - the index can fit into 32 bits - so we can use its low part if it's valid. Grain table (L2): * Grain table entries are indexes (not offsets) to grains. * If the highest nibble of the entry is: 0x0: The grain in not allocated. The rest of the bytes are 0. 0x1: The grain is unmapped - guest sees a zero grain. The rest of the bits point to the previously mapped grain, see 0x3 case. 0x2: The grain is zero. 0x3: The grain is allocated - to get the index calculate: ((entry & 0x0fff000000000000) >> 48) | ((entry & 0x0000ffffffffffff) << 12) * The difference between 0x1 and 0x2 is that 0x1 is an unallocated grain which results from the guest using sg_unmap to unmap the grain - but the grain itself still exists in the grain extent - a space reclamation procedure should delete it. Unmapping a zero grain has no effect (0x2 will not change to 0x1) but unmapping an unallocated grain will (0x0 to 0x1) - naturally. In order to implement seSparse some fields had to be changed to support both 32-bit and 64-bit entry sizes. Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com> Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com> Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com> Message-id: 20190620091057.47441-4-shmuel.eiderman@oracle.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24vmdk: Reduce the max bound for L1 table sizeSam Eiderman
512M of L1 entries is a very loose bound, only 32M are required to store the maximal supported VMDK file size of 2TB. Fixed qemu-iotest 59# - now failure occures before on impossible L1 table size. Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com> Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com> Message-id: 20190620091057.47441-3-shmuel.eiderman@oracle.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24vmdk: Fix comment regarding max l1_size coverageSam Eiderman
Commit b0651b8c246d ("vmdk: Move l1_size check into vmdk_add_extent") extended the l1_size check from VMDK4 to VMDK3 but did not update the default coverage in the moved comment. The previous vmdk4 calculation: (512 * 1024 * 1024) * 512(l2 entries) * 65536(grain) = 16PB The added vmdk3 calculation: (512 * 1024 * 1024) * 4096(l2 entries) * 512(grain) = 1PB Adding the calculation of vmdk3 to the comment. In any case, VMware does not offer virtual disks more than 2TB for vmdk4/vmdk3 or 64TB for the new undocumented seSparse format which is not implemented yet in qemu. Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com> Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com> Message-id: 20190620091057.47441-2-shmuel.eiderman@oracle.com Reviewed-by: yuchenlin <yuchenlin@synology.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24iotest 134: test cluster-misaligned encrypted writeAnton Nefedov
COW (even empty/zero) areas require encryption too Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190516143028.81155-1-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24blockdev: enable non-root nodes for transaction drive-backup sourceVladimir Sementsov-Ogievskiy
We forget to enable it for transaction .prepare, while it is already enabled in do_drive_backup since commit a2d665c1bc362 "blockdev: loosen restrictions on drive-backup source node" Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190618140804.59214-1-vsementsov@virtuozzo.com Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24nvme: do not advertise support for unsupported arbitration mechanismKlaus Birkelund Jensen
The device mistakenly reports that the Weighted Round Robin with Urgent Priority Class arbitration mechanism is supported. It is not. Signed-off-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com> Message-id: 20190606092530.14206-1-klaus@birkelund.eu Acked-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-06-24xen: Import other xen/io/*.hAnthony PERARD
A Xen public header have been imported into QEMU (by f65eadb639 "xen: import ring.h from xen"), but there are other header that depends on ring.h which come from the system when building QEMU. This patch resolves the issue of having headers from the system importing a different copie of ring.h. This patch is prompt by the build issue described in the previous patch: 'Revert xen/io/ring.h of "Clean up a few header guard symbols"' ring.h and the new imported headers are moved to "include/hw/xen/interface" as those describe interfaces with a guest. The imported headers are cleaned up a bit while importing them: some part of the file that QEMU doesn't use are removed (description of how to make hypercall in grant_table.h have been removed). Other cleanup: - xen-mapcache.c and xen-legacy-backend.c don't need grant_table.h. - xenfb.c doesn't need event_channel.h. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Message-Id: <20190621105441.3025-3-anthony.perard@citrix.com>
2019-06-24Revert xen/io/ring.h of "Clean up a few header guard symbols"Anthony PERARD
This reverts changes to include/hw/xen/io/ring.h from commit 37677d7db39a3c250ad661d00fb7c3b59d047b1f. Following 37677d7db3 "Clean up a few header guard symbols", QEMU start to fail to build: In file included from ~/xen/tools/../tools/include/xen/io/blkif.h:31:0, from ~/xen/tools/qemu-xen-dir/hw/block/xen_blkif.h:5, from ~/xen/tools/qemu-xen-dir/hw/block/xen-block.c:22: ~/xen/tools/../tools/include/xen/io/ring.h:68:0: error: "__CONST_RING_SIZE" redefined [-Werror] #define __CONST_RING_SIZE(_s, _sz) \ In file included from ~/xen/tools/qemu-xen-dir/hw/block/xen_blkif.h:4:0, from ~/xen/tools/qemu-xen-dir/hw/block/xen-block.c:22: ~/xen/tools/qemu-xen-dir/include/hw/xen/io/ring.h:66:0: note: this is the location of the previous definition #define __CONST_RING_SIZE(_s, _sz) \ The issue is that some public xen headers have been imported (by f65eadb639 "xen: import ring.h from xen") but not all. With the change in the guards symbole, the ring.h header start to be imported twice. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Message-Id: <20190621105441.3025-2-anthony.perard@citrix.com>
2019-06-24xen: Drop includes of xen/hvm/params.hAnthony PERARD
xen-mapcache.c doesn't needs params.h. xen-hvm.c uses defines available in params.h but so is xen_common.h which is included before. HVM_PARAM_* flags are only needed to make xc_hvm_param_{get,set} calls so including only xenctrl.h, which is where the definition the function is, should be enough. (xenctrl.h does include params.h) Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Message-Id: <20190618112341.513-4-anthony.perard@citrix.com>
2019-06-24xen: Avoid VLAAnthony PERARD
Avoid using a variable length array. We allocate the `dirty_bitmap' buffer only once when we start tracking for dirty bits. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190618112341.513-5-anthony.perard@citrix.com>
2019-06-24xen-bus / xen-block: add support for event channel pollingPaul Durrant
This patch introduces a poll callback for event channel fd-s and uses this to invoke the channel callback function. To properly support polling, it is necessary for the event channel callback function to return a boolean saying whether it has done any useful work or not. Thus xen_block_dataplane_event() is modified to directly invoke xen_block_handle_requests() and the latter only returns true if it actually processes any requests. This also means that the call to qemu_bh_schedule() is moved into xen_block_complete_aio(), which is more intuitive since the only reason for doing a deferred poll of the shared ring should be because there were previously insufficient resources to fully complete a previous poll. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190408151617.13025-4-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-24xen-bus: allow AioContext to be specified for each event channelPaul Durrant
This patch adds an AioContext parameter to xen_device_bind_event_channel() and then uses aio_set_fd_handler() to set the callback rather than qemu_set_fd_handler(). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190408151617.13025-3-paul.durrant@citrix.com> [Call aio_set_fd_handler() with is_external=true] Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-24xen-bus: use a separate fd for each event channelPaul Durrant
To better support use of IOThread-s it will be necessary to be able to set the AioContext for each XenEventChannel and hence it is necessary to open a separate handle to libxenevtchan for each channel. This patch stops using NotifierList for event channel callbacks, replacing that construct by a list of complete XenEventChannel structures. Each of these now has a xenevtchn_handle pointer in place of the single pointer previously held in the XenDevice structure. The individual handles are opened/closed in xen_device_bind/unbind_event_channel(), replacing the single open/close in xen_device_realize/unrealize(). NOTE: This patch does not add an AioContext parameter to xen_device_bind_event_channel(). That will be done in a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190408151617.13025-2-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-24xen-block: support feature-large-sector-sizePaul Durrant
A recent Xen commit [1] clarified the semantics of sector based quantities used in the blkif protocol such that it is now safe to create a xen-block device with a logical_block_size != 512, as long as the device only connects to a frontend advertizing 'feature-large-block-size'. This patch modifies xen-block accordingly. It also uses a stack variable for the BlockBackend in xen_block_realize() to avoid repeated dereferencing of the BlockConf pointer, and changes the parameters of xen_block_dataplane_create() so that the BlockBackend pointer and sector size are passed expicitly rather than implicitly via the BlockConf. These modifications have been tested against a recent Windows PV XENVBD driver [2] using a xen-disk device with a 4kB logical block size. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=67e1c050e36b2c9900cca83618e56189effbad98 [2] https://winpvdrvbuild.xenproject.org:8080/job/XENVBD-master/126 Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190409164038.25484-1-paul.durrant@citrix.com> [Edited error message] Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-21Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-jun-21-2019' ↵Peter Maydell
into staging MIPS queue for June 21st, 2019 # gpg: Signature made Fri 21 Jun 2019 10:46:57 BST # gpg: using RSA key D4972A8967F75A65 # gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65 * remotes/amarkovic/tags/mips-queue-jun-21-2019: target/mips: Fix emulation of ILVR.<B|H|W> on big endian host target/mips: Fix emulation of ILVL.<B|H|W> on big endian host target/mips: Fix emulation of ILVOD.<B|H|W> on big endian host target/mips: Fix emulation of ILVEV.<B|H|W> on big endian host tests/tcg: target/mips: Amend tests for MSA pack instructions tests/tcg: target/mips: Include isa/ase and group name in test output target/mips: Fix if-else-switch-case arms checkpatch errors in translate.c target/mips: Fix some space checkpatch errors in translate.c MAINTAINERS: Consolidate MIPS disassembler-related items MAINTAINERS: Update file items for MIPS Malta board Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-21Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* Nuke hw_compat_4_0_1 and pc_compat_4_0_1 (Greg) * Static analysis fixes (Igor, Lidong) * X86 Hyper-V CPUID improvements (Vitaly) * X86 nested virt migration (Liran) * New MSR-based features (Xiaoyao) # gpg: Signature made Fri 21 Jun 2019 12:25:42 BST # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (25 commits) hw: Nuke hw_compat_4_0_1 and pc_compat_4_0_1 util/main-loop: Fix incorrect assertion sd: Fix out-of-bounds assertions target/i386: kvm: Add nested migration blocker only when kernel lacks required capabilities target/i386: kvm: Add support for KVM_CAP_EXCEPTION_PAYLOAD target/i386: kvm: Add support for save and restore nested state vmstate: Add support for kernel integer types linux-headers: sync with latest KVM headers from Linux 5.2 target/i386: kvm: Block migration for vCPUs exposed with nested virtualization target/i386: kvm: Re-inject #DB to guest with updated DR6 target/i386: kvm: Use symbolic constant for #DB/#BP exception constants KVM: Introduce kvm_arch_destroy_vcpu() target/i386: kvm: Delete VMX migration blocker on vCPU init failure target/i386: define a new MSR based feature word - FEAT_CORE_CAPABILITY i386/kvm: add support for Direct Mode for Hyper-V synthetic timers i386/kvm: hv-evmcs requires hv-vapic i386/kvm: hv-tlbflush/ipi require hv-vpindex i386/kvm: hv-stimer requires hv-time and hv-synic i386/kvm: implement 'hv-passthrough' mode i386/kvm: document existing Hyper-V enlightenments ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-21hw: Nuke hw_compat_4_0_1 and pc_compat_4_0_1Greg Kurz
Commit c87759ce876a fixed a regression affecting pc-q35 machines by introducing a new pc-q35-4.0.1 machine version to be used instead of pc-q35-4.0. The only purpose was to revert the default behaviour of not using split irqchip, but the change also introduced the usual hw_compat and pc_compat bits, and wired them for pc-q35 only. This raises questions when it comes to add new compat properties for 4.0* machine versions of any architecture. Where to add them ? In 4.0, 4.0.1 or both ? Error prone. Another possibility would be to teach all other architectures about 4.0.1. This solution isn't satisfying, especially since this is a pc-q35 specific issue. It turns out that the split irqchip default is handled in the machine option function and doesn't involve compat lists at all. Drop all the 4.0.1 compat lists and use the 4.0 ones instead in the 4.0.1 machine option function. Move the compat props that were added to the 4.0.1 since c87759ce876a to 4.0. Even if only hw_compat_4_0_1 had an impact on other architectures, drop pc_compat_4_0_1 as well for consistency. Fixes: c87759ce876a "q35: Revert to kernel irqchip" Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <156051774276.244890.8660277280145466396.stgit@bahia.lan> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21util/main-loop: Fix incorrect assertionLidong Chen
The check for poll_fds in g_assert() was incorrect. The correct assertion should check "n_poll_fds + w->num <= ARRAY_SIZE(poll_fds)" because the subsequent for-loop is doing access to poll_fds[n_poll_fds + i] where i is in [0, w->num). This could happen with a very high number of file descriptors and/or wait objects. Signed-off-by: Lidong Chen <lidong.chen@oracle.com> Suggested-by: Peter Maydell <peter.maydell@linaro.org> Suggested-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <ded30967982811617ce7f0222d11228130c198b7.1560806687.git.lidong.chen@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21sd: Fix out-of-bounds assertionsLidong Chen
Due to an off-by-one error, the assert statements allow an out-of-bound array access. This doesn't happen in practice, but the static analyzer notices. Signed-off-by: Lidong Chen <lidong.chen@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Message-Id: <6b19cb7359a10a6bedc3ea0fce22fed3ef93c102.1560806687.git.lidong.chen@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21target/i386: kvm: Add nested migration blocker only when kernel lacks ↵Liran Alon
required capabilities Previous commits have added support for migration of nested virtualization workloads. This was done by utilising two new KVM capabilities: KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD. Both which are required in order to correctly migrate such workloads. Therefore, change code to add a migration blocker for vCPUs exposed with Intel VMX or AMD SVM in case one of these kernel capabilities is missing. Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Message-Id: <20190619162140.133674-11-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21target/i386: kvm: Add support for KVM_CAP_EXCEPTION_PAYLOADLiran Alon
Kernel commit c4f55198c7c2 ("kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD") introduced a new KVM capability which allows userspace to correctly distinguish between pending and injected exceptions. This distinguish is important in case of nested virtualization scenarios because a L2 pending exception can still be intercepted by the L1 hypervisor while a L2 injected exception cannot. Furthermore, when an exception is attempted to be injected by QEMU, QEMU should specify the exception payload (CR2 in case of #PF or DR6 in case of #DB) instead of having the payload already delivered in the respective vCPU register. Because in case exception is injected to L2 guest and is intercepted by L1 hypervisor, then payload needs to be reported to L1 intercept (VMExit handler) while still preserving respective vCPU register unchanged. This commit adds support for QEMU to properly utilise this new KVM capability (KVM_CAP_EXCEPTION_PAYLOAD). Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190619162140.133674-10-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21target/i386: kvm: Add support for save and restore nested stateLiran Alon
Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") introduced new IOCTLs to extract and restore vCPU state related to Intel VMX & AMD SVM. Utilize these IOCTLs to add support for migration of VMs which are running nested hypervisors. Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Tested-by: Maran Wilson <maran.wilson@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190619162140.133674-9-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21vmstate: Add support for kernel integer typesLiran Alon
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190619162140.133674-8-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21linux-headers: sync with latest KVM headers from Linux 5.2Liran Alon
Improve the KVM_{GET,SET}_NESTED_STATE structs by detailing the format of VMX nested state data in a struct. In order to avoid changing the ioctl values of KVM_{GET,SET}_NESTED_STATE, there is a need to preserve sizeof(struct kvm_nested_state). This is done by defining the data struct as "data.vmx[0]". It was the most elegant way I found to preserve struct size while still keeping struct readable and easy to maintain. It does have a misfortunate side-effect that now it has to be accessed as "data.vmx[0]" rather than just "data.vmx". Because we are already modifying these structs, I also modified the following: * Define the "format" field values as macros. * Rename vmcs_pa to vmcs12_pa for better readability. Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Message-Id: <20190619162140.133674-7-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21target/i386: kvm: Block migration for vCPUs exposed with nested virtualizationLiran Alon
Commit d98f26073beb ("target/i386: kvm: add VMX migration blocker") added a migration blocker for vCPU exposed with Intel VMX. However, migration should also be blocked for vCPU exposed with AMD SVM. Both cases should be blocked because QEMU should extract additional vCPU state from KVM that should be migrated as part of vCPU VMState. E.g. Whether vCPU is running in guest-mode or host-mode. Fixes: d98f26073beb ("target/i386: kvm: add VMX migration blocker") Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20190619162140.133674-6-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21target/mips: Fix emulation of ILVR.<B|H|W> on big endian hostAleksandar Markovic
Fix emulation of ILVR.<B|H|W> on big endian host by applying mapping of data element indexes from one endian to another. Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com> Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com> Message-Id: <1561038349-17105-5-git-send-email-aleksandar.markovic@rt-rk.com>
2019-06-21target/mips: Fix emulation of ILVL.<B|H|W> on big endian hostAleksandar Markovic
Fix emulation of ILVL.<B|H|W> on big endian host by applying mapping of data element indexes from one endian to another. Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com> Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com> Message-Id: <1561038349-17105-4-git-send-email-aleksandar.markovic@rt-rk.com>
2019-06-21target/mips: Fix emulation of ILVOD.<B|H|W> on big endian hostAleksandar Markovic
Fix emulation of ILVOD.<B|H|W> on big endian host by applying mapping of data element indexes from one endian to another. Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com> Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com> Message-Id: <1561038349-17105-3-git-send-email-aleksandar.markovic@rt-rk.com>