aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-02-05target/ppc/spapr: Add H-Call H_GET_CPU_CHARACTERISTICSSuraj Jitindar Singh
The new H-Call H_GET_CPU_CHARACTERISTICS is used by the guest to query behaviours and available characteristics of the cpu. Implement the handler for this new H-Call which formulates its response based on the setting of the spapr_caps cap-cfpc, cap-sbbc and cap-ibs. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit c59704b254734182c3202e0c261589ea2ccf485e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05target/ppc/spapr_caps: Add new tristate cap safe_indirect_branchSuraj Jitindar Singh
Add new tristate cap cap-ibs to represent the indirect branch serialisation capability. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 4be8d4e7d935fc8919d61f53a0f0fb7230052bb3) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05target/ppc/spapr_caps: Add new tristate cap safe_bounds_checkSuraj Jitindar Singh
Add new tristate cap cap-sbbc to represent the speculation barrier bounds checking capability. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 09114fd8179977e4157b36aab2e3d68eaf08adca) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05target/ppc/spapr_caps: Add new tristate cap safe_cacheSuraj Jitindar Singh
Add new tristate cap cap-cfpc to represent the cache flush on privilege change capability. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 8f38eaf8f9dd194c9961cf76c675724930ce4570) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05target/ppc/spapr_caps: Add support for tristate spapr_capabilitiesSuraj Jitindar Singh
spapr_caps are used to represent the level of support for various capabilities related to the spapr machine type. Currently there is only support for boolean capabilities. Add support for tristate capabilities by implementing their get/set functions. These capabilities can have the values 0, 1 or 2 corresponding to broken, workaround and fixed. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 6898aed77f4636c3e77af9c12631f583f22cb5db) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05target/ppc/kvm: Add cap_ppc_safe_[cache/bounds_check/indirect_branch]Suraj Jitindar Singh
Add three new kvm capabilities used to represent the level of host support for three corresponding workarounds. Host support for each of the capabilities is queried through the new ioctl KVM_PPC_GET_CPU_CHAR which returns four uint64 quantities. The first two, character and behaviour, represent the available characteristics of the cpu and the behaviour of the cpu respectively. The second two, c_mask and b_mask, represent the mask of known bits for the character and beheviour dwords respectively. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Correct some compile errors due to name change in final kernel patch version] Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 8acc2ae5e91681ceda3ff4cf946ebf163f6012e9) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05target/ppc/spapr_caps: Add macro to generate spapr_caps migration vmstateSuraj Jitindar Singh
The vmstate description and the contained needed function for migration of spapr_caps is the same for each cap, with the name of the cap substituted. As such introduce a macro to allow for easier generation of these. Convert the three existing spapr_caps (htm, vsx, and dfp) to use this macro. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 1f63ebaa91f73f469c8f107dbd266cabdbea3a40) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05target/ppc: introduce the PPC_BIT() macroCédric Le Goater
and use them in a couple of obvious places. Other macros will be used in the model of the XIVE interrupt controller. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 2a83f9976efa9a85e8ceb9d1035a68f25c321334) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05spapr: fix device tree properties when using compatibility modeGreg Kurz
Commit 51f84465dd98 changed the compatility mode setting logic: - machine reset only sets compatibility mode for the boot CPU - compatibility mode is set for other CPUs when they are put online by the guest with the "start-cpu" RTAS call This causes a regression for machines started with max-compat-cpu: the device tree nodes related to secondary CPU cores contain wrong "cpu-version" and "ibm,pa-features" values, as shown below. Guest started on a POWER8 host with: -smp cores=2 -machine pseries,max-cpu-compat=compat7 ibm,pa-features = [18 00 f6 3f c7 c0 80 f0 80 00 00 00 00 00 00 00 00 00 80 00 80 00 80 00 00 00]; cpu-version = <0x4d0200>; ^^^ second CPU core ibm,pa-features = <0x600f63f 0xc70080c0>; cpu-version = <0xf000003>; ^^^ boot CPU core The second core is advertised in raw POWER8 mode. This happens because CAS assumes all CPUs to have the same compatibility mode. Since the boot CPU already has the requested compatibility mode, the CAS code does not set it for the secondary one, and exposes the bogus device tree properties in in the CAS response to the guest. A similar situation is observed when hot-plugging a CPU core. The related device tree properties are generated and exposed to guest with the "ibm,configure-connector" RTAS before "start-cpu" is called. The CPU core is advertised to the guest in raw mode as well. It both cases, it boils down to the fact that "start-cpu" happens too late. This can be fixed globally by propagating the compatibility mode of the boot CPU to the other CPUs during reset. For this to work, the compatibility mode of the boot CPU must be set before the machine code actually resets all CPUs. It is not needed to set the compatibility mode in "start-cpu" anymore, so the code is dropped. Fixes: 51f84465dd98 Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 9012a53f067a78022947e18050b145c34a3dc599) Conflicts: hw/ppc/spapr_cpu_core.c hw/ppc/spapr_rtas.c * drop context dep on d6322252b32 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05ppc: Change Power9 compat table to support at most 8 threads/coreJose Ricardo Ziviani
Increases the max smt mode to 8 for Power9. That's because KVM supports smt emulation in this platform so QEMU should allow users to use it as well. Today if we try to pass -smp ...,threads=8, QEMU will silently truncate it to smt4 mode and may cause a crash if we try to perform a cpu hotplug. Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> [dwg: Added an explanatory comment] Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 03ee51d3548f5f553a3089f466483c1c6d5c666b) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05hw/ppc/spapr_caps: Rework spapr_caps to use uint8 internal representationSuraj Jitindar Singh
Currently spapr_caps are tied to boolean values (on or off). This patch reworks the caps so that they can have any uint8 value. This allows more capabilities with various values to be represented in the same way internally. Capabilities are numbered in ascending order. The internal representation of capability values is an array of uint8s in the sPAPRMachineState, indexed by capability number. Capabilities can have their own name, description, options, getter and setter functions, type and allow functions. They also each have their own section in the migration stream. Capabilities are only migrated if they were explictly set on the command line, with the assumption that otherwise the default will match. On migration we ensure that the capability value on the destination is greater than or equal to the capability value from the source. So long at this remains the case then the migration is considered compatible and allowed to continue. This patch implements generic getter and setter functions for boolean capabilities. It also converts the existings cap-htm, cap-vsx and cap-dfp capabilities to this new format. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 4e5fe3688e23d61b45cc549ff1322aff8f50ef45) Conflicts: include/hw/ppc/spapr.h *drop context dep on 60c6823b9bc Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05spapr: Handle Decimal Floating Point (DFP) as an optional capabilityDavid Gibson
Decimal Floating Point has been available on POWER7 and later (server) cpus. However, it can be disabled on the hypervisor, meaning that it's not available to guests. We currently handle this by conditionally advertising DFP support in the device tree depending on whether the guest CPU model supports it - which can also depend on what's allowed in the host for -cpu host. That can lead to confusion on migration, since host properties are silently affecting guest visible properties. This patch handles it by treating it as an optional capability for the pseries machine type. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit 2d1fb9bc8e6e78931d8e1bfeb0ed7a4d223b0480) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05spapr: Handle VMX/VSX presence as an spapr capability flagDavid Gibson
We currently have some conditionals in the spapr device tree code to decide whether or not to advertise the availability of the VMX (aka Altivec) and VSX vector extensions to the guest, based on whether the guest cpu has those features. This can lead to confusion and subtle failures on migration, since it makes a guest visible change based only on host capabilities. We now have a better mechanism for this, in spapr capabilities flags, which explicitly depend on user options rather than host capabilities. Rework the advertisement of VSX and VMX based on a new VSX capability. We no longer bother with a conditional for VMX support, because every CPU that's ever been supported by the pseries machine type supports VMX. NOTE: Some userspace distributions (e.g. RHEL7.4) already rely on availability of VSX in libc, so using cap-vsx=off may lead to a fatal SIGILL in init. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit 2938664286499c0c30d6e455a7e2e5d3e6c3f63d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05target/ppc: Clean up probing of VMX, VSX and DFP availability on KVMDavid Gibson
When constructing the "host" cpu class we modify whether the VMX and VSX vector extensions and DFP (Decimal Floating Point) are available based on whether KVM can support those instructions. This can depend on policy in the host kernel as well as on the actual host cpu capabilities. However, the way we probe for this is not very nice: we explicitly check the host's device tree. That works in practice, but it's not really correct, since the device tree is a property of the host kernel's platform which we don't really know about. We get away with it because the only modern POWER platforms happen to encode VMX, VSX and DFP availability in the device tree in the same way. Arguably we should have an explicit KVM capability for this, but we haven't needed one so far. Barring specific KVM policies which don't yet exist, each of these instruction classes will be available in the guest if and only if they're available in the qemu userspace process. We can determine that from the ELF AUX vector we're supplied with. Once reworked like this, there are no more callers for kvmppc_get_vmx() and kvmppc_get_dfp() so remove them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit 3f2ca480eb872b4946baf77f756236b637a5b15a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05spapr: Validate capabilities on migrationDavid Gibson
Now that the "pseries" machine type implements optional capabilities (well, one so far) there's the possibility of having different capabilities available at either end of a migration. Although arguably a user error, it would be nice to catch this situation and fail as gracefully as we can. This adds code to migrate the capabilities flags. These aren't pulled directly into the destination's configuration since what the user has specified on the destination command line should take precedence. However, they are checked against the destination capabilities. If the source was using a capability which is absent on the destination, we fail the migration, since that could easily cause a guest crash or other bad behaviour. If the source lacked a capability which is present on the destination we warn, but allow the migration to proceed. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit be85537d654565e35e359a74b46fc08b7956525c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05spapr: Treat Hardware Transactional Memory (HTM) as an optional capabilityDavid Gibson
This adds an spapr capability bit for Hardware Transactional Memory. It is enabled by default for pseries-2.11 and earlier machine types. with POWER8 or later CPUs (as it must be, since earlier qemu versions would implicitly allow it). However it is disabled by default for the latest pseries-2.12 machine type. This means that with the latest machine type, HTM will not be available, regardless of CPU, unless it is explicitly enabled on the command line. That change is made on the basis that: * This way running with -M pseries,accel=tcg will start with whatever cpu and will provide the same guest visible model as with accel=kvm. - More specifically, this means existing make check tests don't have to be modified to use cap-htm=off in order to run with TCG * We hope to add a new "HTM without suspend" feature in the not too distant future which could work on both POWER8 and POWER9 cpus, and could be enabled by default. * Best guesses suggest that future POWER cpus may well only support the HTM-without-suspend model, not the (frankly, horribly overcomplicated) POWER8 style HTM with suspend. * Anecdotal evidence suggests problems with HTM being enabled when it wasn't wanted are more common than being missing when it was. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit ee76a09fc72cfbfab2bb5529320ef7e460adffd8) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05spapr: Capabilities infrastructureDavid Gibson
Because PAPR is a paravirtual environment access to certain CPU (or other) facilities can be blocked by the hypervisor. PAPR provides ways to advertise in the device tree whether or not those features are available to the guest. In some places we automatically determine whether to make a feature available based on whether our host can support it, in most cases this is based on limitations in the available KVM implementation. Although we correctly advertise this to the guest, it means that host factors might make changes to the guest visible environment which is bad: as well as generaly reducing reproducibility, it means that a migration between different host environments can easily go bad. We've mostly gotten away with it because the environments considered mature enough to be well supported (basically, KVM on POWER8) have had consistent feature availability. But, it's still not right and some limitations on POWER9 is going to make it more of an issue in future. This introduces an infrastructure for defining "sPAPR capabilities". These are set by default based on the machine version, masked by the capabilities of the chosen cpu, but can be overriden with machine properties. The intention is at reset time we verify that the requested capabilities can be supported on the host (considering TCG, KVM and/or host cpu limitations). If not we simply fail, rather than silently modifying the advertised featureset to the guest. This does mean that certain configurations that "worked" may now fail, but such configurations were already more subtly broken. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> (cherry picked from commit 33face6b8981add8eba1f7cdaf4cf6cede415d2e) Conflicts: include/hw/ppc/spapr.h *drop context dep on 60c6823b9bc Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-05spapr: Add pseries-2.12 machine typeDavid Gibson
While we're at it fix a couple of small errors in the 2.11 and 2.10 models (they didn't have any real effect, but don't quite match the template). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 2b6154120cbd7f5514cefd3c6084d39922d26d88) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-04spapr: don't initialize PATB entry if max-cpu-compat < power9Laurent Vivier
if KVM is enabled and KVM capabilities MMU radix is available, the partition table entry (patb_entry) for the radix mode is initialized by default in ppc_spapr_reset(). It's a problem if we want to migrate the guest to a POWER8 host while the kernel is not started to set the value to the one expected for a POWER8 CPU. The "-machine max-cpu-compat=power8" should allow to migrate a POWER9 KVM host to a POWER8 KVM host, but because patb_entry is set, the destination QEMU tries to enable radix mode on the POWER8 host. This fails and cancels the migration: Process table config unsupported by the host error while loading state for instance 0x0 of device 'spapr' load of migration failed: Invalid argument This patch doesn't set the PATB entry if the user provides a CPU compatibility mode that doesn't support radix mode. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 1481fe5fcfeb7fcf3c1ebb9d8c0432e3e0188ccf) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-01linux-user/signal.c: Rename MC_* definesPeter Maydell
The SPARC code in linux-user/signal.c defines a set of MC_* constants. On some SPARC hosts these are also defined by sys/ucontext.h, resulting in build failures: linux-user/signal.c:2786:0: error: "MC_NGREG" redefined [-Werror] #define MC_NGREG 19 In file included from /usr/include/signal.h:302:0, from include/qemu/osdep.h:86, from linux-user/signal.c:19: /usr/include/sparc64-linux-gnu/sys/ucontext.h:59:0: note: this is the location of the previous definition # define MC_NGREG __MC_NGREG Rename all these constants to SPARC_MC_* to avoid the clash. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1517318239-15764-1-git-send-email-peter.maydell@linaro.org (cherry picked from commit 8ebb314b957403c1c9a3f1cf995f73c6ae9d5d10) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-02-01spapr_pci: fix MSI/MSIX selectionGreg Kurz
In various place we don't correctly check if the device supports MSI or MSI-X. This can cause devices to be advertised with MSI support, even if they only support MSI-X (like virtio-pci-* devices for example): ethernet@0 { ibm,req#msi = <0x1>; <--- wrong! . ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0"; . ibm,req#msi-x = <0x3>; }; Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the PCI status and cause migration to fail: qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0 ^^ PCI_STATUS_CAP_LIST bit which is assumed to be constant This patch changes spapr_populate_pci_child_dt() to properly check for MSI support using msi_present(): this ensures that PCIDevice::msi_cap was set by msi_init() and that msi_nr_vectors_allocated() will look at the right place in the config space. Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add a call to msix_present() there as well for consistency. It also changes rtas_ibm_change_msi() to select the appropriate MSI type in Function 1 instead of always selecting plain MSI. This new behaviour is compliant with LoPAPR 1.1, as described in "Table 71. ibm,change-msi Argument Call Buffer": Function 1: If Number Outputs is equal to 3, request to set to a new number of MSIs (including set to 0). If the “ibm,change-msix-capable” property exists and Number Outputs is equal to 4, request is to set to a new number of MSI or MSI-X (platform choice) interrupts (including set to 0). Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's check for MSI support first. And finally, it checks the input parameters are valid, as described in LoPAPR 1.1 "R1–7.3.10.5.1–3": For the MSI option: The platform must return a Status of -3 (Parameter error) from ibm,change-msi, with no change in interrupt assignments if the PCI configuration address does not support MSI and Function 3 was requested (that is, the “ibm,req#msi” property must exist for the PCI configuration address in order to use Function 3), or does not support MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property must exist for the PCI configuration address in order to use Function 4), or if neither MSIs nor MSI-Xs are supported and Function 1 is requested. This ensures that the ret_intr_type variable contains a valid MSI type for this device, and that spapr_msi_setmsg() won't corrupt the PCI status. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit 9cbe305b60cc49cfcd134765b85c28be95b1b57d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-29usb-storage: Fix share-rw option parsingFam Zheng
Because usb-storage creates an internal scsi device, we should propagate options. We already do so for bootindex etc, but failed to take care of share-rw. Fix it in an apparent way: add a new parameter to scsi_bus_legacy_add_drive and pass in s->conf.share_rw. Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Message-id: 20180117005222.4781-1-famz@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit 395b95395934785ca86baafd314d0c31b307d16d) Conflicts: hw/usb/dev-storage.c * dropped context dep on ceff3e1f01e Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-29osdep: Retry SETLK upon EINTRFam Zheng
We could hit lock failure if there is a signal that makes fcntl return -1 and errno set to EINTR. In this case we should retry. Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit f86428a1f4f91a460ed585682af70d3e8c31dc06) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-29s390x/kvm: provide stfle.81Christian Borntraeger
stfle.81 (ppa15) is a transparent facility that can be passed to the guest without the need to implement hypervisor support. As this feature can be provided by firmware we add it to all full models. Cc: qemu-stable@nongnu.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20180118085628.40798-4-borntraeger@de.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit 9f0d13f4f1de3cf9b70435cc4e87a301ee12471f) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-29s390x/kvm: Handle bpb featureChristian Borntraeger
We need to handle the bpb control on reset and migration. Normally stfle.82 is transparent (and the normal guest part works without hypervisor activity). To prevent any issues we require full host kernel support for this feature. Cc: qemu-stable@nongnu.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20180118085628.40798-3-borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [CH: 'Branch Prediction Blocking' -> 'Branch prediction blocking'] Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit b073c87517d4d348c7bac0f0b35e8e83e6354d82) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-29linux-headers: updateCornelia Huck
Update headers against 4.15-rc9. Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit 9cbb636270b4df6f0a548e5c34b895330db5df8b) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-29linux-headers: update to 4.15-rc1Eric Auger
Update headers against v4.15-rc1. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-id: 1511883692-11511-4-git-send-email-eric.auger@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit dd8739669f95b30653a3a05cb2e21da3f52894fa) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-29s390x: fix storage attributes migration for non-small guestsClaudio Imbrenda
Fix storage attribute migration so that it does not fail for guests with more than a few GB of RAM. With such guests, the index in the buffer would go out of bounds, usually by large amounts, thus receiving -EFAULT from the kernel. Migration itself would be successful, but storage attributes would then not be migrated completely. This patch fixes the out of bounds access, and thus migration of all storage attributes when the guest have large amounts of memory. Cc: qemu-stable@nongnu.org Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Fixes: 903fd80b03243476 ("s390x/migration: Storage attributes device") Message-Id: <1516297904-18188-1-git-send-email-imbrenda@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit 46fa893355e0bd88f3c59b886f0d75cbd5f0bbbe) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-29linux-user: Fix locking order in fork_start()Peter Maydell
Our locking order is that the tb lock should be taken inside the mmap_lock, but fork_start() grabs locks the other way around. This means that if a heavily multithreaded guest process (such as Java) calls fork() it can deadlock, with the thread that called fork() stuck in fork_start() with the tb lock and waiting for the mmap lock, but some other thread in tb_find() with the mmap lock and waiting for the tb lock. The cpu_list_lock() should also always be taken last, not first. Fix this by making fork_start() grab the locks in the right order. The order in which we drop locks doesn't matter, so we leave fork_end() the way it is. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-stable@nongnu.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1512397331-15238-1-git-send-email-peter.maydell@linaro.org> Signed-off-by: Laurent Vivier <laurent@vivier.eu> (cherry picked from commit 024949caf32805f4cc3e7d363a80084b47aac1f6) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23i386: Add EPYC-IBPB CPU modelEduardo Habkost
EPYC-IBPB is a copy of the EPYC CPU model with just CPUID_8000_0008_EBX_IBPB added. Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-7-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit 6cfbc54e8903a9bcc0346119949162d040c144c1) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23i386: Add new -IBRS versions of Intel CPU modelsEduardo Habkost
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel microcode updated and can be used by OSes to mitigate CVE-2017-5715. Unfortunately we can't change the existing CPU models without breaking existing setups, so users need to explicitly update their VM configuration to use the new *-IBRS CPU model if they want to expose IBRS to guests. The new CPU models are simple copies of the existing CPU models, with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated. Cc: Jiri Denemark <jdenemar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit ac96c41354b7e4c70b756342d9b686e31ab87458) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23i386: Add FEAT_8000_0008_EBX CPUID feature wordEduardo Habkost
Add the new feature word and the "ibpb" feature flag. Based on a patch by Paolo Bonzini. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-5-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit 1b3420e1c4d523c49866cca4e7544753201cd43d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23i386: Add spec-ctrl CPUID bitEduardo Habkost
Add the feature name and a CPUID_7_0_EDX_SPEC_CTRL macro. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-4-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit a2381f0934432ef2cd47a335348ba8839632164c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23i386: Add support for SPEC_CTRL MSRPaolo Bonzini
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-3-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit a33a2cfe2f771b360b3422f6cdf566a560860bfc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23i386: Change X86CPUDefinition::model_id to const char*Eduardo Habkost
It is valid to have a 48-character model ID on CPUID, however the definition of X86CPUDefinition::model_id is char[48], which can make the compiler drop the null terminator from the string. If a CPU model happens to have 48 bytes on model_id, "-cpu help" will print garbage and the object_property_set_str() call at x86_cpu_load_def() will read data outside the model_id array. We could increase the array size to 49, but this would mean the compiler would not issue a warning if a 49-char string is used by mistake for model_id. To make things simpler, simply change model_id to be const char*, and validate the string length using an assert() on x86_register_cpudef_type(). Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-2-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> (cherry picked from commit 807e9869b8c4119b81df902625af818519e01759) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23hw/pci-bridge: fix QEMU crash because of pcie-root-portMarcel Apfelbaum
If we try to use more pcie_root_ports then available slots and an IO hint is passed to the port, QEMU crashes because we try to init the "IO hint" capability even if the device is not created. Fix it by checking for error before adding the capability, so QEMU can fail gracefully. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit fced4d00e68e7559c73746d963265f7fd0b6abf9) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23scsi-disk: release AioContext in unaligned WRITE SAME caseStefan Hajnoczi
scsi_write_same_complete() can retry the write if the request was unaligned. Make sure to release the AioContext when that code path is taken! This patch fixes a hang when QEMU terminates after an unaligned WRITE SAME request has been processed with dataplane. The hang occurs because iothread_stop_all() cannot acquire the AioContext lock that was leaked by the IOThread in scsi_write_same_complete(). Fixes: b9e413dd37 ("block: explicitly acquire aiocontext in aio callbacks that need it"). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Reported-by: Cong Li <coli@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20180104142502.15175-1-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 24355b79bdaf6ab12f7c610b032fc35ec045cd55) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23hw/sd/ssi-sd: Reset SD card on controller resetPeter Maydell
Since ssi-sd is still using the legacy SD card API, the SD card created by sd_init() is not plugged into any bus. This means that the controller has to reset it manually. Failing to do this mostly didn't affect the guest since the guest typically does a programmed SD card reset as part of its SD controller driver initialization, but meant that migration failed because it's only in sd_reset() that we set up the wpgrps_size field. In the case of sd-ssi, we have to implement an entire reset function since there wasn't one previously, and that requires a QOM cast macro that got omitted when this device was QOMified. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1515506513-31961-4-git-send-email-peter.maydell@linaro.org (cherry picked from commit 8046d44f3c9f67828d3368797d4d314433ee75e9) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23hw/sd/milkymist-memcard: Reset SD card on controller resetPeter Maydell
Since milkymist-memcard is still using the legacy SD card API, the SD card created by sd_init() is not plugged into any bus. This means that the controller has to reset it manually. Failing to do this mostly didn't affect the guest since the guest typically does a programmed SD card reset as part of its SD controller driver initialization, but meant that migration failed because it's only in sd_reset() that we set up the wpgrps_size field. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1515506513-31961-3-git-send-email-peter.maydell@linaro.org (cherry picked from commit 16bf0e0e7aaa8efc0b8ee7e2aecb2fa235f82d38) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23hw/sd/pl181: Reset SD card on controller resetPeter Maydell
Since pl181 is still using the legacy SD card API, the SD card created by sd_init() is not plugged into any bus. This means that the controller has to reset it manually. Failing to do this mostly didn't affect the guest since the guest typically does a programmed SD card reset as part of its SD controller driver initialization, but meant that migration failed because it's only in sd_reset() that we set up the wpgrps_size field. Cc: qemu-stable@nongnu.org Fixes: https://bugs.launchpad.net/qemu/+bug/1739378 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1515506513-31961-2-git-send-email-peter.maydell@linaro.org (cherry picked from commit 0cb57cc701839e7358918d5f2922ccbc04d28d17) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-23vhost: remove assertion to prevent crashJay Zhou
QEMU will assert on vhost-user backed virtio device hotplug if QEMU is using more RAM regions than VHOST_MEMORY_MAX_NREGIONS (for example if it were started with a lot of DIMM devices). Fix it by returning error instead of asserting and let callers of vhost_set_mem_table() handle error condition gracefully. Cc: qemu-stable@nongnu.org Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit f4bf56fb78ed0e9f60fa1ed656c14ff4c494da5a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-15virtio_error: don't invoke status callbacksMichael S. Tsirkin
Backends don't need to know what frontend requested a reset, and notifying then from virtio_error is messy because virtio_error itself might be invoked from backend. Let's just set the status directly. Cc: qemu-stable@nongnu.org Reported-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 8fc47c876de638353bb635872f2c25bb7f4a3d6e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-11hw/intc/arm_gic: reserved register addresses are RAZ/WIPeter Maydell
The GICv2 specification says that reserved register addresses must RAZ/WI; now that we implement external abort handling for Arm CPUs this means we must return MEMTX_OK rather than MEMTX_ERROR, to avoid generating a spurious guest data abort. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1513183941-24300-3-git-send-email-peter.maydell@linaro.org Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> (cherry picked from commit 0cf09852015e47a5fbb974ff7ac320366afd21ee) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-11hw/intc/arm_gicv3: Make reserved register addresses RAZ/WIPeter Maydell
The GICv3 specification says that reserved register addresses should RAZ/WI. This means we need to return MEMTX_OK, not MEMTX_ERROR, because now that we support generating external aborts the latter will cause an abort on new board models. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1513183941-24300-2-git-send-email-peter.maydell@linaro.org Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> (cherry picked from commit f1945632b43e36bd9f3e0c2feb0e5b152be7ed91) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-09vfio: Fix vfio-kvm group registrationAlex Williamson
Commit 8c37faa475f3 ("vfio-pci, ppc64/spapr: Reorder group-to-container attaching") moved registration of groups with the vfio-kvm device from vfio_get_group() to vfio_connect_container(), but it missed the case where a group is attached to an existing container and takes an early exit. Perhaps this is a less common case on ppc64/spapr, but on x86 (without viommu) all groups are connected to the same container and thus only the first group gets registered with the vfio-kvm device. This becomes a problem if we then hot-unplug the devices associated with that first group and we end up with KVM being misinformed about any vfio connections that might remain. Fix by including the call to vfio_kvm_device_add_group() in this early exit path. Fixes: 8c37faa475f3 ("vfio-pci, ppc64/spapr: Reorder group-to-container attaching") Cc: qemu-stable@nongnu.org # qemu-2.10+ Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> (cherry picked from commit 2016986aedb6ea2839662eb5f60630f3e231bd1a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-09block: Open backing image in force share mode for size probeFam Zheng
Management tools create overlays of running guests with qemu-img: $ qemu-img create -b /image/in/use.qcow2 -f qcow2 /overlay/image.qcow2 but this doesn't work anymore due to image locking: qemu-img: /overlay/image.qcow2: Failed to get shared "write" lock Is another process using the image? Could not open backing image to determine size. Use the force share option to allow this use case again. Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit cc954f01e3c004aad081aa36736a17e842b80211) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-09block: Call .drain_begin only once in bdrv_drain_all_begin()Kevin Wolf
bdrv_drain_all_begin() used to call the .bdrv_co_drain_begin() driver callback inside its polling loop. This means that how many times it got called for each node depended on long it had to poll the event loop. This is obviously not right and results in nodes that stay drained even after bdrv_drain_all_end(), which calls .bdrv_co_drain_begin() once per node. Fix bdrv_drain_all_begin() to call the callback only once, too. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 2da9b7d456278bccc6ce889ae350f2867155d7e8) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-09block: Make bdrv_drain_invoke() recursiveKevin Wolf
This change separates bdrv_drain_invoke(), which calls the BlockDriver drain callbacks, from bdrv_drain_recurse(). Instead, the function performs its own recursion now. One reason for this is that bdrv_drain_recurse() can be called multiple times by bdrv_drain_all_begin(), but the callbacks may only be called once. The separation is necessary to fix this bug. The other reason is that we intend to go to a model where we call all driver callbacks first, and only then start polling. This is not fully achieved yet with this patch, as bdrv_drain_invoke() contains a BDRV_POLL_WHILE() loop for the block driver callbacks, which can still call callbacks for any unrelated event. It's a step in this direction anyway. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit db0289b9b26cb653d5662f5d6a2a52d70243cd56) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-09block/nbd: fix segmentation fault when .desc is not null-terminatedMurilo Opsfelder Araujo
The find_desc_by_name() from util/qemu-option.c relies on the .name not being NULL to call strcmp(). This check becomes unsafe when the list is not NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can result in segmentation fault when strcmp() tries to access an invalid memory: #0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6 #1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166 #2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026 #3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406 #4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135 #5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395 >From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an invalid memory: >>> p desc[5] $8 = { name = 0x1037f098 "R27A", type = 1561964883, help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>, def_value_str = 0x2 <error: Cannot access memory at address 0x2> } >>> p desc[6] $9 = { name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001", type = 272101528, help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>, def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9> } This patch fixes the segmentation fault in strcmp() by adding a NULL element at the end of nbd_runtime_opts.desc list, which is the common practice to most of other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL check becomes safe because it will not evaluate to true when .desc list reached its end. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1727259 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com> Message-Id: <20180105133241.14141-2-muriloo@linux.vnet.ibm.com> CC: qemu-stable@nongnu.org Fixes: 7ccc44fd7d1dfa62c4d6f3a680df809d6e7068ce Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit c4365735a7d38f4355c6f77e6670d3972315f7c2) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2018-01-08qemu-pr-helper: miscellaneous fixesPaolo Bonzini
1) Return a generic sense if TEST UNIT READY does not provide one; 2) Fix two mistakes in copying from the spec. Cc: qemu-stable@nongnu.org Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit a4a9b6eaf35dbe4bf0e069854945bf5e45fc7eab) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>