aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2019-06-04block: Add BlockBackend.ctxKevin Wolf
This adds a new parameter to blk_new() which requires its callers to declare from which AioContext this BlockBackend is going to be used (or the locks of which AioContext need to be taken anyway). The given context is only stored and kept up to date when changing AioContexts. Actually applying the stored AioContext to the root node is saved for another commit. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04block: Add Error to blk_set_aio_context()Kevin Wolf
Add an Error parameter to blk_set_aio_context() and use bdrv_child_try_set_aio_context() internally to check whether all involved nodes can actually support the AioContext switch. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04nvme: add Get/Set Feature Timestamp supportKenneth Heitke
Signed-off-by: Kenneth Heitke <kenneth.heitke@intel.com> Reviewed-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-03q35: Revert to kernel irqchipAlex Williamson
Commit b2fc91db8447 ("q35: set split kernel irqchip as default") changed the default for the pc-q35-4.0 machine type to use split irqchip, which turned out to have disasterous effects on vfio-pci INTx support. KVM resampling irqfds are registered for handling these interrupts, but these are non-functional in split irqchip mode. We can't simply test for split irqchip in QEMU as userspace handling of this interrupt is a significant performance regression versus KVM handling (GeForce GPUs assigned to Windows VMs are non-functional without forcing MSI mode or re-enabling kernel irqchip). The resolution is to revert the change in default irqchip mode in the pc-q35-4.1 machine and create a pc-q35-4.0.1 machine for the 4.0-stable branch. The qemu-q35-4.0 machine type should not be used in vfio-pci configurations for devices requiring legacy INTx support without explicitly modifying the VM configuration to use kernel irqchip. Link: https://bugs.launchpad.net/qemu/+bug/1826422 Fixes: b2fc91db8447 ("q35: set split kernel irqchip as default") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <155786484688.13873.6037015630912983760.stgit@gimli.home> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-03imx25-pdk: create ds1338 for qtest inside the testPaolo Bonzini
There is no need to have a test device created by the board. Instead, create it in the qtest so that we will be able to run it on other boards too. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-03edu: uses uint64_t in dma operationLi Qiang
The dma related variable dma.dst/src/cnt is dma_addr_t, it is uint64_t in x64 platform. Change these usage from uint32_to uint64_t to avoid trancation in edu_dma_timer. Signed-off-by: Li Qiang <liq3ea@163.com> Message-Id: <20190510164349.81507-4-liq3ea@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-03edu: mmio: allow 64-bit access in read dispatchLi Qiang
The edu spec says when address >= 0x80, the MMIO area can be accessed by 64-bit. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com> Message-Id: <20190510164349.81507-3-liq3ea@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-03edu: mmio: allow 64-bit accessLi Qiang
The edu spec says the MMIO area can be accessed by 64-bit. However currently the 'max_access_size' is not so the MMIO access dispatch can only access 32-bit one time. This patch fixes this to respect the spec. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com> Message-Id: <20190510164349.81507-2-liq3ea@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-30Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190529' ↵Peter Maydell
into staging ppc patch queue 2019-05-29 Next pull request against qemu-4.1. Highlights: * KVM accelerated support for the XIVE interrupt controller in PAPR guests * A number of TCG vector fixes * Fixes for the PReP / 40p machine * Improvements to make check-tcg test coverage Other than that it's just a bunch of assorted fixes, cleanups and minor improvements. This supersedes both the pull request dated 2019-05-21 and the one dated 2019-05-22. I've dropped one hunk which I think may have caused the check-tcg failure that Peter saw (by enabling the ppc64abi32 build, which I think has been broken for ages). I'm not entirely certain, since I haven't reproduced exactly the same failure. # gpg: Signature made Wed 29 May 2019 07:49:04 BST # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-4.1-20190529: (44 commits) ppc/pnv: add dummy XSCOM registers for PRD initialization ppc/pnv: introduce new skiboot platform properties spapr: Don't migrate the hpt_maxpagesize cap to older machine types spapr: change default interrupt mode to 'dual' spapr/xive: fix multiple resets when using the 'dual' interrupt mode docs: provide documentation on the POWER9 XIVE interrupt controller spapr/irq: add KVM support to the 'dual' machine ppc/xics: fix irq priority in ics_set_irq_type() spapr/irq: initialize the IRQ device only once spapr/irq: introduce a spapr_irq_init_device() helper spapr: check for the activation of the KVM IRQ device spapr: introduce routines to delete the KVM IRQ device sysbus: add a sysbus_mmio_unmap() helper spapr/xive: activate KVM support spapr/xive: add migration support for KVM spapr/xive: introduce a VM state change handler spapr/xive: add state synchronization with KVM spapr/xive: add hcall support when under KVM spapr/xive: add KVM support spapr: Print out extra hints when CAS negotiation of interrupt mode fails ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-30Merge remote-tracking branch 'remotes/kraxel/tags/usb-20190529-pull-request' ↵Peter Maydell
into staging usb-hub: port count config option, emulate power switching, cleanups. usb-tablet, usb-host: bugfixes. # gpg: Signature made Wed 29 May 2019 07:28:18 BST # gpg: using RSA key 4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full] # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full] # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full] # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/usb-20190529-pull-request: usb-tablet: fix serial compat property usb-hub: emulate per port power switching usb-hub: add usb_hub_port_update() usb-hub: add helpers to update port state usb-hub: make number of ports runtime-configurable usb-hub: tweak feature names usb-host: avoid libusb_set_configuration calls usb-host: skip reset for untouched devices usb: call reset handler before updating state Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-30Merge remote-tracking branch 'remotes/kraxel/tags/vga-20190529-pull-request' ↵Peter Maydell
into staging vga: add vhost-user-gpu. # gpg: Signature made Wed 29 May 2019 05:40:02 BST # gpg: using RSA key 4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full] # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full] # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full] # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/vga-20190529-pull-request: hw/display: add vhost-user-vga & gpu-pci virtio-gpu: split virtio-gpu-pci & virtio-vga virtio-gpu: split virtio-gpu, introduce virtio-gpu-base spice-app: fix running when !CONFIG_OPENGL contrib: add vhost-user-gpu util: compile drm.o on posix virtio-gpu: add a pixman helper header virtio-gpu: add bswap helpers header vhost-user: add vhost_user_gpu_set_socket() virtio-gpu: add sanity check Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-29usb-tablet: fix serial compat propertyGerd Hoffmann
s/kbd/tablet/, fixes cut+paste bug. Cc: qemu-stable@nongnu.org Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190520081805.15019-1-kraxel@redhat.com
2019-05-29usb-hub: emulate per port power switchingGerd Hoffmann
Add support for per port power switching. Virtual power of course ;) Use port-power=on property to enable this. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20190524070310.4952-6-kraxel@redhat.com
2019-05-29usb-hub: add usb_hub_port_update()Gerd Hoffmann
Helper function to update port status bits which depends on the connected device. We need the same logic for device attach and port reset, so factor it out. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190524070310.4952-5-kraxel@redhat.com
2019-05-29usb-hub: add helpers to update port stateGerd Hoffmann
Add usb_hub_port_set() and usb_hub_port_clear() helpers which care about updating the change bits (port->wPortChange) properly, so we don't need to have that logic sprinkled all over the place ;) Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20190524070310.4952-4-kraxel@redhat.com
2019-05-29usb-hub: make number of ports runtime-configurableGerd Hoffmann
Add num_ports property which allows configure the number of downstream ports. Valid range is 1-8, default is 8. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20190524070310.4952-3-kraxel@redhat.com
2019-05-29usb-hub: tweak feature namesGerd Hoffmann
Add dashes, so they don't look like two separate things when printed. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190524070310.4952-2-kraxel@redhat.com
2019-05-29usb-host: avoid libusb_set_configuration callsGerd Hoffmann
Seems some devices become confused when we call libusb_set_configuration(). So before calling the function check whenever the device has multiple configurations in the first place, and in case it hasn't (which is the case for the majority of devices) simply skip the call as it will have no effect anyway. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20190522094702.17619-4-kraxel@redhat.com
2019-05-29usb-host: skip reset for untouched devicesGerd Hoffmann
If the guest didn't talk to the device yet, skip the reset. Without this usb-host devices get resetted a number of times at boot time for no good reason. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20190522094702.17619-3-kraxel@redhat.com
2019-05-29usb: call reset handler before updating stateGerd Hoffmann
That way the device reset handler can see what the before-reset state of the device is. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20190522094702.17619-2-kraxel@redhat.com
2019-05-29hw/display: add vhost-user-vga & gpu-pciMarc-André Lureau
Add new virtio-gpu devices with a "vhost-user" property. The associated vhost-user backend is used to handle the virtio rings and provide rendering results thanks to the vhost-user-gpu protocol. Example usage: -object vhost-user-backend,id=vug,cmd="./vhost-user-gpu" -device vhost-user-vga,vhost-user=vug Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190524130946.31736-10-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-05-29virtio-gpu: split virtio-gpu-pci & virtio-vgaMarc-André Lureau
Add base classes that are common to vhost-user-gpu-pci and vhost-user-vga. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190524130946.31736-9-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-05-29virtio-gpu: split virtio-gpu, introduce virtio-gpu-baseMarc-André Lureau
Add a base class that is common to virtio-gpu and vhost-user-gpu devices. The VirtIOGPUBase base class provides common functionalities necessary for both virtio-gpu and vhost-user-gpu: - common configuration (max-outputs, initial resolution, flags) - virtio device initialization, including queue setup - device pre-conditions checks (iommu) - migration blocker - virtio device callbacks - hooking up to qemu display subsystem - a few common helper functions to reset the device, retrieve display informations - a class callback to unblock the rendering (for GL updates) What is left to the virtio-gpu subdevice to take care of, in short, are all the virtio queues handling, command processing and migration. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190524130946.31736-8-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-05-29virtio-gpu: add a pixman helper headerMarc-André Lureau
This will allow to share the format conversion function with vhost-user-gpu. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190524130946.31736-4-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-05-29virtio-gpu: add bswap helpers headerMarc-André Lureau
The helper functions are useful to build the vhost-user-gpu backend. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190524130946.31736-3-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-05-29vhost-user: add vhost_user_gpu_set_socket()Marc-André Lureau
Add a new vhost-user message to give a unix socket to a vhost-user backend for GPU display updates. Back when I started that work, I added a new GPU channel because the vhost-user protocol wasn't bidirectional. Since then, there is a vhost-user-slave channel for the slave to send requests to the master. We could extend it with GPU messages. However, the GPU protocol is quite orthogonal to vhost-user, thus I chose to have a new dedicated channel. See vhost-user-gpu.rst for the protocol details. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190524130946.31736-2-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-05-29ppc/pnv: add dummy XSCOM registers for PRD initializationCédric Le Goater
PRD (Processor recovery diagnostics) is a service available on OpenPower systems. The opal-prd daemon initializes the PowerPC Processor through the XSCOM bus and then waits for hardware diagnostic events. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190527071722.31424-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29ppc/pnv: introduce new skiboot platform propertiesCédric Le Goater
Newer skiboots (after 6.3) support QEMU platforms that have characteristics closer to real OpenPOWER systems. The CPU type is used to define the BMC drivers: Aspeed AST2400 for POWER8 processors and AST2500 for POWER9s. Advertise the new platform property names, "qemu,powernv8" and "qemu,powernv9", using the CPU type chosen for the QEMU PowerNV machine. Also, advertise the original platform name "qemu,powernv" in case of POWER8 processors for compatibility with older skiboots. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190527071749.31499-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr: Don't migrate the hpt_maxpagesize cap to older machine typesGreg Kurz
Commit 0b8c89be7f7b added the hpt_maxpagesize capability to the migration stream. This is okay for new machine types but it breaks backward migration to older QEMUs, which don't expect the extra subsection. Add a compatibility boolean flag to the sPAPR machine class and use it to skip migration of the capability for machine types 4.0 and older. This fixes migration to an older QEMU. Note that the destination will emit a warning: qemu-system-ppc64: warning: cap-hpt-max-page-size lower level (16) in incoming stream than on destination (24) This is expected and harmless though. It is okay to migrate from a lower HPT maximum page size (64k) to a greater one (16M). Fixes: 0b8c89be7f7b "spapr: Add forgotten capability to migration stream" Based-on: <20190522074016.10521-3-clg@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155853262675.1158324.17301777846476373459.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr: change default interrupt mode to 'dual'Cédric Le Goater
Now that XIVE support is complete (QEMU emulated and KVM devices), change the pseries machine to advertise both interrupt modes: XICS (P7/P8) and XIVE (P9). The machine default interrupt modes depends on the version. Current settings are: pseries default interrupt mode 4.1 dual 4.0 xics 3.1 xics 3.0 legacy xics (different IRQ number space layout) Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190522074016.10521-3-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/xive: fix multiple resets when using the 'dual' interrupt modeCédric Le Goater
Today, when a reset occurs on a pseries machine using the 'dual' interrupt mode, the KVM devices are released and recreated depending on the interrupt mode selected by CAS. If XIVE is selected, the SysBus memory regions of the SpaprXive model are initialized by the KVM backend initialization routine each time a reset occurs. This leads to a crash after a couple of resets because the machine reaches the QDEV_MAX_MMIO limit of SysBusDevice : qemu-system-ppc64: hw/core/sysbus.c:193: sysbus_init_mmio: Assertion `dev->num_mmio < QDEV_MAX_MMIO' failed. To fix, initialize the SysBus memory regions in spapr_xive_realize() called only once and remove the same inits from the QEMU and KVM backend initialization routines which are called at each reset. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190522074016.10521-2-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/irq: add KVM support to the 'dual' machineCédric Le Goater
The interrupt mode is chosen by the CAS negotiation process and activated after a reset to take into account the required changes in the machine. This brings new constraints on how the associated KVM IRQ device is initialized. Currently, each model takes care of the initialization of the KVM device in their realize method but this is not possible anymore as the initialization needs to be done globaly when the interrupt mode is known, i.e. when machine is reseted. It also means that we need a way to delete a KVM device when another mode is chosen. Also, to support migration, the QEMU objects holding the state to transfer should always be available but not necessarily activated. The overall approach of this proposal is to initialize both interrupt mode at the QEMU level to keep the IRQ number space in sync and to allow switching from one mode to another. For the KVM side of things, the whole initialization of the KVM device, sources and presenters, is grouped in a single routine. The XICS and XIVE sPAPR IRQ reset handlers are modified accordingly to handle the init and the delete sequences of the KVM device. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-15-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29ppc/xics: fix irq priority in ics_set_irq_type()Cédric Le Goater
Recent commits changed the behavior of ics_set_irq_type() to initialize correctly LSIs at the KVM level. ics_set_irq_type() is also called by the realize routine of the different devices of the machine when initial interrupts are claimed, before the ICSState device is reseted. In the case, the ICSIRQState priority is 0x0 and the call to ics_set_irq_type() results in configuring the target of the interrupt. On P9, when using the KVM XICS-on-XIVE device, the target is configured to be server 0, priority 0 and the event queue 0 is created automatically by KVM. With the dual interrupt mode creating the KVM device at reset, it leads to unexpected effects on the guest, mostly blocking IPIs. This is wrong, fix it by reseting the ICSIRQState structure when ics_set_irq_type() is called. Fixes: commit 6cead90c5c9c ("xics: Write source state to KVM at claim time") Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190513084245.25755-14-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/irq: initialize the IRQ device only onceCédric Le Goater
Add a check to make sure that the routine initializing the emulated IRQ device is called once. We don't have much to test on the XICS side, so we introduce a 'init' boolean under ICSState. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190513084245.25755-13-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/irq: introduce a spapr_irq_init_device() helperCédric Le Goater
The way the XICS and the XIVE devices are initialized follows the same pattern. First, try to connect to the KVM device and if not possible fallback on the emulated device, unless a kernel_irqchip is required. The spapr_irq_init_device() routine implements this sequence in generic way using new sPAPR IRQ handlers ->init_emu() and ->init_kvm(). The XIVE init sequence is moved under the associated sPAPR IRQ ->init() handler. This will change again when KVM support is added for the dual interrupt mode. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-12-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr: check for the activation of the KVM IRQ deviceCédric Le Goater
The activation of the KVM IRQ device depends on the interrupt mode chosen at CAS time by the machine and some methods used at reset or by the migration need to be protected. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190513084245.25755-11-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr: introduce routines to delete the KVM IRQ deviceCédric Le Goater
If a new interrupt mode is chosen by CAS, the machine generates a reset to reconfigure. At this point, the connection with the previous KVM device needs to be closed and a new connection needs to opened with the KVM device operating the chosen interrupt mode. New routines are introduced to destroy the XICS and the XIVE KVM devices. They make use of a new KVM device ioctl which destroys the device and also disconnects the IRQ presenters from the vCPUs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-10-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29sysbus: add a sysbus_mmio_unmap() helperCédric Le Goater
This will be used to remove the MMIO regions of the POWER9 XIVE interrupt controller when the sPAPR machine is reseted. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/xive: activate KVM supportCédric Le Goater
All is in place for KVM now. State synchronization and migration will come next. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/xive: add migration support for KVMCédric Le Goater
When the VM is stopped, the VM state handler stabilizes the XIVE IC and marks the EQ pages dirty. These are then transferred to destination before the transfer of the device vmstates starts. The SpaprXive interrupt controller model captures the XIVE internal tables, EAT and ENDT and the XiveTCTX model does the same for the thread interrupt context registers. At restart, the SpaprXive 'post_load' method restores all the XIVE states. It is called by the sPAPR machine 'post_load' method, when all XIVE states have been transferred and loaded. Finally, the source states are restored in the VM change state handler when the machine reaches the running state. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/xive: introduce a VM state change handlerCédric Le Goater
This handler is in charge of stabilizing the flow of event notifications in the XIVE controller before migrating a guest. This is a requirement before transferring the guest EQ pages to a destination. When the VM is stopped, the handler sets the source PQs to PENDING to stop the flow of events and to possibly catch a triggered interrupt occuring while the VM is stopped. Their previous state is saved. The XIVE controller is then synced through KVM to flush any in-flight event notification and to stabilize the EQs. At this stage, the EQ pages are marked dirty to make sure the EQ pages are transferred if a migration sequence is in progress. The previous configuration of the sources is restored when the VM resumes, after a migration or a stop. If an interrupt was queued while the VM was stopped, the handler simply generates the missing trigger. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/xive: add state synchronization with KVMCédric Le Goater
This extends the KVM XIVE device backend with 'synchronize_state' methods used to retrieve the state from KVM. The HW state of the sources, the KVM device and the thread interrupt contexts are collected for the monitor usage and also migration. These get operations rely on their KVM counterpart in the host kernel which acts as a proxy for OPAL, the host firmware. The set operations will be added for migration support later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190513084245.25755-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/xive: add hcall support when under KVMCédric Le Goater
XIVE hcalls are all redirected to QEMU as none are on a fast path. When necessary, QEMU invokes KVM through specific ioctls to perform host operations. QEMU should have done the necessary checks before calling KVM and, in case of failure, H_HARDWARE is simply returned. H_INT_ESB is a special case that could have been handled under KVM but the impact on performance was low when under QEMU. Here are some figures : kernel irqchip OFF ON H_INT_ESB KVM QEMU rtl8139 (LSI ) 1.19 1.24 1.23 Gbits/sec virtio 31.80 42.30 -- Gbits/sec Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/xive: add KVM supportCédric Le Goater
This introduces a set of helpers when KVM is in use, which create the KVM XIVE device, initialize the interrupt sources at a KVM level and connect the interrupt presenters to the vCPU. They also handle the initialization of the TIMA and the source ESB memory regions of the controller. These have a different type under KVM. They are 'ram device' memory mappings, similarly to VFIO, exposed to the guest and the associated VMAs on the host are populated dynamically with the appropriate pages using a fault handler. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr: Print out extra hints when CAS negotiation of interrupt mode failsGreg Kurz
Let's suggest to the user how the machine should be configured to allow the guest to boot successfully. Suggested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155799221739.527449.14907564571096243745.stgit@bahia.lan> Reviewed-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> [dwg: Adjusted for style error] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr: Fix phb_placement backwards compatibilityDavid Gibson
When we added support for NVLink2 passthrough devices, we changed the phb_placement hook to handle the placement of NVLink2 bridges' specific resources. For compatibility we use a version that doesn't do this allocation for old machine types. However, because of the delay between when the patch was posted and when it was merged, we ended up with that compatibility hook applying for machine versions 3.1 and earlier whereas it should apply for 4.0 and earlier (since the patch was applied early in the 4.1 tree). Fixes: ec132efaa81 "spapr: Support NVIDIA V100 GPU with NVLink2" Reported-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2019-05-29spapr: Add forgotten capability to migration streamDavid Gibson
spapr machine capabilities are supposed to be sent in the migration stream so that we can sanity check the source and destination have compatible configuration. Unfortunately, when we added the hpt-max-page-size capability, we forgot to add it to the migration state. This means that we can generate spurious warnings when both ends are configured for large pages, or potentially fail to warn if the source is configured for huge pages, but the destination is not. Fixes: 2309832afda "spapr: Maximum (HPT) pagesize property" Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-05-29target/ppc: Set PSSCR_EC on cpu halt to prevent spurious wakeupSuraj Jitindar Singh
The processor stop status and control register (PSSCR) is used to control the power saving facilities of the thread. The exit criterion bit (EC) is used to specify whether the thread should be woken by any interrupt (EC == 0) or only an interrupt enabled in the LPCR to wake the thread (EC == 1). The rtas facilities start-cpu and self-stop are used to transition a vcpu between the stopped and running states. When a vcpu is stopped it may only be started again by the start-cpu rtas call. Currently a vcpu in the stopped state will start again whenever an interrupt comes along due to PSSCR_EC being cleared, and while this is architecturally correct for a hardware thread, a vcpu is expected to only be woken by calling start-cpu. This means when performing a reboot on a tcg machine that the secondary threads will restart while the primary is still in slof, this is unsupported and causes call traces like: SLOF ********************************************************************** QEMU Starting Build Date = Jan 14 2019 18:00:39 FW Version = git-a5b428e1c1eae703 Press "s" to enter Open Firmware. qemu: fatal: Trying to deliver HV exception (MSR) 70 with no HV support NIP 6d61676963313230 LR 000000003dbe0308 CTR 6d61676963313233 XER 0000000000000000 CPU#1 MSR 0000000000000000 HID0 0000000000000000 HF 0000000000000000 iidx 3 didx 3 TB 00000026 115746031956 DECR 18446744073326238463 GPR00 000000003dbe0308 000000003e669fe0 000000003dc10700 0000000000000003 GPR04 000000003dc62198 000000003dc62178 000000003dc0ea48 0000000000000030 GPR08 000000003dc621a8 0000000000000018 000000003e466008 000000003dc50700 GPR12 c00000000093a4e0 c00000003ffff300 c00000003e533f90 0000000000000000 GPR16 0000000000000000 0000000000000000 000000003e466010 000000003dc0b040 GPR20 0000000000008000 000000000000f003 0000000000000006 000000003e66a050 GPR24 000000003dc06400 000000003dc0ae70 0000000000000003 000000000000f001 GPR28 000000003e66a060 ffffffffffffffff 6d61676963313233 0000000000000028 CR 28000222 [ E L - - - E E E ] RES ffffffffffffffff FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR08 0000000000000000 0000000000000000 0000000000000000 00000000311825e0 FPR12 00000000311825e0 0000000000000000 0000000000000000 0000000000000000 FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPSCR 0000000000000000 SRR0 000000003dbe06b0 SRR1 0000000000080000 PVR 00000000004e1200 VRSAVE 0000000000000000 SPRG0 000000003dbe0308 SPRG1 000000003e669fe0 SPRG2 00000000000000d8 SPRG3 000000003dbe0308 SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 HSRR0 6d61676963313230 HSRR1 0000000000000000 CFAR 000000003dbe3e64 LPCR 0000000004020008 PTCR 0000000000000000 DAR 0000000000000000 DSISR 0000000000000000 Aborted (core dumped) To fix this, set the PSSCR_EC bit when a vcpu is stopped to disable it from coming back online until the start-cpu rtas call is made. Fixes: 21c0d66a9c99 ("target/ppc: Fix support for "STOP light" states on POWER9") Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190516005744.24366-1-sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29spapr/xive: Sanity checks of OV5 during CASGreg Kurz
If a machine is started with ic-mode=xive but the guest only knows about XICS, eg. an RHEL 7.6 guest, the kernel panics. This is expected but a bit unfortunate since the crash doesn't provide much information for the end user to guess what's happening. Detect that during CAS and exit QEMU with a proper error message instead, like it is already done for the MMU. Even if this is less likely to happen, the opposite case of a guest that only knows about XIVE would certainly fail all the same if the machine is started with ic-mode=xics. Also, the only valid values a guest can pass in byte 23 of OV5 during CAS are 0b00 (XIVE legacy mode) and 0b01 (XIVE exploitation mode). Any other value is a bug, at least with the current spec. Again, it does not seem right to let the guest go on without a precise idea of the interrupt mode it asked for. Handle these cases as well. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155793986451.464434.12887933000007255549.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29Fix typo on "info pic" monitor cmd output for xiveSatheesh Rajendran
Instead of LISN i.e "Logical Interrupt Source Number" as per Xive PAPR document "info pic" prints as LSIN, let's fix it. Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Message-Id: <20190509080750.21999-1-sathnaga@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>