aboutsummaryrefslogtreecommitdiff
path: root/include/hw
AgeCommit message (Collapse)Author
2019-04-09pci: Allow PCI bus subtypes to support extended config space accessesGreg Kurz
Some PHB implementations, eg. PAPR used on pseries machine, act like a regular PCI bus rather than a PCIe bus, but allow access to the PCIe extended config space anyway. Introduce a new PCI bus class method to modelize this behaviour and use it when adjusting the config space size limit during accesses. No behaviour change for existing PCI bus types. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155414130271.574858.4253514266378127489.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-04-07Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pci, pc, virtio: fixes intel-iommu fixes virtio typo fixes linker: a couple of asserts for consistency/security Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 02 Apr 2019 16:51:19 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: intel_iommu: Drop extended root field intel_iommu: Fix root_scalable migration breakage virtio-net: Fix typo in comment intel_iommu: Correct caching-mode error message acpi: verify file entries in bios_linker_loader_add_pointer() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-04-04riscv: plic: Fix incorrect irq calculationAlistair Francis
This patch fixes four different things, to maintain bisectability they have been merged into a single patch. The following fixes are below: sifive_plic: Fix incorrect irq calculation The irq is incorrectly calculated to be off by one. It has worked in the past as the priority_base offset has also been set incorrectly. We are about to fix the priority_base offset so first first the irq calculation. sifive_u: Fix PLIC priority base offset and numbering According to the FU540 manual the PLIC source priority address starts at an offset of 0x04 and not 0x00. The same manual also specifies that the PLIC only has 53 source priorities. Fix these two incorrect header files. We also need to over extend the plic_gpios[] array as the PLIC sources count from 1 and not 0. riscv: sifive_e: Fix PLIC priority base offset According to the FE31 manual the PLIC source priority address starts at an offset of 0x04 and not 0x00. riscv: virt: Fix PLIC priority base offset Update the virt offsets based on the newly updated SiFive U and SiFive E offsets. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-04-02intel_iommu: Drop extended root fieldPeter Xu
VTD_RTADDR_RTT is dropped even by the VT-d spec, so QEMU should probably do the same thing (after all we never really implemented it). Since we've had a field for that in the migration stream, to keep compatibility we need to fill the hole up. Please refer to VT-d spec 10.4.6. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190329061422.7926-3-peterx@redhat.com> Reviewed-by: Liu, Yi L <yi.l.liu@intel.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-29spapr: Simplify handling of host-serial and host-model valuesDavid Gibson
27461d69a0f "ppc: add host-serial and host-model machine attributes (CVE-2019-8934)" introduced 'host-serial' and 'host-model' machine properties for spapr to explicitly control the values advertised to the guest in device tree properties with the same names. The previous behaviour on KVM was to unconditionally populate the device tree with the real host serial number and model, which leaks possibly sensitive information about the host to the guest. To maintain compatibility for old machine types, we allowed those props to be set to "passthrough" to take the value from the host as before. Or they could be set to "none" to explicitly omit the device tree items. Special casing specific values on what's otherwise a user supplied string is very ugly. So, this patch simplifies things by implementing the backwards compatibility in a different way: we have a machine class flag set for the older machines, and we only load the host values into the device tree if A) they're not set by the user and B) we have that flag set. This does mean that the "passthrough" functionality is no longer available with the current machine type. That's ok though: if a user or management layer really wants the information passed through they can read it themselves (OpenStack Nova already does something similar for x86). It also means the user can't explicitly ask for the values to be omitted on the old machine types. I think that's an acceptable trade-off: if you care enough about not leaking the host information you can either move to the new machine type, or use a dummy value for the properties. For the new machine type, this also removes an odd inconsistency between running on a POWER and non-POWER (or non-Linux) hosts: if the host information couldn't be read from where we expect (in the host's device tree as exposed by Linux), we'd fallback to omitting the guest device tree items. While we're there, improve some poorly worded comments, and the help text for the properties. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org>
2019-03-28Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* Kconfig improvements (msi_nonbroken, imply for default PCI devices) * intel-iommu: sharing passthrough FlatViews (Peter) * Fix for SEV with VFIO (Brijesh) * Allow compilation without CONFIG_PARALLEL (Thomas) # gpg: Signature made Thu 21 Mar 2019 16:42:24 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (23 commits) virtio-vga: only enable for specific boards config-all-devices.mak: rebuild on reconfigure minikconf: fix parser typo intel-iommu: optimize nodmar memory regions test-announce-self: convert to qgraph hw/alpha/Kconfig: DP264 hardware requires e1000 network card hw/hppa/Kconfig: Dino board requires e1000 network card hw/sh4/Kconfig: r2d machine requires the rtl8139 network card hw/ppc/Kconfig: e500 based machines require virtio-net-pci device hw/ppc/Kconfig: Bamboo machine requires e1000 network card hw/mips/Kconfig: Fulong 2e board requires ati-vga/rtl8139 PCI devices hw/mips/Kconfig: Malta machine requires the pcnet network card hw/i386/Kconfig: enable devices that can be created by default hw/isa/Kconfig: PIIX4 southbridge requires USB UHCI hw/isa/Kconfig: i82378 SuperIO requires PC speaker device prep: do not select I82374 hw/i386/Kconfig: PC uses I8257, not I82374 hw/char/parallel: Make it possible to compile also without CONFIG_PARALLEL target/i386: sev: Do not pin the ram device memory region memory: Fix the memory region type assignment order ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/rdma/Makefile.objs # hw/riscv/sifive_plic.c
2019-03-26pflash: Require backend size to match device, improve errorsMarkus Armbruster
We reject undersized backends with a rather enigmatic "failed to read the initial flash content" error. For instance: $ qemu-system-ppc64 -S -display none -M sam460ex -drive if=pflash,format=raw,file=eins.img qemu-system-ppc64: Initialization of device cfi.pflash02 failed: failed to read the initial flash content We happily accept oversized images, ignoring their tail. Throwing away parts of firmware that way is pretty much certain to end in an even more enigmatic failure to boot. Require the backend's size to match the device's size exactly. Report mismatch like this: qemu-system-ppc64: Initialization of device cfi.pflash01 failed: device requires 1048576 bytes, block backend provides 512 bytes Improve the error for actual read failures to "can't read block backend". To avoid duplicating even more code between the two pflash device models, do all that in new helper blk_check_size_and_read_all(). The error reporting can still be confusing. For instance: qemu-system-ppc64 -S -display none -M taihu -drive if=pflash,format=raw,file=eins.img -drive if=pflash,unit=1,format=raw,file=zwei.img qemu-system-ppc64: Initialization of device cfi.pflash02 failed: device requires 2097152 bytes, block backend provides 512 bytes Leaves the user guessing which of the two -drive is wrong. Mention the issue in a TODO comment. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190319163551.32499-2-armbru@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-03-20intel-iommu: optimize nodmar memory regionsPeter Xu
Previously we have per-device system memory aliases when DMAR is disabled by the system. It will slow the system down if there are lots of devices especially when DMAR is disabled, because each of the aliased system address space will contain O(N) slots, and rendering such N address spaces will be O(N^2) complexity. This patch introduces a shared nodmar memory region and for each device we only create an alias to the shared memory region. With the aliasing, QEMU memory core API will be able to detect when devices are sharing the same address space (which is the nodmar address space) when rendering the FlatViews and the total number of FlatViews can be dramatically reduced when there are a lot of devices. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190313094323.18263-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-18virtio-gpu: delay virglrenderer reset when blocked.Gerd Hoffmann
If renderer_blocked is set do not call virtio_gpu_virgl_reset(). Instead set a flag indicating that virglrenderer needs a reset. When renderer_blocked gets cleared do the actual reset call. Without this we can trigger an assert in spice due to calling spice_qxl_gl_scanout() while another operation is still running: spice_qxl_gl_scanout: condition `qxl_state->gl_draw_cookie == GL_DRAW_COOKIE_INVALID' failed Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20190314115358.26678-2-kraxel@redhat.com
2019-03-16{hmp, hw/pvrdma}: Expose device internals via monitor interfaceYuval Shaia
Allow interrogating device internals through HMP interface. The exposed indicators can be used for troubleshooting by developers or sysadmin. There is no need to expose these attributes to a management system (e.x. libvirt) because (1) most of them are not "device-management' related info and (2) there is no guarantee the interface is stable. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1552300155-25216-6-git-send-email-yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2019-03-15hw/intc/bcm2836_control: Implement local timerZoltán Baldaszti
The BCM2836 control logic module includes a simple "local timer" which is a programmable down-counter that can generates an interrupt. Implement this functionality. Signed-off-by: Zoltán Baldaszti <bztemail@gmail.com> [PMM: wrote commit message; wrapped long line; tweaked some comments to match the final version of the code] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-12i386, acpi: check acpi_memory_hotplug capacity in pre_plugWei Yang
Currently we do device realization like below: hotplug_handler_pre_plug() dc->realize() hotplug_handler_plug() Before we do device realization and plug, we should allocate necessary resources and check if memory-hotplug-support property is enabled. At the piix4 and ich9, the memory-hotplug-support property is checked at plug stage. This means that device has been realized and mapped into guest address space 'pc_dimm_plug()' by the time acpi plug handler is called, where it might fail and crash QEMU due to reaching g_assert_not_reached() (piix4) or error_abort (ich9). Fix it by checking if memory hotplug is enabled at pre_plug stage where we can gracefully abort hotplug request. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> CC: Igor Mammedov <imammedo@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190301033548.6691-1-richardw.yang@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12gen_pcie_root_port: Add ACS (Access Control Services) capabilityKnut Omang
Claim ACS support in the generic PCIe root port to allow passthrough of individual functions of a device to different guests (in a nested virt.setting) with VFIO. Without this patch, all functions of a device, such as all VFs of an SR/IOV device, will end up in the same IOMMU group. A similar situation occurs on Windows with Hyper-V. In the single function device case, it also has a small cosmetic benefit in that the root port itself is not grouped with the device. VFIO handles that situation in that binding rules only apply to endpoints, so it does not limit passthrough in those cases. Signed-off-by: Knut Omang <knut.omang@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <319460b483f566dd57487eb3dd340ed4c10aa53c.1550768238.git-series.knut.omang@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2019-03-12pcie: Add a simple PCIe ACS (Access Control Services) helper functionKnut Omang
Implementing an ACS capability on downstream ports and multifunction endpoints indicates isolation and IOMMU visibility to a finer granularity. This creates smaller IOMMU groups in the guest and thus more flexibility in assigning endpoints to guest userspace or an L2 guest. Signed-off-by: Knut Omang <knut.omang@oracle.com> Message-Id: <07489975121696f5573b0a92baaf3486ef51e35d.1550768238.git-series.knut.omang@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2019-03-12vhost-user-blk: Add support to get/set inflight bufferXie Yongji
This patch adds support for vhost-user-blk device to get/set inflight buffer from/to backend. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20190228085355.9614-6-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12vhost-user: Support transferring inflight buffer between qemu and backendXie Yongji
This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared buffer between qemu and backend. Firstly, qemu uses VHOST_USER_GET_INFLIGHT_FD to get the shared buffer from backend. Then qemu should send it back through VHOST_USER_SET_INFLIGHT_FD each time we start vhost-user. This shared buffer is used to track inflight I/O by backend. Qemu should retrieve a new one when vm reset. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Signed-off-by: Chai Wen <chaiwen@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20190228085355.9614-2-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12intel_iommu: add scalable-mode option to make scalable mode workYi Sun
This patch adds an option to provide flexibility for user to expose Scalable Mode to guest. User could expose Scalable Mode to guest by the config as below: "-device intel-iommu,caching-mode=on,scalable-mode=on" The Linux iommu driver has supported scalable mode. Please refer below patch set: https://www.spinics.net/lists/kernel/msg2985279.html Signed-off-by: Liu, Yi L <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Message-Id: <1551753295-30167-4-git-send-email-yi.y.sun@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12intel_iommu: add 256 bits qi_desc supportLiu, Yi L
Per Intel(R) VT-d 3.0, the qi_desc is 256 bits in Scalable Mode. This patch adds emulation of 256bits qi_desc. Signed-off-by: Liu, Yi L <yi.l.liu@intel.com> [Yi Sun is co-developer to rebase and refine the patch.] Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1551753295-30167-3-git-send-email-yi.y.sun@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12intel_iommu: scalable mode emulationLiu, Yi L
Intel(R) VT-d 3.0 spec introduces scalable mode address translation to replace extended context mode. This patch extends current emulator to support Scalable Mode which includes root table, context table and new pasid table format change. Now intel_iommu emulates both legacy mode and scalable mode (with legacy-equivalent capability set). The key points are below: 1. Extend root table operations to support both legacy mode and scalable mode. 2. Extend context table operations to support both legacy mode and scalable mode. 3. Add pasid tabled operations to support scalable mode. Signed-off-by: Liu, Yi L <yi.l.liu@intel.com> [Yi Sun is co-developer to contribute much to refine the whole commit.] Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Message-Id: <1551753295-30167-2-git-send-email-yi.y.sun@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2019-03-12vhost-user: simplify vhost_user_init/vhost_user_cleanupMarc-André Lureau
Take a VhostUserState* that can be pre-allocated, and initialize it with the associated chardev. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Message-Id: <20190308140454.32437-4-marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12Merge remote-tracking branch ↵Peter Maydell
'remotes/ehabkost/tags/machine-next-pull-request' into staging Machine queue, 2019-03-11 * memfd fixes (Ilya Maximets) * Move nvdimms state into struct MachineState (Eric Auger) * hostmem-file: reject invalid pmem file sizes (Stefan Hajnoczi) # gpg: Signature made Tue 12 Mar 2019 00:57:41 GMT # gpg: using RSA key 2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full] # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/machine-next-pull-request: memfd: improve error messages memfd: set up correct errno if not supported memfd: always check for MFD_CLOEXEC hostmem-memfd: disable for systems without sealing support machine: Move nvdimms state into struct MachineState nvdimm: Rename AcpiNVDIMMState into NVDIMMState hostmem-file: reject invalid pmem file sizes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-12Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20190311.0' ↵Peter Maydell
into staging VFIO updates 2019-03-11 - Resolution support for mdev displays supporting EDID interface (Gerd Hoffmann) # gpg: Signature made Mon 11 Mar 2019 19:17:39 GMT # gpg: using RSA key 239B9B6E3BB08B22 # gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" [full] # gpg: aka "Alex Williamson <alex@shazbot.org>" [full] # gpg: aka "Alex Williamson <alwillia@redhat.com>" [full] # gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" [full] # Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B 8A90 239B 9B6E 3BB0 8B22 * remotes/awilliam/tags/vfio-updates-20190311.0: vfio/display: delay link up event vfio/display: add xres + yres properties vfio/display: add edid support. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-12Merge remote-tracking branch 'remotes/armbru/tags/pull-pflash-2019-03-11' ↵Peter Maydell
into staging Pflash and firmware configuration patches for 2019-03-11 # gpg: Signature made Mon 11 Mar 2019 21:59:12 GMT # gpg: using RSA key 3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-pflash-2019-03-11: (27 commits) docs/interop/firmware.json: Prefer -machine to if=pflash pc: Support firmware configuration with -blockdev pc_sysfw: Pass PCMachineState to pc_system_firmware_init() pc_sysfw: Remove unused PcSysFwDevice pflash_cfi01: Add pflash_cfi01_get_blk() helper vl: Create block backends before setting machine properties vl: Factor configure_blockdev() out of main() vl: Improve legibility of BlockdevOptions queue sysbus: Fix latent bug with onboard devices vl: Fix latent bug with -global and onboard devices qom: Move compat_props machinery from qdev to QOM qdev: Fix latent bug with compat_props and onboard devices pflash: Clean up after commit 368a354f02b, part 2 pflash: Clean up after commit 368a354f02b, part 1 mips_malta: Clean up definition of flash memory size somewhat hw/mips/malta: Restrict 'bios_size' variable scope hw/mips/malta: Remove fl_sectors variable mips_malta: Delete disabled, broken DEBUG_BOARD_INIT code r2d: Fix flash memory size, sector size, width, device ID ppc405_boards: Don't size flash memory to match backing image ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-12vfio: Make vfio_get_region_info_cap publicAlexey Kardashevskiy
This makes vfio_get_region_info_cap() to be used in quirks. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alex Williamson <alex.williamson@redhat.com> Message-Id: <20190307050518.64968-3-aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12spapr: Use CamelCase properlyDavid Gibson
The qemu coding standard is to use CamelCase for type and structure names, and the pseries code follows that... sort of. There are quite a lot of places where we bend the rules in order to preserve the capitalization of internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR". That was a bad idea - it frequently leads to names ending up with hard to read clusters of capital letters, and means they don't catch the eye as type identifiers, which is kind of the point of the CamelCase convention in the first place. In short, keeping type identifiers look like CamelCase is more important than preserving standard capitalization of internal "words". So, this patch renames a heap of spapr internal type names to a more standard CamelCase. In addition to case changes, we also make some other identifier renames: VIOsPAPR* -> SpaprVio* The reverse word ordering was only ever used to mitigate the capital cluster, so revert to the natural ordering. VIOsPAPRVTYDevice -> SpaprVioVty VIOsPAPRVLANDevice -> SpaprVioVlan Brevity, since the "Device" didn't add useful information sPAPRDRConnector -> SpaprDrc sPAPRDRConnectorClass -> SpaprDrcClass Brevity, and makes it clearer this is the same thing as a "DRC" mentioned in many other places in the code This is 100% a mechanical search-and-replace patch. It will, however, conflict with essentially any and all outstanding patches touching the spapr code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: POWER9 XSCOM quad supportCédric Le Goater
The POWER9 processor does not support per-core frequency control. The cores are arranged in groups of four, along with their respective L2 and L3 caches, into a structure known as a Quad. The frequency must be managed at the Quad level. Provide a basic Quad model to fake the settings done by the firmware on the Non-Cacheable Unit (NCU). Each core pair (EX) needs a special BAR setting for the TIMA area of XIVE because it resides on the same address on all chips. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190307223548.20516-12-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: extend XSCOM core support for POWER9Cédric Le Goater
Provide a new class attribute to define XSCOM operations per CPU family and add a couple of XSCOM addresses controlling the power management states of the core on POWER9. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190307223548.20516-11-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add a OCC model for POWER9Cédric Le Goater
The OCC on POWER9 is very similar to the one found on POWER8. Provide the same routines with P9 values for the registers and IRQ number. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190307223548.20516-10-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add a OCC model classCédric Le Goater
To ease the introduction of the OCC model for POWER9, provide a new class attributes to define XSCOM operations per CPU family and a PSI IRQ number. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190307223548.20516-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add SerIRQ routing registersCédric Le Goater
This is just a simple reminder that SerIRQ routing should be addressed. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190307223548.20516-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add a LPC Controller model for POWER9Cédric Le Goater
The LPC Controller on POWER9 is very similar to the one found on POWER8 but accesses are now done via on MMIOs, without the XSCOM and ECCB logic. The device tree is populated differently so we add a specific POWER9 routine for the purpose. SerIRQ routing is yet to be done. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190307223548.20516-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add a 'dt_isa_nodename' to the chipCédric Le Goater
The ISA bus has a different DT nodename on POWER9. Compute the name when the PnvChip is realized, that is before it is used by the machine to populate the device tree with the ISA devices. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190307223548.20516-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add a LPC Controller class modelCédric Le Goater
It will ease the introduction of the LPC Controller model for POWER9. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190307223548.20516-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add a PSI bridge model for POWER9Cédric Le Goater
The PSI bridge on POWER9 is very similar to POWER8. The BAR is still set through XSCOM but the controls are now entirely done with MMIOs. More interrupts are defined and the interrupt controller interface has changed to XIVE. The POWER9 model is a first example of the usage of the notify() handler of the XiveNotifier interface, linking the PSI XiveSource to its owning device model. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190307223548.20516-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add a PSI bridge class modelCédric Le Goater
To ease the introduction of the PSI bridge model for POWER9, abstract the POWER chip differences in a PnvPsi class model and introduce a specific Pnv8Psi type for POWER8. POWER8 interface to the interrupt controller is still XICS whereas POWER9 uses the new XIVE model. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190307223548.20516-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12spapr_iommu: Do not replay mappings from just created DMA windowAlexey Kardashevskiy
On sPAPR vfio_listener_region_add() is called in 2 situations: 1. a new listener is registered from vfio_connect_container(); 2. a new IOMMU Memory Region is added from rtas_ibm_create_pe_dma_window(). In both cases vfio_listener_region_add() calls memory_region_iommu_replay() to notify newly registered IOMMU notifiers about existing mappings which is totally desirable for case 1. However for case 2 it is nothing but noop as the window has just been created and has no valid mappings so replaying those does not do anything. It is barely noticeable with usual guests but if the window happens to be really big, such no-op replay might take minutes and trigger RCU stall warnings in the guest. For example, a upcoming GPU RAM memory region mapped at 64TiB (right after SPAPR_PCI_LIMIT) causes a 64bit DMA window to be at least 128TiB which is (128<<40)/0x10000=2.147.483.648 TCEs to replay. This mitigates the problem by adding an "skipping_replay" flag to sPAPRTCETable and defining sPAPR own IOMMU MR replay() hook which does exactly the same thing as the generic one except it returns early if @skipping_replay==true. Another way of fixing this would be delaying replay till the very first H_PUT_TCE but this does not work if in-kernel H_PUT_TCE handler is enabled (a likely case). When "ibm,create-pe-dma-window" is complete, the guest will map only required regions of the huge DMA window. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20190307050518.64968-2-aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: introduce a new pic_print_info() operation to the chip modelCédric Le Goater
The POWER9 and POWER8 processors have different interrupt controllers, and reporting their state requires calling different helper routines. However, the interrupt presenters are still handled in the higher level pic_print_info() routine because they are not related to the chip. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190306085032.15744-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: introduce a new dt_populate() operation to the chip modelCédric Le Goater
The POWER9 and POWER8 processors have a different set of devices and a different device tree layout. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190306085032.15744-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: add a XIVE interrupt controller model for POWER9Cédric Le Goater
This is a simple model of the POWER9 XIVE interrupt controller for the PowerNV machine which only addresses the needs of the skiboot firmware. The PowerNV model reuses the common XIVE framework developed for sPAPR as the fundamentals aspects are quite the same. The difference are outlined below. The controller initial BAR configuration is performed using the XSCOM bus from there, MMIO are used for further configuration. The MMIO regions exposed are : - Interrupt controller registers - ESB pages for IPIs and ENDs - Presenter MMIO (Not used) - Thread Interrupt Management Area MMIO, direct and indirect The virtualization controller MMIO region containing the IPI ESB pages and END ESB pages is sub-divided into "sets" which map portions of the VC region to the different ESB pages. These are modeled with custom address spaces and the XiveSource and XiveENDSource objects are sized to the maximum allowed by HW. The memory regions are resized at run-time using the configuration of EDT set translation table provided by the firmware. The XIVE virtualization structure tables (EAT, ENDT, NVTT) are now in the machine RAM and not in the hypervisor anymore. The firmware (skiboot) configures these tables using Virtual Structure Descriptor defining the characteristics of each table : SBE, EAS, END and NVT. These are later used to access the virtual interrupt entries. The internal cache of these tables in the interrupt controller is updated and invalidated using a set of registers. Still to address to complete the model but not fully required is the support for block grouping. Escalation support will be necessary for KVM guests. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190306085032.15744-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: change the CPU machine_data presenter type to Object *Cédric Le Goater
The POWER9 PowerNV machine will use a XIVE interrupt presenter type. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190306085032.15744-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/pnv: export the xive_router_notify() routineCédric Le Goater
The PowerNV machine with need to encode the block id in the source interrupt number before forwarding the source event notification to the Router. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190306085032.15744-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc/xive: export the TIMA memory accessorsCédric Le Goater
The PowerNV machine can perform indirect loads and stores on the TIMA on behalf of another CPU. Give the controller the possibility to call the TIMA memory accessors with a XiveTCTX of its choice. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190306085032.15744-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12ppc: externalize ppc_get_vcpu_by_pir()Cédric Le Goater
We will use it to get the CPU interrupt presenter in XIVE when the TIMA is accessed from the indirect page. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190306085032.15744-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12spapr: Force SPAPR_MEMORY_BLOCK_SIZE to be a hwaddr (64-bit)David Gibson
SPAPR_MEMORY_BLOCK_SIZE is logically a difference in memory addresses, and hence of type hwaddr which is 64-bit. Previously it wasn't marked as such which means that it could be treated as 32-bit. That will work in some circumstances but if multiplied by another 32-bit value it could lead to a 32-bit overflow and an incorrect result. One specific instance of this in spapr_lmb_dt_populate() was spotted by Coverity (CID 1399145). Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12target/ppc/spapr: Add SPAPR_CAP_CCF_ASSISTSuraj Jitindar Singh
Introduce a new spapr_cap SPAPR_CAP_CCF_ASSIST to be used to indicate the requirement for a hw-assisted version of the count cache flush workaround. The count cache flush workaround is a software workaround which can be used to flush the count cache on context switch. Some revisions of hardware may have a hardware accelerated flush, in which case the software flush can be shortened. This cap is used to set the availability of such hardware acceleration for the count cache flush routine. The availability of such hardware acceleration is indicated by the H_CPU_CHAR_BCCTR_FLUSH_ASSIST flag being set in the characteristics returned from the KVM_PPC_GET_CPU_CHAR ioctl. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190301031912.28809-2-sjitindarsingh@gmail.com> [dwg: Small style fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12target/ppc/spapr: Add workaround option to SPAPR_CAP_IBSSuraj Jitindar Singh
The spapr_cap SPAPR_CAP_IBS is used to indicate the level of capability for mitigations for indirect branch speculation. Currently the available values are broken (default), fixed-ibs (fixed by serialising indirect branches) and fixed-ccd (fixed by diabling the count cache). Introduce a new value for this capability denoted workaround, meaning that software can work around the issue by flushing the count cache on context switch. This option is available if the hypervisor sets the H_CPU_BEHAV_FLUSH_COUNT_CACHE flag in the cpu behaviours returned from the KVM_PPC_GET_CPU_CHAR ioctl. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190301031912.28809-1-sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-12target/ppc/spapr: Add SPAPR_CAP_LARGE_DECREMENTERSuraj Jitindar Singh
Add spapr_cap SPAPR_CAP_LARGE_DECREMENTER to be used to control the availability of the large decrementer for a guest. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Message-Id: <20190301024317.22137-1-sjitindarsingh@gmail.com> [dwg: Trivial style fix] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-03-11pc: Support firmware configuration with -blockdevMarkus Armbruster
The PC machines put firmware in ROM by default. To get it put into flash memory (required by OVMF), you have to use -drive if=pflash,unit=0,... and optionally -drive if=pflash,unit=1,... Why two -drive? This permits setting up one part of the flash memory read-only, and the other part read/write. It also makes upgrading firmware on the host easier. Below the hood, it creates two separate flash devices, because we were too lazy to improve our flash device models to support sector protection. The problem at hand is to do the same with -blockdev somehow, as one more step towards deprecating -drive. Mapping -drive if=none,... to -blockdev is a solved problem. With if=T other than if=none, -drive additionally configures a block device frontend. For non-onboard devices, that part maps to -device. Also a solved problem. For onboard devices such as PC flash memory, we have an unsolved problem. This is actually an instance of a wider problem: our general device configuration interface doesn't cover onboard devices. Instead, we have a zoo of ad hoc interfaces that are much more limited. One of them is -drive, which we'd rather deprecate, but can't until we have suitable replacements for all its uses. Sadly, I can't attack the wider problem today. So back to the narrow problem. My first idea was to reduce it to its solved buddy by using pluggable instead of onboard devices for the flash memory. Workable, but it requires some extra smarts in firmware descriptors and libvirt. Paolo had an idea that is simpler for libvirt: keep the devices onboard, and add machine properties for their block backends. The implementation is less than straightforward, I'm afraid. First, block backend properties are *qdev* properties. Machines can't have those, as they're not devices. I could duplicate these qdev properties as QOM properties, but I hate that. More seriously, the properties do not belong to the machine, they belong to the onboard flash devices. Adding them to the machine would then require bad magic to somehow transfer them to the flash devices. Fortunately, QOM provides the means to handle exactly this case: add alias properties to the machine that forward to the onboard devices' properties. Properties need to be created in .instance_init() methods. For PC machines, that's pc_machine_initfn(). To make alias properties work, we need to create the onboard flash devices there, too. Requires several bug fixes, in the previous commits. We also have to realize the devices. More on that below. If the user sets pflash0, firmware resides in flash memory. pc_system_firmware_init() maps and realizes the flash devices. Else, firmware resides in ROM. The onboard flash devices aren't used then. pc_system_firmware_init() destroys them unrealized, along with the alias properties. The existing code to pick up drives defined with -drive if=pflash is replaced by code to desugar into the machine properties. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <87ftrtux81.fsf@dusky.pond.sub.org>
2019-03-11pc_sysfw: Pass PCMachineState to pc_system_firmware_init()Philippe Mathieu-Daudé
pc_system_firmware_init() parameter @isapc_ram_fw is PCMachineState member pci_enabled negated. The next commit will need more of PCMachineState. To prepare for that, pass a PCMachineState *, and drop the now redundant parameter @isapc_ram_fw. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20190308131445.17502-11-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-11pflash_cfi01: Add pflash_cfi01_get_blk() helperPhilippe Mathieu-Daudé
Add an helper to access the opaque struct PFlashCFI01. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20190308131445.17502-9-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>