aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2018-08-21hw/ppc: deprecate the machine type 'prep', replaced by '40p'Hervé Poussineau
- prep machine is a fictional machine, so has no specifications. Which devices can be changed/added/removed without impact? Are interrupts correctly mapped? - prep firmware (OHW) has support only for IDE drives (no SCSI). Booting from IDE has been broken approximatively 3 years ago, and nobody complained. - OHW is limited on IDE boot to a specific set of OS loaders. These operating systems are of the 2004 time frame. - OHW can use -kernel. Linux kernel freezes a long time after PS/2 mouse detection, and then screen becomes garbage. This was already broken in QEMU v2.7, 2 years ago, and nobody complained. On the other side: - 40p is a real machine, so emulation can be checked against hardware specifications - OpenBIOS has support for SCSI block devices, including 40p LSI adapter - OpenBIOS can start mostly all Linux kernels (including recent ones) and recent operating system (like NetBSD 7.1.2) Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> [dwg: Drop prep from boot-serial test to avoid deprecation warnings] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21spapr: introduce a IRQ controller backend to the machineCédric Le Goater
This proposal moves all the related IRQ routines of the sPAPR machine behind a sPAPR IRQ backend interface 'spapr_irq' to prepare for future changes. First of which will be to increase the size of the IRQ number space, then, will follow a new backend for the POWER9 XIVE IRQ controller. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21hw/ppc/ppc405_uc: Convert away from old_mmioPeter Maydell
Convert the devices in ppc405_uc away from using the old_mmio MemoryRegion accessors: * opba's 32-bit and 16-bit accessors were just calling the 8-bit accessors and assembling a big-endian order number, which we can do by setting the .impl.max_access_size to 1 and the endianness to DEVICE_BIG_ENDIAN, and letting the core memory code do the assembly * ppc405_gpio's accessors were all just stubs * ppc4xx_gpt's 8-bit and 16-bit accessors were treating the access as invalid, which we can do by setting the .valid.min_access_size and .valid.max_access_size fields Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21hw/ppc/ppc_boards: Don't use old_mmio for ref405ep_fpgaPeter Maydell
Switch the ref405ep_fpga device away from using the old_mmio MemoryRegion accessors. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21hw/ppc/prep: Remove ifdeffed-out stub of XCSR codePeter Maydell
The prep machine has some code which is stubs of accessors for XCSR registers. This has been disabled via #if 0 since commit b6b8bd1819ff in 2004, and doesn't have any actual interesting content. It also uses the deprecated old_mmio accessor functions. Remove it entirely. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21spapr: introduce a fixed IRQ number spaceCédric Le Goater
This proposal introduces a new IRQ number space layout using static numbers for all devices, depending on a device index, and a bitmap allocator for the MSI IRQ numbers which are negotiated by the guest at runtime. As the VIO device model does not have a device index but a "reg" property, we introduce a formula to compute an IRQ number from a "reg" value. It should minimize most of the collisions. The previous layout is kept in pre-3.1 machines raising the 'legacy_irq_allocation' machine class flag. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21spapr: Add a pseries-3.1 machine typeCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21vfio/spapr: Allow backing bigger guest IOMMU pages with smaller physical pagesAlexey Kardashevskiy
At the moment the PPC64/pseries guest only supports 4K/64K/16M IOMMU pages and POWER8 CPU supports the exact same set of page size so so far things worked fine. However POWER9 supports different set of sizes - 4K/64K/2M/1G and the last two - 2M and 1G - are not even allowed in the paravirt interface (RTAS DDW) so we always end up using 64K IOMMU pages, although we could back guest's 16MB IOMMU pages with 2MB pages on the host. This stores the supported host IOMMU page sizes in VFIOContainer and uses this later when creating a new DMA window. This uses the system page size (64k normally, 2M/16M/1G if hugepages used) as the upper limit of the IOMMU pagesize. This changes the type of @pagesize to uint64_t as this is what memory_region_iommu_get_min_page_size() returns and clz64() takes. There should be no behavioral changes on platforms other than pseries. The guest will keep using the IOMMU page size selected by the PHB pagesize property as this only changes the underlying hardware TCE table granularity. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21spapr_cpu_core: vmstate_[un]register per-CPU data from (un)realizefnBharata B Rao
VMStateDescription vmstate_spapr_cpu_state was added by commit b94020268e0b6 (spapr_cpu_core: migrate per-CPU data) to migrate per-CPU data with the required vmstate registration and unregistration calls. However the unregistration is being done only from vcpu creation error path and not from CPU delete path. This causes migration to fail with the following error if migration is attempted after a CPU unplug like this: Unknown savevm section or instance 'spapr_cpu' 16 Additionally this leaves the source VM unresponsive after migration failure. Fix this by ensuring the vmstate_unregister happens during CPU removal. Fixing this becomes easier when vmstate (un)registration calls are moved to vcpu (un)realize functions which is what this patch does. Fixes: https://bugs.launchpad.net/qemu/+bug/1785972 Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-20sun4m: don't use legacy fw_cfg_init_mem() functionMark Cave-Ayland
Instead initialise the device via qdev to allow us to set device properties directly as required. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2018-08-20sun4u: ensure kernel_top is always initialisedMark Cave-Ayland
Valgrind reports that when loading a non-ELF kernel, kernel_top may be used uninitialised when checking for an initrd. Since there are no known non-ELF kernels for SPARC64 then we can simply initialise kernel_top to 0 and then skip the initrd load process if it hasn't been set by load_elf(). Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2018-08-20Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20180820' into stagingPeter Maydell
First round of s390x patches for 3.1: - add compat machine for 3.1 - remove deprecated 's390-squash-mcss' option - cpu models: add "max" cpu model, enhance feature group code - kvm: add support for etoken facility and huge page backing # gpg: Signature made Mon 20 Aug 2018 13:47:38 BST # gpg: using RSA key DECF6B93C6F02FAF # gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" # gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" # gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" # gpg: aka "Cornelia Huck <cohuck@kernel.org>" # gpg: aka "Cornelia Huck <cohuck@redhat.com>" # Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF * remotes/cohuck/tags/s390x-20180820: s390x: Enable KVM huge page backing support s390x/kvm: add etoken facility linux-headers: update s390x/cpumodel: Add "-cpu max" support s390x: remove 's390-squash-mcss' option s390x/cpumodel: enum type S390FeatGroup now gets generated s390x: introduce 3.1 compat machine Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-20Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into ↵Peter Maydell
staging RDMA queue # gpg: Signature made Sat 18 Aug 2018 16:01:46 BST # gpg: using RSA key 36D4C0F0CF2FE46D # gpg: Good signature from "Marcel Apfelbaum <marcel.apfelbaum@zoho.com>" # gpg: aka "Marcel Apfelbaum <marcel@redhat.com>" # gpg: aka "Marcel Apfelbaum <marcel.apfelbaum@gmail.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: B1C6 3A57 F92E 08F2 640F 31F5 36D4 C0F0 CF2F E46D * remotes/marcel/tags/rdma-pull-request: config: split PVRDMA from RDMA hw/pvrdma: remove not needed include hw/rdma: Add reference to pci_dev in backend_dev hw/rdma: Bugfix - Support non-aligned buffers hw/rdma: Print backend QP number in hex format hw/rdma: Cosmetic change - move to generic function hw/pvrdma: Cosmetic change - indent right hw/rdma: Reorder resource cleanup hw/rdma: Do not allocate memory for non-dma MR hw/rdma: Delete useless structure RdmaRmUserMR hw/pvrdma: Make default pkey 0xFFFF hw/pvrdma: Clean CQE before use hw/rdma: Modify debug macros hw/pvrdma: Bugfix - provide the correct attr_mask to query_qp hw/rdma: Make distinction between device init and start modes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-20s390x: remove 's390-squash-mcss' optionCornelia Huck
This option has been deprecated for two releases; remove it. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-08-20s390x: introduce 3.1 compat machineCornelia Huck
Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-08-20hw/dma/pl080: Remove hw_error() if DMA is enabledPeter Maydell
The PL08x model currently will unconditionally call hw_error() if the DMA engine is enabled by the guest. This has been present since the PL080 model was edded in 2006, and is presumably either unintentional debug code left enabled, or a guard against untested DMA engine code being used. Remove the hw_error(), since we now have a guest which will actually try to use the DMA engine (the self-test binary for the AN505 MPS2 FPGA image). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-08-20hw/dma/pl080: Correct bug in register address decode logicPeter Maydell
A bug in the handling of the register address decode logic for the PL08x meant that we were incorrectly treating accesses to the DMA channel registers (DMACCxSrcAddr, DMACCxDestaddr, DMACCxLLI, DMACCxControl, DMACCxConfiguration) as bad offsets. Fix this long-standing bug. Fixes: https://bugs.launchpad.net/qemu/+bug/1637974 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-08-20hw/dma/pl080: Provide device reset functionPeter Maydell
The PL080/PL081 model is missing a reset function; implement it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-08-20hw/dma/pl080: Don't use CPU address space for DMA accessesPeter Maydell
Currently our PL080/PL081 model uses a combination of the CPU's address space (via cpu_physical_memory_{read,write}()) and the system address space for performing DMA accesses. For the PL081s in the MPS FPGA images, their DMA accesses must go via Master Security Controllers. Switch the PL080/PL081 model to take a MemoryRegion property which defines its downstream for making DMA accesses. Since the PL08x are only used in two board models, we make provision of the 'downstream' link mandatory and convert both users at once, rather than having it be optional with a default to the system address space. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-08-20hw/dma/pl080: Support all three interrupt linesPeter Maydell
The PL080 and PL081 have three outgoing interrupt lines: * DMACINTERR signals DMA errors * DMACINTTC is the DMA count interrupt * DMACINTR is a combined interrupt, the logical OR of the other two We currently only implement DMACINTR, because that's all the realview and versatile boards needed, but the instances of the PL081 in the MPS2 firmware images use all three interrupt lines. Implement the missing DMACINTERR and DMACINTTC. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-08-20hw/dma/pl080: Allow use as embedded-struct devicePeter Maydell
Create a new include file for the pl081's device struct, type macros, etc, so that it can be instantiated using the "embedded struct" coding style. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-08-20nvic: Expose NMI linePeter Maydell
On real v7M hardware, the NMI line is an externally visible signal that an SoC or board can toggle to assert an NMI. Expose it in our QEMU NVIC and armv7m container objects so that a board model can wire it up if it needs to. In particular, the MPS2 watchdog is wired to NMI. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-08-20hw/watchdog/cmsdk_apb_watchdog: Implement CMSDK APB watchdog modulePeter Maydell
The Arm Cortex-M System Design Kit includes a simple watchdog module based on a 32-bit down-counter. Implement this. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-20hw/timer/m48t59: Move away from old_mmio accessorsPeter Maydell
Move the m48t59 device away from using old_mmio MemoryRegionOps accessors. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-id: 20180802180602.22047-1-peter.maydell@linaro.org
2018-08-20hw/misc: Remove mmio_interface devicePeter Maydell
The mmio_interface device was a purely internal artifact of the implementation of the memory subsystem's request_ptr APIs. Now that we have removed those APIs, we can remove the mmio_interface device too. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com> Message-id: 20180817114619.22354-4-peter.maydell@linaro.org
2018-08-20hw/ssi/xilinx_spips: Remove unneeded MMIO request_ptr codePeter Maydell
We now support direct execution from MMIO regions in the core memory subsystem. This means that we don't need to have device-specific support for it, and we can remove the request_ptr handling from the Xilinx SPIPS device. (It was broken anyway due to race conditions, and disabled by default.) This device is the only in-tree user of this API. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com> Message-id: 20180817114619.22354-2-peter.maydell@linaro.org
2018-08-20sdhci: add i.MX SD Stable Clock bitHans-Erik Floryd
Add the ESDHC PRSSTAT_SDSTB bit, using the value of SDHC_CLOCK_INT_STABLE. Freescale recommends checking this bit when changing clock frequency. Signed-off-by: Hans-Erik Floryd <hans-erik.floryd@rt-labs.com> Message-id: 1534507843-4251-1-git-send-email-hans-erik.floryd@rt-labs.com [PMM: fixed indentation] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-20hw/arm/virt: Add virt-3.1 machine typeAndrew Jones
Signed-off-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-20imx_serial: Generate interrupt on receive data ready if enabledHans-Erik Floryd
Generate an interrupt if USR2_RDR and UCR4_DREN are both set. Signed-off-by: Hans-Erik Floryd <hans-erik.floryd@rt-labs.com> Message-id: 1534341354-11956-1-git-send-email-hans-erik.floryd@rt-labs.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-20hw/intc/arm_gicv3_its: downgrade error_report to warn_report in ↵Jia He
kvm_arm_its_reset In scripts/arch-run.bash of kvm-unit-tests, it will check the qemu output log with: if [ -z "$(echo "$errors" | grep -vi warning)" ]; then Thus without the warning prefix, all of the test fail. Since it is not unrecoverable error in kvm_arm_its_reset for current implementation, downgrading the report from error to warn makes sense. Signed-off-by: Jia He <jia.he@hxt-semitech.com> Message-id: 1531969910-32843-1-git-send-email-jia.he@hxt-semitech.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-20Merge remote-tracking branch ↵Peter Maydell
'remotes/ehabkost/tags/machine-next-pull-request' into staging Machine queue, 2018-08-17 * Allow machine classes to specify if boot device suffixes should be ignored by get_boot_devices_list() * Tiny coding style fixup # gpg: Signature made Fri 17 Aug 2018 19:29:22 BST # gpg: using RSA key 2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/machine-next-pull-request: fw_cfg: ignore suffixes in the bootdevice list dependent on machine class sysbus: always allow explicit_ofw_unit_address() to override address generation machine: Fix coding style at machine_run_board_init() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-18config: split PVRDMA from RDMAMarcel Apfelbaum
In some BSD systems RDMA migration is possible while the pvrdma device can't be used because the mremap system call is missing. Reported-by: Rebecca Cran <rebecca@bluestop.org> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180816151637.24553-1-marcel.apfelbaum@gmail.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
2018-08-18hw/pvrdma: remove not needed includeMarcel Apfelbaum
No need to include linux/types.h, is empty anyway. Suggested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180811171534.11917-1-marcel.apfelbaum@gmail.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
2018-08-18hw/rdma: Add reference to pci_dev in backend_devYuval Shaia
The field backend_dev->dev is not initialized, fix it. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Message-Id: <20180805153518.2983-14-yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/rdma: Bugfix - Support non-aligned buffersYuval Shaia
RDMA application can provide non-aligned buffers to be registered. In such case the DMA address passed by driver is pointing to the beginning of the physical address of the mapped page so we can't distinguish between two addresses from the same page. Fix it by keeping the offset of the virtual address in mr->virt. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-13-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/rdma: Print backend QP number in hex formatYuval Shaia
To be consistent with other prints throughout the code fix places that print it as decimal number. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-12-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/rdma: Cosmetic change - move to generic functionYuval Shaia
To ease maintenance of struct comp_thread move all related code to dedicated function. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-11-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/pvrdma: Cosmetic change - indent rightYuval Shaia
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-10-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/rdma: Reorder resource cleanupYuval Shaia
To be consistence with allocation do the reverse order in deallocation Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-9-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/rdma: Do not allocate memory for non-dma MRYuval Shaia
There is no use in the memory allocated for non-dma MR. Delete the code that allocates it. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-8-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/rdma: Delete useless structure RdmaRmUserMRYuval Shaia
The structure RdmaRmUserMR has no benefits, remove it an move all its fields to struct RdmaRmMR. Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Message-Id: <20180805153518.2983-7-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/pvrdma: Make default pkey 0xFFFFYuval Shaia
0x7FFF is not the default pkey - fix it. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-6-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/pvrdma: Clean CQE before useYuval Shaia
Next CQE is fetched from CQ ring, clean it before usage as it still carries old CQE values. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-5-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/rdma: Modify debug macrosYuval Shaia
- Add line counter to ease navigation in log - Print rdma instead of pvrdma Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-4-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/pvrdma: Bugfix - provide the correct attr_mask to query_qpYuval Shaia
Calling rdma_rm_query_qp with attr_mask equals to -1 leads to error where backend query_qp fails to retrieve the needed QP attributes. Fix it by providing the attr_mask we got from driver. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Message-Id: <20180805153518.2983-3-yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-18hw/rdma: Make distinction between device init and start modesYuval Shaia
There are certain operations that are well considered as part of device configuration while others are needed only when "start" command is triggered by the guest driver. An example of device initialization step is msix_init and example of "device start" stage is the creation of a CQ completion handler thread. Driver expects such distinction - implement it. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20180805153518.2983-2-yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2018-08-17vfio/ccw/pci: Allow devices to opt-in for ballooningAlex Williamson
If a vfio assigned device makes use of a physical IOMMU, then memory ballooning is necessarily inhibited due to the page pinning, lack of page level granularity at the IOMMU, and sufficient notifiers to both remove the page on balloon inflation and add it back on deflation. However, not all devices are backed by a physical IOMMU. In the case of mediated devices, if a vendor driver is well synchronized with the guest driver, such that only pages actively used by the guest driver are pinned by the host mdev vendor driver, then there should be no overlap between pages available for the balloon driver and pages actively in use by the device. Under these conditions, ballooning should be safe. vfio-ccw devices are always mediated devices and always operate under the constraints above. Therefore we can consider all vfio-ccw devices as balloon compatible. The situation is far from straightforward with vfio-pci. These devices can be physical devices with physical IOMMU backing or mediated devices where it is unknown whether a physical IOMMU is in use or whether the vendor driver is well synchronized to the working set of the guest driver. The safest approach is therefore to assume all vfio-pci devices are incompatible with ballooning, but allow user opt-in should they have further insight into mediated devices. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-08-17vfio: Inhibit ballooning based on group attachment to a containerAlex Williamson
We use a VFIOContainer to associate an AddressSpace to one or more VFIOGroups. The VFIOContainer represents the DMA context for that AdressSpace for those VFIOGroups and is synchronized to changes in that AddressSpace via a MemoryListener. For IOMMU backed devices, maintaining the DMA context for a VFIOGroup generally involves pinning a host virtual address in order to create a stable host physical address and then mapping a translation from the associated guest physical address to that host physical address into the IOMMU. While the above maintains the VFIOContainer synchronized to the QEMU memory API of the VM, memory ballooning occurs outside of that API. Inflating the memory balloon (ie. cooperatively capturing pages from the guest for use by the host) simply uses MADV_DONTNEED to "zap" pages from QEMU's host virtual address space. The page pinning and IOMMU mapping above remains in place, negating the host's ability to reuse the page, but the host virtual to host physical mapping of the page is invalidated outside of QEMU's memory API. When the balloon is later deflated, attempting to cooperatively return pages to the guest, the page is simply freed by the guest balloon driver, allowing it to be used in the guest and incurring a page fault when that occurs. The page fault maps a new host physical page backing the existing host virtual address, meanwhile the VFIOContainer still maintains the translation to the original host physical address. At this point the guest vCPU and any assigned devices will map different host physical addresses to the same guest physical address. Badness. The IOMMU typically does not have page level granularity with which it can track this mapping without also incurring inefficiencies in using page size mappings throughout. MMU notifiers in the host kernel also provide indicators for invalidating the mapping on balloon inflation, not for updating the mapping when the balloon is deflated. For these reasons we assume a default behavior that the mapping of each VFIOGroup into the VFIOContainer is incompatible with memory ballooning and increment the balloon inhibitor to match the attached VFIOGroups. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-08-17kvm: Use inhibit to prevent ballooning without synchronous mmuAlex Williamson
Remove KVM specific tests in balloon_page(), instead marking ballooning as inhibited without KVM_CAP_SYNC_MMU support. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-08-16fw_cfg: ignore suffixes in the bootdevice list dependent on machine classMark Cave-Ayland
For the older machines (such as Mac and SPARC) the DT nodes representing bootdevices for disk nodes are irregular for mainly historical reasons. Since the majority of bootdevice nodes for these machines either do not have a separate disk node or require different (custom) names then it is much easier for processing to just disable all suffixes for a particular machine. Introduce a new ignore_boot_device_suffixes MachineClass property to control bootdevice suffix generation, defaulting to false in order to preserve compatibility. Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20180810124027.10698-1-mark.cave-ayland@ilande.co.uk> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>