aboutsummaryrefslogtreecommitdiff
path: root/include/hw/pci-host
AgeCommit message (Collapse)Author
2019-01-09spapr_pci: Define SPAPR_MAX_PHBS in hw/pci-host/spapr.hGreg Kurz
PHB hotplug will bring more users for it. Let's define it along with the PHB defines from which it is derived for simplicity. While here fix a misleading comment about manual placement, which was abandoned with 30b3bc5aa9f4. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-01-09spapr: move spapr_create_phb() to core machine codeGreg Kurz
This function is only used when creating the default PHB. Let's rename it and move it to the core machine code for clarity. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-09-25spapr_pci: add an extra 'nr_msis' argument to spapr_populate_pci_dtCédric Le Goater
So that we don't have to call qdev_get_machine() to get the machine class and the sPAPRIrq backend holding the number of MSIs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-30uninorth: add ofw-addr property to allow correct fw path generationMark Cave-Ayland
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12uninorth: remove token register from uninorth deviceMark Cave-Ayland
>From observation of various OS sources it can be seen that the token register introduced in 4e46dcdbd3 "PPC: Newworld: Add uninorth token register" is not required, since the only register currently implemented is the uninorth hardware version which is read-only. Remove the token register implementation and instead return the uninorth version corresponding to the hardware. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-05-04uninorth: create new uninorth deviceMark Cave-Ayland
Commit 4e46dcdbd3 "PPC: Newworld: Add uninorth token register" added a TODO which was to convert the uninorth registers hack to a proper device. Move these registers to a new uninorth device, removing the old hacks from mac_newworld.c. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27uninorth: rename UNINState to UNINHostStateMark Cave-Ayland
The existing UNINState actually represents the PCI/AGP host bridge stage so rename it accordingly. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27uninorth: move PCI IO (ISA) memory region into the uninorth deviceMark Cave-Ayland
Do this for both the uninorth main and uninorth u3 AGP buses, using the main PCI bus for each machine (this ensures the IO addresses still match those used by OpenBIOS). Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27uninorth: use object link to pass OpenPIC object to uninorthMark Cave-Ayland
Now that the OpenPIC is wired up via the board, we can now remove our temporary PIC qdev pointer property and replace it with an object link instead. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27uninorth: introduce temporary pic_irqs device propertyMark Cave-Ayland
This is in preparation for moving the PCI bus wiring inside the uninorth host bridge devices. In the future it will be possible to remove this once the PICs have been switched to use qdev GPIOs. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27uninorth: move uninorth definitions into uninorth.hMark Cave-Ayland
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [dwg: Added hw/hw.h #include as suggested by Philippe Mathieu-Daudé] Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-09pci: Add support for Designware IP blockAndrey Smirnov
Add code needed to get a functional PCI subsytem when using in conjunction with upstream Linux guest (4.13+). Tested to work against "e1000e" (network adapter, using MSI interrupts) as well as "usb-ehci" (USB controller, using legacy PCI interrupts). Based on "i.MX6 Applications Processor Reference Manual" (Document Number: IMX6DQRM Rev. 4) as well as corresponding dirver in Linux kernel (circa 4.13 - 4.16 found in drivers/pci/dwc/*) Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-01-24apb: rename apb.c to sabre.cMark Cave-Ayland
This is the final stage in correcting the naming convention with respect to sabre, APB and PBM. It is effectively a file rename from apb.c to sabre.c along with touching up a few constants to remove the remaining references to APB. Note that as part of the rename process the configuration variable CONFIG_PCI_APB is changed to CONFIG_PCI_SABRE. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-24apb: rename QOM type from TYPE_APB to TYPE_SABREMark Cave-Ayland
Similarly rename the corresponding APBState typedef to SabreState. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-24apb: QOMify sabre PCI host bridgeMark Cave-Ayland
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-24apb: rename APB functions to use sabre prefixMark Cave-Ayland
As hinted in the comment at the top of the file, the naming convention for the APB types/QOM functions isn't correct. As a starting point we can at least rename the APB type and related functions to improve the readability of apb.c. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-24apb: split simba PCI bridge into hw/pci-bridge/simba.cMark Cave-Ayland
Move the QOM type and macros into a new include/hw/pci-bridge/simba.h file, and add a new CONFIG_SIMBA Makefile.objs variable which is enabled for sparc64-softmmu builds only. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> CC: Michael S. Tsirkin <mst@redhat.com> CC: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-11Merge remote-tracking branch 'origin/master' into HEADMichael S. Tsirkin
Resolve conflicts around apb. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-09sun4u: split IOMMU device out from apb.c to sun4u_iommu.cMark Cave-Ayland
By separating the sun4u IOMMU device into new sun4u_iommu.c and sun4m_iommu.h files we noticeably simplify apb.c whilst bringing sun4u in line with all the other IOMMU-supporting architectures. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-09apb: QOMify IOMMUMark Cave-Ayland
This is in preparation to split the IOMMU device out of the APB. As part of this commit we also enforce separation of the IOMMU and APB devices by using a QOM object link to pass the IOMMU reference and accessing the IOMMU registers via a separate memory region mapped into the APB config space rather than directly. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-09apb: replace OBIO interrupt numbers in pci_pbmA_map_irq() with constantsMark Cave-Ayland
Following on from the previous commit, we can also do the same with with legacy OBIO interrupts in pci_pbmA_map_irq(). Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-01-09ebus: wire up OBIO interrupts to APB pbm via qdev GPIOsMark Cave-Ayland
This enables us to remove the static array mapping in the ISA IRQ handler (and the embedded reference to the APB device) by formalising the interrupt wiring via the qdev GPIO API. For more clarity we replace the APB OBIO interrupt numbers with constants designating the interrupt source, and rename isa_irq_handler() to ebus_isa_irq_handler(). Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-01-09apb: remove busA property from PBMPCIBridge stateMark Cave-Ayland
Since the previous commit the only remaining use of the qdev busA property is to configure the PCI bridge in front of the onboard ebus devices differently to allow early OpenBIOS serial console access. Instead we can now manually update the PCI configuration for bridge A in pci_pbm_reset() and thus completely remove the busA property from the PBMPCIBridge state. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-09apb: remove pci_apb_init() and instantiate APB device using qdevMark Cave-Ayland
By making the special_base and mem_base values qdev properties, we can move the remaining parts of pci_apb_init() into the pbm init() and realize() functions. This finally allows us to instantiate the APB directly using standard qdev create/init functions in sun4u.c. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-01-09apb: move the two secondary PCI bridges objects into APBStateMark Cave-Ayland
This enables us to remove these parameters from pci_apb_init(). Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-01-09apb: use gpios to wire up the apb device to the SPARC CPU IRQsMark Cave-Ayland
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-09apb: return APBState from pci_apb_init() rather than PCIBusMark Cave-Ayland
This is a first step towards removing pci_apb_init() completely. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
2018-01-09sun4u: remove pci_ebus_init() functionMark Cave-Ayland
This is initialisation that should really take place in the ebus realize function. As part of this we also rework the ebus IRQ mapping so that instead of having to pass in the array of pbm_irqs, we obtain a reference to them by looking up the APB device during ebus realize. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-01-09apb: move QOM macros and typedefs from apb.c to apb.hMark Cave-Ayland
This also includes the related IOMMUState typedef and defines. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-12-15spapr: introduce a spapr_qirq() helperCédric Le Goater
xics_get_qirq() is only used by the sPAPR machine. Let's move it there and change its name to reflect its scope. It will be useful for XIVE support which will use its own set of qirqs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-05pci: Move bridge data structures from pci_bus.h to pci_bridge.hDavid Gibson
include/hw/pci/pci_bus.h contains several data structures related to PCI bridges that aren't needed by most users of pci_bus.h. We already have a pci_bridge.h, so move them there. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2017-11-16hw/pci-host: Fix x86 Host Bridges 64bit PCI holeMarcel Apfelbaum
Currently there is no MMIO range over 4G reserved for PCI hotplug. Since the 32bit PCI hole depends on the number of cold-plugged PCI devices and other factors, it is very possible is too small to hotplug PCI devices with large BARs. Fix it by reserving 2G for I4400FX chipset in order to comply with older Win32 Guest OSes and 32G for Q35 chipset. Even if the new defaults of pci-hole64-size will appear in "info qtree" also for older machines, the property was not implemented so no changes will be visible to guests. Note this is a regression since prev QEMU versions had some range reserved for 64bit PCI hotplug. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-09-14hw/pci-host/gpex: Set INTx index/gsi mappingPranavkumar Sawargaonkar
To implement INTx to gsi routing we need to pass the gpex host bridge the gsi associated to each INTx index. Let's introduce irq_num array and gpex_set_irq_num setter function. Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Tushar Jagad <tushar.jagad@linaro.org> Signed-off-by: Eric Auger <eric.auger@redhat.com> Tested-by: Feng Kan <fkan@apm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-id: 1505296004-6798-2-git-send-email-eric.auger@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-16q35/mch: implement extended TSEG sizesLaszlo Ersek
The q35 machine type currently lets the guest firmware select a 1MB, 2MB or 8MB TSEG (basically, SMRAM) size. In edk2/OVMF, we use 8MB, but even that is not enough when a lot of VCPUs (more than approx. 224) are configured -- SMRAM footprint scales largely proportionally with VCPU count. Introduce a new property for "mch" called "extended-tseg-mbytes", which expresses (in megabytes) the user's choice of TSEG (SMRAM) size. Invent a new, QEMU-specific register in the config space of the DRAM Controller, at offset 0x50, in order to allow guest firmware to query the TSEG (SMRAM) size. According to Intel Document Number 316966-002, Table 5-1 "DRAM Controller Register Address Map (D0:F0)": Warning: Address locations that are not listed are considered Intel Reserved registers locations. Reads to Reserved registers may return non-zero values. Writes to reserved locations may cause system failures. All registers that are defined in the PCI 2.3 specification, but are not necessary or implemented in this component are simply not included in this document. The reserved/unimplemented space in the PCI configuration header space is not documented as such in this summary. Offsets 0x50 and 0x51 are not listed in Table 5-1. They are also not part of the standard PCI config space header. And they precede the capability list as well, which starts at 0xe0 for this device. When the guest writes value 0xffff to this register, the value that can be read back is that of "mch.extended-tseg-mbytes" -- unless it remains 0xffff. The guest is required to write 0xffff first (as opposed to a read-only register) because PCI config space is generally not cleared on QEMU reset, and after S3 resume or reboot, new guest firmware running on old QEMU could read a guest OS-injected value from this register. After reading the available "extended" TSEG size, the guest firmware may actually request that TSEG size by writing pattern 11b to the ESMRAMC register's TSEG_SZ bit-field. (The Intel spec referenced above defines only patterns 00b (1MB), 01b (2MB) and 10b (8MB); 11b is reserved.) On the QEMU command line, the value can be set with -global mch.extended-tseg-mbytes=N The default value for 2.10+ q35 machine types is 16. The value is limited to 0xfff (4095) at the moment, purely so that the product (4095 MB) can be stored to the uint32_t variable "tseg_size" in mch_update_smram(). Users are responsible for choosing sensible TSEG sizes. On 2.9 and earlier q35 machine types, the default value is 0. This lets the 11b bit pattern in ESMRAMC.TSEG_SZ, and the register at offset 0x50, keep their original behavior. When "extended-tseg-mbytes" is nonzero, the new register at offset 0x50 is set to that value on reset, for completeness. PCI config space is migrated automatically, so no VMSD changes are necessary. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1447027 Ref: https://lists.01.org/pipermail/edk2-devel/2017-May/010456.html Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-25hw/ppc: removing drc->detach_cb and drc->detach_cb_opaqueDaniel Henrique Barboza
The pointer drc->detach_cb is being used as a way of informing the detach() function inside spapr_drc.c which cb to execute. This information can also be retrieved simply by checking drc->type and choosing the right callback based on it. In this context, detach_cb is redundant information that must be managed. After the previous spapr_lmb_release change, no detach_cb_opaques are being used by any of the three callbacks functions. This is yet another information that is now unused and, on top of that, can't be migrated either. This patch makes the following changes: - removal of detach_cb_opaque. the 'opaque' argument was removed from the callbacks and from the detach() function of sPAPRConnectorClass. The attribute detach_cb_opaque of sPAPRConnector was removed. - removal of detach_cb from the detach() call. The function pointer detach_cb of sPAPRConnector was removed. detach() now uses a switch(drc->type) to execute the apropriate callback. To achieve this, spapr_core_release, spapr_lmb_release and spapr_phb_remove_pci_device_cb callbacks were made public to be visible inside detach(). Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-14pseries: Don't expose PCIe extended config space on older machine typesDavid Gibson
bb9986452 "spapr_pci: Advertise access to PCIe extended config space" allowed guests to access the extended config space of PCI Express devices via the PAPR interfaces, even though the paravirtualized bus mostly acts like plain PCI. However, that patch enabled access unconditionally, including for existing machine types, which is an unwise change in behaviour. This patch limits the change to pseries-2.9 (and later) machine types. Suggested-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01ppc/xics: use the QOM interface to get irqsCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-21hw: xilinx-pcie: Add support for Xilinx AXI PCIe ControllerPaul Burton
Add support for emulating the Xilinx AXI Root Port Bridge for PCI Express as described by Xilinx' PG055 document. This is a PCIe controller that can be used with certain series of Xilinx FPGAs, and is used on the MIPS Boston board which will make use of this code. Signed-off-by: Paul Burton <paul.burton@imgtec.com> [yongbok.kim@imgtec.com: removed returning on !level, updated IRQ connection with GPIO logic, moved xilinx_pcie_init() to boston.c replaced stw_le_p() with pci_set_word() and other cosmetic changes] Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-01-24include: Fix typos found by codespellStefan Weil
Add also a missing parenthesis in a comment. Signed-off-by: Stefan Weil <sw@weilnetz.de> Acked-by: Alistair Francis <alistair.francis@xilinx.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-11-23spapr: Fix 2.7<->2.8 migration of PCI host bridgeDavid Gibson
daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration from qemu-2.7 to the current version. It split the device's MMIO window into two pieces for 32-bit and 64-bit MMIO. The patch included backwards compatibility code to convert the old property into the new format. However, the property value was also transferred in the migration stream and compared with a (probably unwise) VMSTATE_EQUAL. So, the "raw" value from 2.7 is compared to the new style converted value from (pre-)2.8 giving a mismatch and migration failure. Along with the actual field that caused the breakage, there are several other ill-advised VMSTATE_EQUAL()s. To fix forwards migration, we read the values in the stream into scratch variables and ignore them, instead of comparing for equality. To fix backwards migration, we populate those scratch variables in pre_save() with adjusted values to match the old behaviour. To permit the eventual possibility of removing this cruft from the stream, we only include these compatibility fields if a new 'pre-2.8-migration' property is set. We clear it on the pseries-2.8 machine type, which obviously can't be migrated backwards, but set it on earlier machine type versions. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-10-16spapr: Improved placement of PCI host bridges in guest memory mapDavid Gibson
Currently, the MMIO space for accessing PCI on pseries guests begins at 1 TiB in guest address space. Each PCI host bridge (PHB) has a 64 GiB chunk of address space in which it places its outbound PIO and 32-bit and 64-bit MMIO windows. This scheme as several problems: - It limits guest RAM to 1 TiB (though we have a limited fix for this now) - It limits the total MMIO window to 64 GiB. This is not always enough for some of the large nVidia GPGPU cards - Putting all the windows into a single 64 GiB area means that naturally aligning things within there will waste more address space. In addition there was a miscalculation in some of the defaults, which meant that the MMIO windows for each PHB actually slightly overran the 64 GiB region for that PHB. We got away without nasty consequences because the overrun fit within an unused area at the beginning of the next PHB's region, but it's not pretty. This patch implements a new scheme which addresses those problems, and is also closer to what bare metal hardware and pHyp guests generally use. Because some guest versions (including most current distro kernels) can't access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and 64 TiB. This is broken into 1 TiB chunks. The first 1 TiB contains the PIO (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs. Each subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for one PHB each. This reduces the number of allowed PHBs (without full manual configuration of all the windows) from 256 to 31, but this should still be plenty in practice. We also change some of the default window sizes for manually configured PHBs to saner values. Finally we adjust some tests and libqos so that it correctly uses the new default locations. Ideally it would parse the device tree given to the guest, but that's a more complex problem for another time. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16spapr_pci: Add a 64-bit MMIO windowDavid Gibson
On real hardware, and under pHyp, the PCI host bridges on Power machines typically advertise two outbound MMIO windows from the guest's physical memory space to PCI memory space: - A 32-bit window which maps onto 2GiB..4GiB in the PCI address space - A 64-bit window which maps onto a large region somewhere high in PCI address space (traditionally this used an identity mapping from guest physical address to PCI address, but that's not always the case) The qemu implementation in spapr-pci-host-bridge, however, only supports a single outbound MMIO window, however. At least some Linux versions expect the two windows however, so we arranged this window to map onto the PCI memory space from 2 GiB..~64 GiB, then advertised it as two contiguous windows, the "32-bit" window from 2G..4G and the "64-bit" window from 4G..~64G. This approach means, however, that the 64G window is not naturally aligned. In turn this limits the size of the largest BAR we can map (which does have to be naturally aligned) to roughly half of the total window. With some large nVidia GPGPU cards which have huge memory BARs, this is starting to be a problem. This patch adds true support for separate 32-bit and 64-bit outbound MMIO windows to the spapr-pci-host-bridge implementation, each of which can be independently configured. The 32-bit window always maps to 2G.. in PCI space, but the PCI address of the 64-bit window can be configured (it defaults to the same as the guest physical address). So as not to break possible existing configurations, as long as a 64-bit window is not specified, a large single window can be specified. This will appear the same way to the guest as the old approach, although it's now implemented by two contiguous memory regions rather than a single one. For now, this only adds the possibility of 64-bit windows. The default configuration still uses the legacy mode. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16spapr_pci: Delegate placement of PCI host bridges to machine typeDavid Gibson
The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB) for a PAPR guest. Unlike on x86, it's routine on Power (both bare metal and PAPR guests) to have numerous independent PHBs, each controlling a separate PCI domain. There are two ways of configuring the spapr-pci-host-bridge device: first it can be done fully manually, specifying the locations and sizes of all the IO windows. This gives the most control, but is very awkward with 6 mandatory parameters. Alternatively just an "index" can be specified which essentially selects from an array of predefined PHB locations. The PHB at index 0 is automatically created as the default PHB. The current set of default locations causes some problems for guests with large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia GPGPU cards via VFIO). Obviously, for migration we can only change the locations on a new machine type, however. This is awkward, because the placement is currently decided within the spapr-pci-host-bridge code, so it breaks abstraction to look inside the machine type version. So, this patch delegates the "default mode" PHB placement from the spapr-pci-host-bridge device back to the machine type via a public method in sPAPRMachineClass. It's still a bit ugly, but it's about the best we can do. For now, this just changes where the calculation is done. It doesn't change the actual location of the host bridges, or any other behaviour. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-09-23spapr_pci: Add numa node idAlexey Kardashevskiy
This adds a numa id property to a PHB to allow linking passed PCI device to CPU/memory. It is up to the management stack to do CPU/memory pinning to the node with the actual PCI device. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [dwg: Renamed property from "node" to "numa_node" to match the similar one in the pxb device] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-15Remove unused function declarationsLadi Prosek
Unused function declarations were found using a simple gcc plugin and manually verified by grepping the sources. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-07-20acpi: add DMAR scope definition for root IOAPICPeter Xu
To enable interrupt remapping for intel IOMMU device, each IOAPIC device in the system reported via ACPI MADT must be explicitly enumerated under one specific remapping hardware unit. This patch adds the root-complex IOAPIC into the default DMAR device. Please refer to VT-d spec 8.3.1.1 for more information. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-12Clean up header guards that don't match their file nameMarkus Armbruster
Header guard symbols should match their file name to make guard collisions less likely. Offenders found with scripts/clean-header-guards.pl -vn. Cleaned up with scripts/clean-header-guards.pl, followed by some renaming of new guard symbols picked by the script to better ones. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12spapr_pci: Include spapr.h instead of playing games with #errorMarkus Armbruster
include/hw/pci-host/spapr.h needs hw/ppc/spapr.h. It checks whether its header guard is defined, and errors out if it isn't. Playing games with some other header's guard symbol is not a good idea. Just include the frackin' header already. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-05Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pc, pci, virtio: new features, cleanups, fixes iommus can not be added with -device. cleanups and fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 05 Jul 2016 11:18:32 BST # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (30 commits) vmw_pvscsi: remove unnecessary internal msi state flag e1000e: remove unnecessary internal msi state flag vmxnet3: remove unnecessary internal msi state flag mptsas: remove unnecessary internal msi state flag megasas: remove unnecessary megasas_use_msi() pci: Convert msi_init() to Error and fix callers to check it pci bridge dev: change msi property type megasas: change msi/msix property type mptsas: change msi property type intel-hda: change msi property type usb xhci: change msi/msix property type change pvscsi_init_msi() type to void tests: add APIC.cphp and DSDT.cphp blobs tests: acpi: add CPU hotplug testcase log: Permit -dfilter 0..0xffffffffffffffff range: Replace internal representation of Range range: Eliminate direct Range member access log: Clean up misuse of Range for -dfilter pci_register_bar: cleanup Revert "virtio-net: unbreak self announcement and guest offloads after migration" ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-05spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)Alexey Kardashevskiy
This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA window(s) The "ddw" property is enabled by default on a PHB but for compatibility the pseries-2.6 machine and older disable it. This also creates a single DMA window for the older machines to maintain backward migration. This implements DDW for PHB with emulated and VFIO devices. The host kernel support is required. The advertised IOMMU page sizes are 4K and 64K; 16M pages are supported but not advertised by default, in order to enable them, the user has to specify "pgsz" property for PHB and enable huge pages for RAM. The existing linux guests try creating one additional huge DMA window with 64K or 16MB pages and map the entire guest RAM to. If succeeded, the guest switches to dma_direct_ops and never calls TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM and not waste time on map/unmap later. This adds a "dma64_win_addr" property which is a bus address for the 64bit window and by default set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware uses and this allows having emulated and VFIO devices on the same bus. This adds 4 RTAS handlers: * ibm,query-pe-dma-window * ibm,create-pe-dma-window * ibm,remove-pe-dma-window * ibm,reset-pe-dma-window These are registered from type_init() callback. These RTAS handlers are implemented in a separate file to avoid polluting spapr_iommu.c with PCI. This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs and updates all references to dma_liobn. However this does not add 64bit LIOBN to the migration stream as in fact even 32bit LIOBN is rather pointless there (as it is a PHB property and the management software can/should pass LIOBNs via CLI) but we keep it for the backward migration support. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>