aboutsummaryrefslogtreecommitdiff
path: root/include/hw/ppc
AgeCommit message (Collapse)Author
2022-03-02xive2: Add a get_config() handler for the router configurationCédric Le Goater
Add GEN1 config even if we don't use it yet in the core framework. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/pnv: add XIVE Gen2 TIMA supportCédric Le Goater
Only the CAM line updates done by the hypervisor are specific to POWER10. Instead of duplicating the TM ops table, we handle these commands locally under the PowerNV XIVE2 model. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/pnv: Add support for PQ offload on PHB5Cédric Le Goater
The PQ_disable configuration bit disables the check done on the PQ state bits when processing new MSI interrupts. When bit 9 is enabled, the PHB forwards any MSI trigger to the XIVE interrupt controller without checking the PQ state bits. The XIVE IC knows from the trigger message that the PQ bits have not been checked and performs the check locally. This configuration bit only applies to MSIs and LSIs are still checked on the PHB to handle the assertion level. PQ_disable enablement is a requirement for StoreEOI. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/xive: Add support for PQ state bits offloadCédric Le Goater
The trigger message coming from a HW source contains a special bit informing the XIVE interrupt controller that the PQ bits have been checked at the source or not. Depending on the value, the IC can perform the check and the state transition locally using its own PQ state bits. The following changes add new accessors to the XiveRouter required to query and update the PQ state bits. This only applies to the PowerNV machine. sPAPR accessors are provided but the pSeries machine should not be concerned by such complex configuration for the moment. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/xive2: Add support for notification injection on ESB pagesCédric Le Goater
This is an internal offset used to inject triggers when the PQ state bits are not controlled locally. Such as for LSIs when the PHB5 are using the Address-Based Interrupt Trigger mode and on the END. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/pnv: Add a HOMER model to POWER10Cédric Le Goater
Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/pnv: Add model for POWER10 PHB5 PCIe Host bridgeCédric Le Goater
PHB4 and PHB5 are very similar. Use the PHB4 models with some minor adjustements in a subclass for P10. Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/pnv: Add POWER10 quadsCédric Le Goater
and use a pnv_chip_power10_quad_realize() helper to avoid code duplication with P9. This still needs some refinements on the XSCOM registers handling in PnvQuad. Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/pnv: Add a OCC model for POWER10Cédric Le Goater
Our OCC model is very mininal and POWER10 can simply reuse the OCC model we introduced for POWER9. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/pnv: Add a XIVE2 controller to the POWER10 chipCédric Le Goater
The XIVE2 interrupt controller of the POWER10 processor follows the same logic than on POWER9 but the HW interface has been largely reviewed. It has a new register interface, different BARs, extra VSDs, new layout for the XIVE2 structures, and a set of new features which are described below. This is a model of the POWER10 XIVE2 interrupt controller for the PowerNV machine. It focuses primarily on the needs of the skiboot firmware but some initial hypervisor support is implemented for KVM use (escalation). Support for new features will be implemented in time and will require new support from the OS. * XIVE2 BARS The interrupt controller BARs have a different layout outlined below. Each sub-engine has now own its range and the indirect TIMA access was replaced with a set of pages, one per CPU, under the IC BAR: - IC BAR (Interrupt Controller) . 4 pages, one per sub-engine . 128 indirect TIMA pages - TM BAR (Thread Interrupt Management Area) . 4 pages - ESB BAR (ESB pages for IPIs) . up to 1TB - END BAR (ESB pages for ENDs) . up to 2TB - NVC BAR (Notification Virtual Crowd) . up to 128 - NVPG BAR (Notification Virtual Process and Group) . up to 1TB - Direct mapped Thread Context Area (reads & writes) OPAL does not use the grouping and crowd capability. * Virtual Structure Tables XIVE2 adds new tables types and also changes the field layout of the END and NVP Virtualization Structure Descriptors. - EAS - END new layout - NVT was splitted in : . NVP (Processor), 32B . NVG (Group), 32B . NVC (Crowd == P9 block group) 32B - IC for remote configuration - SYNC for cache injection - ERQ for event input queue The setup is slighly different on XIVE2 because the indexing has changed for some of the tables, block ID or the chip topology ID can be used. * XIVE2 features SCOM and MMIO registers have a new layout and XIVE2 adds a new global capability and configuration registers. The lowlevel hardware offers a set of new features among which : - a configurable number of priorities : 1 - 8 - StoreEOI with load-after-store ordering is activated by default - Gen2 TIMA layout - A P9-compat mode, or Gen1, TIMA toggle bit for SW compatibility - increase to 24bit for VP number Other features will have some impact on the Hypervisor and guest OS when activated, but this is not required for initial support of the controller. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/xive2: Introduce a presenter matching routineCédric Le Goater
The VP space is larger in XIVE2 (P10), 24 bits instead of 19bits on XIVE (P9), and the CAM line can use a 7bits or 8bits thread id. For now, we only use 7bits thread ids, same as P9, but because of the change of the size of the VP space, the CAM matching routine is different between P9 and P10. It is easier to duplicate the whole routine than to add extra handlers in xive_presenter_tctx_match() used for P9. We might come with a better solution later on, after we have added some more support for the XIVE2 controller. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02ppc/xive2: Introduce a XIVE2 core frameworkCédric Le Goater
The XIVE2 interrupt controller of the POWER10 processor as the same logic as on POWER9 but its SW interface has been largely reworked. The interrupt controller has a new register interface, different BARs, extra VSDs. These will be described when we add the device model for the baremetal machine. The XIVE internal structures for the EAS, END, NVT have different layouts which is a problem for the current core XIVE framework. To avoid adding too much complexity in the XIVE models, a new XIVE2 core framework is introduced. It duplicates the models which are closely linked to the XIVE internal structures : Xive2Router and Xive2ENDSource and reuses the XiveSource, XivePresenter, XiveTCTX models, as they are more generic. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18spapr: implement nested-hv capability for the virtual hypervisorNicholas Piggin
This implements the Nested KVM HV hcall API for spapr under TCG. The L2 is switched in when the H_ENTER_NESTED hcall is made, and the L1 is switched back in returned from the hcall when a HV exception is sent to the vhyp. Register state is copied in and out according to the nested KVM HV hcall API specification. The hdecr timer is started when the L2 is switched in, and it provides the HDEC / 0x980 return to L1. The MMU re-uses the bare metal radix 2-level page table walker by using the get_pate method to point the MMU to the nested partition table entry. MMU faults due to partition scope errors raise HV exceptions and accordingly are routed back to the L1. The MMU does not tag translations for the L1 (direct) vs L2 (nested) guests, so the TLB is flushed on any L1<->L2 transition (hcall entry and exit). Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-10-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18ppc: allow the hdecr timer to be created/destroyedNicholas Piggin
Machines which don't emulate the HDEC facility are able to use the timer for something else. Provide functions to start and stop the hdecr timer. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-4-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18spapr: nvdimm: Implement H_SCM_FLUSH hcallShivaprasad G Bhat
The patch adds support for the SCM flush hcall for the nvdimm devices. To be available for exploitation by guest through the next patch. The hcall is applicable only for new SPAPR specific device class which is also introduced in this patch. The hcall expects the semantics such that the flush to return with H_LONG_BUSY_ORDER_10_MSEC when the operation is expected to take longer time along with a continue_token. The hcall to be called again by providing the continue_token to get the status. So, all fresh requests are put into a 'pending' list and flush worker is submitted to the thread pool. The thread pool completion callbacks move the requests to 'completed' list, which are cleaned up after collecting the return status for the guest in subsequent hcall from the guest. The semantics makes it necessary to preserve the continue_tokens and their return status across migrations. So, the completed flush states are forwarded to the destination and the pending ones are restarted at the destination in post_load. The necessary nvdimm flush specific vmstate structures are also introduced in this patch which are to be saved in the new SPAPR specific nvdimm device to be introduced in the following patch. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <164396254862.109112.16675611182159105748.stgit@ltczzess4.aus.stglabs.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-28hw/ppc/vof: Add missing includesPhilippe Mathieu-Daudé
vof.h requires "qom/object.h" for DECLARE_CLASS_CHECKERS(), "exec/memory.h" for address_space_read/write(), "exec/address-spaces.h" for address_space_memory and more importantly "cpu.h" for target_ulong. vof.c doesn't need "exec/ram_addr.h". Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220122003104.84391-1-f4bug@amsat.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12ppc/pnv: Move num_phbs under Pnv8ChipCédric Le Goater
It is not used elsewhere so that's where it belongs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220105212338.49899-10-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12ppc/pnv: Reparent user created PHB3 devices to the PnvChipCédric Le Goater
The powernv machine uses the object hierarchy to populate the device tree and each device should be parented to the chip it belongs to. This is not the case for user created devices which are parented to the container "/unattached". Make sure a PHB3 device is parented to its chip by reparenting the object if necessary. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220105212338.49899-8-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12ppc/pnv: Introduce support for user created PHB3 devicesCédric Le Goater
PHB3 devices and PCI devices can now be added to the powernv8 machine using : -device pnv-phb3,chip-id=0,index=1 \ -device nec-usb-xhci,bus=pci.1,addr=0x0 The 'index' property identifies the PHB3 in the chip. In case of user created devices, a lookup on 'chip-id' is required to assign the owning chip. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220105212338.49899-7-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12ppc/pnv: Attach PHB3 root port device when defaults are enabledCédric Le Goater
This cleanups the PHB3 model a bit more since the root port is an independent device and it will ease our task when adding user created PHB3s. pnv_phb_attach_root_port() is made public in pnv.c so it can be reused with the pnv_phb4 root port later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20220105212338.49899-4-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-31dma: Let ld*_dma() propagate MemTxResultPhilippe Mathieu-Daudé
dma_memory_read() returns a MemTxResult type. Do not discard it, return it to the caller. Update the few callers. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-19-philmd@redhat.com>
2021-12-31dma: Let ld*_dma() take MemTxAttrs argumentPhilippe Mathieu-Daudé
Let devices specify transaction attributes when calling ld*_dma(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-17-philmd@redhat.com>
2021-12-31dma: Let st*_dma() take MemTxAttrs argumentPhilippe Mathieu-Daudé
Let devices specify transaction attributes when calling st*_dma(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211223115554.3155328-16-philmd@redhat.com>
2021-12-30dma: Let dma_memory_read/write() take MemTxAttrs argumentPhilippe Mathieu-Daudé
Let devices specify transaction attributes when calling dma_memory_read() or dma_memory_write(). Patch created mechanically using spatch with this script: @@ expression E1, E2, E3, E4; @@ ( - dma_memory_read(E1, E2, E3, E4) + dma_memory_read(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED) | - dma_memory_write(E1, E2, E3, E4) + dma_memory_write(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED) ) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211223115554.3155328-6-philmd@redhat.com>
2021-12-30dma: Let dma_memory_set() take MemTxAttrs argumentPhilippe Mathieu-Daudé
Let devices specify transaction attributes when calling dma_memory_set(). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211223115554.3155328-3-philmd@redhat.com>
2021-12-30dma: Let dma_memory_valid() take MemTxAttrs argumentPhilippe Mathieu-Daudé
Let devices specify transaction attributes when calling dma_memory_valid(). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211223115554.3155328-2-philmd@redhat.com>
2021-12-17ppc/pnv: Introduce a num_pecs class attribute for PHB4 PEC devicesCédric Le Goater
POWER9 processor comes with 3 PHB4 PEC (PCI Express Controller) and each PEC can have several PHBs : * PEC0 provides 1 PHB (PHB0) * PEC1 provides 2 PHBs (PHB1 and PHB2) * PEC2 provides 3 PHBs (PHB3, PHB4 and PHB5) A num_pecs class attribute represents better the logic units of the POWER9 chip. Use that instead of num_phbs which fits POWER8 chips. This will ease adding support for user created devices. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20211213132830.108372-8-clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-10-21spapr/xive: Add source status helpersCédric Le Goater
and use them to set and test the ASSERTED bit of LSI sources. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20211004212141.432954-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-30hw/intc: openpic: Clean up the stylesBin Meng
Correct the multi-line comment format. No functional changes. Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20210918032653.646370-3-bin.meng@windriver.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-30hw/intc: openpic: Drop Raven related codesBin Meng
There is no machine that uses Motorola MCP750 (aka Raven) model. Drop the related codes. While we are here, drop the mentioning of Intel GW80314 I/O companion chip in the comments as it has been obsolete for years, and correct a typo too. Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20210918032653.646370-2-bin.meng@windriver.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-30spapr_numa.c: FORM2 NUMA affinity supportDaniel Henrique Barboza
The main feature of FORM2 affinity support is the separation of NUMA distances from ibm,associativity information. This allows for a more flexible and straightforward NUMA distance assignment without relying on complex associations between several levels of NUMA via ibm,associativity matches. Another feature is its extensibility. This base support contains the facilities for NUMA distance assignment, but in the future more facilities will be added for latency, performance, bandwidth and so on. This patch implements the base FORM2 affinity support as follows: - the use of FORM2 associativity is indicated by using bit 2 of byte 5 of ibm,architecture-vec-5. A FORM2 aware guest can choose to use FORM1 or FORM2 affinity. Setting both forms will default to FORM2. We're not advertising FORM2 for pseries-6.1 and older machine versions to prevent guest visible changes in those; - ibm,associativity-reference-points has a new semantic. Instead of being used to calculate distances via NUMA levels, it's now used to indicate the primary domain index in the ibm,associativity domain of each resource. In our case it's set to {0x4}, matching the position where we already place logical_domain_id; - two new RTAS DT artifacts are introduced: ibm,numa-lookup-index-table and ibm,numa-distance-table. The index table is used to list all the NUMA logical domains of the platform, in ascending order, and allows for spartial NUMA configurations (although QEMU ATM doesn't support that). ibm,numa-distance-table is an array that contains all the distances from the first NUMA node to all other nodes, then the second NUMA node distances to all other nodes and so on; - get_max_dist_ref_points(), get_numa_assoc_size() and get_associativity() now checks for OV5_FORM2_AFFINITY and returns FORM2 values if the guest selected FORM2 affinity during CAS. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-7-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-30spapr: move FORM1 verifications to post CASDaniel Henrique Barboza
FORM2 NUMA affinity is prepared to deal with empty (memory/cpu less) NUMA nodes. This is used by the DAX KMEM driver to locate a PAPR SCM device that has a different latency than the original NUMA node from the regular memory. FORM2 is also able to deal with asymmetric NUMA distances gracefully, something that our FORM1 implementation doesn't do. Move these FORM1 verifications to a new function and wait until after CAS, when we're sure that we're sticking with FORM1, to enforce them. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-6-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-30spapr_numa.c: rename numa_assoc_array to FORM1_assoc_arrayDaniel Henrique Barboza
Introducing a new NUMA affinity, FORM2, requires a new mechanism to switch between affinity modes after CAS. Also, we want FORM2 data structures and functions to be completely separated from the existing FORM1 code, allowing us to avoid adding new code that inherits the existing complexity of FORM1. The idea of switching values used by the write_dt() functions in spapr_numa.c was already introduced in the previous patch, and the same approach will be used when dealing with the FORM1 and FORM2 arrays. We can accomplish that by that by renaming the existing numa_assoc_array to FORM1_assoc_array, which now is used exclusively to handle FORM1 affinity data. A new helper get_associativity() is then introduced to be used by the write_dt() functions to retrieve the current ibm,associativity array of a given node, after considering affinity selection that might have been done during CAS. All code that was using numa_assoc_array now needs to retrieve the array by calling this function. This will allow for an easier plug of FORM2 data later on. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-5-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-30spapr_numa.c: parametrize FORM1 macrosDaniel Henrique Barboza
The next preliminary step to introduce NUMA FORM2 affinity is to make the existing code independent of FORM1 macros and values, i.e. MAX_DISTANCE_REF_POINTS, NUMA_ASSOC_SIZE and VCPU_ASSOC_SIZE. This patch accomplishes that by doing the following: - move the NUMA related macros from spapr.h to spapr_numa.c where they are used. spapr.h gets instead a 'NUMA_NODES_MAX_NUM' macro that is used to refer to the maximum number of NUMA nodes, including GPU nodes, that the machine can support; - MAX_DISTANCE_REF_POINTS and NUMA_ASSOC_SIZE are renamed to FORM1_DIST_REF_POINTS and FORM1_NUMA_ASSOC_SIZE. These FORM1 specific macros are used in FORM1 init functions; - code that uses MAX_DISTANCE_REF_POINTS now retrieves the max_dist_ref_points value using get_max_dist_ref_points(). NUMA_ASSOC_SIZE is replaced by get_numa_assoc_size() and VCPU_ASSOC_SIZE is replaced by get_vcpu_assoc_size(). These functions are used by the generic device tree functions and h_home_node_associativity() and will allow them to switch between FORM1 and FORM2 without changing their core logic. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-4-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-29ppc/pnv: Rename "id" to "quad-id" in PnvQuadCédric Le Goater
This to avoid possible conflicts with the "id" property of QOM objects. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20210901094153.227671-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-29ppc/xive: Export xive_tctx_word2() helperCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20210901094153.227671-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-29ppc/xive: Export priority_to_ipb() helperCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20210901094153.227671-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-08-27ppc/xive: Export xive_presenter_notify()Cédric Le Goater
It's generic enough to be used from the XIVE2 router and avoid more duplication. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20210809134547.689560-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-08-27ppc/xive: Export PQ get/set routinesCédric Le Goater
These will be shared with the XIVE2 router. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20210809134547.689560-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-08-27ppc/pnv: Use a simple incrementing index for the chip-idCédric Le Goater
When the QEMU PowerNV machine was introduced, multi chip support modeled a two socket system with dual chip modules as found on some P8 Tuleta systems (8286-42A). But this is hardly used and not relevant for QEMU. Use a simple index instead. With this change, we can now increase the max socket number to 16 as found on high end systems. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20210809134547.689560-5-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-08-27ppc/pnv: Change the POWER10 machine to support DD2 onlyCédric Le Goater
There is no need to keep the DD1 chip model as it will never be publicly available. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20210809134547.689560-3-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-29ppc/vof: Fix Coverity issuesAlexey Kardashevskiy
Coverity reported issues which are caused by mixing of signed return codes from DTC and unsigned return codes of the client interface. This introduces PROM_ERROR and makes distinction between the error types. This fixes NEGATIVE_RETURNS, OVERRUN issues reported by Coverity. This adds a comment about the return parameters number in the VOF hcall. The reason for such counting is to keep the numbers look the same in vof_client_handle() and the Linux (an OF client). vmc->client_architecture_support() returns target_ulong and we want to propagate this to the client (for example H_MULTI_THREADS_ACTIVE). The VOF path to do_client_architecture_support() needs chopping off the top 32bit but SLOF's H_CAS does not; and either way the return values are either 0 or 32bit negative error code. For now this chops the top 32bits. This makes "claim" fail if the allocated address is above 4GB as the client interface is 32bit. This still allows claiming memory above 4GB as potentially initrd can be put there and the client can read the address from the FDT's "available" property. Fixes: CID 1458139, 1458138, 1458137, 1458133, 1458132 Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210720050726.2737405-1-aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09target/ppc: Support for H_RPT_INVALIDATE hcallBharata B Rao
If KVM_CAP_RPT_INVALIDATE KVM capability is enabled, then - indicate the availability of H_RPT_INVALIDATE hcall to the guest via ibm,hypertas-functions property. - Enable the hcall Both the above are done only if the new sPAPR machine capability cap-rpt-invalidate is set. Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Message-Id: <20210706112440.1449562-3-bharata@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09spapr: Fix implementation of Open Firmware client interfaceAlexey Kardashevskiy
This addresses the comments from v22. The functional changes are (the VOF ones need retesting with Pegasos2): (VOF) setprop will start failing if the machine class callback did not handle it; (VOF) unit addresses are lowered in path_offset(); (SPAPR) /chosen/bootargs is initialized from kernel_cmdline if the client did not change it. Fixes: 5c991e5d4378 ("spapr: Implement Open Firmware client interface") Cc: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210708065625.548396-1-aik@ozlabs.ru> Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09target/ppc/spapr: Update H_GET_CPU_CHARACTERISTICS L1D cache flush bitsNicholas Piggin
There are several new L1D cache flush bits added to the hcall which reflect hardware security features for speculative cache access issues. These behaviours are now being specified as negative in order to simplify patched kernel compatibility with older firmware (a new problem found in existing systems would automatically be vulnerable). [dwg: Technically this changes behaviour for existing machine types. After discussion with Nick, we've determined this is safe, because the worst that will happen if a guest gets the wrong information due to a migration is that it will perform some unnecessary workarounds, but will remain correct and secure (well, as secure as it was going to be anyway). In addition the change only affects cap-cfpc=safe which is not enabled by default, and in fact is not possible to set on any current hardware (though it's expected it will be possible on POWER10)] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20210615044107.1481608-1-npiggin@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09spapr: Implement Open Firmware client interfaceAlexey Kardashevskiy
The PAPR platform describes an OS environment that's presented by a combination of a hypervisor and firmware. The features it specifies require collaboration between the firmware and the hypervisor. Since the beginning, the runtime component of the firmware (RTAS) has been implemented as a 20 byte shim which simply forwards it to a hypercall implemented in qemu. The boot time firmware component is SLOF - but a build that's specific to qemu, and has always needed to be updated in sync with it. Even though we've managed to limit the amount of runtime communication we need between qemu and SLOF, there's some, and it has become increasingly awkward to handle as we've implemented new features. This implements a boot time OF client interface (CI) which is enabled by a new "x-vof" pseries machine option (stands for "Virtual Open Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall which implements Open Firmware Client Interface (OF CI). This allows using a smaller stateless firmware which does not have to manage the device tree. The new "vof.bin" firmware image is included with source code under pc-bios/. It also includes RTAS blob. This implements a handful of CI methods just to get -kernel/-initrd working. In particular, this implements the device tree fetching and simple memory allocator - "claim" (an OF CI memory allocator) and updates "/memory@0/available" to report the client about available memory. This implements changing some device tree properties which we know how to deal with, the rest is ignored. To allow changes, this skips fdt_pack() when x-vof=on as not packing the blob leaves some room for appending. In absence of SLOF, this assigns phandles to device tree nodes to make device tree traversing work. When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree. This adds basic instances support which are managed by a hash map ihandle -> [phandle]. Before the guest started, the used memory is: 0..e60 - the initial firmware 8000..10000 - stack 400000.. - kernel 3ea0000.. - initramdisk This OF CI does not implement "interpret". Unlike SLOF, this does not format uninitialized nvram. Instead, this includes a disk image with pre-formatted nvram. With this basic support, this can only boot into kernel directly. However this is just enough for the petitboot kernel and initradmdisk to boot from any possible source. Note this requires reasonably recent guest kernel with: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 The immediate benefit is much faster booting time which especially crucial with fully emulated early CPU bring up environments. Also this may come handy when/if GRUB-in-the-userspace sees light of the day. This separates VOF and sPAPR in a hope that VOF bits may be reused by other POWERPC boards which do not support pSeries. This assumes potential support for booting from QEMU backends such as blockdev or netdev without devices/drivers used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> [dwg: Adjusted some includes which broke compile in some more obscure compilation setups] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09spapr: tune rtas-sizeAlexey Kardashevskiy
QEMU reserves space for RTAS via /rtas/rtas-size which tells the client how much space the RTAS requires to work which includes the RTAS binary blob implementing RTAS runtime. Because pseries supports FWNMI which requires plenty of space, QEMU reserves more than 2KB which is enough for the RTAS blob as it is just 20 bytes (under QEMU). Since FWNMI reset delivery was added, RTAS_SIZE macro is not used anymore. This replaces RTAS_SIZE with RTAS_MIN_SIZE and uses it in the /rtas/rtas-size calculation to account for the RTAS blob. Fixes: 0e236d347790 ("ppc/spapr: Implement FWNMI System Reset delivery") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210622070336.1463250-1-aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03spapr: nvdimm: Forward declare and move the definitionsShivaprasad G Bhat
The subsequent patches add definitions which tend to get the compilation to cyclic dependency. So, prepare with forward declarations, move the definitions and clean up. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Message-Id: <162133925415.610.11584121797866216417.stgit@4f1e6f2bd33e> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03spapr: Don't hijack current_machine->boot_orderGreg Kurz
QEMU 6.0 moved all the -boot variables to the machine. Especially, the removal of the boot_order static changed the handling of '-boot once' from: if (boot_once) { qemu_boot_set(boot_once, &error_fatal); qemu_register_reset(restore_boot_order, g_strdup(boot_order)); } to if (current_machine->boot_once) { qemu_boot_set(current_machine->boot_once, &error_fatal); qemu_register_reset(restore_boot_order, g_strdup(current_machine->boot_order)); } This means that we now register as subsequent boot order a copy of current_machine->boot_once that was just set with the previous call to qemu_boot_set(), i.e. we never transition away from the once boot order. It is certainly fragile^Wwrong for the spapr code to hijack a field of the base machine type object like that. The boot order rework simply turned this software boundary violation into an actual bug. Have the spapr code to handle that with its own field in SpaprMachineState. Also kfree() the initial boot device string when "once" was used. Fixes: 4b7acd2ac821 ("vl: clean up -boot variables") Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1960119 Cc: pbonzini@redhat.com Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210521160735.1901914-1-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-19hw/ppc: moved hcalls that depend on softmmuLucas Mateus Castro (alqotel)
The hypercalls h_enter, h_remove, h_bulk_remove, h_protect, and h_read, have been moved to spapr_softmmu.c with the functions they depend on. The functions is_ram_address and push_sregs_to_kvm_pr are not static anymore as functions on both spapr_hcall.c and spapr_softmmu.c depend on them. The hypercalls h_resize_hpt_prepare and h_resize_hpt_commit have been divided, the KVM part stayed in spapr_hcall.c while the softmmu part was moved to spapr_softmmu.c Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20210506163941.106984-2-lucas.araujo@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>