aboutsummaryrefslogtreecommitdiff
path: root/hw/ppc/spapr_hcall.c
AgeCommit message (Collapse)Author
2024-03-13spapr: nested: register nested-hv api hcalls only for cap-nested-hvHarsh Prateek Bora
Since cap-nested-hv is an optional capability, it makes sense to register api specfic hcalls only when respective capability is enabled. This requires to introduce a new API to unregister hypercalls to maintain sanity across guest reboot since caps are re-applied across reboots and re-registeration of hypercalls would hit assert otherwise. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-02-23hw/ppc/spapr_hcall: Rename {softmmu -> vhyp_mmu}_resize_hpt_prPhilippe Mathieu-Daudé
Since 'softmmu' is quite a loaded term in QEMU, rename the vhyp MMU facilities to use the vhyp_mmu_ prefix rather than softmmu_. vhyp_mmu_ is chosen because the code that manipulates the hash table via guest software hypercalls is QEMU's implementation of the PAPR hypervisor interface, called vhyp. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> [npiggin: Pick a different name, explain it in changelog.] Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-02-23hw/ppc/spapr_hcall: Allow elision of softmmu_resize_hpt_prepPhilippe Mathieu-Daudé
Check tcg_enabled() before calling softmmu_resize_hpt_prepare() and softmmu_resize_hpt_commit() to allow the compiler to elide their calls. The stubs are then unnecessary, remove them. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2023-12-20hw/ppc/spapr_hcall: Remove unused 'exec/exec-all.h' included headerPhilippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20231212113640.30287-2-philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-09-20ppc: spelling fixesMichael Tokarev
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Cédric Le Goater <clg@kaod.org>
2023-09-06spapr: implement H_SET_MODE debug facilitiesNicholas Piggin
Wire up the H_SET_MODE debug resources to the CIABR and DAWR0 debug facilities in TCG. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-07-12hw/ppc/spapr: Use machine_memory_devices_init()David Hildenbrand
Let's use our new helper and stop always allocating ms->device_memory. There is no difference in common memory-device code anymore between ms->device_memory being NULL or the size being 0. So we only have to teach spapr code that ms->device_memory isn't always around. We can now modify two maxram_size checks to rely on ms->device_memory for detecting whether we have memory devices. Cc: Daniel Henrique Barboza <danielhb413@gmail.com> Cc: "Cédric Le Goater" <clg@kaod.org> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Greg Kurz <groug@kaod.org> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20230623124553.400585-5-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-06-25ppc/spapr: Move spapr nested HV to a new fileNicholas Piggin
Create spapr_nested.c for most of the nested HV implementation. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-06-25ppc/spapr: load and store l2 state with helper functionsNicholas Piggin
Arguably this is just shuffling around register accesses, but one nice thing it does is allow the exit to save away the L2 state then switch the environment to the L1 before copying L2 data back to the L1, which logically flows more naturally and simplifies the error paths. Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-06-25ppc/spapr: Add a nested state structNicholas Piggin
Rather than use a copy of CPUPPCState to store the host state while the environment has been switched to the L2, use a new struct for this purpose. Have helper functions to save and load this host state. Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-06-25ppc/spapr: H_ENTER_NESTED should restore host XER ca fieldNicholas Piggin
Fix missing env->ca restore when going from L2 back to the host. Fixes: 120f738a467 ("spapr: implement nested-hv capability for the virtual hypervisor") Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-05-28spapr: Add SPAPR_CAP_AIL_MODE_3 for AIL mode 3 support for H_SET_MODE hcallNicholas Piggin
The behaviour of the Address Translation Mode on Interrupt resource is not consistently supported by all CPU versions or all KVM versions: KVM HV does not support mode 2, and does not support mode 3 on POWER7 or early POWER9 processesors. KVM PR only supports mode 0. TCG supports all modes (0, 2, 3) on CPUs with support for the corresonding LPCR[AIL] mode. This leads to inconsistencies in guest behaviour and could cause problems migrating guests. This was not noticable for Linux guests for a long time because the kernel only uses modes 0 and 3, and it used to consider AIL-3 to be advisory in that it would always keep the AIL-0 vectors around, so it did not matter whether or not interrupts were delivered according to the AIL mode. Recent Linux guests depend on AIL mode 3 working as specified in order to support the SCV facility interrupt. If AIL-3 can not be provided, then H_SET_MODE must return an error to Linux so it can disable the SCV facility (failure to do so can lead to userspace being able to crash the guest kernel). Add the ail-mode-3 capability to specify that AIL-3 is supported. AIL-0 is implied as the baseline, and AIL-2 is no longer supported by spapr. AIL-2 is not known to be used by any software, but support in TCG could be restored with an ail-mode-2 capability quite easily if a regression is reported. Modify the H_SET_MODE Address Translation Mode on Interrupt resource handler to check capabilities and correctly return error if not supported. KVM has a cap to advertise support for AIL-3. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20230515160216.394612-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2023-05-05ppc: spapr: cleanup cr get/set with helpers.Harsh Prateek Bora
The bits in cr reg are grouped into eight 4-bit fields represented by env->crf[8] and the related calculations should be abstracted to keep the calling routines simpler to read. This is a step towards cleaning up the related/calling code for better readability. Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230503093619.2530487-2-harshpb@linux.ibm.com> [danielhb: add 'const' modifier to fix linux-user build] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2023-03-07includes: move tb_flush into its own headerAlex Bennée
This aids subsystems (like gdbstub) that want to trigger a flush without pulling target specific headers. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230302190846.2593720-8-alex.bennee@linaro.org> Message-Id: <20230303025805.625589-8-richard.henderson@linaro.org>
2022-10-28target/ppc: introduce ppc_maybe_interruptMatheus Ferst
This new method will check if any pending interrupt was unmasked and then call cpu_interrupt/cpu_reset_interrupt accordingly. Code that raises/lowers or masks/unmasks interrupts should call this method to keep CPU_INTERRUPT_HARD coherent with env->pending_interrupts. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20221021142156.4134411-2-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-17hw/ppc: set machine->fdt in spapr machineDaniel Henrique Barboza
The pSeries machine never bothered with the common machine->fdt attribute. We do all the FDT related work using spapr->fdt_blob. We're going to introduce a QMP/HMP command to dump the FDT, which will rely on setting machine->fdt properly to work across all machine archs/types. Let's set machine->fdt in two places where we manipulate the FDT: spapr_machine_reset() and CAS. There are other places where the FDT is manipulated in the pSeries machines, most notably the hotplug/unplug path. For now we'll acknowledge that we won't have the most accurate representation of the FDT, depending on the current machine state, when using this QMP/HMP fdt command. Making the internal FDT representation always match the actual FDT representation that the guest is using is a problem for another day. spapr->fdt_blob is left untouched for now. To replace it with machine->fdt, since we're migrating spapr->fdt_blob, we would need to migrate machine->fdt as well. This is something that we would like to to do keep our code simpler but it's also a work we'll leave for later. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20220926173855.1159396-14-danielhb413@gmail.com>
2022-07-18ppc: Check partition and process table alignmentLeandro Lupori
Check if partition and process tables are properly aligned, in their size, according to PowerISA 3.1B, Book III 6.7.6 programming note. Hardware and KVM also raise an exception in these cases. Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220628133959.15131-2-leandro.lupori@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20spapr: Move nested KVM hypercalls under a TCG only config.Fabiano Rosas
These are the spapr virtual hypervisor implementation of the nested KVM API. They only make sense when running with TCG. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20220325221113.255834-3-farosas@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20spapr: Move hypercall_register_softmmuFabiano Rosas
I'm moving this because next patch will add more code under the ifdef and it will be cleaner if we keep them together. Also switch the ifdef branches to make it more convenient to add code under CONFIG_TCG in the next patch. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20220325221113.255834-2-farosas@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-03-21Use g_new() & friends where that makes obvious senseMarkus Armbruster
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
2022-02-18spapr: implement nested-hv capability for the virtual hypervisorNicholas Piggin
This implements the Nested KVM HV hcall API for spapr under TCG. The L2 is switched in when the H_ENTER_NESTED hcall is made, and the L1 is switched back in returned from the hcall when a HV exception is sent to the vhyp. Register state is copied in and out according to the nested KVM HV hcall API specification. The hdecr timer is started when the L2 is switched in, and it provides the HDEC / 0x980 return to L1. The MMU re-uses the bare metal radix 2-level page table walker by using the get_pate method to point the MMU to the nested partition table entry. MMU faults due to partition scope errors raise HV exceptions and accordingly are routed back to the L1. The MMU does not tag translations for the L1 (direct) vs L2 (nested) guests, so the TLB is flushed on any L1<->L2 transition (hcall entry and exit). Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [ clg: checkpatch fixes ] Message-Id: <20220216102545.1808018-10-npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-09-30spapr: move FORM1 verifications to post CASDaniel Henrique Barboza
FORM2 NUMA affinity is prepared to deal with empty (memory/cpu less) NUMA nodes. This is used by the DAX KMEM driver to locate a PAPR SCM device that has a different latency than the original NUMA node from the regular memory. FORM2 is also able to deal with asymmetric NUMA distances gracefully, something that our FORM1 implementation doesn't do. Move these FORM1 verifications to a new function and wait until after CAS, when we're sure that we're sticking with FORM1, to enforce them. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-6-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-30spapr_numa.c: rename numa_assoc_array to FORM1_assoc_arrayDaniel Henrique Barboza
Introducing a new NUMA affinity, FORM2, requires a new mechanism to switch between affinity modes after CAS. Also, we want FORM2 data structures and functions to be completely separated from the existing FORM1 code, allowing us to avoid adding new code that inherits the existing complexity of FORM1. The idea of switching values used by the write_dt() functions in spapr_numa.c was already introduced in the previous patch, and the same approach will be used when dealing with the FORM1 and FORM2 arrays. We can accomplish that by that by renaming the existing numa_assoc_array to FORM1_assoc_array, which now is used exclusively to handle FORM1 affinity data. A new helper get_associativity() is then introduced to be used by the write_dt() functions to retrieve the current ibm,associativity array of a given node, after considering affinity selection that might have been done during CAS. All code that was using numa_assoc_array now needs to retrieve the array by calling this function. This will allow for an easier plug of FORM2 data later on. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210920174947.556324-5-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09spapr: Fix implementation of Open Firmware client interfaceAlexey Kardashevskiy
This addresses the comments from v22. The functional changes are (the VOF ones need retesting with Pegasos2): (VOF) setprop will start failing if the machine class callback did not handle it; (VOF) unit addresses are lowered in path_offset(); (SPAPR) /chosen/bootargs is initialized from kernel_cmdline if the client did not change it. Fixes: 5c991e5d4378 ("spapr: Implement Open Firmware client interface") Cc: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210708065625.548396-1-aik@ozlabs.ru> Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09target/ppc/spapr: Update H_GET_CPU_CHARACTERISTICS L1D cache flush bitsNicholas Piggin
There are several new L1D cache flush bits added to the hcall which reflect hardware security features for speculative cache access issues. These behaviours are now being specified as negative in order to simplify patched kernel compatibility with older firmware (a new problem found in existing systems would automatically be vulnerable). [dwg: Technically this changes behaviour for existing machine types. After discussion with Nick, we've determined this is safe, because the worst that will happen if a guest gets the wrong information due to a migration is that it will perform some unnecessary workarounds, but will remain correct and secure (well, as secure as it was going to be anyway). In addition the change only affects cap-cfpc=safe which is not enabled by default, and in fact is not possible to set on any current hardware (though it's expected it will be possible on POWER10)] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20210615044107.1481608-1-npiggin@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-09spapr: Implement Open Firmware client interfaceAlexey Kardashevskiy
The PAPR platform describes an OS environment that's presented by a combination of a hypervisor and firmware. The features it specifies require collaboration between the firmware and the hypervisor. Since the beginning, the runtime component of the firmware (RTAS) has been implemented as a 20 byte shim which simply forwards it to a hypercall implemented in qemu. The boot time firmware component is SLOF - but a build that's specific to qemu, and has always needed to be updated in sync with it. Even though we've managed to limit the amount of runtime communication we need between qemu and SLOF, there's some, and it has become increasingly awkward to handle as we've implemented new features. This implements a boot time OF client interface (CI) which is enabled by a new "x-vof" pseries machine option (stands for "Virtual Open Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall which implements Open Firmware Client Interface (OF CI). This allows using a smaller stateless firmware which does not have to manage the device tree. The new "vof.bin" firmware image is included with source code under pc-bios/. It also includes RTAS blob. This implements a handful of CI methods just to get -kernel/-initrd working. In particular, this implements the device tree fetching and simple memory allocator - "claim" (an OF CI memory allocator) and updates "/memory@0/available" to report the client about available memory. This implements changing some device tree properties which we know how to deal with, the rest is ignored. To allow changes, this skips fdt_pack() when x-vof=on as not packing the blob leaves some room for appending. In absence of SLOF, this assigns phandles to device tree nodes to make device tree traversing work. When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree. This adds basic instances support which are managed by a hash map ihandle -> [phandle]. Before the guest started, the used memory is: 0..e60 - the initial firmware 8000..10000 - stack 400000.. - kernel 3ea0000.. - initramdisk This OF CI does not implement "interpret". Unlike SLOF, this does not format uninitialized nvram. Instead, this includes a disk image with pre-formatted nvram. With this basic support, this can only boot into kernel directly. However this is just enough for the petitboot kernel and initradmdisk to boot from any possible source. Note this requires reasonably recent guest kernel with: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 The immediate benefit is much faster booting time which especially crucial with fully emulated early CPU bring up environments. Also this may come handy when/if GRUB-in-the-userspace sees light of the day. This separates VOF and sPAPR in a hope that VOF bits may be reused by other POWERPC boards which do not support pSeries. This assumes potential support for booting from QEMU backends such as blockdev or netdev without devices/drivers used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu> [dwg: Adjusted some includes which broke compile in some more obscure compilation setups] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-19hw/ppc: moved has_spr to cpu.hLucas Mateus Castro (alqotel)
Moved has_spr to cpu.h as ppc_has_spr and turned it into an inline function. Change spr verification in pnv.c and spapr.c to a version that can compile in a !TCG environment. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20210507164146.67086-1-lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-19hw/ppc: moved hcalls that depend on softmmuLucas Mateus Castro (alqotel)
The hypercalls h_enter, h_remove, h_bulk_remove, h_protect, and h_read, have been moved to spapr_softmmu.c with the functions they depend on. The functions is_ram_address and push_sregs_to_kvm_pr are not static anymore as functions on both spapr_hcall.c and spapr_softmmu.c depend on them. The hypercalls h_resize_hpt_prepare and h_resize_hpt_commit have been divided, the KVM part stayed in spapr_hcall.c while the softmmu part was moved to spapr_softmmu.c Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20210506163941.106984-2-lucas.araujo@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-19hw/ppc/spapr.c: Extract MMU mode error reporting into a functionFabiano Rosas
A following patch will make use of it. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20210505001130.3999968-2-farosas@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-05Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210504' ↵Peter Maydell
into staging ppc patch queue 2021-05-04 Here's the first ppc pull request for qemu-6.1. It has a wide variety of stuff accumulated during the 6.0 freeze. Highlights are: * Multi-phase reset cleanups for PAPR * Preliminary cleanups towards allowing !CONFIG_TCG for the ppc target * Cleanup of AIL logic and extension to POWER10 * Further improvements to handling of hot unplug failures on PAPR * Allow much larger numbers of CPU on pseries * Support for the H_SCM_HEALTH hypercall * Add support for the Pegasos II board * Substantial cleanup to hflag handling * Assorted minor fixes and cleanups # gpg: Signature made Tue 04 May 2021 06:52:39 BST # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dg-gitlab/tags/ppc-for-6.1-20210504: (46 commits) hw/ppc/pnv_psi: Use device_cold_reset() instead of device_legacy_reset() hw/ppc/spapr_vio: Reset TCE table object with device_cold_reset() hw/intc/spapr_xive: Use device_cold_reset() instead of device_legacy_reset() target/ppc: removed VSCR from SPR registration target/ppc: Reduce the size of ppc_spr_t target/ppc: Clean up _spr_register et al target/ppc: Add POWER10 exception model target/ppc: rework AIL logic in interrupt delivery target/ppc: move opcode table logic to translate.c target/ppc: code motion from translate_init.c.inc to gdbstub.c spapr_drc.c: handle hotunplug errors in drc_unisolate_logical() spapr.h: increase FDT_MAX_SIZE spapr.c: do not use MachineClass::max_cpus to limit CPUs ppc: Rename current DAWR macros and variables target/ppc: POWER10 supports scv target/ppc: Fix POWER9 radix guest HV interrupt AIL behaviour docs/system: ppc: Add documentation for ppce500 machine roms/u-boot: Bump ppce500 u-boot to v2021.04 to fix broken pci support roms/Makefile: Update ppce500 u-boot build directory name ppc/spapr: Add support for implement support for H_SCM_HEALTH ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-04target/ppc: Add POWER10 exception modelNicholas Piggin
POWER10 adds a new bit that modifies interrupt behaviour, LPCR[HAIL], and it removes support for the LPCR[AIL]=0b10 mode. Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20210501072436.145444-3-npiggin@gmail.com> [dwg: Corrected tab indenting] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-04target/ppc: rework AIL logic in interrupt deliveryNicholas Piggin
The AIL logic is becoming unmanageable spread all over powerpc_excp(), and it is slated to get even worse with POWER10 support. Move it all to a new helper function. Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20210501072436.145444-2-npiggin@gmail.com> [dwg: Corrected tab indenting] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-02Do not include cpu.h if it's not really necessaryThomas Huth
Stop including cpu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-4-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-01-19spapr_hcall.c: make do_client_architecture_support staticDaniel Henrique Barboza
The function is called only inside spapr_hcall.c. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210114180628.1675603-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-06spapr: Introduce spapr_drc_reset_all()Greg Kurz
No need to expose the way DRCs are traversed outside of spapr_drc.c. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-4-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-06spapr: Fix reset of transient DR connectorsGreg Kurz
Documentation of object_property_iter_init() clearly stipulates that "it is forbidden to modify the property list while iterating". But this is exactly what we do when resetting transient DR connectors during CAS. The call to spapr_drc_reset() can finalize the hot-unplug sequence of a PHB or a PCI bridge, both of which will then in turn destroy their PCI DRCs. This could potentially invalidate the iterator. It is pure luck that this haven't caused any issues so far. Change spapr_drc_reset() to return true if it caused a device to be removed. Restart from scratch in this case. This can potentially increase the overall DRC reset time, especially with a high maxmem which generates a lot of LMB DRCs. But this kind of setup is rare, and so is the use case of rebooting a guest while doing hot-unplug. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-3-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-06spapr: Call spapr_drc_reset() for all DRCs at CASGreg Kurz
Non-transient DRCs are either in the empty or the ready state, which means spapr_drc_reset() doesn't change their state. It is thus not needed to do any checking. Call spapr_drc_reset() unconditionally and squash spapr_drc_transient() into its only user, spapr_drc_needed(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201218103400.689660-2-groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-12-14spapr: Pass sPAPR machine state down to spapr_pci_switch_vga()Greg Kurz
This allows to drop a user of qdev_get_machine(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20201209170052.1431440-4-groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-11-05spapr: Convert hpt_prepare_thread() to use qemu_try_memalign()Greg Kurz
HPT resizing is asynchronous: the guest first kicks off the creation of a new HPT, then it waits for that new HPT to be actually created and finally it asks the current HPT to be replaced by the new one. In the case of a userland allocated HPT, this currently relies on calling qemu_memalign() which aborts on OOM and never returns NULL. Since we seem to have path to report the failure to the guest with an H_NO_MEM return value, use qemu_try_memalign() instead of qemu_memalign(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160398563636.32380.1747166034877173994.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09spapr: Simplify error handling in do_client_architecture_support()Greg Kurz
Use the return value of ppc_set_compat_all() to check failures, which is preferred over hijacking local_err. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-7-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09spapr: Get rid of cas_check_pvr() error reportingGreg Kurz
The cas_check_pvr() function has two purposes: - finding the "best" logical PVR, ie. the most recent one supported by the guest for this CPU type - checking if the guest supports the real PVR of this CPU type, which is just an optional extra information to workaround the lack of support for "compat" mode in PR KVM This logic doesn't need error reporting, really. If we don't find a suitable logical PVR, we return the special value 0 which is definitely not a valid PVR. Let the caller decide on whether it should error out or not. This doesn't change the behavior. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20200914123505.612812-6-groug@kaod.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-09-08spapr: move h_home_node_associativity to spapr_numa.cDaniel Henrique Barboza
The implementation of this hypercall will be modified to use spapr->numa_assoc_arrays input. Moving it to spapr_numa.c makes make more sense. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200904172422.617460-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-07-10qom: Crash more nicely on object_property_get_link() failureMarkus Armbruster
Pass &error_abort instead of NULL where the returned value is dereferenced or asserted to be non-null. Drop a now redundant assertion. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-24-armbru@redhat.com>
2020-05-07spapr: Drop CAS reboot flagGreg Kurz
The CAS reboot flag is false by default and all the locations that could set it to true have been dropped. This means that all code blocks depending on the flag being set is dead code and the other code blocks should be executed always. Just do that and drop the now uneeded CAS reboot flag. Fix a comment on the way to make checkpatch happy. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <158514994893.478799.11772512888322840990.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-05-07spapr/cas: Separate CAS handling from rebuilding the FDTAlexey Kardashevskiy
At the moment "ibm,client-architecture-support" ("CAS") is implemented in SLOF and QEMU assists via the custom H_CAS hypercall which copies an updated flatten device tree (FDT) blob to the SLOF memory which it then uses to update its internal tree. When we enable the OpenFirmware client interface in QEMU, we won't need to copy the FDT to the guest as the client is expected to fetch the device tree using the client interface. This moves FDT rebuild out to a separate helper which is going to be called from the "ibm,client-architecture-support" handler and leaves writing FDT to the guest in the H_CAS handler. This should not cause any behavioral change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20200310050733.29805-3-aik@ozlabs.ru> Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <158514994229.478799.2178881312094922324.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-05-07spapr: Simplify selection of radix/hash during CASGreg Kurz
The guest can select the MMU mode by setting bits 0-1 of byte 24 in OV5 to to 0b00 for hash or 0b01 for radix. As required by the architecture, we terminate the boot process if any other value is found there. The usual way to negotiate features in OV5 is basically ANDing the bitfield provided by the guest and the bitfield of features supported by QEMU, previously populated at machine init. For some not documented reason, MMU is treated differently : bit 1 of byte 24 (the radix/hash bit) is cleared from the guest OV5 and explicitely set in the final negotiated OV5 if radix was requested. Since the only expected input from the guest is the radix/hash bit being set or not, it seems more appropriate to handle this like we do for XIVE. Set the radix bit in spapr->ov5 at machine init if it has a chance to work (ie. power9, either TCG or a radix capable KVM) and rely exclusively on spapr_ovec_intersect() to set the radix bit in spapr->ov5_cas. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <158514993621.478799.4204740354545734293.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-05-07spapr: Don't check capabilities removed between CAS callsGreg Kurz
We currently check if some capability in OV5 was removed by the guest since the previous CAS, and we trigger a CAS reboot in that case. This was required because it could call for a device-tree property or node removal, that we didn't support until recently (see commit 6787d27b04a7 "spapr: add option vector handling in CAS-generated resets" for details). Now that we render a full FDT at CAS and that SLOF is able to handle node removal, we don't need to do a CAS reset in this case anymore. Also, this check can only return true if the guest has already called CAS since the last full system reset (otherwise spapr->ov5_cas is empty). Linux doesn't do that so this can be considered as dead code for the vast majority of existing setups. Drop the check. Since the only use of the ov5_cas_old variable is precisely the check itself, drop the variable as well. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <158514993021.478799.10928618293640651819.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-03-24spapr: Fix memory leak in h_client_architecture_support()Greg Kurz
This is the only error path that needs to free the previously allocated ov1. Reported-by: Coverity (CID 1421924) Fixes: cbd0d7f36322 "spapr: Fail CAS if option vector table cannot be parsed" Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <158481206205.336182.16106097429336044843.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-03-17spapr: Don't attempt to clamp RMA to VRMA constraintDavid Gibson
The Real Mode Area (RMA) is the part of memory which a guest can access when in real (MMU off) mode. Of course, for a guest under KVM, the MMU isn't really turned off, it's just in a special translation mode - Virtual Real Mode Area (VRMA) - which looks like real mode in guest mode. The mechanics of how this works when using the hash MMU (HPT) put a constraint on the size of the RMA, which depends on the size of the HPT. So, the latter part of spapr_setup_hpt_and_vrma() clamps the RMA we advertise to the guest based on this VRMA limit. There are several things wrong with this: 1) spapr_setup_hpt_and_vrma() doesn't actually clamp, it takes the minimum of Node 0 memory size and the VRMA limit. That will *often* work the same as clamping, but there can be other constraints on RMA size which supersede Node 0 memory size. We have real bugs caused by this (currently worked around in the guest kernel) 2) Some callers of spapr_setup_hpt_and_vrma() are in a situation where we're past the point that we can actually advertise an RMA limit to the guest 3) But most fundamentally, the VRMA limit depends on host configuration (page size) which shouldn't be visible to the guest, but this partially exposes it. This can cause problems with migration in certain edge cases, although we will mostly get away with it. In practice, this clamping is almost never applied anyway. With 64kiB pages and the normal rules for sizing of the HPT, the theoretical VRMA limit will be 4x(guest memory size) and so never hit. It will hit with 4kiB pages, where it will be (guest memory size)/4. However all mainstream distro kernels for POWER have used a 64kiB page size for at least 10 years. So, simply replace this logic with a check that the RMA we've calculated based only on guest visible configuration will fit within the host implied VRMA limit. This can break if running HPT guests on a host kernel with 4kiB page size. As noted that's very rare. There also exist several possible workarounds: * Change the host kernel to use 64kiB pages * Use radix MMU (RPT) guests instead of HPT * Use 64kiB hugepages on the host to back guest memory * Increase the guest memory size so that the RMA hits one of the fixed limits before the RMA limit. This is relatively easy on POWER8 which has a 16GiB limit, harder on POWER9 which has a 1TiB limit. * Use a guest NUMA configuration which artificially constrains the RMA within the VRMA limit (the RMA must always fit within Node 0). Previously, on KVM, we also temporarily reduced the rma_size to 256M so that the we'd load the kernel and initrd safely, regardless of the VRMA limit. This was a) confusing, b) could significantly limit the size of images we could load and c) introduced a behavioural difference between KVM and TCG. So we remove that as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org>
2020-03-17spapr: Handle pending hot plug/unplug requests at CASGreg Kurz
If a hot plug or unplug request is pending at CAS, we currently trigger a CAS reboot, which severely increases the guest boot time. This is because SLOF doesn't handle hot plug events and we had no way to fix the FDT that gets presented to the guest. We can do better thanks to recent changes in QEMU and SLOF: - we now return a full FDT to SLOF during CAS - SLOF was fixed to correctly detect any device that was either added or removed since boot time and to update its internal DT accordingly. The right solution is to process all pending hot plug/unplug requests during CAS: convert hot plugged devices to cold plugged devices and remove the hot unplugged ones, which is exactly what spapr_drc_reset() does. Also clear all hot plug events that are currently queued since they're no longer relevant. Note that SLOF cannot currently populate hot plugged PCI bridges or PHBs at CAS. Until this limitation is lifted, SLOF will reset the machine when this scenario occurs : this will allow the FDT to be fully processed when SLOF is started again (ie. the same effect as the CAS reboot that would occur anyway without this patch). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <158257222352.4102917.8984214333937947307.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>