aboutsummaryrefslogtreecommitdiff
path: root/hw/ppc/spapr.c
AgeCommit message (Collapse)Author
2015-02-18PPC: Don't use legacy -usbdevice support for setting up boardMarkus Armbruster
It's tempting, because usbdevice_create() is so simple to use. But there's a lot of unwanted complexity behind the simple interface. Switch to usb_create_simple(). Cc: Alexander Graf <agraf@suse.de> Cc: qemu-ppc@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2015-02-10fw_cfg: fix typos in comments: patch -> pathGonglei
Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-01-07hw/ppc/spapr: simplify usb controller creation logicMarcel Apfelbaum
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07hw/usb: simplified usb_enabledMarcel Apfelbaum
The argument is not longer used and the implementation uses now QOM instead of QemuOpts. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07hw/ppc: modified the condition for usb controllers to be created for some ↵Marcel Apfelbaum
ppc machines Some ppc machines create a default usb controller based on a 'machine condition'. Until now the logic was: create the usb controller if: - the usb option was supplied in cli and value is true or - the usb option was absent and both set_defaults and the machine condition were true. Modified the logic to: Create the usb controller if: - the machine condition is true and defaults are enabled or - the usb option is supplied and true. The main for this is to simplify the usb_enabled method. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07target-ppc: Cast ssize_t to size_t before printing with %zxPeter Maydell
The mingw32 compiler complains about trying to print variables of type ssize_t with the %z format string specifier. Since we're printing it as unsigned hex anyway, cast to size_t to silence the warning. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07spapr: Fix stale HTAB during live migration (TCG)Samuel Mendoza-Jonas
If a TCG guest reboots during a running migration HTAB entries are not marked dirty, and the destination boots with an invalid HTAB. When a reboot occurs, explicitly mark the current HTAB dirty after clearing it. Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07spapr: Fix integer overflow during migration (TCG)Samuel Mendoza-Jonas
The n_valid and n_invalid fields are unsigned short integers but it is possible to have more than 65535 entries in a contiguous hunk, overflowing the field. This results in an incorrect HTAB being sent to the destination during migration. Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07spapr: Fix stale HTAB during live migration (KVM)Samuel Mendoza-Jonas
If a guest reboots during a running migration, changes to the hash page table are not necessarily updated on the destination. Opening a new file descriptor to the HTAB forces the migration handler to resend the entire table. Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-22machine: remove qemu_machine_opts global listMarcel Apfelbaum
QEMU has support for options per machine, keeping a global list of options is no longer necessary. Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de> Reviewed-by: Greg Bellows <greg.bellows@linaro.org> Message-id: 1418217570-15517-2-git-send-email-marcel.a@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-11-04spapr: Allow dynamic creation of PHBAlexander Graf
Now that we finally check for presence of dangling sysbus devices, make check started complaining that the sPAPR PHB is one such device. However, it really isn't. The spapr PHB is not really a traditional sysbus device, but much more a special spapr pv device which is already able to get created dynamically. Move spapr to its own dynamic sysbus check handling and allow PHB devices to get allocated dynamically. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04spapr: Cleanup machine naming conventions, and prepare for 2.2 releaseDavid Gibson
As of qemu-2.1, spapr/pseries, has a set of versioned machine classes to represent the machine type as it appeared to the guest in different qemu versions. This allows for safe migration of guests between current and future qemu versions. However, these are organized a bit differently from those for PC: on PC, the default plain "pc" machine type is just an alias for the most recent versioned machine type. In sPAPR, it names the base machine class from which the versioned types are derived. The PC approach is preferable; it makes it clearer which explicit version is the current one. Additionally updating the "current" machine as the base class makes it even more likely than otherwise to incorrectly alter the versioned machines' behaviour when updating the current machine. Therefore this patch changes sPAPR to the PC approach - the base class becomes abstract, and plain "pseries" becomes an alias for the most recent versioned machine class. Since qemu-2.1 is now released, we also create a new pseries-2.2 machine type, to incorporate changes during this development cycle (for now it is identical to pseries-2.1). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-02virtio-pci: fix migration for pci bus masterMichael S. Tsirkin
Current support for bus master (clearing OK bit) together with the need to support guests which do not enable PCI bus mastering, leads to extra state in VIRTIO_PCI_FLAG_BUS_MASTER_BUG bit, which isn't robust in case of cross-version migration for the case when guests use the device before setting DRIVER_OK. Rip out this code, and replace it: - Modern QEMU doesn't need VIRTIO_PCI_FLAG_BUS_MASTER_BUG so just drop it for latest machine type. - For compat machine types, set PCI_COMMAND if DRIVER_OK is set. As this is needed for 2.1 for both pc and ppc, move PC_COMPAT macros from pc.h to a new common header. Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de>
2014-10-20hw: Convert from BlockDriverState to BlockBackend, mostlyMarkus Armbruster
Device models should access their block backends only through the block-backend.h API. Convert them, and drop direct includes of inappropriate headers. Just four uses of BlockDriverState are left: * The Xen paravirtual block device backend (xen_disk.c) opens images itself when set up via xenbus, bypassing blockdev.c. I figure it should go through qmp_blockdev_add() instead. * Device model "usb-storage" prompts for keys. No other device model does, and this one probably shouldn't do it, either. * ide_issue_trim_cb() uses bdrv_aio_discard() instead of blk_aio_discard() because it fishes its backend out of a BlockAIOCB, which has only the BlockDriverState. * PC87312State has an unused BlockDriverState[] member. The next two commits take care of the latter two. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()Markus Armbruster
The patch is big, but all it really does is replacing dinfo->bdrv by blk_bs(blk_by_legacy_dinfo(dinfo)) The replacement is repetitive, but the conversion of device models to BlockBackend is imminent, and will shorten it to just blk_legacy_dinfo(dinfo). Line wrapping muddies the waters a bit. I also omit tests whether dinfo->bdrv is null, because it never is. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-20Fix typos and misspellings in commentszhanghailiang
formated -> formatted gaurantee -> guarantee shear -> sheer Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-08hypervisor property clashes with hypervisor nodeAnton Blanchard
dtc fails on a recent QEMU snapshot: ERROR (name_properties): "name" property in /hypervisor#1 is incorrect ("hypervisor" instead of base node name) Looking at the device tree we have a hypervisor property: # lsprop hypervisor hypervisor "kvm" But we also have a hypervisor node, with a name that doesn't match: # lsprop hypervisor#1/ name "hypervisor" compatible "linux,kvm" linux,phandle 7e5eb5d8 (2120136152) Commit c08ce91d309c (spapr: add uuid/host details to device tree) looks to have collided with an earlier patch. Remove the hypervisor property. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr_pci: map the MSI window in each PHBGreg Kurz
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X. Commit cc943c36faa192cd4b32af8fe5edb31894017d35 has modified MSI-X so that writes are made using the bus master address space and follow the IOMMU path. Unfortunately, the IOMMU address space address space does not have an MSI window: the notification is silently dropped in unassigned_mem_write instead of reaching the guest... The most visible effect is that all virtio devices are non-functional on sPAPR since then. :( This patch does the following: 1) map the MSI window into the IOMMU address space for each PHB - since each PHB instantiates its own IOMMU address space, we can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW) - no real need to keep the MSI window setup in a separate function, the spapr_pci_msi_init() code moves to spapr_phb_realize(). 2) kill the global MSI window as it is not needed in the end Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08ppc/spapr: Fix MAX_CPUS to 255Nikunj A Dadhania
MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This MAX_CPUS number was percolated back to "virsh capabilities" with wrong max_cpus. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: Locate RTAS and device-tree based on real RMABenjamin Herrenschmidt
We currently calculate the final RTAS and FDT location based on the early estimate of the RMA size, cropped to 256M on KVM since we only know the real RMA size at reset time which happens much later in the boot process. This means the FDT and RTAS end up right below 256M while they could be much higher, using precious RMA space and limiting what the OS bootloader can put there which has proved to be a problem with some OSes (such as when using very large initrd's) Fortunately, we do the actual copy of the device-tree into guest memory much later, during reset, late enough to be able to do it using the final RMA value, we just need to move the calculation to the right place. However, RTAS is still loaded too early, so we change the code to load the tiny blob into qemu memory early on, and then copy it into guest memory at reset time. It's small enough that the memory usage doesn't matter. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: fix compilation on 32bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: Fix ibm, associativity for memory nodesAlexey Kardashevskiy
We want the associtivity lists of memory and CPU nodes to match but memory nodes have incorrect domain#3 which is zero for CPU so they won't match. This clears domain#3 in the list to match CPUs associtivity lists. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: Add a helper for node0_size calculationAlexey Kardashevskiy
In multiple places there is a node0_size variable calculation which assumes that NUMA node #0 and memory node #0 are the same things which they are not. Since we are going to change it and do not want to change it in multiple places, let's make a helper. This adds a spapr_node0_size() helper and makes use of it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: Split memory nodes to power-of-two blocksAlexey Kardashevskiy
Linux kernel expects nodes to have power-of-two size and does WARN_ON if this is not the case: [ 0.041456] WARNING: at drivers/base/memory.c:115 which is: === /* Validate blk_sz is a power of 2 and not less than section size */ if ((block_sz & (block_sz - 1)) || (block_sz < MIN_MEMORY_BLOCK_SIZE)) { WARN_ON(1); block_sz = MIN_MEMORY_BLOCK_SIZE; } === This splits memory nodes into set of smaller blocks with a size which is a power of two. This makes sure the start address of every node is aligned to the node size. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: squash windows compile fix in] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: Refactor spapr_populate_memory() to allow memoryless nodesAlexey Kardashevskiy
Current QEMU does not support memoryless NUMA nodes, however actual hardware may have them so it makes sense to have a way to emulate them in QEMU. This prepares SPAPR for that. This moves 2 calls of spapr_populate_memory_node() into the existing loop over numa nodes so first several nodes may have no memory and this still will work. If there is no numa configuration, the code assumes there is just a single node at 0 and it has all the guest memory. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: Use DT memory node rendering helper for other nodesAlexey Kardashevskiy
This finishes refactoring by using the spapr_populate_memory_node helper for all nodes and removing leftovers from spapr_populate_memory(). This is not a part of the previous patch because the patches look nicer apart. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: Move DT memory node rendering to a helperAlexey Kardashevskiy
This moves recurring bits of code related to memory@xxx nodes creation to a helper. This makes use of the new helper for node@0. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: fix possible memory leakGonglei
get_boot_devices_list() will malloc memory, spapr_finalize_fdt doesn't free it. Signed-off-by: Chenliang <chenliang88@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08spapr: add uuid/host details to device treeNikunj A Dadhania
Useful for identifying the guest/host uniquely within the guest. Adding following properties to the guest root node. vm,uuid - uuid of the guest host-model - Host model number host-serial - Host machine serial number hypervisor type - Tells its "kvm" Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08ppc: spapr-rtas - implement os-term rtas callNikunj A Dadhania
PAPR compliant guest calls this in absence of kdump. This finally reaches the guest and can be handled according to the policies set by higher level tools(like taking dump) for further analysis by tools like crash. Linux kernel calls ibm,os-term when extended property of os-term is set. This makes sure that a return to the linux kernel is gauranteed. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> [agraf: reduce RTAS_TOKEN_MAX] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-08-25spapr: Add support for new NMI interfaceAlexey Kardashevskiy
This implements an NMI interface POWERPC SPAPR machine. This enables an "nmi" HMP/QMP command supported on SPAPR. This calls POWERPC_EXCP_RESET (vector 0x100) in the guest to deliver NMI to every CPU. The expected result is XMON (in-kernel debugger) invocation. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-15spapr: Enable use of huge pagesAlexey Kardashevskiy
0b183fc87 "memory: move mem_path handling to memory_region_allocate_system_memory" disabled -mempath use for all machines that do not use memory_region_allocate_system_memory() to register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages support was disabled for it. This replaces memory_region_init_ram()+vmstate_register_ram_global() with memory_region_allocate_system_memory() to get huge pages back. This changes RAM size from (ram_limit - rma_alloc_size) to ram_limit as the previous patch moved RMA memory region allocation after RAM allocation and therefore this change does not have immediate effect but simplifies the code. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15spapr: Move RMA memory region registration codeAlexey Kardashevskiy
PPC970 does not support VRMA (virtual RMA) so real memory required for SLOF to execute must be allocated by the KVM_ALLOCATE_RMA ioctl. Later this memory is used as a part of the guest RAM area. The RMA allocating code also registers a memory region for this piece of RAM. We are going to simplify memory regions layout: RMA memory region will be a subregion in the RAM memory region, both starting from zero. This way we will not have to take care of start address alignment for the piece of RAM next to the RMA. This moves memory region business closer to the RAM memory region creation/allocation code. As this is a mechanical patch, no change in behaviour is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: fix compilation on non-kvm systems] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08target-ppc: KVMPPC_H_CAS fix cpu-version endianessLaurent Dufour
During KVMPPC_H_CAS processing, the cpu-version updated value is stored without taking care of the current endianess. As a consequence, the guest may not switch to the right CPU model, leading to unexpected results. If needed, the value is now converted. Fixes: 6d9412ea8132 ("target-ppc: Implement "compat" CPU option") Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27spapr: Remove @next_irqAlexey Kardashevskiy
This removes @next_irq from sPAPREnvironment which was used in old IRQ allocator as XICS is now responsible for IRQs and keeps track of allocated IRQs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27spapr: Move interrupt allocator to xicsAlexey Kardashevskiy
The current allocator returns IRQ numbers from a pool and does not support IRQs reuse in any form as it did not keep track of what it previously returned, it only keeps the last returned IRQ. Some use cases such as PCI hot(un)plug may require IRQ release and reallocation. This moves an allocator from SPAPR to XICS. This switches IRQ users to use new API. This uses LSI/MSI flags to know if interrupt is allocated. The interrupt release function will be posted as a separate patch. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27spapr: Define a 2.1 pseries machineAlexey Kardashevskiy
This adds a v2.1 machine to support backward compatibility for newer macines in the case if they ever be implemented. This adds a "pseries-2.1" machine as a child of the "pseries" machine and only changes visible machine name. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27spapr: Fix code design style (s/SPAPRMachine/sPAPRMachineState)Alexey Kardashevskiy
Every single sPAPR QOM object has small first "s". Most (not all yet) QOM objects have "State" suffix. This replaces SPAPRMachine with sPAPRMachineState to conform with QEMU code style and removes redundant empty line. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27spapr: Add "qemu, boot-menu" property to /chosenAvik Sil
This is required to enable boot menu display during booting Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-19NUMA: Add numa_info structure to contain numa nodes infoWanlong Gao
Add the numa_info structure to contain the numa nodes memory, VCPUs information and the future added numa nodes host memory policies. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> [Fix hw/ppc/spapr.c - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-16spapr_pci: Advertise MSI quotaBadari Pulavarty
Hotplug of multiple disks fails due to MSI vector quota check. Number of MSI vectors default to 8 allowing only 4 devices. This happens on RHEL6.5 guest. RHEL7 and SLES11 guests fallback to INTX. One way to workaround the issue is to increase total MSIs, so that MSI quota check allows us to hotplug multiple disks. This sets the quota to the maximum number of interupts XICS has which is 1024 now (XICS_IRQS). This moves XICS_IRQS from spapr.c to xics.h for wider visibility. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> [aik: put XICS_IRQS=1024 instead of 64i, fixed endianness and size] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr: Add kvm-type propertyEduardo Habkost
The kvm-type machine option was left out when MachineState was introduced, preventing the kvm-type option from being used. Add the missing property to the sPAPR machine class, so it can be used. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr: Create SPAPRMachine structEduardo Habkost
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16PPC: spapr: Expose /hypervisor node in device treeAlexander Graf
PR KVM supports an ePAPR compliant hypercall interface in parallel to the normal sPAPR one. Expose the ePAPR /hypervisor node and properties to the guest so it can use it. This enables magic page sharing on PR KVM with -M pseries. However we had a few nasty bugs in the magic page implementation on vcpus newer than 970 (p7, p8) that KVM now has workarounds for. It indicates that it does have these workarounds through the PPC_FIXUP_HCALL capability. To not expose broken guest kernels to issues on host kernels that don't have the fixups in place, we don't expose working hypercall instructions when the fixups are not available so that the guest can never active the magic page. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr_iommu: Enable multiple TCE requestsAlexey Kardashevskiy
Currently only single TCE entry per request is supported (H_PUT_TCE). However PAPR+ specification allows multiple entry requests such as H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host kernel via ioctls, support of these calls can accelerate IOMMU operations. This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT. This advertises "multi-tce" capability to the guest if the host kernel supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr: Enable dynamic change of the supported hypercalls listAlexey Kardashevskiy
At the moment the "ibm,hypertas-functions" list is fixed. However some calls should be listed there if they are supported by QEMU or the host kernel. This enables hyperrtas_prop to grow on stack by adding a SPAPR_HYPERRTAS_ADD macro. "qemu,hypertas-functions" is converted as well. The first user of this is going to be a "multi-tce" property. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr: Implement processor compatibility in ibm, client-architecture-supportAlexey Kardashevskiy
Modern Linux kernels support last POWERPC CPUs so when a kernel boots, in most cases it can find a matching cpu_spec in the kernel's cpu_specs list. However if the kernel is quite old, it may be missing a definition of the actual CPU. To provide an ability for old kernels to work on modern hardware, a Processor Compatibility Mode has been introduced by the PowerISA specification. >From the hardware prospective, it is supported by the Processor Compatibility Register (PCR) which is defined in PowerISA. The register enables one of the compatibility modes (2.05/2.06/2.07). Since PCR is a hypervisor privileged register and cannot be directly accessed from the guest, the mode selection is done via ibm,client-architecture-support (CAS) RTAS call using which the guest specifies what "raw" and "architected" CPU versions it supports. QEMU works out the best match, changes a "cpu-version" property of every CPU and notifies the guest about the change by setting these properties in the buffer passed as a response on a custom H_CAS hypercall. This implements ibm,client-architecture-support parameters parsing (now only for PVRs) and cooks the device tree diff with new values for "cpu-version", "ibm,ppc-interrupt-server#s" and "ibm,ppc-interrupt-server#s" properties. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr: Limit threads per core according to current compatibility modeAlexey Kardashevskiy
This puts a limit to the number of threads per core based on the current compatibility mode. Although PowerISA specs do not specify the maximum threads per core number, the linux guest still expects that PowerISA2.05-compatible CPU supports only 2 threads per core as this is what POWER6 (2.05 compliant CPU) implements, the same is for POWER7 (2.06, 4 threads) and POWER8 (2.07, 8 threads). This calls spapr_fixup_cpu_smt_dt() with the maximum allowed number of threads which affects ibm,ppc-interrupt-server#s and ibm,ppc-interrupt-gserver#s properties. The number of CPU nodesremains unchanged. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr: Rework spapr_fixup_cpu_dt()Alexey Kardashevskiy
In PPC code we usually use the "cs" name for a CPUState* variables and "cpu" for PowerPCCPU. So let's change spapr_fixup_cpu_dt() to use same rules as spapr_create_fdt_skel() does. This adds missing nodes creation if they do not already exist in the current device tree, this is going to be used from the client-architecture-support handler. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16spapr: Add ibm, client-architecture-support callAlexey Kardashevskiy
The PAPR+ specification defines a ibm,client-architecture-support (CAS) RTAS call which purpose is to provide a negotiation mechanism for the guest and the hypervisor to work out the best compatibility parameters. During the negotiation process, the guest provides an array of various options and capabilities which it supports, the hypervisor adjusts the device tree and (optionally) reboots the guest. At the moment the Linux guest calls CAS method at early boot so SLOF gets called. SLOF allocates a memory buffer for the device tree changes and calls a custom KVMPPC_H_CAS hypercall. QEMU parses the options, composes a diff for the device tree, copies it to the buffer provided by SLOF and returns to SLOF. SLOF updates the device tree and returns control to the guest kernel. Only then the Linux guest parses the device tree so it is possible to avoid unnecessary reboot in most cases. The device tree diff is a header with an update format version (defined as 1 in this patch) followed by a device tree with the properties which require update. If QEMU detects that it has to reboot the guest, it silently does so as the guest expects reboot to happen because this is usual pHyp firmware behavior. This defines custom KVMPPC_H_CAS hypercall. The current SLOF already has support for it. This implements stub which returns very basic tree (root node, no properties) to the guest. As the return buffer does not contain any change, no change in behavior is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16target-ppc: Implement "compat" CPU optionAlexey Kardashevskiy
This adds basic support for the "compat" CPU option. By specifying the compat property, the user can manually switch guest CPU mode from "raw" to "architected". This defines feature disable bits which are not used yet as, for example, PowerISA 2.07 says if 2.06 mode is selected, the TM bit does not matter - transactional memory (TM) will be disabled because 2.06 does not define it at all. The same is true for VSX and 2.05 mode. So just setting a mode must be ok. This does not change the existing behavior as the actual compatibility mode support is coming in next patches. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: fix compilation on 32bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>