slackcoder/qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer

Age	Commit message (Collapse)	Author
2018-12-21	spapr: introduce an 'ic-mode' machine option	Cédric Le Goater
	This option is used to select the interrupt controller mode (XICS or XIVE) with which the machine will operate. XICS being the default mode for now. When running a machine with the XIVE interrupt mode backend, the guest OS is required to have support for the XIVE exploitation mode. In the case of legacy OS, the mode selected by CAS should be XICS and the OS should fail to boot. However, QEMU could possibly detect it, terminate the boot process and reset to stop in the SLOF firmware. This is not yet handled. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: add an extra OV5 field to the sPAPR IRQ backend	Cédric Le Goater
	The interrupt modes supported by the hypervisor are advertised to the guest with new bits definitions of the option vector 5 of property "ibm,arch-vec-5-platform-support. The byte 23 bits 0-1 of the OV5 are defined as follow : 0b00 PAPR 2.7 and earlier (Legacy systems) 0b01 XIVE Exploitation mode only 0b10 Either available If the client/guest selects the XIVE interrupt mode, it informs the hypervisor by returning the value 0b01 in byte 23 bits 0-1. A 0b00 value indicates the use of the XICS interrupt mode (Legacy systems). The sPAPR IRQ backend is extended with these definitions and the values are directly used to populate the "ibm,arch-vec-5-platform-support" property. The interrupt mode is advertised under TCG and under KVM. Although a KVM XIVE device is not yet available, the machine can still operate with kernel_irqchip=off. However, we apply a restriction on the CPU which is required to be a POWER9 when a XIVE interrupt controller is in use. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: add a 'reset' method to the sPAPR IRQ backend	Cédric Le Goater
	For the time being, the XIVE reset handler updates the OS CAM line of the vCPU as it is done under a real hypervisor when a vCPU is scheduled to run on a HW thread. This will let the XIVE presenter engine find a match among the NVTs dispatched on the HW threads. This handler will become even more useful when we introduce the machine supporting both interrupt modes, XIVE and XICS. In this machine, the interrupt mode is chosen by the CAS negotiation process and activated after a reset. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Fix style nits] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: extend the sPAPR IRQ backend for XICS migration	Cédric Le Goater
	Introduce a new sPAPR IRQ handler to handle resend after migration when the machine is using a KVM XICS interrupt controller model. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: allocate the interrupt thread context under the CPU core	Cédric Le Goater
	Each interrupt mode has its own specific interrupt presenter object, that we store under the CPU object, one for XICS and one for XIVE. Extend the sPAPR IRQ backend with a new handler to support them both. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: add device tree support for the XIVE exploitation mode	Cédric Le Goater
	The XIVE interface for the guest is described in the device tree under the "interrupt-controller" node. A couple of new properties are specific to XIVE : - "reg" contains the base address and size of the thread interrupt managnement areas (TIMA), for the User level and for the Guest OS level. Only the Guest OS level is taken into account today. - "ibm,xive-eq-sizes" the size of the event queues. One cell per size supported, contains log2 of size, in ascending order. - "ibm,xive-lisn-ranges" the IRQ interrupt number ranges assigned to the guest for the IPIs. and also under the root node : - "ibm,plat-res-int-priorities" contains a list of priorities that the hypervisor has reserved for its own use. OPAL uses the priority 7 queue to automatically escalate interrupts for all other queues (DD2.X POWER9). So only priorities [0..6] are allowed for the guest. Extend the sPAPR IRQ backend with a new handler to populate the DT with the appropriate "interrupt-controller" node. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Fix style nits] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: add hcalls support for the XIVE exploitation interrupt mode	Cédric Le Goater
	The different XIVE virtualization structures (sources and event queues) are configured with a set of Hypervisor calls : - H_INT_GET_SOURCE_INFO used to obtain the address of the MMIO page of the Event State Buffer (ESB) entry associated with the source. - H_INT_SET_SOURCE_CONFIG assigns a source to a "target". - H_INT_GET_SOURCE_CONFIG determines which "target" and "priority" is assigned to a source - H_INT_GET_QUEUE_INFO returns the address of the notification management page associated with the specified "target" and "priority". - H_INT_SET_QUEUE_CONFIG sets or resets the event queue for a given "target" and "priority". It is also used to set the notification configuration associated with the queue, only unconditional notification is supported for the moment. Reset is performed with a queue size of 0 and queueing is disabled in that case. - H_INT_GET_QUEUE_CONFIG returns the queue settings for a given "target" and "priority". - H_INT_RESET resets all of the guest's internal interrupt structures to their initial state, losing all configuration set via the hcalls H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG. - H_INT_SYNC issue a synchronisation on a source to make sure all notifications have reached their queue. Calls that still need to be addressed : H_INT_SET_OS_REPORTING_LINE H_INT_GET_OS_REPORTING_LINE See the code for more documentation on each hcall. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Folded in fix for field accessors] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: introduce a new machine IRQ backend for XIVE	Cédric Le Goater
	The XIVE IRQ backend uses the same layout as the new XICS backend but covers the full range of the IRQ number space. The IRQ numbers for the CPU IPIs are allocated at the bottom of this space, below 4K, to preserve compatibility with XICS which does not use that range. This should be enough given that the maximum number of CPUs is 1024 for the sPAPR machine under QEMU. For the record, the biggest POWER8 or POWER9 system has a maximum of 1536 HW threads (16 sockets, 192 cores, SMT8). Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr/xive: introduce a XIVE interrupt controller	Cédric Le Goater
	sPAPRXive models the XIVE interrupt controller of the sPAPR machine. It inherits from the XiveRouter and provisions storage for the routing tables : - Event Assignment Structure (EAS) - Event Notification Descriptor (END) The sPAPRXive model incorporates an internal XiveSource for the IPIs and for the interrupts of the virtual devices of the guest. This model is consistent with XIVE architecture which also incorporates an internal IVSE for IPIs and accelerator interrupts in the IVRE sub-engine. The sPAPRXive model exports two memory regions, one for the ESB trigger and management pages used to control the sources and one for the TIMA pages. They are mapped by default at the addresses found on chip 0 of a baremetal system. This is also consistent with the XIVE architecture which defines a Virtualization Controller BAR for the internal IVSE ESB pages and a Thread Managment BAR for the TIMA. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Fold in field accessor fixes] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	ppc/xive: introduce a simplified XIVE presenter	Cédric Le Goater
	The last sub-engine of the XIVE architecture is the Interrupt Virtualization Presentation Engine (IVPE). On HW, the IVRE and the IVPE share elements, the Power Bus interface (CQ), the routing table descriptors, and they can be combined in the same HW logic. We do the same in QEMU and combine both engines in the XiveRouter for simplicity. When the IVRE has completed its job of matching an event source with a Notification Virtual Target (NVT) to notify, it forwards the event notification to the IVPE sub-engine. The IVPE scans the thread interrupt contexts of the Notification Virtual Targets (NVT) dispatched on the HW processor threads and if a match is found, it signals the thread. If not, the IVPE escalates the notification to some other targets and records the notification in a backlog queue. The IVPE maintains the thread interrupt context state for each of its NVTs not dispatched on HW processor threads in the Notification Virtual Target table (NVTT). The model currently only supports single NVT notifications. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Folded in fix for field accessors] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	ppc/xive: introduce the XIVE interrupt thread context	Cédric Le Goater
	Each POWER9 processor chip has a XIVE presenter that can generate four different exceptions to its threads: - hypervisor exception, - O/S exception - Event-Based Branch (EBB) - msgsnd (doorbell). Each exception has a state independent from the others called a Thread Interrupt Management context. This context is a set of registers which lets the thread handle priority management and interrupt acknowledgment among other things. The most important ones being : - Interrupt Priority Register (PIPR) - Interrupt Pending Buffer (IPB) - Current Processor Priority (CPPR) - Notification Source Register (NSR) These registers are accessible through a specific MMIO region, called the Thread Interrupt Management Area (TIMA), four aligned pages, each exposing a different view of the registers. First page (page address ending in 0b00) gives access to the entire context and is reserved for the ring 0 view for the physical thread context. The second (page address ending in 0b01) is for the hypervisor, ring 1 view. The third (page address ending in 0b10) is for the operating system, ring 2 view. The fourth (page address ending in 0b11) is for user level, ring 3 view. The thread interrupt context is modeled with a XiveTCTX object containing the values of the different exception registers. The TIMA region is mapped at the same address for each CPU. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	ppc/xive: add support for the END Event State Buffers	Cédric Le Goater
	The Event Notification Descriptor (END) XIVE structure also contains two Event State Buffers providing further coalescing of interrupts, one for the notification event (ESn) and one for the escalation events (ESe). A MMIO page is assigned for each to control the EOI through loads only. Stores are not allowed. The END ESBs are modeled through an object resembling the 'XiveSource' It is stateless as the END state bits are backed into the XiveEND structure under the XiveRouter and the MMIO accesses follow the same rules as for the XiveSource ESBs. END ESBs are not supported by the Linux drivers neither on OPAL nor on sPAPR. Nevetherless, it provides a mean to study the question in the future and validates a bit more the XIVE model. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Fold in a later fix for field access] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: export and rename the xics_max_server_number() routine	Cédric Le Goater
	The XIVE sPAPR IRQ backend will use it to define the number of ENDs of the IC controller. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	spapr: introduce a spapr_irq_init() routine	Cédric Le Goater
	Initialize the MSI bitmap from it as this will be necessary for the sPAPR IRQ backend for XIVE. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	ppc/xive: introduce the XIVE Event Notification Descriptors	Cédric Le Goater
	To complete the event routing, the IVRE sub-engine uses a second table containing Event Notification Descriptor (END) structures. An END specifies on which Event Queue (EQ) the event notification data, defined in the associated EAS, should be posted when an exception occurs. It also defines which Notification Virtual Target (NVT) should be notified. The Event Queue is a memory page provided by the O/S defining a circular buffer, one per server and priority couple, containing Event Queue entries. These are 4 bytes long, the first bit being a 'generation' bit and the 31 following bits the END Data field. They are pulled by the O/S when the exception occurs. The END Data field is a way to set an invariant logical event source number for an IRQ. On sPAPR machines, it is set with the H_INT_SET_SOURCE_CONFIG hcall when the EISN flag is used. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Fold in a later fix from Cédric fixing field accessors] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	ppc/xive: introduce the XiveRouter model	Cédric Le Goater
	The XiveRouter models the second sub-engine of the XIVE architecture : the Interrupt Virtualization Routing Engine (IVRE). The IVRE handles event notifications of the IVSE and performs the interrupt routing process. For this purpose, it uses a set of tables stored in system memory, the first of which being the Event Assignment Structure (EAS) table. The EAT associates an interrupt source number with an Event Notification Descriptor (END) which will be used in a second phase of the routing process to identify a Notification Virtual Target. The XiveRouter is an abstract class which needs to be inherited from to define a storage for the EAT, and other upcoming tables. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Folded in parts of a later fix by Cédric fixing field access] [dwg: Fix style nits] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	ppc/xive: introduce the XiveNotifier interface	Cédric Le Goater
	The XiveNotifier offers a simple interface, between the XiveSource object and the main interrupt controller of the machine. It will forward event notifications to the XIVE Interrupt Virtualization Routing Engine (IVRE). Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Adjust type name string for XiveNotifier] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	ppc/xive: add support for the LSI interrupt sources	Cédric Le Goater
	The 'sent' status of the LSI interrupt source is modeled with the 'P' bit of the ESB and the assertion status of the source is maintained with an extra bit under the main XiveSource object. The type of the source is stored in the same array for practical reasons. Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Fix style nit] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	ppc/xive: introduce a XIVE interrupt source model	Cédric Le Goater
	The first sub-engine of the overall XIVE architecture is the Interrupt Virtualization Source Engine (IVSE). An IVSE can be integrated into another logic, like in a PCI PHB or in the main interrupt controller to manage IPIs. Each IVSE instance is associated with an Event State Buffer (ESB) that contains a two bit state entry for each possible event source. When an event is signaled to the IVSE, by MMIO or some other means, the associated interrupt state bits are fetched from the ESB and modified. Depending on the resulting ESB state, the event is forwarded to the IVRE sub-engine of the controller doing the routing. Each supported ESB entry is associated with either a single or a even/odd pair of pages which provides commands to manage the source: to EOI, to turn off the source for instance. On a sPAPR machine, the O/S will obtain the page address of the ESB entry associated with a source and its characteristic using the H_INT_GET_SOURCE_INFO hcall. On PowerNV, a similar OPAL call is used. The xive_source_notify() routine is in charge forwarding the source event notification to the routing engine. It will be filled later on. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-21	mac_newworld: simplify IRQ wiring	Greg Kurz
	The OpenPIC have 5 outputs per connected CPU. The machine init code hence needs a bi-dimensional array (smp_cpu lines, 5 columns) to wire up the irqs between the PIC and the CPUs. The current code first allocates an array of smp_cpus pointers to qemu_irq type, then it allocates another array of smp_cpus * 5 qemu_irq and fills the first array with pointers to each line of the second array. This is rather convoluted. Simplify the logic by introducing a structured type that describes all the OpenPIC outputs for a single CPU, ie, fixed size of 5 qemu_irq, and only allocate a smp_cpu sized array of those. This also allows to use g_new(T, n) instead of g_malloc(sizeof(T) * n) as recommended in HACKING. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-12-20	Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20181220' into staging	Peter Maydell
	Two s390x bugfixes. # gpg: Signature made Thu 20 Dec 2018 16:36:42 GMT # gpg: using RSA key DECF6B93C6F02FAF # gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" # gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" # gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" # gpg: aka "Cornelia Huck <cohuck@kernel.org>" # gpg: aka "Cornelia Huck <cohuck@redhat.com>" # Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF * remotes/cohuck/tags/s390x-20181220: hw/s390x: Fix bad mask in time2tod() hw/s390/ccw.c: Don't take address of packed members Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-12-20	sifive_uart: Implement interrupt pending register	Nathaniel Graff
	The watermark bits are set in the interrupt pending register according to the configuration of txcnt and rxcnt in the txctrl and rxctrl registers. Since the UART TX does not implement a FIFO, the txwm bit is set as long as the TX watermark level is greater than zero. Signed-off-by: Nathaniel Graff <nathaniel.graff@sifive.com> Reviewed-by: Michael Clark <mjc@sifive.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-12-20	sifive_u: Add clock DT node for GEM ethernet	Anup Patel
	The GEM ethernet on SiFive unleashed has fixed input clock of 125MHz as-per SiFive FU540 manual. This patch updates FDT generation for QEMU sifive_u machine to provide fixed-rate clock for GEM ethernet. Signed-off-by: Anup Patel <anup@brainfault.org> Signed-off-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-12-20	hw/riscv/virt: Connect the gpex PCIe	Alistair Francis
	Connect the gpex PCIe device based on the device tree included in the HiFive Unleashed ROM. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-12-20	hw/riscv/virt: Increase the number of interrupts	Alistair Francis
	Increase the number of interrupts to match the HiFive Unleashed board. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-12-20	x86-iommu: switch intr_supported to OnOffAuto type	Peter Xu
	Switch the intr_supported variable from a boolean to OnOffAuto type so that we can know whether the user specified it or not. With that we'll have a chance to help the user to choose more wisely where possible. Introduce x86_iommu_ir_supported() to mask these changes. No functional change at all. Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	q35: set split kernel irqchip as default	Peter Xu
	Starting from QEMU 4.0, let's specify "split" as the default value for kernel-irqchip. So for QEMU>=4.0 we'll have: allowed=Y,required=N,split=Y for QEMU<=3.1 we'll have: allowed=Y,required=N,split=N (omitting all the "kernel_irqchip_" prefix) Note that this will let the default q35 machine type to depend on Linux version 4.4 or newer because that's where split irqchip is introduced in kernel. But it's fine since we're boosting supported Linux version for QEMU 4.0 to around Linux 4.5. For more information please refer to the discussion on AMD's RDTSCP: https://lore.kernel.org/lkml/20181210181328.GA762@zn.tnic/ Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	pci/shpc: perform unplug via the hotplug handler	David Hildenbrand
	Introduce and use the "unplug" callback. This is a preparation for multi-stage hotplug handlers, whereby the bus hotplug handler is overwritten by the machine hotplug handler. This handler will then pass control to the bus hotplug handler. So to get this running cleanly, we also have to make sure to go via the hotplug handler chain when actually unplugging a device after an unplug request. Lookup the hotplug handler and call "unplug". Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	pci: Reuse pci-bridge hotplug handler handlers for pcie-pci-bridge	David Hildenbrand
	These functions are essentially the same, we only have to use object_get_typename() for reporting errors. So let's share the implementation of hotplug handler callbacks. Suggested-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	pci/pcie: perform unplug via the hotplug handler	David Hildenbrand
	Introduce and use the "unplug" callback. This is a preparation for multi-stage hotplug handlers, whereby the bus hotplug handler is overwritten by the machine hotplug handler. This handler will then pass control to the bus hotplug handler. So to get this running cleanly, we also have to make sure to go via the hotplug handler chain when actually unplugging a device after an unplug request. Lookup the hotplug handler and call "unplug". Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	pci/pcihp: perform unplug via the hotplug handler	David Hildenbrand
	Introduce and use the "unplug" callback. This is a preparation for multi-stage hotplug handlers, whereby the bus hotplug handler is overwritten by the machine hotplug handler. This handler will then pass control to the bus hotplug handler. So to get this running cleanly, we also have to make sure to go via the hotplug handler chain when actually unplugging a device after an unplug request. Lookup the hotplug handler and call "unplug". Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	pci/pcihp: perform check for bus capability in pre_plug handler	David Hildenbrand
	Perform the check in the pre_plug handler. In addition, we need the capability only if the device is actually hotplugged (and not created during machine initialization). This is a preparation for coldplugging pci devices via that hotplug handler. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	pci/shpc: rename hotplug handler callbacks	David Hildenbrand
	The callbacks are also called for cold plugged devices. Drop the "hot" to better match the actual callback names. While at it, also rename shpc_device_hotplug_common() to shpc_device_plug_common(). Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	pci/pcie: rename hotplug handler callbacks	David Hildenbrand
	The callbacks are also called for cold plugged devices. Drop the "hot" to better match the actual callback names. While at it, also rename pcie_cap_slot_hotplug_common() to pcie_cap_slot_plug_common(). Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	hw: acpi: Remove AcpiRsdpDescriptor and fix tests	Samuel Ortiz
	The only remaining AcpiRsdpDescriptor users are the ACPI utils for the BIOS table tests. We remove that dependency and can thus remove the structure itself. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-20	hw/s390x: Fix bad mask in time2tod()	Thomas Huth
	Since "s390x/tcg: avoid overflows in time2tod/tod2time", the time2tod() function tries to deal with the 9 uppermost bits in the time value, but uses the wrong mask for this: 0xff80000000000000 should be used instead of 0xff10000000000000 here. Fixes: 14055ce53c2d901d826ffad7fb7d6bb8ab46bdfd Cc: qemu-stable@nongnu.org Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1544792887-14575-1-git-send-email-thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [CH: tweaked commit message] Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-12-20	Clean up includes	Markus Armbruster
	Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes, with the changes to the following files manually reverted: contrib/libvhost-user/libvhost-user-glib.h contrib/libvhost-user/libvhost-user.c contrib/libvhost-user/libvhost-user.h linux-user/mips64/cpu_loop.c linux-user/mips64/signal.c linux-user/sparc64/cpu_loop.c linux-user/sparc64/signal.c linux-user/x86_64/cpu_loop.c linux-user/x86_64/signal.c target/s390x/gen-features.c tests/migration/s390x/a-b-bios.c tests/test-rcu-simpleq.c tests/test-rcu-tailq.c Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20181204172535.2799-1-armbru@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Viktor Prutyanov <viktor.prutyanov@phystech.edu>
2018-12-19	hw: acpi: Export and share the ARM RSDP build	Samuel Ortiz
	Now that build_rsdp() supports building both legacy and current RSDP tables, we can move it to a generic folder (hw/acpi) and have the i386 ACPI code reuse it in order to reduce code duplication. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com>
2018-12-19	hw: arm: Carry RSDP specific data through AcpiRsdpData	Samuel Ortiz
	That will allow us to generalize the ARM build_rsdp() routine to support both legacy RSDP (The current i386 implementation) and extended RSDP (The ARM implementation). Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	intel_iommu: dma read/write draining support	Peter Xu
	Support DMA read/write draining should be easy for existing VT-d emulation since the emulation itself does not have any request queue there so we don't need to do anything to flush the un-commited queue. What we need to do is to declare the support. These capabilities are required to pass Windows SVVP test program. It is verified that when with parameters "x-aw-bits=48,caching-mode=off" we can pass the Windows SVVP test with this patch applied. Otherwise we'll fail with: IOMMU[0] - DWD (DMA write draining) not supported IOMMU[0] - DWD (DMA read draining) not supported Segment 0 has no DMA remapping capable IOMMU units However since these bits are not declared support for QEMU<=3.1, we'll need a compatibility bit for it and we turn this on by default only for QEMU>=4.0. Please refer to VT-d spec 6.5.4 for more information. CC: Yu Wang <wyu@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1654550 Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	pcie: Fast PCIe root ports for new machines	Alex Williamson
	Change the default speed and width for new machine types to the fastest and widest currently supported. This should be compatible to the PCIe 4.0 spec. Pre-QEMU-4.0 machine types remain at 2.5GT/s, x1 width. Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	pcie: Add link speed and width fields to PCIESlot	Alex Williamson
	Add fields allowing the PCIe link speed and width of a PCIESlot to be configured, with an instance_post_init callback on the root port parent class to set defaults. This allows child classes to set these via properties or via their own instance_init callback, without requiring all implementions to support arbitrary user selected values. Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	qapi: Define PCIe link speed and width properties	Alex Williamson
	Create properties to be able to define speeds and widths for PCIe links. The only tricky bit here is that our get and set callbacks translate from the fixed QAPI automagic enums to those we define in PCI code to represent the actual register segment value. Cc: Eric Blake <eblake@redhat.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	pci: Sync PCIe downstream port LNKSTA on read	Alex Williamson
	The PCIe link speed and width between a downstream device and its upstream port is negotiated on real hardware and susceptible to dynamic changes due to signal issues and power management. In the emulated device case there is no real hardware link, but we still might wish to have some consistency between endpoint and downstream port via a virtual negotiation. There is of course a real link for assigned devices and this same virtual negotiation allows the downstream port to match the endpoint, synchronizing on every read to support underlying physical hardware dynamically adjusting the link. This negotiation is intentionally unidirectional for compatibility. If the endpoint exceeds the capabilities of the downstream port or there is no endpoint device, the downstream port reports negotiation to its maximum speed and width, matching the previous case where negotiation was absent. De-tuning the endpoint to match a virtual link doesn't seem to benefit anyone and is a condition we've thus far reported without functional issues. Note that PCI_EXP_LNKSTA is already ignored for migration compatibility via pcie_cap_v1_fill(). Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	pcie: Create enums for link speed and width	Alex Williamson
	In preparation for reporting higher virtual link speeds and widths, create enums and macros to help us manage them. Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	hw/smbios: Move to the hw/firmware/ subdirectory	Philippe Mathieu-Daudé
	SMBIOS is just another firmware interface used by some QEMU models. We will later introduce more firmware interfaces in this subdirectory. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	hw/smbios: Restrict access to "hw/smbios/ipmi.h"	Philippe Mathieu-Daudé
	All the consumers of "hw/smbios/ipmi.h" are located in hw/smbios/. There is no need to have this include publicly exposed, reduce the visibility by moving it in hw/smbios/. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19	Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2018-12-18' into ↵	Peter Maydell
	staging QAPI patches for 2018-12-18 # gpg: Signature made Tue 18 Dec 2018 07:20:11 GMT # gpg: using RSA key 3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-qapi-2018-12-18: qapi: fix flat union on uncovered branches conditionals qmp hmp: Make system_wakeup check wake-up support and run state qga: update guest-suspend-ram and guest-suspend-hybrid descriptions qmp: query-current-machine with wakeup-suspend-support qmp: Split ShutdownCause host-qmp into quit and system-reset qmp: Add reason to SHUTDOWN and RESET events qapi: Turn ShutdownCause into QAPI enum Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-12-18	qmp hmp: Make system_wakeup check wake-up support and run state	Daniel Henrique Barboza
	The qmp/hmp command 'system_wakeup' is simply a direct call to 'qemu_system_wakeup_request' from vl.c. This function verifies if runstate is SUSPENDED and if the wake up reason is valid before proceeding. However, no error or warning is thrown if any of those pre-requirements isn't met. There is no way for the caller to differentiate between a successful wakeup or an error state caused when trying to wake up a guest that wasn't suspended. This means that system_wakeup is silently failing, which can be considered a bug. Adding error handling isn't an API break in this case - applications that didn't check the result will remain broken, the ones that check it will have a chance to deal with it. Adding to that, the commit before previous created a new QMP API called query-current-machine, with a new flag called wakeup-suspend-support, that indicates if the guest has the capability of waking up from suspended state. Although such guest will never reach SUSPENDED state and erroring it out in this scenario would suffice, it is more informative for the user to differentiate between a failure because the guest isn't suspended versus a failure because the guest does not have support for wake up at all. All this considered, this patch changes qmp_system_wakeup to check if the guest is capable of waking up from suspend, and if it is suspended. After this patch, this is the output of system_wakeup in a guest that does not have wake-up from suspend support (ppc64): (qemu) system_wakeup wake-up from suspend is not supported by this guest (qemu) And this is the output of system_wakeup in a x86 guest that has the support but isn't suspended: (qemu) system_wakeup Unable to wake up: guest is not in suspended state (qemu) Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20181205194701.17836-4-danielhb413@gmail.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-12-18	qmp: query-current-machine with wakeup-suspend-support	Daniel Henrique Barboza
	When issuing the qmp/hmp 'system_wakeup' command, what happens in a nutshell is: - qmp_system_wakeup_request set runstate to RUNNING, sets a wakeup_reason and notify the event - in the main_loop, all vcpus are paused, a system reset is issued, all subscribers of wakeup_notifiers receives a notification, vcpus are then resumed and the wake up QAPI event is fired Note that this procedure alone doesn't ensure that the guest will awake from SUSPENDED state - the subscribers of the wake up event must take action to resume the guest, otherwise the guest will simply reboot. At this moment, only the ACPI machines via acpi_pm1_cnt_init and xen_hvm_init have wake-up from suspend support. However, only the presence of 'system_wakeup' is required for QGA to support 'guest-suspend-ram' and 'guest-suspend-hybrid' at this moment. This means that the user/management will expect to suspend the guest using one of those suspend commands and then resume execution using system_wakeup, regardless of the support offered in system_wakeup in the first place. This patch creates a new API called query-current-machine [1], that holds a new flag called 'wakeup-suspend-support' that indicates if the guest supports wake up from suspend via system_wakeup. The machine is considered to implement wake-up support if a call to a new 'qemu_register_wakeup_support' is made during its init, as it is now being done inside acpi_pm1_cnt_init and xen_hvm_init. This allows for any other machine type to declare wake-up support regardless of ACPI state or wakeup_notifiers subscription, making easier for newer implementations that might have their own mechanisms in the future. This is the expected output of query-current-machine when running a x86 guest: {"execute" : "query-current-machine"} {"return": {"wakeup-suspend-support": true}} Running the same x86 guest, but with the --no-acpi option: {"execute" : "query-current-machine"} {"return": {"wakeup-suspend-support": false}} This is the output when running a pseries guest: {"execute" : "query-current-machine"} {"return": {"wakeup-suspend-support": false}} With this extra tool, management can avoid situations where a guest that does not have proper suspend/wake capabilities ends up in inconsistent state (e.g. https://github.com/open-power-host-os/qemu/issues/31). [1] the decision of creating the query-current-machine API is based on discussions in the QEMU mailing list where it was decided that query-target wasn't a proper place to store the wake-up flag, neither was query-machines because this isn't a static property of the machine object. This new API can then be used to store other dynamic machine properties that are scattered around the code ATM. More info at: https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg04235.html Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20181205194701.17836-2-danielhb413@gmail.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>