diff options
author | Peter Maydell <peter.maydell@linaro.org> | 2016-06-24 11:00:15 +0100 |
---|---|---|
committer | Peter Maydell <peter.maydell@linaro.org> | 2016-06-24 11:00:15 +0100 |
commit | a01aef5d2f96c334d048f43f0d3573a1152b37ca (patch) | |
tree | fce1969574c37a87816c17f98d4fd5a9f818bf00 | |
parent | c7288767523f6510cf557707d3eb5e78e519b90d (diff) | |
parent | 21a4d96243e60a4c8eeb124a023b8a3bd9120e18 (diff) |
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes
nvdimm label support
cpu acpi hotplug rework
virtio rework
misc cleanups and fixes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 24 Jun 2016 06:50:32 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (34 commits)
virtio-bus: remove old set_host_notifier callback
virtio-mmio: convert to ioeventfd callbacks
virtio-pci: convert to ioeventfd callbacks
virtio-ccw: convert to ioeventfd callbacks
virtio-bus: have callers tolerate new host notifier api
virtio-bus: common ioeventfd infrastructure
pc: acpi: drop intermediate PCMachineState.node_cpu
acpi-test-data: update expected
pc: use new CPU hotplug interface since 2.7 machine type
acpi: cpuhp: add cpu._OST handling
acpi: cpuhp: implement hot-remove parts of CPU hotplug interface
acpi: cpuhp: implement hot-add parts of CPU hotplug interface
pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
acpi: cpuhp: add CPU devices AML with _STA method
pc: piix4/ich9: add 'cpu-hotplug-legacy' property
docs: update ACPI CPU hotplug spec with new protocol
i386: pci-assign: Fix MSI-X table size
docs: add NVDIMM ACPI documentation
nvdimm acpi: support Set Namespace Label Data function
nvdimm acpi: support Get Namespace Label Data function
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
53 files changed, 2505 insertions, 439 deletions
diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt index 340b751a95..ee219c8358 100644 --- a/docs/specs/acpi_cpu_hotplug.txt +++ b/docs/specs/acpi_cpu_hotplug.txt @@ -4,21 +4,91 @@ QEMU<->ACPI BIOS CPU hotplug interface QEMU supports CPU hotplug via ACPI. This document describes the interface between QEMU and the ACPI BIOS. -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access): ------------------------------------------ - -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU -hot-add/remove event to ACPI BIOS, via SCI interrupt. +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add +and hot-remove events. +============================================ +Legacy ACPI CPU hotplug interface registers: +-------------------------------------------- CPU present bitmap for: ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access) PIIX-PM (IO port 0xaf00-0xaf1f, 1-byte access) + One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only. + The first DWORD in bitmap is used in write mode to switch from legacy + to new CPU hotplug interface, write 0 into it to do switch. --------------------------------------------------------------- -One bit per CPU. Bit position reflects corresponding CPU APIC ID. -Read-only. +QEMU sets corresponding CPU bit on hot-add event and issues SCI +with GPE.2 event set. CPU present map is read by ACPI BIOS GPE.2 handler +to notify OS about CPU hot-add events. CPU hot-remove isn't supported. + +===================================== +ACPI CPU hotplug interface registers: +------------------------------------- +Register block base address: + ICH9-LPC IO port 0x0cd8 + PIIX-PM IO port 0xaf00 +Register block size: + ACPI_CPU_HOTPLUG_REG_LEN = 12 + +read access: + offset: + [0x0-0x3] reserved + [0x4] CPU device status fields: (1 byte access) + bits: + 0: Device is enabled and may be used by guest + 1: Device insert event, used to distinguish device for which + no device check event to OSPM was issued. + It's valid only when bit 0 is set. + 2: Device remove event, used to distinguish device for which + no device eject request to OSPM was issued. + 3-7: reserved and should be ignored by OSPM + [0x5-0x7] reserved + [0x8] Command data: (DWORD access) + in case of error or unsupported command reads is 0xFFFFFFFF + current 'Command field' value: + 0: returns PXM value corresponding to device + +write access: + offset: + [0x0-0x3] CPU selector: (DWORD access) + selects active CPU device. All following accesses to other + registers will read/store data from/to selected CPU. + [0x4] CPU device control fields: (1 byte access) + bits: + 0: reserved, OSPM must clear it before writing to register. + 1: if set to 1 clears device insert event, set by OSPM + after it has emitted device check event for the + selected CPU device + 2: if set to 1 clears device remove event, set by OSPM + after it has emitted device eject request for the + selected CPU device + 3: if set to 1 initiates device eject, set by OSPM when it + triggers CPU device removal and calls _EJ0 method + 4-7: reserved, OSPM must clear them before writing to register + [0x5] Command field: (1 byte access) + value: + 0: selects a CPU device with inserting/removing events and + following reads from 'Command data' register return + selected CPU (CPU selector value). If no CPU with events + found, the current CPU selector doesn't change and + corresponding insert/remove event flags are not set. + 1: following writes to 'Command data' register set OST event + register in QEMU + 2: following writes to 'Command data' register set OST status + register in QEMU + other values: reserved + [0x6-0x7] reserved + [0x8] Command data: (DWORD access) + current 'Command field' value: + 0: OSPM reads value of CPU selector + 1: stores value into OST event register + 2: stores value into OST status register, triggers + ACPI_DEVICE_OST QMP event from QEMU to external applications + with current values of OST event and status registers. + other values: reserved -CPU hot-add/remove notification: ------------------------------------------------------ -QEMU sets/clears corresponding CPU bit on hot-add/remove event. -CPU present map read by ACPI BIOS GPE.2 handler to notify OS of CPU -hot-(un)plug events. +Selecting CPU device beyond possible range has no effect on platform: + - write accesses to CPU hot-plug registers not documented above are + ignored + - read accesses to CPU hot-plug registers not documented above return + all bits set to 0. diff --git a/docs/specs/acpi_nvdimm.txt b/docs/specs/acpi_nvdimm.txt new file mode 100644 index 0000000000..0fdd251fc0 --- /dev/null +++ b/docs/specs/acpi_nvdimm.txt @@ -0,0 +1,132 @@ +QEMU<->ACPI BIOS NVDIMM interface +--------------------------------- + +QEMU supports NVDIMM via ACPI. This document describes the basic concepts of +NVDIMM ACPI and the interface between QEMU and the ACPI BIOS. + +NVDIMM ACPI Background +---------------------- +NVDIMM is introduced in ACPI 6.0 which defines an NVDIMM root device under +_SB scope with a _HID of “ACPI0012”. For each NVDIMM present or intended +to be supported by platform, platform firmware also exposes an ACPI +Namespace Device under the root device. + +The NVDIMM child devices under the NVDIMM root device are defined with _ADR +corresponding to the NFIT device handle. The NVDIMM root device and the +NVDIMM devices can have device specific methods (_DSM) to provide additional +functions specific to a particular NVDIMM implementation. + +This is an example from ACPI 6.0, a platform contains one NVDIMM: + +Scope (\_SB){ + Device (NVDR) // Root device + { + Name (_HID, “ACPI0012”) + Method (_STA) {...} + Method (_FIT) {...} + Method (_DSM, ...) {...} + Device (NVD) + { + Name(_ADR, h) //where h is NFIT Device Handle for this NVDIMM + Method (_DSM, ...) {...} + } + } +} + +Method supported on both NVDIMM root device and NVDIMM device +_DSM (Device Specific Method) + It is a control method that enables devices to provide device specific + control functions that are consumed by the device driver. + The NVDIMM DSM specification can be found at: + http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf + + Arguments: + Arg0 – A Buffer containing a UUID (16 Bytes) + Arg1 – An Integer containing the Revision ID (4 Bytes) + Arg2 – An Integer containing the Function Index (4 Bytes) + Arg3 – A package containing parameters for the function specified by the + UUID, Revision ID, and Function Index + + Return Value: + If Function Index = 0, a Buffer containing a function index bitfield. + Otherwise, the return value and type depends on the UUID, revision ID + and function index which are described in the DSM specification. + +Methods on NVDIMM ROOT Device +_FIT(Firmware Interface Table) + It evaluates to a buffer returning data in the format of a series of NFIT + Type Structure. + + Arguments: None + + Return Value: + A Buffer containing a list of NFIT Type structure entries. + + The detailed definition of the structure can be found at ACPI 6.0: 5.2.25 + NVDIMM Firmware Interface Table (NFIT). + +QEMU NVDIMM Implemention +======================== +QEMU uses 4 bytes IO Port starting from 0x0a18 and a RAM-based memory page +for NVDIMM ACPI. + +Memory: + QEMU uses BIOS Linker/loader feature to ask BIOS to allocate a memory + page and dynamically patch its into a int32 object named "MEMA" in ACPI. + + This page is RAM-based and it is used to transfer data between _DSM + method and QEMU. If ACPI has control, this pages is owned by ACPI which + writes _DSM input data to it, otherwise, it is owned by QEMU which + emulates _DSM access and writes the output data to it. + + ACPI writes _DSM Input Data (based on the offset in the page): + [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle, 0 is reserved for NVDIMM + Root device. + [0x4 - 0x7]: 4 bytes, Revision ID, that is the Arg1 of _DSM method. + [0x8 - 0xB]: 4 bytes. Function Index, that is the Arg2 of _DSM method. + [0xC - 0xFFF]: 4084 bytes, the Arg3 of _DSM method. + + QEMU Writes Output Data (based on the offset in the page): + [0x0 - 0x3]: 4 bytes, the length of result + [0x4 - 0xFFF]: 4092 bytes, the DSM result filled by QEMU + +IO Port 0x0a18 - 0xa1b: + ACPI writes the address of the memory page allocated by BIOS to this + port then QEMU gets the control and fills the result in the memory page. + + write Access: + [0x0a18 - 0xa1b]: 4 bytes, the address of the memory page allocated + by BIOS. + +_DSM process diagram: +--------------------- +"MEMA" indicates the address of memory page allocated by BIOS. + + +----------------------+ +-----------------------+ + | 1. OSPM | | 2. OSPM | + | save _DSM input data | | write "MEMA" to | Exit to QEMU + | to the page +----->| IO port 0x0a18 +------------+ + | indicated by "MEMA" | | | | + +----------------------+ +-----------------------+ | + | + v + +------------- ----+ +-----------+ +------------------+--------+ + | 5 QEMU | | 4 QEMU | | 3. QEMU | + | write _DSM result | | emulate | | get _DSM input data from | + | to the page +<------+ _DSM +<-----+ the page indicated by the | + | | | | | value from the IO port | + +--------+-----------+ +-----------+ +---------------------------+ + | + | Enter Guest + | + v + +--------------------------+ +--------------+ + | 6 OSPM | | 7 OSPM | + | result size is returned | | _DSM return | + | by reading DSM +----->+ | + | result from the page | | | + +--------------------------+ +--------------+ + + _FIT implementation + ------------------- + TODO (will fill it when nvdimm hotplug is introduced) diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index 66bd72702b..4b7da6639f 100644 --- a/hw/acpi/Makefile.objs +++ b/hw/acpi/Makefile.objs @@ -2,7 +2,9 @@ common-obj-$(CONFIG_ACPI_X86) += core.o piix4.o pcihp.o common-obj-$(CONFIG_ACPI_X86_ICH) += ich9.o tco.o common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o memory_hotplug_acpi_table.o +common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o common-obj-$(CONFIG_ACPI) += acpi_interface.o common-obj-$(CONFIG_ACPI) += bios-linker-loader.o common-obj-$(CONFIG_ACPI) += aml-build.o +common-obj-$(call land,$(CONFIG_ACPI),$(CONFIG_IPMI)) += ipmi.o diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 874e473cac..db3e914fb4 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -660,6 +660,20 @@ Aml *aml_call4(const char *method, Aml *arg1, Aml *arg2, Aml *arg3, Aml *arg4) return var; } +/* helper to call method with 5 arguments */ +Aml *aml_call5(const char *method, Aml *arg1, Aml *arg2, Aml *arg3, Aml *arg4, + Aml *arg5) +{ + Aml *var = aml_alloc(); + build_append_namestring(var->buf, "%s", method); + aml_append(var, arg1); + aml_append(var, arg2); + aml_append(var, arg3); + aml_append(var, arg4); + aml_append(var, arg5); + return var; +} + /* * ACPI 5.0: 6.4.3.8.1 GPIO Connection Descriptor * Type 1, Large Item Name 0xC @@ -1481,6 +1495,14 @@ Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target) target); } +/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefObjectType */ +Aml *aml_object_type(Aml *object) +{ + Aml *var = aml_opcode(0x8E /* ObjectTypeOp */); + aml_append(var, object); + return var; +} + void build_header(BIOSLinker *linker, GArray *table_data, AcpiTableHeader *h, const char *sig, int len, uint8_t rev, diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c new file mode 100644 index 0000000000..c13b65c2c9 --- /dev/null +++ b/hw/acpi/cpu.c @@ -0,0 +1,561 @@ +#include "qemu/osdep.h" +#include "hw/boards.h" +#include "hw/acpi/cpu.h" +#include "qapi/error.h" +#include "qapi-event.h" +#include "trace.h" + +#define ACPI_CPU_HOTPLUG_REG_LEN 12 +#define ACPI_CPU_SELECTOR_OFFSET_WR 0 +#define ACPI_CPU_FLAGS_OFFSET_RW 4 +#define ACPI_CPU_CMD_OFFSET_WR 5 +#define ACPI_CPU_CMD_DATA_OFFSET_RW 8 + +enum { + CPHP_GET_NEXT_CPU_WITH_EVENT_CMD = 0, + CPHP_OST_EVENT_CMD = 1, + CPHP_OST_STATUS_CMD = 2, + CPHP_CMD_MAX +}; + +static ACPIOSTInfo *acpi_cpu_device_status(int idx, AcpiCpuStatus *cdev) +{ + ACPIOSTInfo *info = g_new0(ACPIOSTInfo, 1); + + info->slot_type = ACPI_SLOT_TYPE_CPU; + info->slot = g_strdup_printf("%d", idx); + info->source = cdev->ost_event; + info->status = cdev->ost_status; + if (cdev->cpu) { + DeviceState *dev = DEVICE(cdev->cpu); + if (dev->id) { + info->device = g_strdup(dev->id); + info->has_device = true; + } + } + return info; +} + +void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list) +{ + int i; + + for (i = 0; i < cpu_st->dev_count; i++) { + ACPIOSTInfoList *elem = g_new0(ACPIOSTInfoList, 1); + elem->value = acpi_cpu_device_status(i, &cpu_st->devs[i]); + elem->next = NULL; + **list = elem; + *list = &elem->next; + } +} + +static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size) +{ + uint64_t val = 0; + CPUHotplugState *cpu_st = opaque; + AcpiCpuStatus *cdev; + + if (cpu_st->selector >= cpu_st->dev_count) { + return val; + } + + cdev = &cpu_st->devs[cpu_st->selector]; + switch (addr) { + case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */ + val |= cdev->cpu ? 1 : 0; + val |= cdev->is_inserting ? 2 : 0; + val |= cdev->is_removing ? 4 : 0; + trace_cpuhp_acpi_read_flags(cpu_st->selector, val); + break; + case ACPI_CPU_CMD_DATA_OFFSET_RW: + switch (cpu_st->command) { + case CPHP_GET_NEXT_CPU_WITH_EVENT_CMD: + val = cpu_st->selector; + break; + default: + break; + } + trace_cpuhp_acpi_read_cmd_data(cpu_st->selector, val); + break; + default: + break; + } + return val; +} + +static void cpu_hotplug_wr(void *opaque, hwaddr addr, uint64_t data, + unsigned int size) +{ + CPUHotplugState *cpu_st = opaque; + AcpiCpuStatus *cdev; + ACPIOSTInfo *info; + + assert(cpu_st->dev_count); + + if (addr) { + if (cpu_st->selector >= cpu_st->dev_count) { + trace_cpuhp_acpi_invalid_idx_selected(cpu_st->selector); + return; + } + } + + switch (addr) { + case ACPI_CPU_SELECTOR_OFFSET_WR: /* current CPU selector */ + cpu_st->selector = data; + trace_cpuhp_acpi_write_idx(cpu_st->selector); + break; + case ACPI_CPU_FLAGS_OFFSET_RW: /* set is_* fields */ + cdev = &cpu_st->devs[cpu_st->selector]; + if (data & 2) { /* clear insert event */ + cdev->is_inserting = false; + trace_cpuhp_acpi_clear_inserting_evt(cpu_st->selector); + } else if (data & 4) { /* clear remove event */ + cdev->is_removing = false; + trace_cpuhp_acpi_clear_remove_evt(cpu_st->selector); + } else if (data & 8) { + DeviceState *dev = NULL; + HotplugHandler *hotplug_ctrl = NULL; + + if (!cdev->cpu) { + trace_cpuhp_acpi_ejecting_invalid_cpu(cpu_st->selector); + break; + } + + trace_cpuhp_acpi_ejecting_cpu(cpu_st->selector); + dev = DEVICE(cdev->cpu); + hotplug_ctrl = qdev_get_hotplug_handler(dev); + hotplug_handler_unplug(hotplug_ctrl, dev, NULL); + } + break; + case ACPI_CPU_CMD_OFFSET_WR: + trace_cpuhp_acpi_write_cmd(cpu_st->selector, data); + if (data < CPHP_CMD_MAX) { + cpu_st->command = data; + if (cpu_st->command == CPHP_GET_NEXT_CPU_WITH_EVENT_CMD) { + uint32_t iter = cpu_st->selector; + + do { + cdev = &cpu_st->devs[iter]; + if (cdev->is_inserting || cdev->is_removing) { + cpu_st->selector = iter; + trace_cpuhp_acpi_cpu_has_events(cpu_st->selector, + cdev->is_inserting, cdev->is_removing); + break; + } + iter = iter + 1 < cpu_st->dev_count ? iter + 1 : 0; + } while (iter != cpu_st->selector); + } + } + break; + case ACPI_CPU_CMD_DATA_OFFSET_RW: + switch (cpu_st->command) { + case CPHP_OST_EVENT_CMD: { + cdev = &cpu_st->devs[cpu_st->selector]; + cdev->ost_event = data; + trace_cpuhp_acpi_write_ost_ev(cpu_st->selector, cdev->ost_event); + break; + } + case CPHP_OST_STATUS_CMD: { + cdev = &cpu_st->devs[cpu_st->selector]; + cdev->ost_status = data; + info = acpi_cpu_device_status(cpu_st->selector, cdev); + qapi_event_send_acpi_device_ost(info, &error_abort); + qapi_free_ACPIOSTInfo(info); + trace_cpuhp_acpi_write_ost_status(cpu_st->selector, + cdev->ost_status); + break; + } + default: + break; + } + break; + default: + break; + } +} + +static const MemoryRegionOps cpu_hotplug_ops = { + .read = cpu_hotplug_rd, + .write = cpu_hotplug_wr, + .endianness = DEVICE_LITTLE_ENDIAN, + .valid = { + .min_access_size = 1, + .max_access_size = 4, + }, +}; + +void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner, + CPUHotplugState *state, hwaddr base_addr) +{ + MachineState *machine = MACHINE(qdev_get_machine()); + MachineClass *mc = MACHINE_GET_CLASS(machine); + CPUArchIdList *id_list; + int i; + + assert(mc->possible_cpu_arch_ids); + id_list = mc->possible_cpu_arch_ids(machine); + state->dev_count = id_list->len; + state->devs = g_new0(typeof(*state->devs), state->dev_count); + for (i = 0; i < id_list->len; i++) { + state->devs[i].cpu = id_list->cpus[i].cpu; + state->devs[i].arch_id = id_list->cpus[i].arch_id; + } + g_free(id_list); + memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state, + "acpi-mem-hotplug", ACPI_CPU_HOTPLUG_REG_LEN); + memory_region_add_subregion(as, base_addr, &state->ctrl_reg); +} + +static AcpiCpuStatus *get_cpu_status(CPUHotplugState *cpu_st, DeviceState *dev) +{ + CPUClass *k = CPU_GET_CLASS(dev); + uint64_t cpu_arch_id = k->get_arch_id(CPU(dev)); + int i; + + for (i = 0; i < cpu_st->dev_count; i++) { + if (cpu_arch_id == cpu_st->devs[i].arch_id) { + return &cpu_st->devs[i]; + } + } + return NULL; +} + +void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev, + CPUHotplugState *cpu_st, DeviceState *dev, Error **errp) +{ + AcpiCpuStatus *cdev; + + cdev = get_cpu_status(cpu_st, dev); + if (!cdev) { + return; + } + + cdev->cpu = CPU(dev); + if (dev->hotplugged) { + cdev->is_inserting = true; + acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS); + } +} + +void acpi_cpu_unplug_request_cb(HotplugHandler *hotplug_dev, + CPUHotplugState *cpu_st, + DeviceState *dev, Error **errp) +{ + AcpiCpuStatus *cdev; + + cdev = get_cpu_status(cpu_st, dev); + if (!cdev) { + return; + } + + cdev->is_removing = true; + acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS); +} + +void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st, + DeviceState *dev, Error **errp) +{ + AcpiCpuStatus *cdev; + + cdev = get_cpu_status(cpu_st, dev); + if (!cdev) { + return; + } + + cdev->cpu = NULL; +} + +static const VMStateDescription vmstate_cpuhp_sts = { + .name = "CPU hotplug device state", + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .fields = (VMStateField[]) { + VMSTATE_BOOL(is_inserting, AcpiCpuStatus), + VMSTATE_BOOL(is_removing, AcpiCpuStatus), + VMSTATE_UINT32(ost_event, AcpiCpuStatus), + VMSTATE_UINT32(ost_status, AcpiCpuStatus), + VMSTATE_END_OF_LIST() + } +}; + +const VMStateDescription vmstate_cpu_hotplug = { + .name = "CPU hotplug state", + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .fields = (VMStateField[]) { + VMSTATE_UINT32(selector, CPUHotplugState), + VMSTATE_UINT8(command, CPUHotplugState), + VMSTATE_STRUCT_VARRAY_POINTER_UINT32(devs, CPUHotplugState, dev_count, + vmstate_cpuhp_sts, AcpiCpuStatus), + VMSTATE_END_OF_LIST() + } +}; + +#define CPU_NAME_FMT "C%.03X" +#define CPUHP_RES_DEVICE "PRES" +#define CPU_LOCK "CPLK" +#define CPU_STS_METHOD "CSTA" +#define CPU_SCAN_METHOD "CSCN" +#define CPU_NOTIFY_METHOD "CTFY" +#define CPU_EJECT_METHOD "CEJ0" +#define CPU_OST_METHOD "COST" + +#define CPU_ENABLED "CPEN" +#define CPU_SELECTOR "CSEL" +#define CPU_COMMAND "CCMD" +#define CPU_DATA "CDAT" +#define CPU_INSERT_EVENT "CINS" +#define CPU_REMOVE_EVENT "CRMV" +#define CPU_EJECT_EVENT "CEJ0" + +void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts, + hwaddr io_base, + const char *res_root, + const char *event_handler_method) +{ + Aml *ifctx; + Aml *field; + Aml *method; + Aml *cpu_ctrl_dev; + Aml *cpus_dev; + Aml *zero = aml_int(0); + Aml *one = aml_int(1); + Aml *sb_scope = aml_scope("_SB"); + MachineClass *mc = MACHINE_GET_CLASS(machine); + CPUArchIdList *arch_ids = mc->possible_cpu_arch_ids(machine); + char *cphp_res_path = g_strdup_printf("%s." CPUHP_RES_DEVICE, res_root); + Object *obj = object_resolve_path_type("", TYPE_ACPI_DEVICE_IF, NULL); + AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(obj); + AcpiDeviceIf *adev = ACPI_DEVICE_IF(obj); + + cpu_ctrl_dev = aml_device("%s", cphp_res_path); + { + Aml *crs; + + aml_append(cpu_ctrl_dev, + aml_name_decl("_HID", aml_eisaid("PNP0A06"))); + aml_append(cpu_ctrl_dev, + aml_name_decl("_UID", aml_string("CPU Hotplug resources"))); + aml_append(cpu_ctrl_dev, aml_mutex(CPU_LOCK, 0)); + + crs = aml_resource_template(); + aml_append(crs, aml_io(AML_DECODE16, io_base, io_base, 1, + ACPI_CPU_HOTPLUG_REG_LEN)); + aml_append(cpu_ctrl_dev, aml_name_decl("_CRS", crs)); + + /* declare CPU hotplug MMIO region with related access fields */ + aml_append(cpu_ctrl_dev, + aml_operation_region("PRST", AML_SYSTEM_IO, aml_int(io_base), + ACPI_CPU_HOTPLUG_REG_LEN)); + + field = aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK, + AML_WRITE_AS_ZEROS); + aml_append(field, aml_reserved_field(ACPI_CPU_FLAGS_OFFSET_RW * 8)); + /* 1 if enabled, read only */ + aml_append(field, aml_named_field(CPU_ENABLED, 1)); + /* (read) 1 if has a insert event. (write) 1 to clear event */ + aml_append(field, aml_named_field(CPU_INSERT_EVENT, 1)); + /* (read) 1 if has a remove event. (write) 1 to clear event */ + aml_append(field, aml_named_field(CPU_REMOVE_EVENT, 1)); + /* initiates device eject, write only */ + aml_append(field, aml_named_field(CPU_EJECT_EVENT, 1)); + aml_append(field, aml_reserved_field(4)); + aml_append(field, aml_named_field(CPU_COMMAND, 8)); + aml_append(cpu_ctrl_dev, field); + + field = aml_field("PRST", AML_DWORD_ACC, AML_NOLOCK, AML_PRESERVE); + /* CPU selector, write only */ + aml_append(field, aml_named_field(CPU_SELECTOR, 32)); + /* flags + cmd + 2byte align */ + aml_append(field, aml_reserved_field(4 * 8)); + aml_append(field, aml_named_field(CPU_DATA, 32)); + aml_append(cpu_ctrl_dev, field); + + if (opts.has_legacy_cphp) { + method = aml_method("_INI", 0, AML_SERIALIZED); + /* switch off legacy CPU hotplug HW and use new one, + * on reboot system is in new mode and writing 0 + * in CPU_SELECTOR selects BSP, which is NOP at + * the time _INI is called */ + aml_append(method, aml_store(zero, aml_name(CPU_SELECTOR))); + aml_append(cpu_ctrl_dev, method); + } + } + aml_append(sb_scope, cpu_ctrl_dev); + + cpus_dev = aml_device("\\_SB.CPUS"); + { + int i; + Aml *ctrl_lock = aml_name("%s.%s", cphp_res_path, CPU_LOCK); + Aml *cpu_selector = aml_name("%s.%s", cphp_res_path, CPU_SELECTOR); + Aml *is_enabled = aml_name("%s.%s", cphp_res_path, CPU_ENABLED); + Aml *cpu_cmd = aml_name("%s.%s", cphp_res_path, CPU_COMMAND); + Aml *cpu_data = aml_name("%s.%s", cphp_res_path, CPU_DATA); + Aml *ins_evt = aml_name("%s.%s", cphp_res_path, CPU_INSERT_EVENT); + Aml *rm_evt = aml_name("%s.%s", cphp_res_path, CPU_REMOVE_EVENT); + Aml *ej_evt = aml_name("%s.%s", cphp_res_path, CPU_EJECT_EVENT); + + aml_append(cpus_dev, aml_name_decl("_HID", aml_string("ACPI0010"))); + aml_append(cpus_dev, aml_name_decl("_CID", aml_eisaid("PNP0A05"))); + + method = aml_method(CPU_NOTIFY_METHOD, 2, AML_NOTSERIALIZED); + for (i = 0; i < arch_ids->len; i++) { + Aml *cpu = aml_name(CPU_NAME_FMT, i); + Aml *uid = aml_arg(0); + Aml *event = aml_arg(1); + + ifctx = aml_if(aml_equal(uid, aml_int(i))); + { + aml_append(ifctx, aml_notify(cpu, event)); + } + aml_append(method, ifctx); + } + aml_append(cpus_dev, method); + + method = aml_method(CPU_STS_METHOD, 1, AML_SERIALIZED); + { + Aml *idx = aml_arg(0); + Aml *sta = aml_local(0); + + aml_append(method, aml_acquire(ctrl_lock, 0xFFFF)); + aml_append(method, aml_store(idx, cpu_selector)); + aml_append(method, aml_store(zero, sta)); + ifctx = aml_if(aml_equal(is_enabled, one)); + { + aml_append(ifctx, aml_store(aml_int(0xF), sta)); + } + aml_append(method, ifctx); + aml_append(method, aml_release(ctrl_lock)); + aml_append(method, aml_return(sta)); + } + aml_append(cpus_dev, method); + + method = aml_method(CPU_EJECT_METHOD, 1, AML_SERIALIZED); + { + Aml *idx = aml_arg(0); + + aml_append(method, aml_acquire(ctrl_lock, 0xFFFF)); + aml_append(method, aml_store(idx, cpu_selector)); + aml_append(method, aml_store(one, ej_evt)); + aml_append(method, aml_release(ctrl_lock)); + } + aml_append(cpus_dev, method); + + method = aml_method(CPU_SCAN_METHOD, 0, AML_SERIALIZED); + { + Aml *else_ctx; + Aml *while_ctx; + Aml *has_event = aml_local(0); + Aml *dev_chk = aml_int(1); + Aml *eject_req = aml_int(3); + Aml *next_cpu_cmd = aml_int(CPHP_GET_NEXT_CPU_WITH_EVENT_CMD); + + aml_append(method, aml_acquire(ctrl_lock, 0xFFFF)); + aml_append(method, aml_store(one, has_event)); + while_ctx = aml_while(aml_equal(has_event, one)); + { + /* clear loop exit condition, ins_evt/rm_evt checks + * will set it to 1 while next_cpu_cmd returns a CPU + * with events */ + aml_append(while_ctx, aml_store(zero, has_event)); + aml_append(while_ctx, aml_store(next_cpu_cmd, cpu_cmd)); + ifctx = aml_if(aml_equal(ins_evt, one)); + { + aml_append(ifctx, + aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk)); + aml_append(ifctx, aml_store(one, ins_evt)); + aml_append(ifctx, aml_store(one, has_event)); + } + aml_append(while_ctx, ifctx); + else_ctx = aml_else(); + ifctx = aml_if(aml_equal(rm_evt, one)); + { + aml_append(ifctx, + aml_call2(CPU_NOTIFY_METHOD, cpu_data, eject_req)); + aml_append(ifctx, aml_store(one, rm_evt)); + aml_append(ifctx, aml_store(one, has_event)); + } + aml_append(else_ctx, ifctx); + aml_append(while_ctx, else_ctx); + } + aml_append(method, while_ctx); + aml_append(method, aml_release(ctrl_lock)); + } + aml_append(cpus_dev, method); + + method = aml_method(CPU_OST_METHOD, 4, AML_SERIALIZED); + { + Aml *uid = aml_arg(0); + Aml *ev_cmd = aml_int(CPHP_OST_EVENT_CMD); + Aml *st_cmd = aml_int(CPHP_OST_STATUS_CMD); + + aml_append(method, aml_acquire(ctrl_lock, 0xFFFF)); + aml_append(method, aml_store(uid, cpu_selector)); + aml_append(method, aml_store(ev_cmd, cpu_cmd)); + aml_append(method, aml_store(aml_arg(1), cpu_data)); + aml_append(method, aml_store(st_cmd, cpu_cmd)); + aml_append(method, aml_store(aml_arg(2), cpu_data)); + aml_append(method, aml_release(ctrl_lock)); + } + aml_append(cpus_dev, method); + + /* build Processor object for each processor */ + for (i = 0; i < arch_ids->len; i++) { + Aml *dev; + Aml *uid = aml_int(i); + GArray *madt_buf = g_array_new(0, 1, 1); + int arch_id = arch_ids->cpus[i].arch_id; + + if (opts.apci_1_compatible && arch_id < 255) { + dev = aml_processor(i, 0, 0, CPU_NAME_FMT, i); + } else { + dev = aml_device(CPU_NAME_FMT, i); + aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007"))); + aml_append(dev, aml_name_decl("_UID", uid)); + } + + method = aml_method("_STA", 0, AML_SERIALIZED); + aml_append(method, aml_return(aml_call1(CPU_STS_METHOD, uid))); + aml_append(dev, method); + + /* build _MAT object */ + assert(adevc && adevc->madt_cpu); + adevc->madt_cpu(adev, i, arch_ids, madt_buf); + switch (madt_buf->data[0]) { + case ACPI_APIC_PROCESSOR: { + AcpiMadtProcessorApic *apic = (void *)madt_buf->data; + apic->flags = cpu_to_le32(1); + break; + } + default: + assert(0); + } + aml_append(dev, aml_name_decl("_MAT", + aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data))); + g_array_free(madt_buf, true); + + method = aml_method("_EJ0", 1, AML_NOTSERIALIZED); + aml_append(method, aml_call1(CPU_EJECT_METHOD, uid)); + aml_append(dev, method); + + method = aml_method("_OST", 3, AML_SERIALIZED); + aml_append(method, + aml_call4(CPU_OST_METHOD, uid, aml_arg(0), + aml_arg(1), aml_arg(2)) + ); + aml_append(dev, method); + aml_append(cpus_dev, dev); + } + } + aml_append(sb_scope, cpus_dev); + aml_append(table, sb_scope); + + method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED); + aml_append(method, aml_call0("\\_SB.CPUS." CPU_SCAN_METHOD)); + aml_append(table, method); + + g_free(cphp_res_path); + g_free(arch_ids); +} diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c index fe75bd9ac9..e19d902063 100644 --- a/hw/acpi/cpu_hotplug.c +++ b/hw/acpi/cpu_hotplug.c @@ -34,7 +34,15 @@ static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size) static void cpu_status_write(void *opaque, hwaddr addr, uint64_t data, unsigned int size) { - /* TODO: implement VCPU removal on guest signal that CPU can be removed */ + /* firmware never used to write in CPU present bitmap so use + this fact as means to switch QEMU into modern CPU hotplug + mode by writing 0 at the beginning of legacy CPU bitmap + */ + if (addr == 0 && data == 0) { + AcpiCpuHotplug *cpus = opaque; + object_property_set_bool(cpus->device, false, "cpu-hotplug-legacy", + &error_abort); + } } static const MemoryRegionOps AcpiCpuHotplug_ops = { @@ -83,6 +91,17 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner, memory_region_init_io(&gpe_cpu->io, owner, &AcpiCpuHotplug_ops, gpe_cpu, "acpi-cpu-hotplug", ACPI_GPE_PROC_LEN); memory_region_add_subregion(parent, base, &gpe_cpu->io); + gpe_cpu->device = owner; +} + +void acpi_switch_to_modern_cphp(AcpiCpuHotplug *gpe_cpu, + CPUHotplugState *cpuhp_state, + uint16_t io_port) +{ + MemoryRegion *parent = pci_address_space_io(PCI_DEVICE(gpe_cpu->device)); + + memory_region_del_subregion(parent, &gpe_cpu->io); + cpu_hotplug_hw_init(parent, gpe_cpu->device, cpuhp_state, io_port); } void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine, diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c index 853c9c4eb7..e5a3c18e52 100644 --- a/hw/acpi/ich9.c +++ b/hw/acpi/ich9.c @@ -189,6 +189,33 @@ static const VMStateDescription vmstate_tco_io_state = { } }; +static bool vmstate_test_use_cpuhp(void *opaque) +{ + ICH9LPCPMRegs *s = opaque; + return !s->cpu_hotplug_legacy; +} + +static int vmstate_cpuhp_pre_load(void *opaque) +{ + ICH9LPCPMRegs *s = opaque; + Object *obj = OBJECT(s->gpe_cpu.device); + object_property_set_bool(obj, false, "cpu-hotplug-legacy", &error_abort); + return 0; +} + +static const VMStateDescription vmstate_cpuhp_state = { + .name = "ich9_pm/cpuhp", + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .needed = vmstate_test_use_cpuhp, + .pre_load = vmstate_cpuhp_pre_load, + .fields = (VMStateField[]) { + VMSTATE_CPU_HOTPLUG(cpuhp_state, ICH9LPCPMRegs), + VMSTATE_END_OF_LIST() + } +}; + const VMStateDescription vmstate_ich9_pm = { .name = "ich9_pm", .version_id = 1, @@ -209,6 +236,7 @@ const VMStateDescription vmstate_ich9_pm = { .subsections = (const VMStateDescription*[]) { &vmstate_memhp_state, &vmstate_tco_io_state, + &vmstate_cpuhp_state, NULL } }; @@ -306,6 +334,26 @@ static void ich9_pm_set_memory_hotplug_support(Object *obj, bool value, s->pm.acpi_memory_hotplug.is_enabled = value; } +static bool ich9_pm_get_cpu_hotplug_legacy(Object *obj, Error **errp) +{ + ICH9LPCState *s = ICH9_LPC_DEVICE(obj); + + return s->pm.cpu_hotplug_legacy; +} + +static void ich9_pm_set_cpu_hotplug_legacy(Object *obj, bool value, + Error **errp) +{ + ICH9LPCState *s = ICH9_LPC_DEVICE(obj); + + assert(!value); + if (s->pm.cpu_hotplug_legacy && value == false) { + acpi_switch_to_modern_cphp(&s->pm.gpe_cpu, &s->pm.cpuhp_state, + ICH9_CPU_HOTPLUG_IO_BASE); + } + s->pm.cpu_hotplug_legacy = value; +} + static void ich9_pm_get_disable_s3(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { @@ -397,6 +445,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, Error **errp) { static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN; pm->acpi_memory_hotplug.is_enabled = true; + pm->cpu_hotplug_legacy = true; pm->disable_s3 = 0; pm->disable_s4 = 0; pm->s4_val = 2; @@ -412,6 +461,10 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, Error **errp) ich9_pm_get_memory_hotplug_support, ich9_pm_set_memory_hotplug_support, NULL); + object_property_add_bool(obj, "cpu-hotplug-legacy", + ich9_pm_get_cpu_hotplug_legacy, + ich9_pm_set_cpu_hotplug_legacy, + NULL); object_property_add(obj, ACPI_PM_PROP_S3_DISABLED, "uint8", ich9_pm_get_disable_s3, ich9_pm_set_disable_s3, @@ -440,7 +493,11 @@ void ich9_pm_device_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, acpi_memory_plug_cb(hotplug_dev, &lpc->pm.acpi_memory_hotplug, dev, errp); } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { - legacy_acpi_cpu_plug_cb(hotplug_dev, &lpc->pm.gpe_cpu, dev, errp); + if (lpc->pm.cpu_hotplug_legacy) { + legacy_acpi_cpu_plug_cb(hotplug_dev, &lpc->pm.gpe_cpu, dev, errp); + } else { + acpi_cpu_plug_cb(hotplug_dev, &lpc->pm.cpuhp_state, dev, errp); + } } else { error_setg(errp, "acpi: device plug request for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -457,6 +514,10 @@ void ich9_pm_device_unplug_request_cb(HotplugHandler *hotplug_dev, acpi_memory_unplug_request_cb(hotplug_dev, &lpc->pm.acpi_memory_hotplug, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) && + !lpc->pm.cpu_hotplug_legacy) { + acpi_cpu_unplug_request_cb(hotplug_dev, &lpc->pm.cpuhp_state, + dev, errp); } else { error_setg(errp, "acpi: device unplug request for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -471,6 +532,9 @@ void ich9_pm_device_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, if (lpc->pm.acpi_memory_hotplug.is_enabled && object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { acpi_memory_unplug_cb(&lpc->pm.acpi_memory_hotplug, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) && + !lpc->pm.cpu_hotplug_legacy) { + acpi_cpu_unplug_cb(&lpc->pm.cpuhp_state, dev, errp); } else { error_setg(errp, "acpi: device unplug for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -482,4 +546,7 @@ void ich9_pm_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list) ICH9LPCState *s = ICH9_LPC_DEVICE(adev); acpi_memory_ospm_status(&s->pm.acpi_memory_hotplug, list); + if (!s->pm.cpu_hotplug_legacy) { + acpi_cpu_ospm_status(&s->pm.cpuhp_state, list); + } } diff --git a/hw/acpi/ipmi.c b/hw/acpi/ipmi.c new file mode 100644 index 0000000000..7e74ce4460 --- /dev/null +++ b/hw/acpi/ipmi.c @@ -0,0 +1,105 @@ +/* + * IPMI ACPI firmware handling + * + * Copyright (c) 2015,2016 Corey Minyard, MontaVista Software, LLC + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "hw/ipmi/ipmi.h" +#include "hw/acpi/aml-build.h" +#include "hw/acpi/acpi.h" +#include "hw/acpi/ipmi.h" + +static Aml *aml_ipmi_crs(IPMIFwInfo *info) +{ + Aml *crs = aml_resource_template(); + + /* + * The base address is fixed and cannot change. That may be different + * if someone does PCI, but we aren't there yet. + */ + switch (info->memspace) { + case IPMI_MEMSPACE_IO: + aml_append(crs, aml_io(AML_DECODE16, info->base_address, + info->base_address + info->register_length - 1, + info->register_spacing, info->register_length)); + break; + case IPMI_MEMSPACE_MEM32: + aml_append(crs, + aml_dword_memory(AML_POS_DECODE, + AML_MIN_FIXED, AML_MAX_FIXED, + AML_NON_CACHEABLE, AML_READ_WRITE, + 0xffffffff, + info->base_address, + info->base_address + info->register_length - 1, + info->register_spacing, info->register_length)); + break; + case IPMI_MEMSPACE_MEM64: + aml_append(crs, + aml_qword_memory(AML_POS_DECODE, + AML_MIN_FIXED, AML_MAX_FIXED, + AML_NON_CACHEABLE, AML_READ_WRITE, + 0xffffffffffffffffULL, + info->base_address, + info->base_address + info->register_length - 1, + info->register_spacing, info->register_length)); + break; + case IPMI_MEMSPACE_SMBUS: + aml_append(crs, aml_return(aml_int(info->base_address))); + break; + default: + abort(); + } + + if (info->interrupt_number) { + aml_append(crs, aml_irq_no_flags(info->interrupt_number)); + } + + return crs; +} + +static Aml *aml_ipmi_device(IPMIFwInfo *info) +{ + Aml *dev; + uint16_t version = ((info->ipmi_spec_major_revision << 8) + | (info->ipmi_spec_minor_revision << 4)); + + assert(info->ipmi_spec_minor_revision <= 15); + + dev = aml_device("MI%d", info->uuid); + aml_append(dev, aml_name_decl("_HID", aml_eisaid("IPI0001"))); + aml_append(dev, aml_name_decl("_STR", aml_string("ipmi_%s", + info->interface_name))); + aml_append(dev, aml_name_decl("_UID", aml_int(info->uuid))); + aml_append(dev, aml_name_decl("_CRS", aml_ipmi_crs(info))); + aml_append(dev, aml_name_decl("_IFT", aml_int(info->interface_type))); + aml_append(dev, aml_name_decl("_SRV", aml_int(version))); + + return dev; +} + +void build_acpi_ipmi_devices(Aml *scope, BusState *bus) +{ + + BusChild *kid; + + QTAILQ_FOREACH(kid, &bus->children, sibling) { + IPMIInterface *ii; + IPMIInterfaceClass *iic; + IPMIFwInfo info; + Object *obj = object_dynamic_cast(OBJECT(kid->child), + TYPE_IPMI_INTERFACE); + + if (!obj) { + continue; + } + + ii = IPMI_INTERFACE(obj); + iic = IPMI_INTERFACE_GET_CLASS(obj); + iic->get_fwinfo(ii, &info); + aml_append(scope, aml_ipmi_device(&info)); + } +} diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c index b4c22627df..e486128aa1 100644 --- a/hw/acpi/nvdimm.c +++ b/hw/acpi/nvdimm.c @@ -216,6 +216,26 @@ static uint32_t nvdimm_slot_to_dcr_index(int slot) return nvdimm_slot_to_spa_index(slot) + 1; } +static NVDIMMDevice *nvdimm_get_device_by_handle(uint32_t handle) +{ + NVDIMMDevice *nvdimm = NULL; + GSList *list, *device_list = nvdimm_get_plugged_device_list(); + + for (list = device_list; list; list = list->next) { + NVDIMMDevice *nvd = list->data; + int slot = object_property_get_int(OBJECT(nvd), PC_DIMM_SLOT_PROP, + NULL); + + if (nvdimm_slot_to_handle(slot) == handle) { + nvdimm = nvd; + break; + } + } + + g_slist_free(device_list); + return nvdimm; +} + /* ACPI 6.0: 5.2.25.1 System Physical Address Range Structure */ static void nvdimm_build_structure_spa(GArray *structures, DeviceState *dev) @@ -406,6 +426,282 @@ struct NvdimmDsmFuncNoPayloadOut { } QEMU_PACKED; typedef struct NvdimmDsmFuncNoPayloadOut NvdimmDsmFuncNoPayloadOut; +struct NvdimmFuncGetLabelSizeOut { + /* the size of buffer filled by QEMU. */ + uint32_t len; + uint32_t func_ret_status; /* return status code. */ + uint32_t label_size; /* the size of label data area. */ + /* + * Maximum size of the namespace label data length supported by + * the platform in Get/Set Namespace Label Data functions. + */ + uint32_t max_xfer; +} QEMU_PACKED; +typedef struct NvdimmFuncGetLabelSizeOut NvdimmFuncGetLabelSizeOut; +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelSizeOut) > 4096); + +struct NvdimmFuncGetLabelDataIn { + uint32_t offset; /* the offset in the namespace label data area. */ + uint32_t length; /* the size of data is to be read via the function. */ +} QEMU_PACKED; +typedef struct NvdimmFuncGetLabelDataIn NvdimmFuncGetLabelDataIn; +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelDataIn) + + offsetof(NvdimmDsmIn, arg3) > 4096); + +struct NvdimmFuncGetLabelDataOut { + /* the size of buffer filled by QEMU. */ + uint32_t len; + uint32_t func_ret_status; /* return status code. */ + uint8_t out_buf[0]; /* the data got via Get Namesapce Label function. */ +} QEMU_PACKED; +typedef struct NvdimmFuncGetLabelDataOut NvdimmFuncGetLabelDataOut; +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelDataOut) > 4096); + +struct NvdimmFuncSetLabelDataIn { + uint32_t offset; /* the offset in the namespace label data area. */ + uint32_t length; /* the size of data is to be written via the function. */ + uint8_t in_buf[0]; /* the data written to label data area. */ +} QEMU_PACKED; +typedef struct NvdimmFuncSetLabelDataIn NvdimmFuncSetLabelDataIn; +QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncSetLabelDataIn) + + offsetof(NvdimmDsmIn, arg3) > 4096); + +static void +nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr) +{ + NvdimmDsmFunc0Out func0 = { + .len = cpu_to_le32(sizeof(func0)), + .supported_func = cpu_to_le32(supported_func), + }; + cpu_physical_memory_write(dsm_mem_addr, &func0, sizeof(func0)); +} + +static void +nvdimm_dsm_no_payload(uint32_t func_ret_status, hwaddr dsm_mem_addr) +{ + NvdimmDsmFuncNoPayloadOut out = { + .len = cpu_to_le32(sizeof(out)), + .func_ret_status = cpu_to_le32(func_ret_status), + }; + cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out)); +} + +static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr) +{ + /* + * function 0 is called to inquire which functions are supported by + * OSPM + */ + if (!in->function) { + nvdimm_dsm_function0(0 /* No function supported other than + function 0 */, dsm_mem_addr); + return; + } + + /* No function except function 0 is supported yet. */ + nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr); +} + +/* + * the max transfer size is the max size transferred by both a + * 'Get Namespace Label Data' function and a 'Set Namespace Label Data' + * function. + */ +static uint32_t nvdimm_get_max_xfer_label_size(void) +{ + uint32_t max_get_size, max_set_size, dsm_memory_size = 4096; + + /* + * the max data ACPI can read one time which is transferred by + * the response of 'Get Namespace Label Data' function. + */ + max_get_size = dsm_memory_size - sizeof(NvdimmFuncGetLabelDataOut); + + /* + * the max data ACPI can write one time which is transferred by + * 'Set Namespace Label Data' function. + */ + max_set_size = dsm_memory_size - offsetof(NvdimmDsmIn, arg3) - + sizeof(NvdimmFuncSetLabelDataIn); + + return MIN(max_get_size, max_set_size); +} + +/* + * DSM Spec Rev1 4.4 Get Namespace Label Size (Function Index 4). + * + * It gets the size of Namespace Label data area and the max data size + * that Get/Set Namespace Label Data functions can transfer. + */ +static void nvdimm_dsm_label_size(NVDIMMDevice *nvdimm, hwaddr dsm_mem_addr) +{ + NvdimmFuncGetLabelSizeOut label_size_out = { + .len = cpu_to_le32(sizeof(label_size_out)), + }; + uint32_t label_size, mxfer; + + label_size = nvdimm->label_size; + mxfer = nvdimm_get_max_xfer_label_size(); + + nvdimm_debug("label_size %#x, max_xfer %#x.\n", label_size, mxfer); + + label_size_out.func_ret_status = cpu_to_le32(0 /* Success */); + label_size_out.label_size = cpu_to_le32(label_size); + label_size_out.max_xfer = cpu_to_le32(mxfer); + + cpu_physical_memory_write(dsm_mem_addr, &label_size_out, + sizeof(label_size_out)); +} + +static uint32_t nvdimm_rw_label_data_check(NVDIMMDevice *nvdimm, + uint32_t offset, uint32_t length) +{ + uint32_t ret = 3 /* Invalid Input Parameters */; + + if (offset + length < offset) { + nvdimm_debug("offset %#x + length %#x is overflow.\n", offset, + length); + return ret; + } + + if (nvdimm->label_size < offset + length) { + nvdimm_debug("position %#x is beyond label data (len = %" PRIx64 ").\n", + offset + length, nvdimm->label_size); + return ret; + } + + if (length > nvdimm_get_max_xfer_label_size()) { + nvdimm_debug("length (%#x) is larger than max_xfer (%#x).\n", + length, nvdimm_get_max_xfer_label_size()); + return ret; + } + + return 0 /* Success */; +} + +/* + * DSM Spec Rev1 4.5 Get Namespace Label Data (Function Index 5). + */ +static void nvdimm_dsm_get_label_data(NVDIMMDevice *nvdimm, NvdimmDsmIn *in, + hwaddr dsm_mem_addr) +{ + NVDIMMClass *nvc = NVDIMM_GET_CLASS(nvdimm); + NvdimmFuncGetLabelDataIn *get_label_data; + NvdimmFuncGetLabelDataOut *get_label_data_out; + uint32_t status; + int size; + + get_label_data = (NvdimmFuncGetLabelDataIn *)in->arg3; + le32_to_cpus(&get_label_data->offset); + le32_to_cpus(&get_label_data->length); + + nvdimm_debug("Read Label Data: offset %#x length %#x.\n", + get_label_data->offset, get_label_data->length); + + status = nvdimm_rw_label_data_check(nvdimm, get_label_data->offset, + get_label_data->length); + if (status != 0 /* Success */) { + nvdimm_dsm_no_payload(status, dsm_mem_addr); + return; + } + + size = sizeof(*get_label_data_out) + get_label_data->length; + assert(size <= 4096); + get_label_data_out = g_malloc(size); + + get_label_data_out->len = cpu_to_le32(size); + get_label_data_out->func_ret_status = cpu_to_le32(0 /* Success */); + nvc->read_label_data(nvdimm, get_label_data_out->out_buf, + get_label_data->length, get_label_data->offset); + + cpu_physical_memory_write(dsm_mem_addr, get_label_data_out, size); + g_free(get_label_data_out); +} + +/* + * DSM Spec Rev1 4.6 Set Namespace Label Data (Function Index 6). + */ +static void nvdimm_dsm_set_label_data(NVDIMMDevice *nvdimm, NvdimmDsmIn *in, + hwaddr dsm_mem_addr) +{ + NVDIMMClass *nvc = NVDIMM_GET_CLASS(nvdimm); + NvdimmFuncSetLabelDataIn *set_label_data; + uint32_t status; + + set_label_data = (NvdimmFuncSetLabelDataIn *)in->arg3; + + le32_to_cpus(&set_label_data->offset); + le32_to_cpus(&set_label_data->length); + + nvdimm_debug("Write Label Data: offset %#x length %#x.\n", + set_label_data->offset, set_label_data->length); + + status = nvdimm_rw_label_data_check(nvdimm, set_label_data->offset, + set_label_data->length); + if (status != 0 /* Success */) { + nvdimm_dsm_no_payload(status, dsm_mem_addr); + return; + } + + assert(sizeof(*in) + sizeof(*set_label_data) + set_label_data->length <= + 4096); + + nvc->write_label_data(nvdimm, set_label_data->in_buf, + set_label_data->length, set_label_data->offset); + nvdimm_dsm_no_payload(0 /* Success */, dsm_mem_addr); +} + +static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr dsm_mem_addr) +{ + NVDIMMDevice *nvdimm = nvdimm_get_device_by_handle(in->handle); + + /* See the comments in nvdimm_dsm_root(). */ + if (!in->function) { + uint32_t supported_func = 0; + + if (nvdimm && nvdimm->label_size) { + supported_func |= 0x1 /* Bit 0 indicates whether there is + support for any functions other + than function 0. */ | + 1 << 4 /* Get Namespace Label Size */ | + 1 << 5 /* Get Namespace Label Data */ | + 1 << 6 /* Set Namespace Label Data */; + } + nvdimm_dsm_function0(supported_func, dsm_mem_addr); + return; + } + + if (!nvdimm) { + nvdimm_dsm_no_payload(2 /* Non-Existing Memory Device */, + dsm_mem_addr); + return; + } + + /* Encode DSM function according to DSM Spec Rev1. */ + switch (in->function) { + case 4 /* Get Namespace Label Size */: + if (nvdimm->label_size) { + nvdimm_dsm_label_size(nvdimm, dsm_mem_addr); + return; + } + break; + case 5 /* Get Namespace Label Data */: + if (nvdimm->label_size) { + nvdimm_dsm_get_label_data(nvdimm, in, dsm_mem_addr); + return; + } + break; + case 0x6 /* Set Namespace Label Data */: + if (nvdimm->label_size) { + nvdimm_dsm_set_label_data(nvdimm, in, dsm_mem_addr); + return; + } + break; + } + + nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr); +} + static uint64_t nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size) { @@ -436,26 +732,22 @@ nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size) nvdimm_debug("Revision %#x Handler %#x Function %#x.\n", in->revision, in->handle, in->function); - /* - * function 0 is called to inquire which functions are supported by - * OSPM - */ - if (in->function == 0) { - NvdimmDsmFunc0Out func0 = { - .len = cpu_to_le32(sizeof(func0)), - /* No function supported other than function 0 */ - .supported_func = cpu_to_le32(0), - }; - cpu_physical_memory_write(dsm_mem_addr, &func0, sizeof func0); - } else { - /* No function except function 0 is supported yet. */ - NvdimmDsmFuncNoPayloadOut out = { - .len = cpu_to_le32(sizeof(out)), - .func_ret_status = cpu_to_le32(1) /* Not Supported */, - }; - cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out)); + if (in->revision != 0x1 /* Currently we only support DSM Spec Rev1. */) { + nvdimm_debug("Revision %#x is not supported, expect %#x.\n", + in->revision, 0x1); + nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr); + goto exit; + } + + /* Handle 0 is reserved for NVDIMM Root Device. */ + if (!in->handle) { + nvdimm_dsm_root(in, dsm_mem_addr); + goto exit; } + nvdimm_dsm_device(in, dsm_mem_addr); + +exit: g_free(in); } @@ -487,18 +779,39 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, MemoryRegion *io, static void nvdimm_build_common_dsm(Aml *dev) { - Aml *method, *ifctx, *function, *dsm_mem, *unpatched, *result_size; + Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *result_size; + Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid; + Aml *pckg, *pckg_index, *pckg_buf; uint8_t byte_list[1]; - method = aml_method(NVDIMM_COMMON_DSM, 4, AML_SERIALIZED); + method = aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED); + uuid = aml_arg(0); function = aml_arg(2); + handle = aml_arg(4); dsm_mem = aml_name(NVDIMM_ACPI_MEM_ADDR); /* * do not support any method if DSM memory address has not been * patched. */ - unpatched = aml_if(aml_equal(dsm_mem, aml_int(0x0))); + unpatched = aml_equal(dsm_mem, aml_int(0x0)); + + expected_uuid = aml_local(0); + + ifctx = aml_if(aml_equal(handle, aml_int(0x0))); + aml_append(ifctx, aml_store( + aml_touuid("2F10E7A4-9E91-11E4-89D3-123B93F75CBA") + /* UUID for NVDIMM Root Device */, expected_uuid)); + aml_append(method, ifctx); + elsectx = aml_else(); + aml_append(elsectx, aml_store( + aml_touuid("4309AC30-0D11-11E4-9191-0800200C9A66") + /* UUID for NVDIMM Devices */, expected_uuid)); + aml_append(method, elsectx); + + uuid_invalid = aml_lnot(aml_equal(uuid, expected_uuid)); + + unsupport = aml_if(aml_or(unpatched, uuid_invalid, NULL)); /* * function 0 is called to inquire what functions are supported by @@ -507,24 +820,42 @@ static void nvdimm_build_common_dsm(Aml *dev) ifctx = aml_if(aml_equal(function, aml_int(0))); byte_list[0] = 0 /* No function Supported */; aml_append(ifctx, aml_return(aml_buffer(1, byte_list))); - aml_append(unpatched, ifctx); + aml_append(unsupport, ifctx); /* No function is supported yet. */ byte_list[0] = 1 /* Not Supported */; - aml_append(unpatched, aml_return(aml_buffer(1, byte_list))); - aml_append(method, unpatched); + aml_append(unsupport, aml_return(aml_buffer(1, byte_list))); + aml_append(method, unsupport); /* * The HDLE indicates the DSM function is issued from which device, - * it is not used at this time as no function is supported yet. - * Currently we make it always be 0 for all the devices and will set - * the appropriate value once real function is implemented. + * it reserves 0 for root device and is the handle for NVDIMM devices. + * See the comments in nvdimm_slot_to_handle(). */ - aml_append(method, aml_store(aml_int(0x0), aml_name("HDLE"))); + aml_append(method, aml_store(handle, aml_name("HDLE"))); aml_append(method, aml_store(aml_arg(1), aml_name("REVS"))); aml_append(method, aml_store(aml_arg(2), aml_name("FUNC"))); /* + * The fourth parameter (Arg3) of _DSM is a package which contains + * a buffer, the layout of the buffer is specified by UUID (Arg0), + * Revision ID (Arg1) and Function Index (Arg2) which are documented + * in the DSM Spec. + */ + pckg = aml_arg(3); + ifctx = aml_if(aml_and(aml_equal(aml_object_type(pckg), + aml_int(4 /* Package */)) /* It is a Package? */, + aml_equal(aml_sizeof(pckg), aml_int(1)) /* 1 element? */, + NULL)); + + pckg_index = aml_local(2); + pckg_buf = aml_local(3); + aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_index)); + aml_append(ifctx, aml_store(aml_derefof(pckg_index), pckg_buf)); + aml_append(ifctx, aml_store(pckg_buf, aml_name("ARG3"))); + aml_append(method, ifctx); + + /* * tell QEMU about the real address of DSM memory, then QEMU * gets the control and fills the result in DSM memory. */ @@ -542,13 +873,14 @@ static void nvdimm_build_common_dsm(Aml *dev) aml_append(dev, method); } -static void nvdimm_build_device_dsm(Aml *dev) +static void nvdimm_build_device_dsm(Aml *dev, uint32_t handle) { Aml *method; method = aml_method("_DSM", 4, AML_NOTSERIALIZED); - aml_append(method, aml_return(aml_call4(NVDIMM_COMMON_DSM, aml_arg(0), - aml_arg(1), aml_arg(2), aml_arg(3)))); + aml_append(method, aml_return(aml_call5(NVDIMM_COMMON_DSM, aml_arg(0), + aml_arg(1), aml_arg(2), aml_arg(3), + aml_int(handle)))); aml_append(dev, method); } @@ -573,7 +905,7 @@ static void nvdimm_build_nvdimm_devices(GSList *device_list, Aml *root_dev) */ aml_append(nvdimm_dev, aml_name_decl("_ADR", aml_int(handle))); - nvdimm_build_device_dsm(nvdimm_dev); + nvdimm_build_device_dsm(nvdimm_dev, handle); aml_append(root_dev, nvdimm_dev); } } @@ -665,7 +997,9 @@ static void nvdimm_build_ssdt(GSList *device_list, GArray *table_offsets, aml_append(dev, field); nvdimm_build_common_dsm(dev); - nvdimm_build_device_dsm(dev); + + /* 0 is reserved for root device. */ + nvdimm_build_device_dsm(dev, 0); nvdimm_build_nvdimm_devices(device_list, dev); diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index c48cb1b91a..2adc246b00 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -34,6 +34,7 @@ #include "hw/acpi/piix4.h" #include "hw/acpi/pcihp.h" #include "hw/acpi/cpu_hotplug.h" +#include "hw/acpi/cpu.h" #include "hw/hotplug.h" #include "hw/mem/pc-dimm.h" #include "hw/acpi/memory_hotplug.h" @@ -86,7 +87,9 @@ typedef struct PIIX4PMState { uint8_t disable_s4; uint8_t s4_val; + bool cpu_hotplug_legacy; AcpiCpuHotplug gpe_cpu; + CPUHotplugState cpuhp_state; MemHotplugState acpi_memory_hotplug; } PIIX4PMState; @@ -273,6 +276,32 @@ static const VMStateDescription vmstate_memhp_state = { } }; +static bool vmstate_test_use_cpuhp(void *opaque) +{ + PIIX4PMState *s = opaque; + return !s->cpu_hotplug_legacy; +} + +static int vmstate_cpuhp_pre_load(void *opaque) +{ + Object *obj = OBJECT(opaque); + object_property_set_bool(obj, false, "cpu-hotplug-legacy", &error_abort); + return 0; +} + +static const VMStateDescription vmstate_cpuhp_state = { + .name = "piix4_pm/cpuhp", + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .needed = vmstate_test_use_cpuhp, + .pre_load = vmstate_cpuhp_pre_load, + .fields = (VMStateField[]) { + VMSTATE_CPU_HOTPLUG(cpuhp_state, PIIX4PMState), + VMSTATE_END_OF_LIST() + } +}; + /* qemu-kvm 1.2 uses version 3 but advertised as 2 * To support incoming qemu-kvm 1.2 migration, change version_id * and minimum_version_id to 2 below (which breaks migration from @@ -307,6 +336,7 @@ static const VMStateDescription vmstate_acpi = { }, .subsections = (const VMStateDescription*[]) { &vmstate_memhp_state, + &vmstate_cpuhp_state, NULL } }; @@ -352,7 +382,11 @@ static void piix4_device_plug_cb(HotplugHandler *hotplug_dev, } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { acpi_pcihp_device_plug_cb(hotplug_dev, &s->acpi_pci_hotplug, dev, errp); } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { - legacy_acpi_cpu_plug_cb(hotplug_dev, &s->gpe_cpu, dev, errp); + if (s->cpu_hotplug_legacy) { + legacy_acpi_cpu_plug_cb(hotplug_dev, &s->gpe_cpu, dev, errp); + } else { + acpi_cpu_plug_cb(hotplug_dev, &s->cpuhp_state, dev, errp); + } } else { error_setg(errp, "acpi: device plug request for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -371,6 +405,9 @@ static void piix4_device_unplug_request_cb(HotplugHandler *hotplug_dev, } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { acpi_pcihp_device_unplug_cb(hotplug_dev, &s->acpi_pci_hotplug, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) && + !s->cpu_hotplug_legacy) { + acpi_cpu_unplug_request_cb(hotplug_dev, &s->cpuhp_state, dev, errp); } else { error_setg(errp, "acpi: device unplug request for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -385,6 +422,9 @@ static void piix4_device_unplug_cb(HotplugHandler *hotplug_dev, if (s->acpi_memory_hotplug.is_enabled && object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { acpi_memory_unplug_cb(&s->acpi_memory_hotplug, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) && + !s->cpu_hotplug_legacy) { + acpi_cpu_unplug_cb(&s->cpuhp_state, dev, errp); } else { error_setg(errp, "acpi: device unplug for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -560,6 +600,26 @@ static const MemoryRegionOps piix4_gpe_ops = { .endianness = DEVICE_LITTLE_ENDIAN, }; + +static bool piix4_get_cpu_hotplug_legacy(Object *obj, Error **errp) +{ + PIIX4PMState *s = PIIX4_PM(obj); + + return s->cpu_hotplug_legacy; +} + +static void piix4_set_cpu_hotplug_legacy(Object *obj, bool value, Error **errp) +{ + PIIX4PMState *s = PIIX4_PM(obj); + + assert(!value); + if (s->cpu_hotplug_legacy && value == false) { + acpi_switch_to_modern_cphp(&s->gpe_cpu, &s->cpuhp_state, + PIIX4_CPU_HOTPLUG_IO_BASE); + } + s->cpu_hotplug_legacy = value; +} + static void piix4_acpi_system_hot_add_init(MemoryRegion *parent, PCIBus *bus, PIIX4PMState *s) { @@ -570,6 +630,11 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion *parent, acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent, s->use_acpi_pci_hotplug); + s->cpu_hotplug_legacy = true; + object_property_add_bool(OBJECT(s), "cpu-hotplug-legacy", + piix4_get_cpu_hotplug_legacy, + piix4_set_cpu_hotplug_legacy, + NULL); legacy_acpi_cpu_hotplug_init(parent, OBJECT(s), &s->gpe_cpu, PIIX4_CPU_HOTPLUG_IO_BASE); @@ -583,6 +648,9 @@ static void piix4_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list) PIIX4PMState *s = PIIX4_PM(adev); acpi_memory_ospm_status(&s->acpi_memory_hotplug, list); + if (!s->cpu_hotplug_legacy) { + acpi_cpu_ospm_status(&s->cpuhp_state, list); + } } static void piix4_send_gpe(AcpiDeviceIf *adev, AcpiEventStatusBits ev) @@ -631,6 +699,7 @@ static void piix4_pm_class_init(ObjectClass *klass, void *data) hc->unplug = piix4_device_unplug_cb; adevc->ospm_status = piix4_ospm_status; adevc->send_event = piix4_send_gpe; + adevc->madt_cpu = pc_madt_cpu_entry; } static const TypeInfo piix4_pm_info = { diff --git a/hw/acpi/trace-events b/hw/acpi/trace-events index e95b2183ac..5aa3ba67c8 100644 --- a/hw/acpi/trace-events +++ b/hw/acpi/trace-events @@ -16,3 +16,17 @@ mhp_acpi_clear_insert_evt(uint32_t slot) "slot[0x%"PRIx32"] clear insert event" mhp_acpi_clear_remove_evt(uint32_t slot) "slot[0x%"PRIx32"] clear remove event" mhp_acpi_pc_dimm_deleted(uint32_t slot) "slot[0x%"PRIx32"] pc-dimm deleted" mhp_acpi_pc_dimm_delete_failed(uint32_t slot) "slot[0x%"PRIx32"] pc-dimm delete failed" + +# hw/acpi/cpu.c +cpuhp_acpi_invalid_idx_selected(uint32_t idx) "0x%"PRIx32 +cpuhp_acpi_read_flags(uint32_t idx, uint8_t flags) "idx[0x%"PRIx32"] flags: 0x%"PRIx8 +cpuhp_acpi_write_idx(uint32_t idx) "set active cpu idx: 0x%"PRIx32 +cpuhp_acpi_write_cmd(uint32_t idx, uint8_t cmd) "idx[0x%"PRIx32"] cmd: 0x%"PRIx8 +cpuhp_acpi_read_cmd_data(uint32_t idx, uint32_t data) "idx[0x%"PRIx32"] data: 0x%"PRIx32 +cpuhp_acpi_cpu_has_events(uint32_t idx, bool ins, bool rm) "idx[0x%"PRIx32"] inserting: %d, removing: %d" +cpuhp_acpi_clear_inserting_evt(uint32_t idx) "idx[0x%"PRIx32"]" +cpuhp_acpi_clear_remove_evt(uint32_t idx) "idx[0x%"PRIx32"]" +cpuhp_acpi_ejecting_invalid_cpu(uint32_t idx) "0x%"PRIx32 +cpuhp_acpi_ejecting_cpu(uint32_t idx) "0x%"PRIx32 +cpuhp_acpi_write_ost_ev(uint32_t slot, uint32_t ev) "idx[0x%"PRIx32"] OST EVENT: 0x%"PRIx32 +cpuhp_acpi_write_ost_status(uint32_t slot, uint32_t st) "idx[0x%"PRIx32"] OST STATUS: 0x%"PRIx32 diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 2073f9a270..2041b0400b 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -79,7 +79,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf, } /* Don't try if transport does not support notifiers. */ - if (!k->set_guest_notifiers || !k->set_host_notifier) { + if (!k->set_guest_notifiers || !k->ioeventfd_started) { error_setg(errp, "device is incompatible with dataplane " "(transport does not support notifiers)"); @@ -157,7 +157,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) s->guest_notifier = virtio_queue_get_guest_notifier(s->vq); /* Set up virtqueue notify */ - r = k->set_host_notifier(qbus->parent, 0, true); + r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, true); if (r != 0) { fprintf(stderr, "virtio-blk failed to set host notifier (%d)\n", r); goto fail_host_notifier; @@ -217,7 +217,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s) aio_context_release(s->ctx); - k->set_host_notifier(qbus->parent, 0, false); + virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, false); /* Clean up guest notifier (irq) */ k->set_guest_notifiers(qbus->parent, 1, false); diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 8ca203211a..5a594be8ee 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -33,6 +33,7 @@ #include "hw/timer/hpet.h" #include "hw/acpi/acpi-defs.h" #include "hw/acpi/acpi.h" +#include "hw/acpi/cpu.h" #include "hw/nvram/fw_cfg.h" #include "hw/acpi/bios-linker-loader.h" #include "hw/loader.h" @@ -43,6 +44,7 @@ #include "hw/acpi/tpm.h" #include "sysemu/tpm_backend.h" #include "hw/timer/mc146818rtc_regs.h" +#include "sysemu/numa.h" /* Supported chipsets: */ #include "hw/acpi/piix4.h" @@ -58,6 +60,8 @@ #include "qapi/qmp/qint.h" #include "qom/qom-qobject.h" +#include "hw/acpi/ipmi.h" + /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and * -M pc-i440fx-2.0. Even if the actual amount of AML generated grows * a little bit, there should be plenty of free space since the DSDT @@ -327,12 +331,38 @@ build_fadt(GArray *table_data, BIOSLinker *linker, AcpiPmInfo *pm, (void *)fadt, "FACP", sizeof(*fadt), 1, oem_id, oem_table_id); } +void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid, + CPUArchIdList *apic_ids, GArray *entry) +{ + int apic_id; + AcpiMadtProcessorApic *apic = acpi_data_push(entry, sizeof *apic); + + apic_id = apic_ids->cpus[uid].arch_id; + apic->type = ACPI_APIC_PROCESSOR; + apic->length = sizeof(*apic); + apic->processor_id = uid; + apic->local_apic_id = apic_id; + if (apic_ids->cpus[uid].cpu != NULL) { + apic->flags = cpu_to_le32(1); + } else { + /* ACPI spec says that LAPIC entry for non present + * CPU may be omitted from MADT or it must be marked + * as disabled. However omitting non present CPU from + * MADT breaks hotplug on linux. So possible CPUs + * should be put in MADT but kept disabled. + */ + apic->flags = cpu_to_le32(0); + } +} + static void build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms) { MachineClass *mc = MACHINE_GET_CLASS(pcms); CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(pcms)); int madt_start = table_data->len; + AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(pcms->acpi_dev); + AcpiDeviceIf *adev = ACPI_DEVICE_IF(pcms->acpi_dev); AcpiMultipleApicTable *madt; AcpiMadtIoApic *io_apic; @@ -345,24 +375,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms) madt->flags = cpu_to_le32(1); for (i = 0; i < apic_ids->len; i++) { - AcpiMadtProcessorApic *apic = acpi_data_push(table_data, sizeof *apic); - int apic_id = apic_ids->cpus[i].arch_id; - - apic->type = ACPI_APIC_PROCESSOR; - apic->length = sizeof(*apic); - apic->processor_id = i; - apic->local_apic_id = apic_id; - if (apic_ids->cpus[i].cpu != NULL) { - apic->flags = cpu_to_le32(1); - } else { - /* ACPI spec says that LAPIC entry for non present - * CPU may be omitted from MADT or it must be marked - * as disabled. However omitting non present CPU from - * MADT breaks hotplug on linux. So possible CPUs - * should be put in MADT but kept disabled. - */ - apic->flags = cpu_to_le32(0); - } + adevc->madt_cpu(adev, i, apic_ids, table_data); } g_free(apic_ids); @@ -1334,8 +1347,10 @@ static Aml *build_com_device_aml(uint8_t uid) static void build_isa_devices_aml(Aml *table) { ISADevice *fdc = pc_find_fdc0(); + bool ambiguous; Aml *scope = aml_scope("_SB.PCI0.ISA"); + Object *obj = object_resolve_path_type("", TYPE_ISA_BUS, &ambiguous); aml_append(scope, build_rtc_device_aml()); aml_append(scope, build_kbd_device_aml()); @@ -1347,6 +1362,14 @@ static void build_isa_devices_aml(Aml *table) aml_append(scope, build_com_device_aml(1)); aml_append(scope, build_com_device_aml(2)); + if (ambiguous) { + error_report("Multiple ISA busses, unable to define IPMI ACPI data"); + } else if (!obj) { + error_report("No ISA bus, unable to define IPMI ACPI data"); + } else { + build_acpi_ipmi_devices(scope, BUS(obj)); + } + aml_append(table, scope); } @@ -1874,6 +1897,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, GPtrArray *mem_ranges = g_ptr_array_new_with_free_func(crs_range_free); GPtrArray *io_ranges = g_ptr_array_new_with_free_func(crs_range_free); PCMachineState *pcms = PC_MACHINE(machine); + PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine); uint32_t nr_mem = machine->ram_slots; int root_bus_limit = 0xFF; PCIBus *bus = NULL; @@ -1929,7 +1953,15 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, build_q35_pci0_int(dsdt); } - build_legacy_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base); + if (pcmc->legacy_cpu_hotplug) { + build_legacy_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base); + } else { + CPUHotplugFeatures opts = { + .apci_1_compatible = true, .has_legacy_cphp = true + }; + build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base, + "\\_SB.PCI0", "\\_GPE._E02"); + } build_memory_hotplug_aml(dsdt, nr_mem, pm->mem_hp_io_base, pm->mem_hp_io_len); @@ -2297,7 +2329,6 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine) AcpiSratMemoryAffinity *numamem; int i; - uint64_t curnode; int srat_start, numa_start, slots; uint64_t mem_len, mem_base, next_base; MachineClass *mc = MACHINE_GET_CLASS(machine); @@ -2313,14 +2344,19 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine) srat->reserved1 = cpu_to_le32(1); for (i = 0; i < apic_ids->len; i++) { + int j; int apic_id = apic_ids->cpus[i].arch_id; core = acpi_data_push(table_data, sizeof *core); core->type = ACPI_SRAT_PROCESSOR_APIC; core->length = sizeof(*core); core->local_apic_id = apic_id; - curnode = pcms->node_cpu[apic_id]; - core->proximity_lo = curnode; + for (j = 0; j < nb_numa_nodes; j++) { + if (test_bit(i, numa_info[j].node_cpu)) { + core->proximity_lo = j; + break; + } + } memset(core->proximity_hi, 0, 3); core->local_sapic_eid = 0; core->flags = cpu_to_le32(1); diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index f9c901471d..98997d167c 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -36,8 +36,6 @@ #include "kvm_i386.h" #include "hw/pci/pci-assign.h" -#define MSIX_PAGE_SIZE 0x1000 - /* From linux/ioport.h */ #define IORESOURCE_IO 0x00000100 /* Resource type */ #define IORESOURCE_MEM 0x00000200 @@ -122,6 +120,7 @@ typedef struct AssignedDevice { int *msi_virq; MSIXTableEntry *msix_table; hwaddr msix_table_addr; + uint16_t msix_table_size; uint16_t msix_max; MemoryRegion mmio; char *configfd_name; @@ -1310,6 +1309,7 @@ static int assigned_device_pci_cap_init(PCIDevice *pci_dev, Error **errp) bar_nr = msix_table_entry & PCI_MSIX_FLAGS_BIRMASK; msix_table_entry &= ~PCI_MSIX_FLAGS_BIRMASK; dev->msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry; + dev->msix_table_size = msix_max * sizeof(MSIXTableEntry); dev->msix_max = msix_max; } @@ -1633,7 +1633,7 @@ static void assigned_dev_msix_reset(AssignedDevice *dev) return; } - memset(dev->msix_table, 0, MSIX_PAGE_SIZE); + memset(dev->msix_table, 0, dev->msix_table_size); for (i = 0, entry = dev->msix_table; i < dev->msix_max; i++, entry++) { entry->ctrl = cpu_to_le32(0x1); /* Masked */ @@ -1642,8 +1642,8 @@ static void assigned_dev_msix_reset(AssignedDevice *dev) static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp) { - dev->msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE, - MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); + dev->msix_table = mmap(NULL, dev->msix_table_size, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); if (dev->msix_table == MAP_FAILED) { error_setg_errno(errp, errno, "failed to allocate msix_table"); dev->msix_table = NULL; @@ -1653,7 +1653,7 @@ static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp) assigned_dev_msix_reset(dev); memory_region_init_io(&dev->mmio, OBJECT(dev), &assigned_dev_msix_mmio_ops, - dev, "assigned-dev-msix", MSIX_PAGE_SIZE); + dev, "assigned-dev-msix", dev->msix_table_size); } static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) @@ -1662,7 +1662,7 @@ static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) return; } - if (munmap(dev->msix_table, MSIX_PAGE_SIZE) == -1) { + if (munmap(dev->msix_table, dev->msix_table_size) == -1) { error_report("error unmapping msix_table! %s", strerror(errno)); } dev->msix_table = NULL; diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 7198ed533c..b8fead37e2 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1179,7 +1179,7 @@ void pc_machine_done(Notifier *notifier, void *data) void pc_guest_info_init(PCMachineState *pcms) { - int i, j; + int i; pcms->apic_xrupt_override = kvm_allows_irq0_override(); pcms->numa_nodes = nb_numa_nodes; @@ -1189,20 +1189,6 @@ void pc_guest_info_init(PCMachineState *pcms) pcms->node_mem[i] = numa_info[i].node_mem; } - pcms->node_cpu = g_malloc0(pcms->apic_id_limit * - sizeof *pcms->node_cpu); - - for (i = 0; i < max_cpus; i++) { - unsigned int apic_id = x86_cpu_apic_id_from_index(i); - assert(apic_id < pcms->apic_id_limit); - for (j = 0; j < nb_numa_nodes; j++) { - if (test_bit(i, numa_info[j].node_cpu)) { - pcms->node_cpu[apic_id] = j; - break; - } - } - } - pcms->machine_done.notify = pc_machine_done; qemu_add_machine_init_done_notifier(&pcms->machine_done); } @@ -1707,6 +1693,49 @@ static void pc_cpu_plug(HotplugHandler *hotplug_dev, out: error_propagate(errp, local_err); } +static void pc_cpu_unplug_request_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) +{ + HotplugHandlerClass *hhc; + Error *local_err = NULL; + PCMachineState *pcms = PC_MACHINE(hotplug_dev); + + hhc = HOTPLUG_HANDLER_GET_CLASS(pcms->acpi_dev); + hhc->unplug_request(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &local_err); + + if (local_err) { + goto out; + } + + out: + error_propagate(errp, local_err); + +} + +static void pc_cpu_unplug_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) +{ + HotplugHandlerClass *hhc; + Error *local_err = NULL; + PCMachineState *pcms = PC_MACHINE(hotplug_dev); + + hhc = HOTPLUG_HANDLER_GET_CLASS(pcms->acpi_dev); + hhc->unplug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &local_err); + + if (local_err) { + goto out; + } + + /* + * TODO: enable unplug once generic CPU remove bits land + * for now guest will be able to eject CPU ACPI wise but + * it will come back again on machine reset. + */ + /* object_unparent(OBJECT(dev)); */ + + out: + error_propagate(errp, local_err); +} static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) @@ -1723,6 +1752,8 @@ static void pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev, { if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { pc_dimm_unplug_request(hotplug_dev, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + pc_cpu_unplug_request_cb(hotplug_dev, dev, errp); } else { error_setg(errp, "acpi: device unplug request for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -1734,6 +1765,8 @@ static void pc_machine_device_unplug_cb(HotplugHandler *hotplug_dev, { if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { pc_dimm_unplug(hotplug_dev, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + pc_cpu_unplug_cb(hotplug_dev, dev, errp); } else { error_setg(errp, "acpi: device unplug for not supported device" " type: %s", object_get_typename(OBJECT(dev))); diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 53bc968bd0..c7d70af253 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -445,9 +445,11 @@ DEFINE_I440FX_MACHINE(v2_7, "pc-i440fx-2.7", NULL, static void pc_i440fx_2_6_machine_options(MachineClass *m) { + PCMachineClass *pcmc = PC_MACHINE_CLASS(m); pc_i440fx_2_7_machine_options(m); m->is_default = 0; m->alias = NULL; + pcmc->legacy_cpu_hotplug = true; SET_MACHINE_COMPAT(m, PC_COMPAT_2_6); } diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index e4b541f7b2..97a8835eea 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -294,8 +294,10 @@ DEFINE_Q35_MACHINE(v2_7, "pc-q35-2.7", NULL, static void pc_q35_2_6_machine_options(MachineClass *m) { + PCMachineClass *pcmc = PC_MACHINE_CLASS(m); pc_q35_2_7_machine_options(m); m->alias = NULL; + pcmc->legacy_cpu_hotplug = true; SET_MACHINE_COMPAT(m, PC_COMPAT_2_6); } diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index 213741bc21..c1a4f1b34c 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -714,6 +714,7 @@ static void ich9_lpc_class_init(ObjectClass *klass, void *data) hc->unplug = ich9_pm_device_unplug_cb; adevc->ospm_status = ich9_pm_ospm_status; adevc->send_event = ich9_send_gpe; + adevc->madt_cpu = pc_madt_cpu_entry; } static const TypeInfo ich9_lpc_info = { diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c index 0a602f28ba..81896c0e84 100644 --- a/hw/mem/nvdimm.c +++ b/hw/mem/nvdimm.c @@ -23,20 +23,152 @@ */ #include "qemu/osdep.h" +#include "qapi/error.h" +#include "qapi/visitor.h" #include "hw/mem/nvdimm.h" +static void nvdimm_get_label_size(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + NVDIMMDevice *nvdimm = NVDIMM(obj); + uint64_t value = nvdimm->label_size; + + visit_type_size(v, name, &value, errp); +} + +static void nvdimm_set_label_size(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + NVDIMMDevice *nvdimm = NVDIMM(obj); + Error *local_err = NULL; + uint64_t value; + + if (memory_region_size(&nvdimm->nvdimm_mr)) { + error_setg(&local_err, "cannot change property value"); + goto out; + } + + visit_type_size(v, name, &value, &local_err); + if (local_err) { + goto out; + } + if (value < MIN_NAMESPACE_LABEL_SIZE) { + error_setg(&local_err, "Property '%s.%s' (0x%" PRIx64 ") is required" + " at least 0x%lx", object_get_typename(obj), + name, value, MIN_NAMESPACE_LABEL_SIZE); + goto out; + } + + nvdimm->label_size = value; +out: + error_propagate(errp, local_err); +} + +static void nvdimm_init(Object *obj) +{ + object_property_add(obj, "label-size", "int", + nvdimm_get_label_size, nvdimm_set_label_size, NULL, + NULL, NULL); +} + +static MemoryRegion *nvdimm_get_memory_region(PCDIMMDevice *dimm) +{ + NVDIMMDevice *nvdimm = NVDIMM(dimm); + + return &nvdimm->nvdimm_mr; +} + +static void nvdimm_realize(PCDIMMDevice *dimm, Error **errp) +{ + MemoryRegion *mr = host_memory_backend_get_memory(dimm->hostmem, errp); + NVDIMMDevice *nvdimm = NVDIMM(dimm); + uint64_t align, pmem_size, size = memory_region_size(mr); + + align = memory_region_get_alignment(mr); + + pmem_size = size - nvdimm->label_size; + nvdimm->label_data = memory_region_get_ram_ptr(mr) + pmem_size; + pmem_size = QEMU_ALIGN_DOWN(pmem_size, align); + + if (size <= nvdimm->label_size || !pmem_size) { + HostMemoryBackend *hostmem = dimm->hostmem; + char *path = object_get_canonical_path_component(OBJECT(hostmem)); + + error_setg(errp, "the size of memdev %s (0x%" PRIx64 ") is too " + "small to contain nvdimm label (0x%" PRIx64 ") and " + "aligned PMEM (0x%" PRIx64 ")", + path, memory_region_size(mr), nvdimm->label_size, align); + return; + } + + memory_region_init_alias(&nvdimm->nvdimm_mr, OBJECT(dimm), + "nvdimm-memory", mr, 0, pmem_size); + nvdimm->nvdimm_mr.align = align; +} + +/* + * the caller should check the input parameters before calling + * label read/write functions. + */ +static void nvdimm_validate_rw_label_data(NVDIMMDevice *nvdimm, uint64_t size, + uint64_t offset) +{ + assert((nvdimm->label_size >= size + offset) && (offset + size > offset)); +} + +static void nvdimm_read_label_data(NVDIMMDevice *nvdimm, void *buf, + uint64_t size, uint64_t offset) +{ + nvdimm_validate_rw_label_data(nvdimm, size, offset); + + memcpy(buf, nvdimm->label_data + offset, size); +} + +static void nvdimm_write_label_data(NVDIMMDevice *nvdimm, const void *buf, + uint64_t size, uint64_t offset) +{ + MemoryRegion *mr; + PCDIMMDevice *dimm = PC_DIMM(nvdimm); + uint64_t backend_offset; + + nvdimm_validate_rw_label_data(nvdimm, size, offset); + + memcpy(nvdimm->label_data + offset, buf, size); + + mr = host_memory_backend_get_memory(dimm->hostmem, &error_abort); + backend_offset = memory_region_size(mr) - nvdimm->label_size + offset; + memory_region_set_dirty(mr, backend_offset, size); +} + +static MemoryRegion *nvdimm_get_vmstate_memory_region(PCDIMMDevice *dimm) +{ + return host_memory_backend_get_memory(dimm->hostmem, &error_abort); +} + static void nvdimm_class_init(ObjectClass *oc, void *data) { DeviceClass *dc = DEVICE_CLASS(oc); + PCDIMMDeviceClass *ddc = PC_DIMM_CLASS(oc); + NVDIMMClass *nvc = NVDIMM_CLASS(oc); /* nvdimm hotplug has not been supported yet. */ dc->hotpluggable = false; + + ddc->realize = nvdimm_realize; + ddc->get_memory_region = nvdimm_get_memory_region; + ddc->get_vmstate_memory_region = nvdimm_get_vmstate_memory_region; + + nvc->read_label_data = nvdimm_read_label_data; + nvc->write_label_data = nvdimm_write_label_data; } static TypeInfo nvdimm_info = { .name = TYPE_NVDIMM, .parent = TYPE_PC_DIMM, + .class_size = sizeof(NVDIMMClass), .class_init = nvdimm_class_init, + .instance_size = sizeof(NVDIMMDevice), + .instance_init = nvdimm_init, }; static void nvdimm_register_types(void) diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index 6de2275986..249193a543 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -40,6 +40,8 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms, int slot; MachineState *machine = MACHINE(qdev_get_machine()); PCDIMMDevice *dimm = PC_DIMM(dev); + PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); + MemoryRegion *vmstate_mr = ddc->get_vmstate_memory_region(dimm); Error *local_err = NULL; uint64_t existing_dimms_capacity = 0; uint64_t addr; @@ -105,7 +107,7 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms, } memory_region_add_subregion(&hpms->mr, addr - hpms->base, mr); - vmstate_register_ram(mr, dev); + vmstate_register_ram(vmstate_mr, dev); numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node); out: @@ -116,10 +118,12 @@ void pc_dimm_memory_unplug(DeviceState *dev, MemoryHotplugState *hpms, MemoryRegion *mr) { PCDIMMDevice *dimm = PC_DIMM(dev); + PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); + MemoryRegion *vmstate_mr = ddc->get_vmstate_memory_region(dimm); numa_unset_mem_node_id(dimm->addr, memory_region_size(mr), dimm->node); memory_region_del_subregion(&hpms->mr, mr); - vmstate_unregister_ram(mr, dev); + vmstate_unregister_ram(vmstate_mr, dev); } static int pc_existing_dimms_capacity_internal(Object *obj, void *opaque) @@ -424,6 +428,11 @@ static MemoryRegion *pc_dimm_get_memory_region(PCDIMMDevice *dimm) return host_memory_backend_get_memory(dimm->hostmem, &error_abort); } +static MemoryRegion *pc_dimm_get_vmstate_memory_region(PCDIMMDevice *dimm) +{ + return host_memory_backend_get_memory(dimm->hostmem, &error_abort); +} + static void pc_dimm_class_init(ObjectClass *oc, void *data) { DeviceClass *dc = DEVICE_CLASS(oc); @@ -434,6 +443,7 @@ static void pc_dimm_class_init(ObjectClass *oc, void *data) dc->desc = "DIMM memory module"; ddc->get_memory_region = pc_dimm_get_memory_region; + ddc->get_vmstate_memory_region = pc_dimm_get_vmstate_memory_region; } static TypeInfo pc_dimm_info = { diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index 1625e6b38b..8b709e362e 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -69,92 +69,58 @@ VirtIODevice *virtio_ccw_get_vdev(SubchDev *sch) return vdev; } -static int virtio_ccw_set_guest2host_notifier(VirtioCcwDevice *dev, int n, - bool assign, bool set_handler) +static void virtio_ccw_start_ioeventfd(VirtioCcwDevice *dev) { - VirtIODevice *vdev = virtio_bus_get_device(&dev->bus); - VirtQueue *vq = virtio_get_queue(vdev, n); - EventNotifier *notifier = virtio_queue_get_host_notifier(vq); - int r = 0; - SubchDev *sch = dev->sch; - uint32_t sch_id = (css_build_subchannel_id(sch) << 16) | sch->schid; + virtio_bus_start_ioeventfd(&dev->bus); +} - if (assign) { - r = event_notifier_init(notifier, 1); - if (r < 0) { - error_report("%s: unable to init event notifier: %d", __func__, r); - return r; - } - virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler); - r = s390_assign_subch_ioeventfd(notifier, sch_id, n, assign); - if (r < 0) { - error_report("%s: unable to assign ioeventfd: %d", __func__, r); - virtio_queue_set_host_notifier_fd_handler(vq, false, false); - event_notifier_cleanup(notifier); - return r; - } - } else { - virtio_queue_set_host_notifier_fd_handler(vq, false, false); - s390_assign_subch_ioeventfd(notifier, sch_id, n, assign); - event_notifier_cleanup(notifier); - } - return r; +static void virtio_ccw_stop_ioeventfd(VirtioCcwDevice *dev) +{ + virtio_bus_stop_ioeventfd(&dev->bus); } -static void virtio_ccw_start_ioeventfd(VirtioCcwDevice *dev) +static bool virtio_ccw_ioeventfd_started(DeviceState *d) { - VirtIODevice *vdev; - int n, r; + VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d); - if (!(dev->flags & VIRTIO_CCW_FLAG_USE_IOEVENTFD) || - dev->ioeventfd_disabled || - dev->ioeventfd_started) { - return; - } - vdev = virtio_bus_get_device(&dev->bus); - for (n = 0; n < VIRTIO_CCW_QUEUE_MAX; n++) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } - r = virtio_ccw_set_guest2host_notifier(dev, n, true, true); - if (r < 0) { - goto assign_error; - } - } - dev->ioeventfd_started = true; - return; + return dev->ioeventfd_started; +} - assign_error: - while (--n >= 0) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } - r = virtio_ccw_set_guest2host_notifier(dev, n, false, false); - assert(r >= 0); +static void virtio_ccw_ioeventfd_set_started(DeviceState *d, bool started, + bool err) +{ + VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d); + + dev->ioeventfd_started = started; + if (err) { + /* Disable ioeventfd for this device. */ + dev->flags &= ~VIRTIO_CCW_FLAG_USE_IOEVENTFD; } - dev->ioeventfd_started = false; - /* Disable ioeventfd for this device. */ - dev->flags &= ~VIRTIO_CCW_FLAG_USE_IOEVENTFD; - error_report("%s: failed. Fallback to userspace (slower).", __func__); } -static void virtio_ccw_stop_ioeventfd(VirtioCcwDevice *dev) +static bool virtio_ccw_ioeventfd_disabled(DeviceState *d) { - VirtIODevice *vdev; - int n, r; + VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d); - if (!dev->ioeventfd_started) { - return; - } - vdev = virtio_bus_get_device(&dev->bus); - for (n = 0; n < VIRTIO_CCW_QUEUE_MAX; n++) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } - r = virtio_ccw_set_guest2host_notifier(dev, n, false, false); - assert(r >= 0); - } - dev->ioeventfd_started = false; + return dev->ioeventfd_disabled || + !(dev->flags & VIRTIO_CCW_FLAG_USE_IOEVENTFD); +} + +static void virtio_ccw_ioeventfd_set_disabled(DeviceState *d, bool disabled) +{ + VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d); + + dev->ioeventfd_disabled = disabled; +} + +static int virtio_ccw_ioeventfd_assign(DeviceState *d, EventNotifier *notifier, + int n, bool assign) +{ + VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d); + SubchDev *sch = dev->sch; + uint32_t sch_id = (css_build_subchannel_id(sch) << 16) | sch->schid; + + return s390_assign_subch_ioeventfd(notifier, sch_id, n, assign); } VirtualCssBus *virtual_css_bus_init(void) @@ -1157,19 +1123,6 @@ static bool virtio_ccw_query_guest_notifiers(DeviceState *d) return !!(dev->sch->curr_status.pmcw.flags & PMCW_FLAGS_MASK_ENA); } -static int virtio_ccw_set_host_notifier(DeviceState *d, int n, bool assign) -{ - VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d); - - /* Stop using the generic ioeventfd, we are doing eventfd handling - * ourselves below */ - dev->ioeventfd_disabled = assign; - if (assign) { - virtio_ccw_stop_ioeventfd(dev); - } - return virtio_ccw_set_guest2host_notifier(dev, n, assign, false); -} - static int virtio_ccw_get_mappings(VirtioCcwDevice *dev) { int r; @@ -1798,7 +1751,6 @@ static void virtio_ccw_bus_class_init(ObjectClass *klass, void *data) k->notify = virtio_ccw_notify; k->vmstate_change = virtio_ccw_vmstate_change; k->query_guest_notifiers = virtio_ccw_query_guest_notifiers; - k->set_host_notifier = virtio_ccw_set_host_notifier; k->set_guest_notifiers = virtio_ccw_set_guest_notifiers; k->save_queue = virtio_ccw_save_queue; k->load_queue = virtio_ccw_load_queue; @@ -1807,6 +1759,11 @@ static void virtio_ccw_bus_class_init(ObjectClass *klass, void *data) k->device_plugged = virtio_ccw_device_plugged; k->post_plugged = virtio_ccw_post_plugged; k->device_unplugged = virtio_ccw_device_unplugged; + k->ioeventfd_started = virtio_ccw_ioeventfd_started; + k->ioeventfd_set_started = virtio_ccw_ioeventfd_set_started; + k->ioeventfd_disabled = virtio_ccw_ioeventfd_disabled; + k->ioeventfd_set_disabled = virtio_ccw_ioeventfd_set_disabled; + k->ioeventfd_assign = virtio_ccw_ioeventfd_assign; } static const TypeInfo virtio_ccw_bus_info = { diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c index 1a49f1e4b7..18ced31493 100644 --- a/hw/scsi/virtio-scsi-dataplane.c +++ b/hw/scsi/virtio-scsi-dataplane.c @@ -31,7 +31,7 @@ void virtio_scsi_set_iothread(VirtIOSCSI *s, IOThread *iothread) s->ctx = iothread_get_aio_context(vs->conf.iothread); /* Don't try if transport does not support notifiers. */ - if (!k->set_guest_notifiers || !k->set_host_notifier) { + if (!k->set_guest_notifiers || !k->ioeventfd_started) { fprintf(stderr, "virtio-scsi: Failed to set iothread " "(transport does not support notifiers)"); exit(1); @@ -69,11 +69,10 @@ static int virtio_scsi_vring_init(VirtIOSCSI *s, VirtQueue *vq, int n, void (*fn)(VirtIODevice *vdev, VirtQueue *vq)) { BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s))); - VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); int rc; /* Set up virtqueue notify */ - rc = k->set_host_notifier(qbus->parent, n, true); + rc = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), n, true); if (rc != 0) { fprintf(stderr, "virtio-scsi: Failed to set host notifier (%d)\n", rc); @@ -159,7 +158,7 @@ fail_vrings: virtio_scsi_clear_aio(s); aio_context_release(s->ctx); for (i = 0; i < vs->conf.num_queues + 2; i++) { - k->set_host_notifier(qbus->parent, i, false); + virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false); } k->set_guest_notifiers(qbus->parent, vs->conf.num_queues + 2, false); fail_guest_notifiers: @@ -198,7 +197,7 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s) aio_context_release(s->ctx); for (i = 0; i < vs->conf.num_queues + 2; i++) { - k->set_host_notifier(qbus->parent, i, false); + virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false); } /* Clean up guest notifier (irq) */ diff --git a/hw/smbios/Makefile.objs b/hw/smbios/Makefile.objs index f69a92f967..c3d3753602 100644 --- a/hw/smbios/Makefile.objs +++ b/hw/smbios/Makefile.objs @@ -1 +1,2 @@ common-obj-$(CONFIG_SMBIOS) += smbios.o +common-obj-$(call land,$(CONFIG_SMBIOS),$(CONFIG_IPMI)) += smbios_type_38.o diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c index cb8a111102..74c7102929 100644 --- a/hw/smbios/smbios.c +++ b/hw/smbios/smbios.c @@ -24,6 +24,8 @@ #include "hw/smbios/smbios.h" #include "hw/loader.h" #include "exec/cpu-common.h" +#include "smbios_build.h" +#include "hw/smbios/ipmi.h" /* legacy structures and constants for <= 2.0 machines */ struct smbios_header { @@ -53,10 +55,10 @@ static bool smbios_uuid_encoded = true; /* end: legacy structures & constants for <= 2.0 machines */ -static uint8_t *smbios_tables; -static size_t smbios_tables_len; -static unsigned smbios_table_max; -static unsigned smbios_table_cnt; +uint8_t *smbios_tables; +size_t smbios_tables_len; +unsigned smbios_table_max; +unsigned smbios_table_cnt; static SmbiosEntryPointType smbios_ep_type = SMBIOS_ENTRY_POINT_21; static SmbiosEntryPoint ep; @@ -429,7 +431,7 @@ uint8_t *smbios_get_table_legacy(size_t *length) /* end: legacy setup functions for <= 2.0 machines */ -static bool smbios_skip_table(uint8_t type, bool required_table) +bool smbios_skip_table(uint8_t type, bool required_table) { if (test_bit(type, have_binfile_bitmap)) { return true; /* user provided their own binary blob(s) */ @@ -443,65 +445,6 @@ static bool smbios_skip_table(uint8_t type, bool required_table) return true; } -#define SMBIOS_BUILD_TABLE_PRE(tbl_type, tbl_handle, tbl_required) \ - struct smbios_type_##tbl_type *t; \ - size_t t_off; /* table offset into smbios_tables */ \ - int str_index = 0; \ - do { \ - /* should we skip building this table ? */ \ - if (smbios_skip_table(tbl_type, tbl_required)) { \ - return; \ - } \ - \ - /* use offset of table t within smbios_tables */ \ - /* (pointer must be updated after each realloc) */ \ - t_off = smbios_tables_len; \ - smbios_tables_len += sizeof(*t); \ - smbios_tables = g_realloc(smbios_tables, smbios_tables_len); \ - t = (struct smbios_type_##tbl_type *)(smbios_tables + t_off); \ - \ - t->header.type = tbl_type; \ - t->header.length = sizeof(*t); \ - t->header.handle = cpu_to_le16(tbl_handle); \ - } while (0) - -#define SMBIOS_TABLE_SET_STR(tbl_type, field, value) \ - do { \ - int len = (value != NULL) ? strlen(value) + 1 : 0; \ - if (len > 1) { \ - smbios_tables = g_realloc(smbios_tables, \ - smbios_tables_len + len); \ - memcpy(smbios_tables + smbios_tables_len, value, len); \ - smbios_tables_len += len; \ - /* update pointer post-realloc */ \ - t = (struct smbios_type_##tbl_type *)(smbios_tables + t_off); \ - t->field = ++str_index; \ - } else { \ - t->field = 0; \ - } \ - } while (0) - -#define SMBIOS_BUILD_TABLE_POST \ - do { \ - size_t term_cnt, t_size; \ - \ - /* add '\0' terminator (add two if no strings defined) */ \ - term_cnt = (str_index == 0) ? 2 : 1; \ - smbios_tables = g_realloc(smbios_tables, \ - smbios_tables_len + term_cnt); \ - memset(smbios_tables + smbios_tables_len, 0, term_cnt); \ - smbios_tables_len += term_cnt; \ - \ - /* update smbios max. element size */ \ - t_size = smbios_tables_len - t_off; \ - if (t_size > smbios_table_max) { \ - smbios_table_max = t_size; \ - } \ - \ - /* update smbios element count */ \ - smbios_table_cnt++; \ - } while (0) - static void smbios_build_type_0_table(void) { SMBIOS_BUILD_TABLE_PRE(0, 0x000, false); /* optional, leave up to BIOS */ @@ -906,6 +849,7 @@ void smbios_get_tables(const struct smbios_phys_mem_area *mem_array, } smbios_build_type_32_table(); + smbios_build_type_38_table(); smbios_build_type_127_table(); smbios_validate_table(); diff --git a/hw/smbios/smbios_build.h b/hw/smbios/smbios_build.h new file mode 100644 index 0000000000..68b8b72e09 --- /dev/null +++ b/hw/smbios/smbios_build.h @@ -0,0 +1,87 @@ +/* + * SMBIOS Support + * + * Copyright (C) 2009 Hewlett-Packard Development Company, L.P. + * Copyright (C) 2013 Red Hat, Inc. + * + * Authors: + * Alex Williamson <alex.williamson@hp.com> + * Markus Armbruster <armbru@redhat.com> + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Contributions after 2012-01-13 are licensed under the terms of the + * GNU GPL, version 2 or (at your option) any later version. + */ + +#ifndef QEMU_SMBIOS_BUILD_H +#define QEMU_SMBIOS_BUILD_H + +bool smbios_skip_table(uint8_t type, bool required_table); + +extern uint8_t *smbios_tables; +extern size_t smbios_tables_len; +extern unsigned smbios_table_max; +extern unsigned smbios_table_cnt; + +#define SMBIOS_BUILD_TABLE_PRE(tbl_type, tbl_handle, tbl_required) \ + struct smbios_type_##tbl_type *t; \ + size_t t_off; /* table offset into smbios_tables */ \ + int str_index = 0; \ + do { \ + /* should we skip building this table ? */ \ + if (smbios_skip_table(tbl_type, tbl_required)) { \ + return; \ + } \ + \ + /* use offset of table t within smbios_tables */ \ + /* (pointer must be updated after each realloc) */ \ + t_off = smbios_tables_len; \ + smbios_tables_len += sizeof(*t); \ + smbios_tables = g_realloc(smbios_tables, smbios_tables_len); \ + t = (struct smbios_type_##tbl_type *)(smbios_tables + t_off); \ + \ + t->header.type = tbl_type; \ + t->header.length = sizeof(*t); \ + t->header.handle = cpu_to_le16(tbl_handle); \ + } while (0) + +#define SMBIOS_TABLE_SET_STR(tbl_type, field, value) \ + do { \ + int len = (value != NULL) ? strlen(value) + 1 : 0; \ + if (len > 1) { \ + smbios_tables = g_realloc(smbios_tables, \ + smbios_tables_len + len); \ + memcpy(smbios_tables + smbios_tables_len, value, len); \ + smbios_tables_len += len; \ + /* update pointer post-realloc */ \ + t = (struct smbios_type_##tbl_type *)(smbios_tables + t_off); \ + t->field = ++str_index; \ + } else { \ + t->field = 0; \ + } \ + } while (0) + +#define SMBIOS_BUILD_TABLE_POST \ + do { \ + size_t term_cnt, t_size; \ + \ + /* add '\0' terminator (add two if no strings defined) */ \ + term_cnt = (str_index == 0) ? 2 : 1; \ + smbios_tables = g_realloc(smbios_tables, \ + smbios_tables_len + term_cnt); \ + memset(smbios_tables + smbios_tables_len, 0, term_cnt); \ + smbios_tables_len += term_cnt; \ + \ + /* update smbios max. element size */ \ + t_size = smbios_tables_len - t_off; \ + if (t_size > smbios_table_max) { \ + smbios_table_max = t_size; \ + } \ + \ + /* update smbios element count */ \ + smbios_table_cnt++; \ + } while (0) + +#endif /* QEMU_SMBIOS_BUILD_H */ diff --git a/hw/smbios/smbios_type_38.c b/hw/smbios/smbios_type_38.c new file mode 100644 index 0000000000..56e8609c00 --- /dev/null +++ b/hw/smbios/smbios_type_38.c @@ -0,0 +1,117 @@ +/* + * IPMI SMBIOS firmware handling + * + * Copyright (c) 2015,2016 Corey Minyard, MontaVista Software, LLC + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "hw/ipmi/ipmi.h" +#include "hw/smbios/ipmi.h" +#include "hw/smbios/smbios.h" +#include "qemu/error-report.h" +#include "smbios_build.h" + +/* SMBIOS type 38 - IPMI */ +struct smbios_type_38 { + struct smbios_structure_header header; + uint8_t interface_type; + uint8_t ipmi_spec_revision; + uint8_t i2c_slave_address; + uint8_t nv_storage_device_address; + uint64_t base_address; + uint8_t base_address_modifier; + uint8_t interrupt_number; +} QEMU_PACKED; + +static void smbios_build_one_type_38(IPMIFwInfo *info) +{ + uint64_t baseaddr = info->base_address; + SMBIOS_BUILD_TABLE_PRE(38, 0x3000, true); + + t->interface_type = info->interface_type; + t->ipmi_spec_revision = ((info->ipmi_spec_major_revision << 4) + | info->ipmi_spec_minor_revision); + t->i2c_slave_address = info->i2c_slave_address; + t->nv_storage_device_address = 0; + + assert(info->ipmi_spec_minor_revision <= 15); + assert(info->ipmi_spec_major_revision <= 15); + + /* or 1 to set it to I/O space */ + switch (info->memspace) { + case IPMI_MEMSPACE_IO: + baseaddr |= 1; + break; + case IPMI_MEMSPACE_MEM32: + case IPMI_MEMSPACE_MEM64: + break; + case IPMI_MEMSPACE_SMBUS: + baseaddr <<= 1; + break; + } + + t->base_address = cpu_to_le64(baseaddr); + + t->base_address_modifier = 0; + if (info->irq_type == IPMI_LEVEL_IRQ) { + t->base_address_modifier |= 1; + } + switch (info->register_spacing) { + case 1: + break; + case 4: + t->base_address_modifier |= 1 << 6; + break; + case 16: + t->base_address_modifier |= 2 << 6; + break; + default: + error_report("IPMI register spacing %d is not compatible with" + " SMBIOS, ignoring this entry.", info->register_spacing); + return; + } + t->interrupt_number = info->interrupt_number; + + SMBIOS_BUILD_TABLE_POST; +} + +static void smbios_add_ipmi_devices(BusState *bus) +{ + BusChild *kid; + + QTAILQ_FOREACH(kid, &bus->children, sibling) { + DeviceState *dev = kid->child; + Object *obj = object_dynamic_cast(OBJECT(dev), TYPE_IPMI_INTERFACE); + BusState *childbus; + + if (obj) { + IPMIInterface *ii; + IPMIInterfaceClass *iic; + IPMIFwInfo info; + + ii = IPMI_INTERFACE(obj); + iic = IPMI_INTERFACE_GET_CLASS(obj); + memset(&info, 0, sizeof(info)); + iic->get_fwinfo(ii, &info); + smbios_build_one_type_38(&info); + continue; + } + + QLIST_FOREACH(childbus, &dev->child_bus, sibling) { + smbios_add_ipmi_devices(childbus); + } + } +} + +void smbios_build_type_38_table(void) +{ + BusState *bus; + + bus = sysbus_get_default(); + if (bus) { + smbios_add_ipmi_devices(bus); + } +} diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 81cc5b0ae3..a01394d5ac 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -1110,14 +1110,15 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev) VirtioBusState *vbus = VIRTIO_BUS(qbus); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus); int i, r, e; - if (!k->set_host_notifier) { + if (!k->ioeventfd_started) { fprintf(stderr, "binding does not support host notifiers\n"); r = -ENOSYS; goto fail; } for (i = 0; i < hdev->nvqs; ++i) { - r = k->set_host_notifier(qbus->parent, hdev->vq_index + i, true); + r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i, + true); if (r < 0) { fprintf(stderr, "vhost VQ %d notifier binding failed: %d\n", i, -r); goto fail_vq; @@ -1127,7 +1128,8 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev) return 0; fail_vq: while (--i >= 0) { - e = k->set_host_notifier(qbus->parent, hdev->vq_index + i, false); + e = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i, + false); if (e < 0) { fprintf(stderr, "vhost VQ %d notifier cleanup error: %d\n", i, -r); fflush(stderr); @@ -1146,12 +1148,11 @@ fail: void vhost_dev_disable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev) { BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); - VirtioBusState *vbus = VIRTIO_BUS(qbus); - VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus); int i, r; for (i = 0; i < hdev->nvqs; ++i) { - r = k->set_host_notifier(qbus->parent, hdev->vq_index + i, false); + r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i, + false); if (r < 0) { fprintf(stderr, "vhost VQ %d notifier cleanup failed: %d\n", i, -r); fflush(stderr); diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c index 574f0e23f8..131376027b 100644 --- a/hw/virtio/virtio-bus.c +++ b/hw/virtio/virtio-bus.c @@ -146,6 +146,138 @@ void virtio_bus_set_vdev_config(VirtioBusState *bus, uint8_t *config) } } +/* + * This function handles both assigning the ioeventfd handler and + * registering it with the kernel. + * assign: register/deregister ioeventfd with the kernel + * set_handler: use the generic ioeventfd handler + */ +static int set_host_notifier_internal(DeviceState *proxy, VirtioBusState *bus, + int n, bool assign, bool set_handler) +{ + VirtIODevice *vdev = virtio_bus_get_device(bus); + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(bus); + VirtQueue *vq = virtio_get_queue(vdev, n); + EventNotifier *notifier = virtio_queue_get_host_notifier(vq); + int r = 0; + + if (assign) { + r = event_notifier_init(notifier, 1); + if (r < 0) { + error_report("%s: unable to init event notifier: %d", __func__, r); + return r; + } + virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler); + r = k->ioeventfd_assign(proxy, notifier, n, assign); + if (r < 0) { + error_report("%s: unable to assign ioeventfd: %d", __func__, r); + virtio_queue_set_host_notifier_fd_handler(vq, false, false); + event_notifier_cleanup(notifier); + return r; + } + } else { + virtio_queue_set_host_notifier_fd_handler(vq, false, false); + k->ioeventfd_assign(proxy, notifier, n, assign); + event_notifier_cleanup(notifier); + } + return r; +} + +void virtio_bus_start_ioeventfd(VirtioBusState *bus) +{ + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(bus); + DeviceState *proxy = DEVICE(BUS(bus)->parent); + VirtIODevice *vdev; + int n, r; + + if (!k->ioeventfd_started || k->ioeventfd_started(proxy)) { + return; + } + if (k->ioeventfd_disabled(proxy)) { + return; + } + vdev = virtio_bus_get_device(bus); + for (n = 0; n < VIRTIO_QUEUE_MAX; n++) { + if (!virtio_queue_get_num(vdev, n)) { + continue; + } + r = set_host_notifier_internal(proxy, bus, n, true, true); + if (r < 0) { + goto assign_error; + } + } + k->ioeventfd_set_started(proxy, true, false); + return; + +assign_error: + while (--n >= 0) { + if (!virtio_queue_get_num(vdev, n)) { + continue; + } + + r = set_host_notifier_internal(proxy, bus, n, false, false); + assert(r >= 0); + } + k->ioeventfd_set_started(proxy, false, true); + error_report("%s: failed. Fallback to userspace (slower).", __func__); +} + +void virtio_bus_stop_ioeventfd(VirtioBusState *bus) +{ + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(bus); + DeviceState *proxy = DEVICE(BUS(bus)->parent); + VirtIODevice *vdev; + int n, r; + + if (!k->ioeventfd_started || !k->ioeventfd_started(proxy)) { + return; + } + vdev = virtio_bus_get_device(bus); + for (n = 0; n < VIRTIO_QUEUE_MAX; n++) { + if (!virtio_queue_get_num(vdev, n)) { + continue; + } + r = set_host_notifier_internal(proxy, bus, n, false, false); + assert(r >= 0); + } + k->ioeventfd_set_started(proxy, false, false); +} + +/* + * This function switches from/to the generic ioeventfd handler. + * assign==false means 'use generic ioeventfd handler'. + */ +int virtio_bus_set_host_notifier(VirtioBusState *bus, int n, bool assign) +{ + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(bus); + DeviceState *proxy = DEVICE(BUS(bus)->parent); + VirtIODevice *vdev = virtio_bus_get_device(bus); + VirtQueue *vq = virtio_get_queue(vdev, n); + + if (!k->ioeventfd_started) { + return -ENOSYS; + } + if (assign) { + /* + * Stop using the generic ioeventfd, we are doing eventfd handling + * ourselves below + */ + k->ioeventfd_set_disabled(proxy, true); + } + /* + * Just switch the handler, don't deassign the ioeventfd. + * Otherwise, there's a window where we don't have an + * ioeventfd and we may end up with a notification where + * we don't expect one. + */ + virtio_queue_set_host_notifier_fd_handler(vq, assign, !assign); + if (!assign) { + /* Use generic ioeventfd handler again. */ + k->ioeventfd_set_disabled(proxy, false); + } + return 0; +} + static char *virtio_bus_get_dev_path(DeviceState *dev) { BusState *bus = qdev_get_parent_bus(dev); diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c index d4cd91f8c4..eb84b74532 100644 --- a/hw/virtio/virtio-mmio.c +++ b/hw/virtio/virtio-mmio.c @@ -93,90 +93,59 @@ typedef struct { bool ioeventfd_started; } VirtIOMMIOProxy; -static int virtio_mmio_set_host_notifier_internal(VirtIOMMIOProxy *proxy, - int n, bool assign, - bool set_handler) +static bool virtio_mmio_ioeventfd_started(DeviceState *d) { - VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); - VirtQueue *vq = virtio_get_queue(vdev, n); - EventNotifier *notifier = virtio_queue_get_host_notifier(vq); - int r = 0; + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d); - if (assign) { - r = event_notifier_init(notifier, 1); - if (r < 0) { - error_report("%s: unable to init event notifier: %d", - __func__, r); - return r; - } - virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler); - memory_region_add_eventfd(&proxy->iomem, VIRTIO_MMIO_QUEUENOTIFY, 4, - true, n, notifier); - } else { - memory_region_del_eventfd(&proxy->iomem, VIRTIO_MMIO_QUEUENOTIFY, 4, - true, n, notifier); - virtio_queue_set_host_notifier_fd_handler(vq, false, false); - event_notifier_cleanup(notifier); - } - return r; + return proxy->ioeventfd_started; } -static void virtio_mmio_start_ioeventfd(VirtIOMMIOProxy *proxy) +static void virtio_mmio_ioeventfd_set_started(DeviceState *d, bool started, + bool err) { - VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); - int n, r; + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d); - if (!kvm_eventfds_enabled() || - proxy->ioeventfd_disabled || - proxy->ioeventfd_started) { - return; - } + proxy->ioeventfd_started = started; +} - for (n = 0; n < VIRTIO_QUEUE_MAX; n++) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } +static bool virtio_mmio_ioeventfd_disabled(DeviceState *d) +{ + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d); - r = virtio_mmio_set_host_notifier_internal(proxy, n, true, true); - if (r < 0) { - goto assign_error; - } - } - proxy->ioeventfd_started = true; - return; + return !kvm_eventfds_enabled() || proxy->ioeventfd_disabled; +} -assign_error: - while (--n >= 0) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } +static void virtio_mmio_ioeventfd_set_disabled(DeviceState *d, bool disabled) +{ + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d); - r = virtio_mmio_set_host_notifier_internal(proxy, n, false, false); - assert(r >= 0); - } - proxy->ioeventfd_started = false; - error_report("%s: failed. Fallback to a userspace (slower).", __func__); + proxy->ioeventfd_disabled = disabled; } -static void virtio_mmio_stop_ioeventfd(VirtIOMMIOProxy *proxy) +static int virtio_mmio_ioeventfd_assign(DeviceState *d, + EventNotifier *notifier, + int n, bool assign) { - int r; - int n; - VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); + VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d); - if (!proxy->ioeventfd_started) { - return; + if (assign) { + memory_region_add_eventfd(&proxy->iomem, VIRTIO_MMIO_QUEUENOTIFY, 4, + true, n, notifier); + } else { + memory_region_del_eventfd(&proxy->iomem, VIRTIO_MMIO_QUEUENOTIFY, 4, + true, n, notifier); } + return 0; +} - for (n = 0; n < VIRTIO_QUEUE_MAX; n++) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } +static void virtio_mmio_start_ioeventfd(VirtIOMMIOProxy *proxy) +{ + virtio_bus_start_ioeventfd(&proxy->bus); +} - r = virtio_mmio_set_host_notifier_internal(proxy, n, false, false); - assert(r >= 0); - } - proxy->ioeventfd_started = false; +static void virtio_mmio_stop_ioeventfd(VirtIOMMIOProxy *proxy) +{ + virtio_bus_stop_ioeventfd(&proxy->bus); } static uint64_t virtio_mmio_read(void *opaque, hwaddr offset, unsigned size) @@ -498,25 +467,6 @@ assign_error: return r; } -static int virtio_mmio_set_host_notifier(DeviceState *opaque, int n, - bool assign) -{ - VirtIOMMIOProxy *proxy = VIRTIO_MMIO(opaque); - - /* Stop using ioeventfd for virtqueue kick if the device starts using host - * notifiers. This makes it easy to avoid stepping on each others' toes. - */ - proxy->ioeventfd_disabled = assign; - if (assign) { - virtio_mmio_stop_ioeventfd(proxy); - } - /* We don't need to start here: it's not needed because backend - * currently only stops on status change away from ok, - * reset, vmstop and such. If we do add code to start here, - * need to check vmstate, device state etc. */ - return virtio_mmio_set_host_notifier_internal(proxy, n, assign, false); -} - /* virtio-mmio device */ static void virtio_mmio_realizefn(DeviceState *d, Error **errp) @@ -558,8 +508,12 @@ static void virtio_mmio_bus_class_init(ObjectClass *klass, void *data) k->notify = virtio_mmio_update_irq; k->save_config = virtio_mmio_save_config; k->load_config = virtio_mmio_load_config; - k->set_host_notifier = virtio_mmio_set_host_notifier; k->set_guest_notifiers = virtio_mmio_set_guest_notifiers; + k->ioeventfd_started = virtio_mmio_ioeventfd_started; + k->ioeventfd_set_started = virtio_mmio_ioeventfd_set_started; + k->ioeventfd_disabled = virtio_mmio_ioeventfd_disabled; + k->ioeventfd_set_disabled = virtio_mmio_ioeventfd_set_disabled; + k->ioeventfd_assign = virtio_mmio_ioeventfd_assign; k->has_variable_vring_alignment = true; bus_class->max_dev = 1; } diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 1a0278304b..2b34b43060 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -262,14 +262,44 @@ static int virtio_pci_load_queue(DeviceState *d, int n, QEMUFile *f) return 0; } +static bool virtio_pci_ioeventfd_started(DeviceState *d) +{ + VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); + + return proxy->ioeventfd_started; +} + +static void virtio_pci_ioeventfd_set_started(DeviceState *d, bool started, + bool err) +{ + VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); + + proxy->ioeventfd_started = started; +} + +static bool virtio_pci_ioeventfd_disabled(DeviceState *d) +{ + VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); + + return proxy->ioeventfd_disabled || + !(proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD); +} + +static void virtio_pci_ioeventfd_set_disabled(DeviceState *d, bool disabled) +{ + VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); + + proxy->ioeventfd_disabled = disabled; +} + #define QEMU_VIRTIO_PCI_QUEUE_MEM_MULT 0x1000 -static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy, - int n, bool assign, bool set_handler) +static int virtio_pci_ioeventfd_assign(DeviceState *d, EventNotifier *notifier, + int n, bool assign) { + VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); VirtQueue *vq = virtio_get_queue(vdev, n); - EventNotifier *notifier = virtio_queue_get_host_notifier(vq); bool legacy = !(proxy->flags & VIRTIO_PCI_FLAG_DISABLE_LEGACY); bool modern = !(proxy->flags & VIRTIO_PCI_FLAG_DISABLE_MODERN); bool fast_mmio = kvm_ioeventfd_any_length_enabled(); @@ -280,16 +310,8 @@ static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy, hwaddr modern_addr = QEMU_VIRTIO_PCI_QUEUE_MEM_MULT * virtio_get_queue_index(vq); hwaddr legacy_addr = VIRTIO_PCI_QUEUE_NOTIFY; - int r = 0; if (assign) { - r = event_notifier_init(notifier, 1); - if (r < 0) { - error_report("%s: unable to init event notifier: %d", - __func__, r); - return r; - } - virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler); if (modern) { if (fast_mmio) { memory_region_add_eventfd(modern_mr, modern_addr, 0, @@ -325,68 +347,18 @@ static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy, memory_region_del_eventfd(legacy_mr, legacy_addr, 2, true, n, notifier); } - virtio_queue_set_host_notifier_fd_handler(vq, false, false); - event_notifier_cleanup(notifier); } - return r; + return 0; } static void virtio_pci_start_ioeventfd(VirtIOPCIProxy *proxy) { - VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); - int n, r; - - if (!(proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) || - proxy->ioeventfd_disabled || - proxy->ioeventfd_started) { - return; - } - - for (n = 0; n < VIRTIO_QUEUE_MAX; n++) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } - - r = virtio_pci_set_host_notifier_internal(proxy, n, true, true); - if (r < 0) { - goto assign_error; - } - } - proxy->ioeventfd_started = true; - return; - -assign_error: - while (--n >= 0) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } - - r = virtio_pci_set_host_notifier_internal(proxy, n, false, false); - assert(r >= 0); - } - proxy->ioeventfd_started = false; - error_report("%s: failed. Fallback to a userspace (slower).", __func__); + virtio_bus_start_ioeventfd(&proxy->bus); } static void virtio_pci_stop_ioeventfd(VirtIOPCIProxy *proxy) { - VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); - int r; - int n; - - if (!proxy->ioeventfd_started) { - return; - } - - for (n = 0; n < VIRTIO_QUEUE_MAX; n++) { - if (!virtio_queue_get_num(vdev, n)) { - continue; - } - - r = virtio_pci_set_host_notifier_internal(proxy, n, false, false); - assert(r >= 0); - } - proxy->ioeventfd_started = false; + virtio_bus_stop_ioeventfd(&proxy->bus); } static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) @@ -1110,24 +1082,6 @@ assign_error: return r; } -static int virtio_pci_set_host_notifier(DeviceState *d, int n, bool assign) -{ - VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); - - /* Stop using ioeventfd for virtqueue kick if the device starts using host - * notifiers. This makes it easy to avoid stepping on each others' toes. - */ - proxy->ioeventfd_disabled = assign; - if (assign) { - virtio_pci_stop_ioeventfd(proxy); - } - /* We don't need to start here: it's not needed because backend - * currently only stops on status change away from ok, - * reset, vmstop and such. If we do add code to start here, - * need to check vmstate, device state etc. */ - return virtio_pci_set_host_notifier_internal(proxy, n, assign, false); -} - static void virtio_pci_vmstate_change(DeviceState *d, bool running) { VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d); @@ -2488,12 +2442,16 @@ static void virtio_pci_bus_class_init(ObjectClass *klass, void *data) k->load_extra_state = virtio_pci_load_extra_state; k->has_extra_state = virtio_pci_has_extra_state; k->query_guest_notifiers = virtio_pci_query_guest_notifiers; - k->set_host_notifier = virtio_pci_set_host_notifier; k->set_guest_notifiers = virtio_pci_set_guest_notifiers; k->vmstate_change = virtio_pci_vmstate_change; k->device_plugged = virtio_pci_device_plugged; k->device_unplugged = virtio_pci_device_unplugged; k->query_nvectors = virtio_pci_query_nvectors; + k->ioeventfd_started = virtio_pci_ioeventfd_started; + k->ioeventfd_set_started = virtio_pci_ioeventfd_set_started; + k->ioeventfd_disabled = virtio_pci_ioeventfd_disabled; + k->ioeventfd_set_disabled = virtio_pci_ioeventfd_set_disabled; + k->ioeventfd_assign = virtio_pci_ioeventfd_assign; } static const TypeInfo virtio_pci_bus_info = { diff --git a/include/hw/acpi/acpi_dev_interface.h b/include/hw/acpi/acpi_dev_interface.h index a0c4a336f2..da4ef7fbd3 100644 --- a/include/hw/acpi/acpi_dev_interface.h +++ b/include/hw/acpi/acpi_dev_interface.h @@ -3,6 +3,7 @@ #include "qom/object.h" #include "qapi-types.h" +#include "hw/boards.h" /* These values are part of guest ABI, and can not be changed */ typedef enum { @@ -37,6 +38,10 @@ void acpi_send_event(DeviceState *dev, AcpiEventStatusBits event); * ospm_status: returns status of ACPI device objects, reported * via _OST method if device supports it. * send_event: inject a specified event into guest + * madt_cpu: fills @entry with Interrupt Controller Structure + * for CPU indexed by @uid in @apic_ids array, + * returned structure types are: + * 0 - Local APIC, 9 - Local x2APIC, 0xB - GICC * * Interface is designed for providing unified interface * to generic ACPI functionality that could be used without @@ -50,5 +55,7 @@ typedef struct AcpiDeviceIfClass { /* <public> */ void (*ospm_status)(AcpiDeviceIf *adev, ACPIOSTInfoList ***list); void (*send_event)(AcpiDeviceIf *adev, AcpiEventStatusBits ev); + void (*madt_cpu)(AcpiDeviceIf *adev, int uid, + CPUArchIdList *apic_ids, GArray *entry); } AcpiDeviceIfClass; #endif diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 10c09ca29f..e7a1a4cefd 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -277,6 +277,8 @@ Aml *aml_call1(const char *method, Aml *arg1); Aml *aml_call2(const char *method, Aml *arg1, Aml *arg2); Aml *aml_call3(const char *method, Aml *arg1, Aml *arg2, Aml *arg3); Aml *aml_call4(const char *method, Aml *arg1, Aml *arg2, Aml *arg3, Aml *arg4); +Aml *aml_call5(const char *method, Aml *arg1, Aml *arg2, Aml *arg3, Aml *arg4, + Aml *arg5); Aml *aml_gpio_int(AmlConsumerAndProducer con_and_pro, AmlLevelAndEdge edge_level, AmlActiveHighAndLow active_level, AmlShared shared, @@ -363,6 +365,7 @@ Aml *aml_refof(Aml *arg); Aml *aml_derefof(Aml *arg); Aml *aml_sizeof(Aml *arg); Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target); +Aml *aml_object_type(Aml *object); void build_header(BIOSLinker *linker, GArray *table_data, diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h new file mode 100644 index 0000000000..89ce172941 --- /dev/null +++ b/include/hw/acpi/cpu.h @@ -0,0 +1,67 @@ +/* + * QEMU ACPI hotplug utilities + * + * Copyright (C) 2016 Red Hat Inc + * + * Authors: + * Igor Mammedov <imammedo@redhat.com> + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef ACPI_CPU_H +#define ACPI_CPU_H + +#include "hw/qdev-core.h" +#include "hw/acpi/acpi.h" +#include "hw/acpi/aml-build.h" +#include "hw/hotplug.h" + +typedef struct AcpiCpuStatus { + struct CPUState *cpu; + uint64_t arch_id; + bool is_inserting; + bool is_removing; + uint32_t ost_event; + uint32_t ost_status; +} AcpiCpuStatus; + +typedef struct CPUHotplugState { + MemoryRegion ctrl_reg; + uint32_t selector; + uint8_t command; + uint32_t dev_count; + AcpiCpuStatus *devs; +} CPUHotplugState; + +void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev, + CPUHotplugState *cpu_st, DeviceState *dev, Error **errp); + +void acpi_cpu_unplug_request_cb(HotplugHandler *hotplug_dev, + CPUHotplugState *cpu_st, + DeviceState *dev, Error **errp); + +void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st, + DeviceState *dev, Error **errp); + +void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner, + CPUHotplugState *state, hwaddr base_addr); + +typedef struct CPUHotplugFeatures { + bool apci_1_compatible; + bool has_legacy_cphp; +} CPUHotplugFeatures; + +void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts, + hwaddr io_base, + const char *res_root, + const char *event_handler_method); + +void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list); + +extern const VMStateDescription vmstate_cpu_hotplug; +#define VMSTATE_CPU_HOTPLUG(cpuhp, state) \ + VMSTATE_STRUCT(cpuhp, state, 1, \ + vmstate_cpu_hotplug, CPUHotplugState) + +#endif diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h index 6fef67ec14..b995ef2ebd 100644 --- a/include/hw/acpi/cpu_hotplug.h +++ b/include/hw/acpi/cpu_hotplug.h @@ -16,8 +16,10 @@ #include "hw/acpi/pc-hotplug.h" #include "hw/acpi/aml-build.h" #include "hw/hotplug.h" +#include "hw/acpi/cpu.h" typedef struct AcpiCpuHotplug { + Object *device; MemoryRegion io; uint8_t sts[ACPI_GPE_PROC_LEN]; } AcpiCpuHotplug; @@ -28,6 +30,10 @@ void legacy_acpi_cpu_plug_cb(HotplugHandler *hotplug_dev, void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner, AcpiCpuHotplug *gpe_cpu, uint16_t base); +void acpi_switch_to_modern_cphp(AcpiCpuHotplug *gpe_cpu, + CPUHotplugState *cpuhp_state, + uint16_t io_port); + void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine, uint16_t io_base); #endif diff --git a/include/hw/acpi/ich9.h b/include/hw/acpi/ich9.h index bbd657c59b..a352c94fde 100644 --- a/include/hw/acpi/ich9.h +++ b/include/hw/acpi/ich9.h @@ -23,6 +23,7 @@ #include "hw/acpi/acpi.h" #include "hw/acpi/cpu_hotplug.h" +#include "hw/acpi/cpu.h" #include "hw/acpi/memory_hotplug.h" #include "hw/acpi/acpi_dev_interface.h" #include "hw/acpi/tco.h" @@ -48,7 +49,9 @@ typedef struct ICH9LPCPMRegs { uint32_t pm_io_base; Notifier powerdown_notifier; + bool cpu_hotplug_legacy; AcpiCpuHotplug gpe_cpu; + CPUHotplugState cpuhp_state; MemHotplugState acpi_memory_hotplug; diff --git a/include/hw/acpi/ipmi.h b/include/hw/acpi/ipmi.h new file mode 100644 index 0000000000..ab2bb29048 --- /dev/null +++ b/include/hw/acpi/ipmi.h @@ -0,0 +1,22 @@ +/* + * QEMU IPMI ACPI handling + * + * Copyright (c) 2015,2016 Corey Minyard <cminyard@mvista.com> + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef HW_ACPI_IPMI_H +#define HW_ACPI_IPMI_H + +#include "qemu/osdep.h" +#include "hw/acpi/aml-build.h" + +/* + * Add ACPI IPMI entries for all registered IPMI devices whose parent + * bus matches the given bus. The resource is the ACPI resource that + * contains the IPMI device, this is required for the I2C CRS. + */ +void build_acpi_ipmi_devices(Aml *table, BusState *bus); + +#endif /* HW_ACPI_IPMI_H */ diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 49566c89d4..948ed0c277 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -17,6 +17,7 @@ #include "hw/compat.h" #include "hw/mem/pc-dimm.h" #include "hw/mem/nvdimm.h" +#include "hw/acpi/acpi_dev_interface.h" #define HPET_INTCAP "hpet-intcap" @@ -71,7 +72,6 @@ struct PCMachineState { /* NUMA information: */ uint64_t numa_nodes; uint64_t *node_mem; - uint64_t *node_cpu; }; #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device" @@ -136,6 +136,8 @@ struct PCMachineClass { /* TSC rate migration: */ bool save_tsc_khz; + /* generate legacy CPU hotplug AML */ + bool legacy_cpu_hotplug; }; #define TYPE_PC_MACHINE "generic-pc-machine" @@ -345,6 +347,10 @@ void pc_system_firmware_init(MemoryRegion *rom_memory, /* pvpanic.c */ uint16_t pvpanic_port(void); +/* acpi-build.c */ +void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid, + CPUArchIdList *apic_ids, GArray *entry); + /* e820 types */ #define E820_RAM 1 #define E820_RESERVED 2 diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h index 60ee92b85a..1cfe9e01c4 100644 --- a/include/hw/mem/nvdimm.h +++ b/include/hw/mem/nvdimm.h @@ -34,7 +34,60 @@ } \ } while (0) -#define TYPE_NVDIMM "nvdimm" +/* + * The minimum label data size is required by NVDIMM Namespace + * specification, see the chapter 2 Namespaces: + * "NVDIMMs following the NVDIMM Block Mode Specification use an area + * at least 128KB in size, which holds around 1000 labels." + */ +#define MIN_NAMESPACE_LABEL_SIZE (128UL << 10) + +#define TYPE_NVDIMM "nvdimm" +#define NVDIMM(obj) OBJECT_CHECK(NVDIMMDevice, (obj), TYPE_NVDIMM) +#define NVDIMM_CLASS(oc) OBJECT_CLASS_CHECK(NVDIMMClass, (oc), TYPE_NVDIMM) +#define NVDIMM_GET_CLASS(obj) OBJECT_GET_CLASS(NVDIMMClass, (obj), \ + TYPE_NVDIMM) +struct NVDIMMDevice { + /* private */ + PCDIMMDevice parent_obj; + + /* public */ + + /* + * the size of label data in NVDIMM device which is presented to + * guest via __DSM "Get Namespace Label Size" function. + */ + uint64_t label_size; + + /* + * the address of label data which is read by __DSM "Get Namespace + * Label Data" function and written by __DSM "Set Namespace Label + * Data" function. + */ + void *label_data; + + /* + * it's the PMEM region in NVDIMM device, which is presented to + * guest via ACPI NFIT and _FIT method if NVDIMM hotplug is supported. + */ + MemoryRegion nvdimm_mr; +}; +typedef struct NVDIMMDevice NVDIMMDevice; + +struct NVDIMMClass { + /* private */ + PCDIMMDeviceClass parent_class; + + /* public */ + + /* read @size bytes from NVDIMM label data at @offset into @buf. */ + void (*read_label_data)(NVDIMMDevice *nvdimm, void *buf, + uint64_t size, uint64_t offset); + /* write @size bytes from @buf to NVDIMM label data at @offset. */ + void (*write_label_data)(NVDIMMDevice *nvdimm, const void *buf, + uint64_t size, uint64_t offset); +}; +typedef struct NVDIMMClass NVDIMMClass; #define NVDIMM_DSM_MEM_FILE "etc/acpi/nvdimm-mem" diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h index 67e92d8f7b..1e483f2670 100644 --- a/include/hw/mem/pc-dimm.h +++ b/include/hw/mem/pc-dimm.h @@ -61,7 +61,9 @@ typedef struct PCDIMMDevice { * @realize: called after common dimm is realized so that the dimm based * devices get the chance to do specified operations. * @get_memory_region: returns #MemoryRegion associated with @dimm which - * is directly mapped into the physical address space of guest + * is directly mapped into the physical address space of guest. + * @get_vmstate_memory_region: returns #MemoryRegion which indicates the + * memory of @dimm should be kept during live migration. */ typedef struct PCDIMMDeviceClass { /* private */ @@ -70,6 +72,7 @@ typedef struct PCDIMMDeviceClass { /* public */ void (*realize)(PCDIMMDevice *dimm, Error **errp); MemoryRegion *(*get_memory_region)(PCDIMMDevice *dimm); + MemoryRegion *(*get_vmstate_memory_region)(PCDIMMDevice *dimm); } PCDIMMDeviceClass; /** diff --git a/include/hw/smbios/ipmi.h b/include/hw/smbios/ipmi.h new file mode 100644 index 0000000000..1c9aae38f2 --- /dev/null +++ b/include/hw/smbios/ipmi.h @@ -0,0 +1,15 @@ +/* + * IPMI SMBIOS firmware handling + * + * Copyright (c) 2015,2016 Corey Minyard, MontaVista Software, LLC + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef QEMU_SMBIOS_IPMI_H +#define QEMU_SMBIOS_IPMI_H + +void smbios_build_type_38_table(void); + +#endif /* QEMU_SMBIOS_IPMI_H */ diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h index 3f2c1363d0..f3e5ef3f5b 100644 --- a/include/hw/virtio/virtio-bus.h +++ b/include/hw/virtio/virtio-bus.h @@ -52,7 +52,6 @@ typedef struct VirtioBusClass { bool (*has_extra_state)(DeviceState *d); bool (*query_guest_notifiers)(DeviceState *d); int (*set_guest_notifiers)(DeviceState *d, int nvqs, bool assign); - int (*set_host_notifier)(DeviceState *d, int n, bool assigned); void (*vmstate_change)(DeviceState *d, bool running); /* * transport independent init function. @@ -71,6 +70,29 @@ typedef struct VirtioBusClass { void (*device_unplugged)(DeviceState *d); int (*query_nvectors)(DeviceState *d); /* + * ioeventfd handling: if the transport implements ioeventfd_started, + * it must implement the other ioeventfd callbacks as well + */ + /* Returns true if the ioeventfd has been started for the device. */ + bool (*ioeventfd_started)(DeviceState *d); + /* + * Sets the 'ioeventfd started' state after the ioeventfd has been + * started/stopped for the device. err signifies whether an error + * had occurred. + */ + void (*ioeventfd_set_started)(DeviceState *d, bool started, bool err); + /* Returns true if the ioeventfd has been disabled for the device. */ + bool (*ioeventfd_disabled)(DeviceState *d); + /* Sets the 'ioeventfd disabled' state for the device. */ + void (*ioeventfd_set_disabled)(DeviceState *d, bool disabled); + /* + * Assigns/deassigns the ioeventfd backing for the transport on + * the device for queue number n. Returns an error value on + * failure. + */ + int (*ioeventfd_assign)(DeviceState *d, EventNotifier *notifier, + int n, bool assign); + /* * Does the transport have variable vring alignment? * (ie can it ever call virtio_queue_set_align()?) * Note that changing this will break migration for this transport. @@ -111,4 +133,11 @@ static inline VirtIODevice *virtio_bus_get_device(VirtioBusState *bus) return (VirtIODevice *)qdev; } +/* Start the ioeventfd. */ +void virtio_bus_start_ioeventfd(VirtioBusState *bus); +/* Stop the ioeventfd. */ +void virtio_bus_stop_ioeventfd(VirtioBusState *bus); +/* Switch from/to the generic ioeventfd handler */ +int virtio_bus_set_host_notifier(VirtioBusState *bus, int n, bool assign); + #endif /* VIRTIO_BUS_H */ diff --git a/qapi-schema.json b/qapi-schema.json index 0964eece6d..84b6708125 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -4079,8 +4079,9 @@ ## @ACPISlotType # # @DIMM: memory slot +# @CPU: logical CPU slot (since 2.7) # -{ 'enum': 'ACPISlotType', 'data': [ 'DIMM' ] } +{ 'enum': 'ACPISlotType', 'data': [ 'DIMM', 'CPU' ] } ## @ACPIOSTInfo # diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 4b258a6731..7cdcad4fdb 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -41,3 +41,6 @@ stub-obj-y += target-monitor-defs.o stub-obj-y += target-get-monitor-def.o stub-obj-y += vhost.o stub-obj-y += iohandler.o +stub-obj-y += smbios_type_38.o +stub-obj-y += ipmi.o +stub-obj-y += pc_madt_cpu_entry.o diff --git a/stubs/ipmi.c b/stubs/ipmi.c new file mode 100644 index 0000000000..98b6dcee0d --- /dev/null +++ b/stubs/ipmi.c @@ -0,0 +1,14 @@ +/* + * IPMI ACPI firmware handling + * + * Copyright (c) 2015,2016 Corey Minyard, MontaVista Software, LLC + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "hw/acpi/ipmi.h" + +void build_acpi_ipmi_devices(Aml *table, BusState *bus) +{ +} diff --git a/stubs/pc_madt_cpu_entry.c b/stubs/pc_madt_cpu_entry.c new file mode 100644 index 0000000000..427e772868 --- /dev/null +++ b/stubs/pc_madt_cpu_entry.c @@ -0,0 +1,7 @@ +#include "qemu/osdep.h" +#include "hw/i386/pc.h" + +void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid, + CPUArchIdList *apic_ids, GArray *entry) +{ +} diff --git a/stubs/smbios_type_38.c b/stubs/smbios_type_38.c new file mode 100644 index 0000000000..9528c2c28e --- /dev/null +++ b/stubs/smbios_type_38.c @@ -0,0 +1,14 @@ +/* + * IPMI SMBIOS firmware handling + * + * Copyright (c) 2015,2016 Corey Minyard, MontaVista Software, LLC + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "hw/smbios/ipmi.h" + +void smbios_build_type_38_table(void) +{ +} diff --git a/tests/acpi-test-data/pc/DSDT b/tests/acpi-test-data/pc/DSDT Binary files differindex 8b4f1a09b8..8053d71105 100644 --- a/tests/acpi-test-data/pc/DSDT +++ b/tests/acpi-test-data/pc/DSDT diff --git a/tests/acpi-test-data/pc/DSDT.bridge b/tests/acpi-test-data/pc/DSDT.bridge Binary files differindex 0d09b5cc61..850e71a973 100644 --- a/tests/acpi-test-data/pc/DSDT.bridge +++ b/tests/acpi-test-data/pc/DSDT.bridge diff --git a/tests/acpi-test-data/pc/DSDT.ipmikcs b/tests/acpi-test-data/pc/DSDT.ipmikcs Binary files differnew file mode 100644 index 0000000000..8ac48afb6a --- /dev/null +++ b/tests/acpi-test-data/pc/DSDT.ipmikcs diff --git a/tests/acpi-test-data/q35/DSDT b/tests/acpi-test-data/q35/DSDT Binary files differindex 67445428d9..58fbb3d2e2 100644 --- a/tests/acpi-test-data/q35/DSDT +++ b/tests/acpi-test-data/q35/DSDT diff --git a/tests/acpi-test-data/q35/DSDT.bridge b/tests/acpi-test-data/q35/DSDT.bridge Binary files differindex e85f5b1af9..c392802a95 100644 --- a/tests/acpi-test-data/q35/DSDT.bridge +++ b/tests/acpi-test-data/q35/DSDT.bridge diff --git a/tests/acpi-test-data/q35/DSDT.ipmibt b/tests/acpi-test-data/q35/DSDT.ipmibt Binary files differnew file mode 100644 index 0000000000..0ea38e1e72 --- /dev/null +++ b/tests/acpi-test-data/q35/DSDT.ipmibt diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c index 16d11aa854..92c90dd194 100644 --- a/tests/bios-tables-test.c +++ b/tests/bios-tables-test.c @@ -49,6 +49,8 @@ typedef struct { GArray *tables; uint32_t smbios_ep_addr; struct smbios_21_entry_point smbios_ep_table; + uint8_t *required_struct_types; + int required_struct_types_len; } test_data; #define ACPI_READ_FIELD(field, addr) \ @@ -334,7 +336,7 @@ static void test_acpi_tables(test_data *data) for (i = 0; i < tables_nr; i++) { AcpiSdtTable ssdt_table; - memset(&ssdt_table, 0 , sizeof(ssdt_table)); + memset(&ssdt_table, 0, sizeof(ssdt_table)); uint32_t addr = data->rsdt_tables_addr[i + 1]; /* fadt is first */ test_dst_table(&ssdt_table, addr); g_array_append_val(data->tables, ssdt_table); @@ -661,7 +663,6 @@ static void test_smbios_structs(test_data *data) uint32_t addr = ep_table->structure_table_address; int i, len, max_len = 0; uint8_t type, prv, crt; - uint8_t required_struct_types[] = {0, 1, 3, 4, 16, 17, 19, 32, 127}; /* walk the smbios tables */ for (i = 0; i < ep_table->number_of_structures; i++) { @@ -701,8 +702,8 @@ static void test_smbios_structs(test_data *data) g_assert_cmpuint(ep_table->max_structure_size, ==, max_len); /* required struct types must all be present */ - for (i = 0; i < ARRAY_SIZE(required_struct_types); i++) { - g_assert(test_bit(required_struct_types[i], struct_bitmap)); + for (i = 0; i < data->required_struct_types_len; i++) { + g_assert(test_bit(data->required_struct_types[i], struct_bitmap)); } } @@ -742,6 +743,10 @@ static void test_acpi_one(const char *params, test_data *data) g_free(args); } +static uint8_t base_required_struct_types[] = { + 0, 1, 3, 4, 16, 17, 19, 32, 127 +}; + static void test_acpi_piix4_tcg(void) { test_data data; @@ -751,6 +756,8 @@ static void test_acpi_piix4_tcg(void) */ memset(&data, 0, sizeof(data)); data.machine = MACHINE_PC; + data.required_struct_types = base_required_struct_types; + data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types); test_acpi_one("-machine accel=tcg", &data); free_test_data(&data); } @@ -762,6 +769,8 @@ static void test_acpi_piix4_tcg_bridge(void) memset(&data, 0, sizeof(data)); data.machine = MACHINE_PC; data.variant = ".bridge"; + data.required_struct_types = base_required_struct_types; + data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types); test_acpi_one("-machine accel=tcg -device pci-bridge,chassis_nr=1", &data); free_test_data(&data); } @@ -772,6 +781,8 @@ static void test_acpi_q35_tcg(void) memset(&data, 0, sizeof(data)); data.machine = MACHINE_Q35; + data.required_struct_types = base_required_struct_types; + data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types); test_acpi_one("-machine q35,accel=tcg", &data); free_test_data(&data); } @@ -783,11 +794,50 @@ static void test_acpi_q35_tcg_bridge(void) memset(&data, 0, sizeof(data)); data.machine = MACHINE_Q35; data.variant = ".bridge"; + data.required_struct_types = base_required_struct_types; + data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types); test_acpi_one("-machine q35,accel=tcg -device pci-bridge,chassis_nr=1", &data); free_test_data(&data); } +static uint8_t ipmi_required_struct_types[] = { + 0, 1, 3, 4, 16, 17, 19, 32, 38, 127 +}; + +static void test_acpi_q35_tcg_ipmi(void) +{ + test_data data; + + memset(&data, 0, sizeof(data)); + data.machine = MACHINE_Q35; + data.variant = ".ipmibt"; + data.required_struct_types = ipmi_required_struct_types; + data.required_struct_types_len = ARRAY_SIZE(ipmi_required_struct_types); + test_acpi_one("-machine q35,accel=tcg -device ipmi-bmc-sim,id=bmc0" + " -device isa-ipmi-bt,bmc=bmc0", + &data); + free_test_data(&data); +} + +static void test_acpi_piix4_tcg_ipmi(void) +{ + test_data data; + + /* Supplying -machine accel argument overrides the default (qtest). + * This is to make guest actually run. + */ + memset(&data, 0, sizeof(data)); + data.machine = MACHINE_PC; + data.variant = ".ipmikcs"; + data.required_struct_types = ipmi_required_struct_types; + data.required_struct_types_len = ARRAY_SIZE(ipmi_required_struct_types); + test_acpi_one("-machine accel=tcg -device ipmi-bmc-sim,id=bmc0" + " -device isa-ipmi-kcs,irq=0,bmc=bmc0", + &data); + free_test_data(&data); +} + int main(int argc, char *argv[]) { const char *arch = qtest_get_arch(); @@ -804,6 +854,8 @@ int main(int argc, char *argv[]) qtest_add_func("acpi/piix4/tcg/bridge", test_acpi_piix4_tcg_bridge); qtest_add_func("acpi/q35/tcg", test_acpi_q35_tcg); qtest_add_func("acpi/q35/tcg/bridge", test_acpi_q35_tcg_bridge); + qtest_add_func("acpi/piix4/tcg/ipmi", test_acpi_piix4_tcg_ipmi); + qtest_add_func("acpi/q35/tcg/ipmi", test_acpi_q35_tcg_ipmi); } ret = g_test_run(); boot_sector_cleanup(disk); |