aboutsummaryrefslogtreecommitdiff
path: root/hw/i386
AgeCommit message (Collapse)Author
2018-11-20hw/i386: add pc-i440fx-3.1 & pc-q35-3.1Marc-André Lureau
We have a couple of PC_COMPAT_3_0, so we should have 3.1 PC machines, and update the 3.0 machines to make use of those. Fixes a "Known issue" from https://wiki.qemu.org/Planning/3.1. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20181120132604.22854-1-marcandre.lureau@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-11-05x86_iommu/amd: Enable Guest virtual APIC supportSingh, Brijesh
Now that amd-iommu support interrupt remapping, enable the GASup in IVRS table and GASup in extended feature register to indicate that IOMMU support guest virtual APIC mode. GASup provides option to guest OS to make use of 128-bit IRTE. Note that the GAMSup is set to zero to indicate that amd-iommu does not support guest virtual APIC mode (aka AVIC) which would be used for the nested VMs. See Table 21 from IOMMU spec for interrupt virtualization controls Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Add interrupt remap support when VAPIC is enabledSingh, Brijesh
Emulate the interrupt remapping support when guest virtual APIC is enabled. For more information refer: IOMMU spec rev 3.0 (section 2.2.5.2) When VAPIC is enabled, it uses interrupt remapping as defined in Table 22 and Figure 17 from IOMMU spec. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05i386: acpi: add IVHD device entry for IOAPICSingh, Brijesh
When interrupt remapping is enabled, add a special IVHD device (type IOAPIC). Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Add interrupt remap support when VAPIC is not enabledSingh, Brijesh
Emulate the interrupt remapping support when guest virtual APIC is not enabled. For more info Refer: AMD IOMMU spec Rev 3.0 - section 2.2.5.1 When VAPIC is not enabled, it uses interrupt remapping as defined in Table 20 and Figure 15 from IOMMU spec. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Prepare for interrupt remap supportSingh, Brijesh
Register the interrupt remapping callback and read/write ops for the amd-iommu-ir memory region. amd-iommu-ir is set to higher priority to ensure that this region won't be masked out by other memory regions. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: make the address space naming consistent with intel-iommuSingh, Brijesh
To be consistent with intel-iommu: - rename the address space to use '_' instead of '-' - update the memory region relationships Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: remove V=1 check from amdvi_validate_dte()Singh, Brijesh
Currently, the amdvi_validate_dte() assumes that a valid DTE will always have V=1. This is not true. The V=1 means that bit[127:1] are valid. A valid DTE can have IV=1 and V=0 (i.e address translation disabled and interrupt remapping enabled) Remove the V=1 check from amdvi_validate_dte(), make the caller responsible to check for V or IV bits. This also fixes a bug in existing code that when error is detected during the translation we'll fail the translation instead of assuming a passthrough mode. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu: move vtd_generate_msi_message in common fileSingh, Brijesh
The vtd_generate_msi_message() in intel-iommu is used to construct a MSI Message from IRQ. A similar function will be needed when we add interrupt remapping support in amd-iommu. Moving the function in common file to avoid the code duplication. Rename it to x86_iommu_irq_to_msi_message(). There is no logic changes in the code flow. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu: move the kernel-irqchip check in common codeSingh, Brijesh
Interrupt remapping needs kernel-irqchip={off|split} on both Intel and AMD platforms. Move the check in common place. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: handle invalid ce for shadow syncPeter Xu
We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing shadow page tables. Having invalid context entry there is perfectly valid when we move a device out of an existing domain. When that happens, instead of posting an error we invalidate the whole region. Without this patch, QEMU will crash if we do these steps: (1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe) (2) bind the NICs with vfio-pci in the guest (3) start testpmd with the NICs applied (4) stop testpmd (5) rebind the NIC back to ixgbe kernel driver The patch should fix it. Reported-by: Pei Zhang <pezhang@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272 Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: move ce fetching out when sync shadowPeter Xu
There are two callers for vtd_sync_shadow_page_table_range(): one provided a valid context entry and one not. Move that fetching operation into the caller vtd_sync_shadow_page_table() where we need to fetch the context entry. Meanwhile, remove the error_report_once() directly since we're already tracing all the error cases in the previous call. Instead, return error number back to caller. This will not change anything functional since callers are dropping it after all. We do this move majorly because we want to do something more later in vtd_sync_shadow_page_table(). Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: better handling of dmar state switchPeter Xu
QEMU is not handling the global DMAR switch well, especially when from "on" to "off". Let's first take the example of system reset. Assuming that a guest has IOMMU enabled. When it reboots, we will drop all the existing DMAR mappings to handle the system reset, however we'll still keep the existing memory layouts which has the IOMMU memory region enabled. So after the reboot and before the kernel reloads again, there will be no mapping at all for the host device. That's problematic since any software (for example, SeaBIOS) that runs earlier than the kernel after the reboot will assume the IOMMU is disabled, so any DMA from the software will fail. For example, a guest that boots on an assigned NVMe device might fail to find the boot device after a system reboot/reset and we'll be able to observe SeaBIOS errors if we capture the debugging log: WARNING - Timeout at nvme_wait:144! Meanwhile, we should see DMAR errors on the host of that NVMe device. It's the DMA fault that caused a NVMe driver timeout. The correct fix should be that we do proper switching of device DMA address spaces when system resets, which will setup correct memory regions and notify the backend of the devices. This might not affect much on non-assigned devices since QEMU VT-d emulation will assume a default passthrough mapping if DMAR is not enabled in the GCMD register (please refer to vtd_iommu_translate). However that's required for an assigned devices, since that'll rebuild the correct GPA to HPA mapping that is needed for any DMA operation during guest bootstrap. Besides the system reset, we have some other places that might change the global DMAR status and we'd better do the same thing there. For example, when we change the state of GCMD register, or the DMAR root pointer. Do the same refresh for all these places. For these two places we'll also need to explicitly invalidate the context entry cache and iotlb cache. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173 CC: QEMU Stable <qemu-stable@nongnu.org> Reported-by: Cong Li <coli@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> -- v2: - do the same for GCMD write, or root pointer update [Alex] - test is carried out by me this time, by observing the vtd_switch_address_space tracepoint after system reboot v3: - rewrite commit message as suggested by Alex Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: introduce vtd_reset_caches()Peter Xu
Provide the function and use it in vtd_init(). Used to reset both context entry cache and iotlb cache for the whole IOMMU unit. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-10-24pc-dimm: pass PCDIMMDevice to pc_dimm_.*plugDavid Hildenbrand
We're plugging/unplugging a PCDIMMDevice, so directly pass this type instead of a more generic DeviceState. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20181005092024.14344-5-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-19pc: Fix machine property nvdimm-persistence error handlingMarkus Armbruster
Calling error_report() in a function that takes an Error ** argument is suspicious. pc.c's pc_machine_set_nvdimm_persistence() does that, and then exit()s. Wrong. Attempting to set machine property nvdimm-persistence to a bad value instantly kills the VM: $ qemu-system-x86_64 -nodefaults -S -display none -qmp stdio {"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 3}, "package": "v3.0.0-837-gc5e4e49258"}, "capabilities": []}} {"execute": "qmp_capabilities"} {"return": {}} {"execute": "qom-set", "arguments": {"path": "/machine", "property": "nvdimm-persistence", "value": "instadeath"}} -machine nvdimm-persistence=instadeath: unsupported option $ echo $? 1 Broken when commit 11c39b5cd96 (v3.0.0) replaced error_propagate(); return by error_report(); exit() instead of error_setg(); return. Fix that. Fixes: 11c39b5cd966ddc067a1ca0c5392ec9b666c45b7 Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181017082702.5581-10-armbru@redhat.com>
2018-10-02kvmclock: run KVM_KVMCLOCK_CTRL ioctl in vcpu threadYongji Xie
According to KVM API Documentation, we should only run vcpu ioctls from the same thread that was used to create the vcpu. This patch makes KVM_KVMCLOCK_CTRL ioctl consistent with the Documentation. No functional change. Signed-off-by: Yongji Xie <xieyongji@baidu.com> Signed-off-by: Chai Wen <chaiwen@baidu.com> Message-Id: <1531315364-2551-1-git-send-email-xieyongji@baidu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yongji Xie <elohimes@gmail.com>
2018-10-02change get_image_size return type to int64_tLi Zhijian
Previously, if the size of initrd >=2G, qemu exits with error: root@haswell-OptiPlex-9020:/home/lizj# /home/lizhijian/lkp/qemu-colo/x86_64-softmmu/qemu-system-x86_64 -kernel ./vmlinuz-4.16.0-rc4 -initrd large.cgz -nographic qemu: error reading initrd large.cgz: No such file or directory root@haswell-OptiPlex-9020:/home/lizj# du -sh large.cgz 2.5G large.cgz this patch changes the caller side that use this function to calculate size of initrd file as well. v2: update error message and int64_t printing format Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Message-Id: <1536833233-14121-1-git-send-email-lizhijian@cn.fujitsu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-25Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-09-24' ↵Peter Maydell
into staging Error reporting & miscellaneous patches for 2018-09-24 # gpg: Signature made Mon 24 Sep 2018 16:16:50 BST # gpg: using RSA key 3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-error-2018-09-24: MAINTAINERS: Fix F: patterns that don't match anything Drop "qemu:" prefix from error_report() arguments qemu-error: make use of {error, warn}_report_once_cond qemu-error: add {error, warn}_report_once_cond Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-09-24Drop "qemu:" prefix from error_report() argumentsMao Zhongyi
error_report and friends already add a "qemu-system-xxx" prefix to the string, so a "qemu:" prefix is redundant in the string. Just drop it. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1537495530-580-1-git-send-email-maozhongyi@cmss.chinamobile.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-09-07pc: acpi: revert back to 1 SRAT entry for hotpluggable areaIgor Mammedov
Commit 10efd7e108 "pc: acpi: fix memory hotplug regression by reducing stub SRAT entry size" attemped to fix hotplug regression introduced by 848a1cc1e "hw/acpi-build: build SRAT memory affinity structures for DIMM devices" fixed issue for Windows/3.0+ linux kernels, however it regressed 2.6 based kernels (RHEL6) to the point where guest might crash at boot. Reason is that 2.6 kernel discards SRAT table due too small last entry which down the road leads to crashes. Hack I've tried in 10efd7e108 is also not ACPI spec compliant according to which whole possible RAM should be described in SRAT. Revert 10efd7e108 to fix regression for 2.6 based kernels. With 10efd7e108 reverted, I've also tried splitting SRAT table statically in different ways %/node and %/slot but Windows still fails to online 2nd pc-dimm hot-plugged into node 0 (as described in 10efd7e108) and sometimes even coldplugged pc-dimms where affected with static SRAT partitioning. The only known so far way where Windows stays happy is when we have 1 SRAT entry in the last node covering all hotplug area. Revert 848a1cc1e until we come up with a way to avoid regression on Windows with hotplug area split in several entries. Tested this with 2.6/3.0 based kernels (RHEL6/7) and WS20[08/12/12R2/16]). Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-08-27intel-iommu: replace more vtd_err_* tracesPeter Xu
Replace all the trace_vtd_err_*() hooks with the new error_report_once() since they are similar to trace_vtd_err() - dumping the first error would be mostly enough, then we have them on by default too. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180815095328.32414-4-peterx@redhat.com> [Use "%x" instead of "%" PRIx16 to print uint16_t, whitespace tidied up] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-08-27intel-iommu: start to use error_report_oncePeter Xu
Replace existing trace_vtd_err() with error_report_once() then stderr will capture something if any of the error happens, meanwhile we don't suffer from any DDOS. Then remove the trace point. Since at it, provide more information where proper (now we can pass parameters into the report function). Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180815095328.32414-3-peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> [Two format strings fixed, whitespace tidied up] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-08-23pc-dimm: assign and verify the "addr" property during pre_plugDavid Hildenbrand
We can assign and verify the address before realizing and trying to plug. reading/writing the address property should never fail for DIMMs, so let's reduce error handling a bit by using &error_abort. Getting access to the memory region now might however fail. So forward errors from get_memory_region() properly. As all memory devices should use the alignment of the underlying memory region for guest physical address asignment, do detection of the alignment in pc_dimm_pre_plug(), but allow pc.c to overwrite the alignment for compatibility handling. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180801133444.11269-5-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23pc: drop memory region alignment check for 0David Hildenbrand
All applicable memory regions always have an alignment > 0. All memory backends result in file_ram_alloc() or qemu_anon_ram_alloc() getting called, setting the alignment to > 0. So a PCDIMM memory region always has an alignment > 0. NVDIMM copy the alignment of the original memory memory region into the handcrafted memory region that will be used at this place. So the check for 0 can be dropped and we can reduce the special handling. Dropping this check makes factoring out of alignment handling easier as compat handling only has to look at pcmc->enforce_aligned_dimm and not care about the alignment of the memory region. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180801133444.11269-4-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23pc-dimm: assign and verify the "slot" property during pre_plugDavid Hildenbrand
We can assign and verify the slot before realizing and trying to plug. reading/writing the slot property should never fail, so let's reduce error handling a bit by using &error_abort. To do this during pre_plug, add and use (x86, ppc) pc_dimm_pre_plug(). Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180801133444.11269-2-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-03pc: acpi: fix memory hotplug regression by reducing stub SRAT entry sizeIgor Mammedov
Commit 848a1cc1e (hw/acpi-build: build SRAT memory affinity structures for DIMM devices) broke the first dimm hotplug in following cases: 1: there is no coldplugged dimm in the last numa node but there is a coldplugged dimm in another node -m 4096,slots=4,maxmem=32G \ -object memory-backend-ram,id=m0,size=2G \ -device pc-dimm,memdev=m0,node=0 \ -numa node,nodeid=0 \ -numa node,nodeid=1 2: if order of dimms on CLI is: 1st plugged dimm in node1 2nd plugged dimm in node0 -m 4096,slots=4,maxmem=32G \ -object memory-backend-ram,size=2G,id=m0 \ -device pc-dimm,memdev=m0,node=1 \ -object memory-backend-ram,id=m1,size=2G \ -device pc-dimm,memdev=m1,node=0 \ -numa node,nodeid=0 \ -numa node,nodeid=1 (qemu) object_add memory-backend-ram,id=m2,size=1G (qemu) device_add pc-dimm,memdev=m2,node=0 the first DIMM hotplug to any node except the last one fails (Windows is unable to online it). Length reduction of stub hotplug memory SRAT entry, fixes issue for some reason. RHBZ: 1609234 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-08-03hw/acpi-build: Add a check for memory-less NUMA nodesDou Liyang
Currently, Qemu ACPI builder doesn't consider the memory-less NUMA nodes, eg: -m 4G,slots=4,maxmem=8G \ -numa node,nodeid=0 \ -numa node,nodeid=1,mem=2G \ -numa node,nodeid=2,mem=2G \ -numa node,nodeid=3\ Guest Linux will report [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0xffffffffffffffff] [ 0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x00000000-0x0009ffff] [ 0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x00100000-0x7fffffff] [ 0.000000] ACPI: SRAT: Node 2 PXM 2 [mem 0x80000000-0xbfffffff] [ 0.000000] ACPI: SRAT: Node 2 PXM 2 [mem 0x100000000-0x13fffffff] [ 0.000000] ACPI: SRAT: Node 3 PXM 3 [mem 0x140000000-0x13fffffff] [ 0.000000] ACPI: SRAT: Node 3 PXM 3 [mem 0x140000000-0x33fffffff] hotplug [mem 0x00000000-0xffffffffffffffff] and [mem 0x140000000-0x13fffffff] are bogus. Add a check to avoid building srat memory for memory-less NUMA nodes, also update the test file. Now the info in guest linux will be [ 0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x00000000-0x0009ffff] [ 0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x00100000-0x7fffffff] [ 0.000000] ACPI: SRAT: Node 2 PXM 2 [mem 0x80000000-0xbfffffff] [ 0.000000] ACPI: SRAT: Node 2 PXM 2 [mem 0x100000000-0x13fffffff] [ 0.000000] ACPI: SRAT: Node 3 PXM 3 [mem 0x140000000-0x33fffffff] hotplug Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-07-17i386: only parse the initrd_filename once for multiboot modulesDaniel P. Berrangé
The multiboot code parses the initrd_filename twice, first to count how many entries there are, and second to process each entry. This changes the first loop to store the parse module names in a list, and the second loop can now use these names. This avoids having to pass NULL to the get_opt_value() method which means it can safely assume a non-NULL param. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180514171913.17664-3-berrange@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-17i386: fix regression parsing multiboot initrd modulesDaniel P. Berrangé
The logic for parsing the multiboot initrd modules was messed up in commit 950c4e6c94b15cd0d8b63891dddd7a8dbf458e6a Author: Daniel P. Berrangé <berrange@redhat.com> Date: Mon Apr 16 12:17:43 2018 +0100 opts: don't silently truncate long option values Causing the length to be undercounter, and the number of modules over counted. It also passes NULL to get_opt_value() which was not robust at accepting a NULL value. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180514171913.17664-2-berrange@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-16hyperv: ensure VP index equal to QEMU cpu_indexRoman Kagan
Hyper-V identifies vCPUs by Virtual Processor (VP) index which can be queried by the guest via HV_X64_MSR_VP_INDEX msr. It is defined by the spec as a sequential number which can't exceed the maximum number of vCPUs per VM. It has to be owned by QEMU in order to preserve it across migration. However, the initial implementation in KVM didn't allow to set this msr, and KVM used its own notion of VP index. Fortunately, the way vCPUs are created in QEMU/KVM makes it likely that the KVM value is equal to QEMU cpu_index. So choose cpu_index as the value for vp_index, and push that to KVM on kernels that support setting the msr. On older ones that don't, query the kernel value and assert that it's in sync with QEMU. Besides, since handling errors from vCPU init at hotplug time is impossible, disable vCPU hotplug. This patch also introduces accessor functions to encapsulate the mapping between a vCPU and its vp_index. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Message-Id: <20180702134156.13404-3-rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-03Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pc, virtio: fixes A couple of fixes to amd iommu, and a fix to virtio iommu. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 28 Jun 2018 02:46:45 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: virtio-rng: process pending requests on DRIVER_OK hw/i386: Fix AMDVI GATS and HATS encodings hw/i386: Fix IVHD entry length for AMD IOMMU Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-07-02hw/i386: Use the IEC binary prefix definitionsPaolo Bonzini
It eases code review, unit is explicit. Patch generated using: $ git grep -E '[<>][<>]=? ?[1-5]0' hw/ include/hw/ and modified manually. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-02hw/xen: Use the IEC binary prefix definitionsPhilippe Mathieu-Daudé
It eases code review, unit is explicit. Patch generated using: $ git grep -E '(1024|2048|4096|8192|(<<|>>).?(10|20|30))' hw/ include/hw/ and modified manually. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alan Robinson <Alan.Robinson@ts.fujitsu.com> Message-Id: <20180625124238.25339-12-f4bug@amsat.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28hmp: obsolete "info ioapic"Peter Xu
Let's start to use "info pic" just like other platforms. For now we keep the command for a while so that old users can know what is the new command to use. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20171229073104.3810-6-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28ioapic: support "info irq"Peter Xu
This include both userspace and in-kernel ioapic. Note that the numbers can be inaccurate for kvm-ioapic. One reason is the same with kvm-i8259, that when irqfd is used, irqs can be delivered all inside kernel without our notice. Meanwhile, kvm-ioapic is specially treated when irq numbers <ISA_NUM_IRQS, those irqs will be delivered in kernel too via kvm-i8259 (please refer to kvm_pc_gsi_handler). Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20171229073104.3810-5-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28pc-dimm: get_memory_region() will not fail after realizeDavid Hildenbrand
Let's try to reduce error handling a bit. In the plug/unplug case, the device was realized and therefore we can assume that getting access to the memory region will not fail. For get_vmstate_memory_region() this is already handled that way. Document both cases. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180619134141.29478-13-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28pc: factor out pc specific dimm checks into pc_memory_pre_plug()David Hildenbrand
We can perform these checks before the device is actually realized. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180619134141.29478-6-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28pc-dimm: rename pc_dimm_memory_* to pc_dimm_*David Hildenbrand
Let's rename it to make it look more consistent. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180619134141.29478-4-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28pc: rename pc_dimm_(plug|unplug|...)* into pc_memory_(plug|unplug|...)*David Hildenbrand
Use a similar naming scheme as spapr. This way, we can go ahead and rename e.g. pc_dimm_memory_plug to pc_dimm_plug, which avoids confusion. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180619134141.29478-3-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-26hw/i386: Fix AMDVI GATS and HATS encodingsJan Kiszka
We support up to 6 levels, but those are encoded as 10b according to the AMD IOMMU spec (chapter 3.3.1, Extended Feature Register). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-06-26hw/i386: Fix IVHD entry length for AMD IOMMUJan Kiszka
Counting from the IVHD ID field to the all-devices entry, we have 28 bytes, not 36. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-06-25hw/i386: Deprecate the machine types pc-0.10 and pc-0.11Thomas Huth
The oldest machine type which is still used in a still maintained distro is a pc-0.12 based machine type in RHEL6, so everything that is older than pc-0.12 should not be used anymore. Thus let's deprecate pc-0.10 and pc-0.11 so that we can finally remove them in a future release. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1529917512-10528-1-git-send-email-thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-06-18hw/display: add standalone ramfb deviceGerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Message-id: 20180613122948.18149-3-kraxel@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2018-06-15iommu: Add IOMMU index argument to translate methodPeter Maydell
Add an IOMMU index argument to the translate method of IOMMUs. Since all of our current IOMMU implementations support only a single IOMMU index, this has no effect on the behaviour. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180604152941.20374-4-peter.maydell@linaro.org
2018-06-15iommu: Add IOMMU index argument to notifier APIsPeter Maydell
Add support for multiple IOMMU indexes to the IOMMU notifier APIs. When initializing a notifier with iommu_notifier_init(), the caller must pass the IOMMU index that it is interested in. When a change happens, the IOMMU implementation must pass memory_region_notify_iommu() the IOMMU index that has changed and that notifiers must be called for. IOMMUs which support only a single index don't need to change. Callers which only really support working with IOMMUs with a single index can use the result of passing MEMTXATTRS_UNSPECIFIED to memory_region_iommu_attrs_to_index(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
2018-06-12Merge remote-tracking branch 'remotes/kraxel/tags/usb-20180612-pull-request' ↵Peter Maydell
into staging usb: bug fix collection, doc update. # gpg: Signature made Tue 12 Jun 2018 11:44:17 BST # gpg: using RSA key 4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/usb-20180612-pull-request: usb-mtp: Return error on suspicious TYPE_DATA packet from initiator usb-hcd-xhci-test: add a test for ccid hotplug usb-ccid: fix bus leak object: fix OBJ_PROP_LINK_UNREF_ON_RELEASE ambivalence bus: do not unref the added child bus on realize usb/dev-mtp: Fix use of uninitialized values usb: correctly handle Zero Length Packets usb: update docs Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-12Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pc: fixes A couple of fixes to acpi and nvdimm. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 11 Jun 2018 20:21:03 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: nvdimm: make persistence option symbolic hw/i386: Update SSDT table used by "make check" Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-12object: fix OBJ_PROP_LINK_UNREF_ON_RELEASE ambivalenceMarc-André Lureau
A link property can be set during creation, with object_property_add_link() and later with object_property_set_link(). add_link() doesn't add a reference to the target object, while set_link() does. Furthemore, OBJ_PROP_LINK_UNREF_ON_RELEASE flags, set during add_link, says whether a reference must be released when the property is destroyed. This can lead to leaks if the property was later set_link(), as the added reference is never released. Instead, rename OBJ_PROP_LINK_UNREF_ON_RELEASE to OBJ_PROP_LINK_STRONG and use that has an indication on how the link handle reference management in set_link(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20180531195119.22021-3-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2018-06-11nvdimm: make persistence option symbolicRoss Zwisler
Replace the "nvdimm-cap" option which took numeric arguments such as "2" with a more user friendly "nvdimm-persistence" option which takes symbolic arguments "cpu" or "mem-ctrl". Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>