aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-11-05hw/pci-host/x86: extend the 64-bit PCI hole relative to the fw-assigned baseLaszlo Ersek
In commit 9fa99d2519cb ("hw/pci-host: Fix x86 Host Bridges 64bit PCI hole", 2017-11-16), we meant to expose such a 64-bit PCI MMIO aperture in the ACPI DSDT that would be at least as large as the new "pci-hole64-size" property (2GB on i440fx, 32GB on q35). The goal was to offer "enough" 64-bit MMIO aperture to the guest OS for hotplug purposes. In that commit, we added or modified five functions: - pc_pci_hole64_start(): shared between i440fx and q35. Provides a default 64-bit base, which starts beyond the cold-plugged 64-bit RAM, and skips the DIMM hotplug area too (if any). - i440fx_pcihost_get_pci_hole64_start(), q35_host_get_pci_hole64_start(): board-specific 64-bit base property getters called abstractly by the ACPI generator. Both of these fall back to pc_pci_hole64_start() if the firmware didn't program any 64-bit hole (i.e. if the firmware didn't assign a 64-bit GPA to any MMIO BAR on any device). Otherwise, they honor the firmware's BAR assignments (i.e., they treat the lowest 64-bit GPA programmed by the firmware as the base address for the aperture). - i440fx_pcihost_get_pci_hole64_end(), q35_host_get_pci_hole64_end(): these intended to extend the aperture to our size recommendation, calculated relative to the base of the aperture. Despite the original intent, i440fx_pcihost_get_pci_hole64_end() and q35_host_get_pci_hole64_end() currently only extend the aperture relative to the default base (pc_pci_hole64_start()), ignoring any programming done by the firmware. This means that our size recommendation may not be met. Fix it by honoring the firmware's address assignments. The strange extension sizes were spotted by Alex, in the log of a guest kernel running on top of OVMF (which prefers to assign 64-bit GPAs to 64-bit BARs). This change only affects DSDT generation, therefore no new compat property is being introduced. Using an i440fx OVMF guest with 5GB RAM, an example _CRS change is: > @@ -881,9 +881,9 @@ > QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000000800000000, // Range Minimum > - 0x000000080001C0FF, // Range Maximum > + 0x000000087FFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > - 0x000000000001C100, // Length > + 0x0000000080000000, // Length > ,, , AddressRangeMemory, TypeStatic) > }) > Device (GPE0) (On i440fx, the low RAM split is at 3GB, in this case. Therefore, with 5GB guest RAM and no DIMM hotplug range, pc_pci_hole64_start() returns 4 + (5-3) = 6 GB. Adding the 2GB extension to that yields 8GB, which is below the firmware-programmed base of 32GB, before the patch. Therefore, before the patch, the extension is ineffective. After the patch, we add the 2GB extension to the firmware-programmed base, namely 32GB.) Using a q35 OVMF guest with 5GB RAM, an example _CRS change is: > @@ -3162,9 +3162,9 @@ > QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000000800000000, // Range Minimum > - 0x00000009BFFFFFFF, // Range Maximum > + 0x0000000FFFFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > - 0x00000001C0000000, // Length > + 0x0000000800000000, // Length > ,, , AddressRangeMemory, TypeStatic) > }) > Device (GPE0) (On Q35, the low RAM split is at 2GB. Therefore, with 5GB guest RAM and no DIMM hotplug range, pc_pci_hole64_start() returns 4 + (5-2) = 7 GB. Adding the 32GB extension to that yields 39GB (0x0000_0009_BFFF_FFFF + 1), before the patch. After the patch, we add the 32GB extension to the firmware-programmed base, namely 32GB.) The ACPI test data for the bios-tables-test case that we added earlier in this series are corrected too, as follows: > @@ -3339,9 +3339,9 @@ > QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000000200000000, // Range Minimum > - 0x00000009BFFFFFFF, // Range Maximum > + 0x00000009FFFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > - 0x00000007C0000000, // Length > + 0x0000000800000000, // Length > ,, , AddressRangeMemory, TypeStatic) > }) > Device (GPE0) Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Fixes: 9fa99d2519cbf71f871e46871df12cb446dc1c3e Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05hw/pci-host/x86: extract get_pci_hole64_start_value() helpersLaszlo Ersek
Expose the calculated "hole64 start" GPAs as plain uint64_t values, extracting the internals of the current property getters. This patch doesn't change behavior. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05pci-testdev: add optional memory barGerd Hoffmann
Add memory bar to pci-testdev. Size is configurable using the membar property. Setting the size to zero (default) turns it off. Can be used to check whether guests handle large pci bars correctly. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05MAINTAINERS: list "tests/acpi-test-data" files in ACPI/SMBIOS sectionLaszlo Ersek
The "tests/acpi-test-data" files are currently not covered by any section in MAINTAINERS, and "scripts/checkpatch.pl" complains when new data files are added. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Enable Guest virtual APIC supportSingh, Brijesh
Now that amd-iommu support interrupt remapping, enable the GASup in IVRS table and GASup in extended feature register to indicate that IOMMU support guest virtual APIC mode. GASup provides option to guest OS to make use of 128-bit IRTE. Note that the GAMSup is set to zero to indicate that amd-iommu does not support guest virtual APIC mode (aka AVIC) which would be used for the nested VMs. See Table 21 from IOMMU spec for interrupt virtualization controls Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Add interrupt remap support when VAPIC is enabledSingh, Brijesh
Emulate the interrupt remapping support when guest virtual APIC is enabled. For more information refer: IOMMU spec rev 3.0 (section 2.2.5.2) When VAPIC is enabled, it uses interrupt remapping as defined in Table 22 and Figure 17 from IOMMU spec. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05i386: acpi: add IVHD device entry for IOAPICSingh, Brijesh
When interrupt remapping is enabled, add a special IVHD device (type IOAPIC). Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Add interrupt remap support when VAPIC is not enabledSingh, Brijesh
Emulate the interrupt remapping support when guest virtual APIC is not enabled. For more info Refer: AMD IOMMU spec Rev 3.0 - section 2.2.5.1 When VAPIC is not enabled, it uses interrupt remapping as defined in Table 20 and Figure 15 from IOMMU spec. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: Prepare for interrupt remap supportSingh, Brijesh
Register the interrupt remapping callback and read/write ops for the amd-iommu-ir memory region. amd-iommu-ir is set to higher priority to ensure that this region won't be masked out by other memory regions. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: make the address space naming consistent with intel-iommuSingh, Brijesh
To be consistent with intel-iommu: - rename the address space to use '_' instead of '-' - update the memory region relationships Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu/amd: remove V=1 check from amdvi_validate_dte()Singh, Brijesh
Currently, the amdvi_validate_dte() assumes that a valid DTE will always have V=1. This is not true. The V=1 means that bit[127:1] are valid. A valid DTE can have IV=1 and V=0 (i.e address translation disabled and interrupt remapping enabled) Remove the V=1 check from amdvi_validate_dte(), make the caller responsible to check for V or IV bits. This also fixes a bug in existing code that when error is detected during the translation we'll fail the translation instead of assuming a passthrough mode. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu: move vtd_generate_msi_message in common fileSingh, Brijesh
The vtd_generate_msi_message() in intel-iommu is used to construct a MSI Message from IRQ. A similar function will be needed when we add interrupt remapping support in amd-iommu. Moving the function in common file to avoid the code duplication. Rename it to x86_iommu_irq_to_msi_message(). There is no logic changes in the code flow. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05x86_iommu: move the kernel-irqchip check in common codeSingh, Brijesh
Interrupt remapping needs kernel-irqchip={off|split} on both Intel and AMD platforms. Move the check in common place. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05vhost-user-blk: start vhost when guest kicksYongji Xie
Some old guests (before commit 7a11370e5: "virtio_blk: enable VQs early") kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK. This violates the virtio spec. But virtio 1.0 transitional devices support this behaviour. So we should start vhost when guest kicks in this case. Signed-off-by: Yongji Xie <xieyongji@baidu.com> Signed-off-by: Chai Wen <chaiwen@baidu.com> Signed-off-by: Ni Xun <nixun@baidu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: handle invalid ce for shadow syncPeter Xu
We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing shadow page tables. Having invalid context entry there is perfectly valid when we move a device out of an existing domain. When that happens, instead of posting an error we invalidate the whole region. Without this patch, QEMU will crash if we do these steps: (1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe) (2) bind the NICs with vfio-pci in the guest (3) start testpmd with the NICs applied (4) stop testpmd (5) rebind the NIC back to ixgbe kernel driver The patch should fix it. Reported-by: Pei Zhang <pezhang@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272 Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: move ce fetching out when sync shadowPeter Xu
There are two callers for vtd_sync_shadow_page_table_range(): one provided a valid context entry and one not. Move that fetching operation into the caller vtd_sync_shadow_page_table() where we need to fetch the context entry. Meanwhile, remove the error_report_once() directly since we're already tracing all the error cases in the previous call. Instead, return error number back to caller. This will not change anything functional since callers are dropping it after all. We do this move majorly because we want to do something more later in vtd_sync_shadow_page_table(). Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: better handling of dmar state switchPeter Xu
QEMU is not handling the global DMAR switch well, especially when from "on" to "off". Let's first take the example of system reset. Assuming that a guest has IOMMU enabled. When it reboots, we will drop all the existing DMAR mappings to handle the system reset, however we'll still keep the existing memory layouts which has the IOMMU memory region enabled. So after the reboot and before the kernel reloads again, there will be no mapping at all for the host device. That's problematic since any software (for example, SeaBIOS) that runs earlier than the kernel after the reboot will assume the IOMMU is disabled, so any DMA from the software will fail. For example, a guest that boots on an assigned NVMe device might fail to find the boot device after a system reboot/reset and we'll be able to observe SeaBIOS errors if we capture the debugging log: WARNING - Timeout at nvme_wait:144! Meanwhile, we should see DMAR errors on the host of that NVMe device. It's the DMA fault that caused a NVMe driver timeout. The correct fix should be that we do proper switching of device DMA address spaces when system resets, which will setup correct memory regions and notify the backend of the devices. This might not affect much on non-assigned devices since QEMU VT-d emulation will assume a default passthrough mapping if DMAR is not enabled in the GCMD register (please refer to vtd_iommu_translate). However that's required for an assigned devices, since that'll rebuild the correct GPA to HPA mapping that is needed for any DMA operation during guest bootstrap. Besides the system reset, we have some other places that might change the global DMAR status and we'd better do the same thing there. For example, when we change the state of GCMD register, or the DMAR root pointer. Do the same refresh for all these places. For these two places we'll also need to explicitly invalidate the context entry cache and iotlb cache. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173 CC: QEMU Stable <qemu-stable@nongnu.org> Reported-by: Cong Li <coli@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> -- v2: - do the same for GCMD write, or root pointer update [Alex] - test is carried out by me this time, by observing the vtd_switch_address_space tracepoint after system reboot v3: - rewrite commit message as suggested by Alex Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05intel_iommu: introduce vtd_reset_caches()Peter Xu
Provide the function and use it in vtd_init(). Used to reset both context entry cache and iotlb cache for the whole IOMMU unit. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05virtio-blk: fix comment for virtio_blk_rw_completeYaowei Bai
Here should be submit_requests, there is no submit_merged_requests function. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05configure: Use LINKS loop for all build tree symlinksPeter Maydell
A few places in configure were doing ad-hoc calls to the symlink function to set up symlinks from the build tree back to the source tree. We have a loop that does this already for all files and directories listed in the LINKS environment variable; use that instead. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05configure: Rename FILES variable to LINKSPeter Maydell
The FILES variable is used to accumulate a list of things to symlink from the source tree into the build tree. These don't have to be individual files; symlinking an entire directory of data files is also fine. Rename it to something less confusing before we add a few directories to it. Improve the comment to clarify what DIRS and LINKS do and why it's not a good idea to add things to LINKS with wildcarding. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05tests: Move tests/hex-loader-check-data/ to tests/data/hex-loader/Peter Maydell
Currently tests/hex-loader-check-data contains data files used by the hexloader-test, and configure individually symlinks those data files into the build directory using a wildcard. Using a wildcard like this is a bad idea, because if a new data file is added, nothing causes configure to be rerun, and so no symlink is added for the new file. This can cause tests to spuriously fail when they can't find their data. Instead, it's better to symlink an entire directory of data files. We already have such a directory: tests/data. Move the data files from tests/hex-loader-check-data/ to tests/data/hex-loader/, and remove the unnecessary symlinking. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05tests: Move tests/acpi-test-data/ to tests/data/acpi/Peter Maydell
Currently tests/acpi-test-data contains data files used by the bios-tables-test, and configure individually symlinks those data files into the build directory using a wildcard. Using a wildcard like this is a bad idea, because if a new data file is added, nothing causes configure to be rerun, and so no symlink is added for the new file. This can cause tests to spuriously fail when they can't find their data. Instead, it's better to symlink an entire directory of data files. We already have such a directory: tests/data. Move the data files from tests/acpi-test-data/ to tests/data/acpi/, and remove the unnecessary symlinking. We can remove entirely the note in rebuild-expected-aml.sh about copying any new data files, because now they will be in the source directory, not the build directory, and no copying is required. (We can't just change the existing tests/acpi-test-data/ to being a symlinked directory, because if we did that and a developer switched git branches from one after that change to one before it then configure would end up trashing all the test files by making them symlinks to themselves. Changing their path avoids this annoyance.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches: - auto-read-only option to fix commit job when used with -blockdev - Fix help text related qemu-iotests failure (by improving the help text and updating the reference output) - quorum: Add missing checks when adding/removing child nodes - Don't take address of fields in packed structs - vvfat: Fix crash when reporting error about too many files in directory # gpg: Signature made Mon 05 Nov 2018 15:35:25 GMT # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (36 commits) include: Add a comment to explain the origin of sizes' lookup table vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE fw_cfg: Drop newline in @file description object: Make option help nicer to read qdev-monitor: Make device options help nicer chardev: Indent list of chardevs option: Make option help nicer to read qemu-iotests: Test auto-read-only with -drive and -blockdev block: Make auto-read-only=on default for -drive iscsi: Support auto-read-only option gluster: Support auto-read-only option curl: Support auto-read-only option file-posix: Support auto-read-only option nbd: Support auto-read-only option block: Require auto-read-only for existing fallbacks rbd: Close image in qemu_rbd_open() error path block: Add auto-read-only option block: Update flags in bdrv_set_read_only() iotest: Test x-blockdev-change on a Quorum quorum: Forbid adding children in blkverify mode ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-11-05include: Add a comment to explain the origin of sizes' lookup tableLeonid Bloch
The lookup table for power-of-two sizes was added in commit 540b8492618eb for the purpose of having convenient shortcuts for these sizes in cases when the literal number has to be present at compile time, and expressions as '(1 * KiB)' can not be used. One such case is the stringification of sizes. Beyond that, it is convenient to use these shortcuts for all power-of-two sizes, even if they don't have to be literal numbers. Despite its convenience, this table introduced 55 lines of "dumb" code, the purpose and origin of which are obscure without reading the message of the commit which introduced it. This patch fixes that by adding a comment to the code itself with a brief explanation for the reasoning behind this table. This comment includes the short AWK script that generated the table, so that anyone who's interested could make sure that the values in it are correct (otherwise these values look as if they were typed manually). Signed-off-by: Leonid Bloch <lbloch@janustech.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZELeonid Bloch
If an expression is used to define DEFAULT_CLUSTER_SIZE, when compiled, it will be embedded as a literal expression in the binary (as the default value) because it is stringified to mark the size of the default value. Now this is fixed by using a defined number to define this value. Signed-off-by: Leonid Bloch <lbloch@janustech.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05fw_cfg: Drop newline in @file descriptionMax Reitz
There is no good reason why there should be a newline in this description, so remove it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05object: Make option help nicer to readMax Reitz
Just like in qemu_opts_print_help(), print the object name as a caption instead of on every single line, indent all options, add angle brackets around types, and align the descriptions after 24 characters. Also, indent every object name in the list of available objects. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05qdev-monitor: Make device options help nicerMax Reitz
Just like in qemu_opts_print_help(), print the device name as a caption instead of on every single line, indent all options, add angle brackets around types, and align the descriptions after 24 characters. Also, separate the descriptions with " - " instead of putting them in parentheses, because that is what we do everywhere else. This does look a bit funny here because basically all bits have the description "on/off", but funny does not mean it is less readable. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05chardev: Indent list of chardevsMax Reitz
Following the example of qemu_opts_print_help(), indent all entries in the list of character devices. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05option: Make option help nicer to readMax Reitz
This adds some whitespace into the option help (including indentation) and puts angle brackets around the type names. Furthermore, the list name is no longer printed as part of every line, but only once in advance, and only if the caller did not print a caption already. This patch also restores the description alignment we had before commit 9cbef9d68ee1d8d0, just at 24 instead of 16 characters like we used to. This increase is because now we have the type and two spaces of indentation before the description, and with a usual type name length of three chracters, this sums up to eight additional characters -- which means that we now need 24 characters to get the same amount of padding for most options. Also, 24 is a third of 80, which makes it kind of a round number in terminal terms. Finally, this patch amends the reference output of iotest 082 to match the changes (and thus makes it pass again). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05qemu-iotests: Test auto-read-only with -drive and -blockdevKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05block: Make auto-read-only=on default for -driveKevin Wolf
While we want machine interfaces like -blockdev and QMP blockdev-add to add as little auto-detection as possible so that management tools are explicit about their needs, -drive is a convenience option for human users. Enabling auto-read-only=on by default there enables users to use read-only images for read-only guest devices without having to specify read-only=on explicitly. If they try to attach the image to a read-write device, they will still get an error message. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05iscsi: Support auto-read-only optionKevin Wolf
If read-only=off, but auto-read-only=on is given, open the volume read-write if we have the permissions, but instead of erroring out for read-only volumes, just degrade to read-only. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05gluster: Support auto-read-only optionKevin Wolf
If read-only=off, but auto-read-only=on is given, open the file read-write if we have the permissions, but instead of erroring out for read-only files, just degrade to read-only. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
2018-11-05curl: Support auto-read-only optionKevin Wolf
If read-only=off, but auto-read-only=on is given, just degrade to read-only. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05file-posix: Support auto-read-only optionKevin Wolf
If read-only=off, but auto-read-only=on is given, open the file read-write if we have the permissions, but instead of erroring out for read-only files, just degrade to read-only. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05nbd: Support auto-read-only optionKevin Wolf
If read-only=off, but auto-read-only=on is given, open a read-write NBD connection if the server provides a read-write export, but instead of erroring out for read-only exports, just degrade to read-only. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05block: Require auto-read-only for existing fallbacksKevin Wolf
Some block drivers have traditionally changed their node to read-only mode without asking the user. This behaviour has been marked deprecated since 2.11, expecting users to provide an explicit read-only=on option. Now that we have auto-read-only=on, enable these drivers to make use of the option. This is the only use of bdrv_set_read_only(), so we can make it a bit more specific and turn it into a bdrv_apply_auto_read_only() that is more convenient for drivers to use. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05rbd: Close image in qemu_rbd_open() error pathKevin Wolf
Commit e2b8247a322 introduced an error path in qemu_rbd_open() after calling rbd_open(), but neglected to close the image again in this error path. The error path should contain everything that the regular close function qemu_rbd_close() contains. This adds the missing rbd_close() call. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05block: Add auto-read-only optionKevin Wolf
If a management application builds the block graph node by node, the protocol layer doesn't inherit its read-only option from the format layer any more, so it must be set explicitly. Backing files should work on read-only storage, but at the same time, a block job like commit should be able to reopen them read-write if they are on read-write storage. However, without option inheritance, reopen only changes the read-only option for the root node (typically the format layer), but not the protocol layer, so reopening fails (the format layer wants to get write permissions, but the protocol layer is still read-only). A simple workaround for the problem in the management tool would be to open the protocol layer always read-write and to make only the format layer read-only for backing files. However, sometimes the file is actually stored on read-only storage and we don't know whether the image can be opened read-write (for example, for NBD it depends on the server we're trying to connect to). This adds an option that makes QEMU try to open the image read-write, but allows it to degrade to a read-only mode without returning an error. The documentation for this option is consciously phrased in a way that allows QEMU to switch to a better model eventually: Instead of trying when the image is first opened, making the read-only flag dynamic and changing it automatically whenever the first BLK_PERM_WRITE user is attached or the last one is detached would be much more useful behaviour. Unfortunately, this more useful behaviour is also a lot harder to implement, and libvirt needs a solution now before it can switch to -blockdev, so let's start with this easier approach for now. Instead of adding a new auto-read-only option, turning the existing read-only into an enum (with a bool alternate for compatibility) was considered, but it complicated the implementation to the point that it didn't seem to be worth it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-11-05block: Update flags in bdrv_set_read_only()Kevin Wolf
To fully change the read-only state of a node, we must not only change bs->read_only, but also update bs->open_flags. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2018-11-05iotest: Test x-blockdev-change on a QuorumAlberto Garcia
This patch tests that you can add and remove drives from a Quorum using the x-blockdev-change command. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05quorum: Forbid adding children in blkverify modeAlberto Garcia
The blkverify mode of Quorum only works when the number of children is exactly two, so any attempt to add a new one must return an error. quorum_del_child() on the other hand doesn't need any additional check because decreasing the number of children would make it go under the vote threshold. Signed-off-by: Alberto Garcia <berto@igalia.com> Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05iotest: Test the blkverify mode of the Quorum driverAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05quorum: Return an error if the blkverify mode has invalid settingsAlberto Garcia
The blkverify mode of Quorum can only be enabled if the number of children is exactly two and the value of vote-threshold is also two. If the user tries to enable it but the other settings are incorrect then QEMU simply prints an error message to stderr and carries on disabling the blkverify setting. This patch makes quorum_open() fail and return an error in this case. Signed-off-by: Alberto Garcia <berto@igalia.com> Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05quorum: Remove quorum_err()Alberto Garcia
This is a static function with only one caller, so there's no need to keep it. Inlining the code in quorum_compare() makes it much simpler. Signed-off-by: Alberto Garcia <berto@igalia.com> Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05block/vdi: Don't take address of fields in packed structsPeter Maydell
Taking the address of a field in a packed struct is a bad idea, because it might not be actually aligned enough for that pointer type (and thus cause a crash on dereference on some host architectures). Newer versions of clang warn about this. Avoid the bug by not using the "modify in place" byte swapping functions. There are a few places where the in-place swap function is used on something other than a packed struct field; we convert those anyway, for consistency. Patch produced with scripts/coccinelle/inplace-byteswaps.cocci. There are other places where we take the address of a packed member in this file for other purposes than passing it to a byteswap function (all the calls to qemu_uuid_*()); we leave those for now. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05block/vhdx: Don't take address of fields in packed structsPeter Maydell
Taking the address of a field in a packed struct is a bad idea, because it might not be actually aligned enough for that pointer type (and thus cause a crash on dereference on some host architectures). Newer versions of clang warn about this. Avoid the bug by not using the "modify in place" byte swapping functions. There are a few places where the in-place swap function is used on something other than a packed struct field; we convert those anyway, for consistency. Patch produced with scripts/coccinelle/inplace-byteswaps.cocci. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-05vpc: Don't leak opts in vpc_open()Kevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>