aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-07-19msi/msix: added API to set MSI message address and dataAlexey Kardashevskiy
Added (msi|msix)_set_message() function for whoever might want to use them. Currently msi_notify()/msix_notify() write to these vectors to signal the guest about an interrupt so the correct values have to written there by the guest or QEMU. For example, POWER guest never initializes MSI/MSIX vectors, instead it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on POWER we have to initialize MSI/MSIX message from QEMU. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-07-19pci: Add INTx routing notifierJan Kiszka
This per-device notifier shall be triggered by any interrupt router along the path of a device's legacy interrupt signal on routing changes. For simplicity reasons and as this is a slow path anyway, no further details on the routing changes are provided. Instead, the callback is expected to use pci_device_route_intx_to_irq to check the effect of the change. Will be used by KVM PCI device assignment and VFIO. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-07-19pci: Add pci_device_route_intx_to_irqMichael S. Tsirkin
Device assigned on KVM needs to know the mode (enabled/inverted/disabled) and the IRQ number that a given device triggers in the attached interrupt controller. Add a PCI IRQ path discovery function that walks from a given device to the host bridge, and gets this information. For this purpose, a host bridge callback function is introduced: route_intx_to_irq. It is so far only implemented by the PIIX3, other host bridges can be added later on as required. Will be used for KVM PCI device assignment and VFIO. Based on patch by Jan Kiszka, with minor tweaks. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-07-04pci: Unregister BARs before device exitAlex Williamson
BARs are registered in init functions from memory regions created by the drivers. Exit functions destroy those memory regions. By unregistering the io regions after exit(), we're calling memory_region_del_subregion on freed memory. Don't do that. The option rom comes along for the ride because it's more symmetric to how it's created. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-07-04pci: convert PCIUnregisterFunc to voidAlex Williamson
Not a single driver has any possibility of failure on their exit function, let's keep it that way. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18msix: Switch msix_uninit to return voidAlex Williamson
It can't fail. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18msix: Allow full specification of MSIX layoutAlex Williamson
Finally, complete the fully specified interface. msix_add_config() gets folded into msix_init() because we now have quite a few parameters to pass and rolling it in let's us error earlier, avoiding the ugly unwind exit path. msix_mmio_setup() also gets rolled in, just because it's redundant to rediscover offsets when we already have them for such a tiny function. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18msix: Split PBA into it's own MemoryRegionAlex Williamson
These don't have to be contiguous. Size them to only what they need and use separate MemoryRegions for the vector table and PBA. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18msix: Note endian TODO itemAlex Williamson
MSIX, like PCI, is little endian. Specifying native is wrong here, but we need to check the rest of the file to determine if it's as simple as flipping this macro. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18msix: Move msix_mmio_readAlex Williamson
What's this doing so far from msix_mmio_ops? Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18virtio: Convert to msix_init_exclusive_bar() interfaceAlex Williamson
Simple conversion. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18ivshmem: Convert to msix_init_exclusive_bar() interfaceAlex Williamson
Trivial conversion, failed to have an uninit before and after. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18msix: Add simple BAR allocation MSIX setup functionsAlex Williamson
msi_init() takes over a BAR without really specifying or allowing specification of how it does so. Instead, let's split it into two interfaces, one fully specified, and one trivially easy. This implements the latter. msix_init_exclusive_bar() takes over allocating and filling a PCI BAR _exclusively_ for the use of MSIX. When used, the matching msi_uninit_exclusive_bar() should be used to tear it down. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18msix: fix PCIDevice naming inconsistencyAlex Williamson
msix.h calls the PCIDevice * parameter "dev" almost everywhere except the msix_write_config declaration. Fix the inconsistency. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-18msix: drop unused msix_bar_size, require valid bar_sizeJan Kiszka
No user in sight for msix_bar_size. bar_size for all users is aligned, let's simply require this instead of trying to fix up invalid input. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-11pci_bridge_dev: fix error path in pci_bridge_dev_initfn()Jason Baron
Currently, we do not properly cleanup, if pci_bridge_dev_initfn fails to initialize properly. Make sure to call pci_bridge_exitfn() in the error path. Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-11qdev: release parent properties on dc->init failureJason Baron
While looking into hot-plugging bridges, I can create a qemu segfault via: $ device_add pci-bridge Bridge chassis not specified. Each bridge is required to be assigned a unique chassis id > 0. ** ERROR:qom/object.c:389:object_delete: assertion failed: (obj->ref == 0) I'm proposing to fix this by adding a call to 'object_unparent()', before the call to qdev_free(). I see there is already a precedent for this usage pattern as seen in qdev_simple_unplug_cb(): /* can be used as ->unplug() callback for the simple cases */ int qdev_simple_unplug_cb(DeviceState *dev) { /* just zap it */ object_unparent(OBJECT(dev)); qdev_free(dev); return 0; } Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07msi: Use msi/msix_present more consistentlyJan Kiszka
Replace some open-coded msi/msix_present checks. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07msi: Invoke msi/msix_write_config from PCI coreJan Kiszka
Also this functions is better invoked by the core than by each and every device. This allows to drop the config_write callbacks from ich and intel-hda. CC: Alexander Graf <agraf@suse.de> CC: Gerd Hoffmann <kraxel@redhat.com> CC: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07msi: Guard msi/msix_write_config with msi_presentJan Kiszka
Terminate msi/msix_write_config early if support is not enabled. This allows to remove checks at the caller site if MSI is optional. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07msi: Invoke msi/msix_reset from PCI coreJan Kiszka
There is no point in pushing this burden to the devices, they tend to forget to call them (like intel-hda, ahci, xhci did). Instead, reset functions are now called from pci_device_reset. They do nothing if MSI/MSI-X is not in use. CC: Alexander Graf <agraf@suse.de> CC: Gerd Hoffmann <kraxel@redhat.com> CC: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07msi: Guard msi_reset with msi_presentJan Kiszka
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07ahci: Clean up reset functionsJan Kiszka
Properly register reset functions via the device class. CC: Alexander Graf <agraf@suse.de> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07intel-hda: Fix reset of MSI functionJan Kiszka
Call msi_reset on device reset as still required by the core. CC: Gerd Hoffmann <kraxel@redhat.com> CC: qemu-stable@nongnu.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07ahci: Fix reset of MSI functionJan Kiszka
Call msi_reset on device reset as still required by the core. CC: Alexander Graf <agraf@suse.de> CC: qemu-stable@nongnu.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07rtl8139: honor RxOverflow flag in can_receive methodFernando Luis Vazquez Cao
Some drivers (Linux' 8139too among them) rely on the NIC injecting an interrupt in the event of a receive buffer overflow and, accordingly, set the RxOverflow bit in the interrupt mask. Unfortunately rtl8139's can_receive method ignores the RxOverflow flag, which may lead to a situation where rtl8139 stops receiving packets (can_receive returns 0) when the receive buffer becomes full. If the driver eventually read from the receive buffer or reset the card the emulator could recover from this situation. However some implementations only do this upon receiving an interrupt with either RxOK or RxOverflow set in the ISR; interrupt that will never come because QEMU's flow control mechanisms would prevent rtl8139 from receiving any packet. Letting packets go through when the overflow interrupt is enabled makes the QEMU emulator compliant to the spec and solves the problem. This patch should fix a relatively common (in our experience) network stall observed when running enterprise distros with rtl8139 as the NIC; in some cases the 8139too device driver gets loaded and when under heavy load the network eventually stops working. Reported-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp> Tested-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp> Acked-by: Igor Kovalenko <igor.v.kovalenko@gmail.com> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-07shpc: unparent device before freeMichael S. Tsirkin
Recent core change removed unparent so we need to do this in all callers now. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-06-04target-microblaze: lwx/swx: first implementationPeter A. G. Crosthwaite
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2012-06-04Revert "rtl8139: do the network/host communication only in normal operating ↵Jason Wang
mode" This reverts commit ff71f2e8cacefae99179993204172bc65e4303df. This is because the linux 8139cp driver would leave the card in "Config Register Write Enable" mode after the eeprom were read or write ( which is unexpected in the spec ). Also a physical 8139 card can still DMA into host memory in modes other than Normal mode, so we need revert this commit to align with the behavior of physical card. The issue of 8139cp driver should be fixed in linux seperately. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-06-03Merge remote-tracking branch 'qemu-kvm/uq/master' into stagingAnthony Liguori
* qemu-kvm/uq/master: virtio/vhost: Add support for KVM in-kernel MSI injection msix: Add msix_nr_vectors_allocated kvm: Enable use of kvm_irqchip_in_kernel in hwlib code kvm: Introduce kvm_irqchip_add/remove_irqfd kvm: Make kvm_irqchip_commit_routes an internal service kvm: Publicize kvm_irqchip_release_virq kvm: Introduce kvm_irqchip_add_msi_route kvm: Rename kvm_irqchip_add_route to kvm_irqchip_add_irq_route msix: Introduce vector notifiers msix: Invoke msix_handle_mask_update on msix_mask_all msix: Factor out msix_get_message kvm: update vmxcap for EPT A/D, INVPCID, RDRAND, VMFUNC kvm: Enable in-kernel irqchip support by default kvm: Add support for direct MSI injections kvm: Update kernel headers kvm: x86: Wire up MSI support for in-kernel irqchip pc: Enable MSI support at APIC level kvm: Introduce basic MSI support for in-kernel irqchips Introduce MSIMessage structure kvm: Refactor KVMState::max_gsi to gsi_count
2012-06-03Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori
* kwolf/for-anthony: ahci: SATA FIS is 20 bytes, not 0x20 virtio-blk: Fix geometry sector calculation block: prevent snapshot mode $TMPDIR symlink attack sheepdog: fix return value of do_load_save_vm_state virtio: Fix compiler warning for non Linux hosts
2012-06-01Update version to open the 1.2 development branchAnthony Liguori
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-06-01Update version for 1.1.0 releasev1.1.0Anthony Liguori
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-31Update version for 1.1.0-rc4 releasev1.1.0-rc4Anthony Liguori
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-31Merge remote-tracking branch 'origin/master' into stagingAnthony Liguori
* origin/master: pc-bios: Update OpenBIOS images
2012-05-30pc-bios: Update OpenBIOS imagesBlue Swirl
Update OpenBIOS images to r1060 built from submodule. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-05-30ahci: SATA FIS is 20 bytes, not 0x20Daniel Verkamp
As in the SATA and AHCI specifications, a FIS is 5 Dwords of 4 bytes each, which comes to 20 bytes (decimal), not 0x20. Signed-off-by: Daniel Verkamp <daniel@drv.nu> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-30virtio-blk: Fix geometry sector calculationChristian Borntraeger
Currently the sector value for the geometry is masked, even if the user usesa command line parameter that explicitely gives a number. This breaks dasd devices on s390. A dasd device can have a physical block size of 4096 (== same for logical block size) and a typcial geometry of 15 heads and 12 sectors per cyl. The ibm partition detection relies on a correct geometry reported by the device. Unfortunately the current code changes 12 to 8. This would be necessary if the total size is not a multiple of logical sector size, but for dasd this is not the case. This patch checks the device size and only applies sector mask if necessary. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> CC: Christoph Hellwig <hch@lst.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-30block: prevent snapshot mode $TMPDIR symlink attackJim Meyering
In snapshot mode, bdrv_open creates an empty temporary file without checking for mkstemp or close failure, and ignoring the possibility of a buffer overrun given a surprisingly long $TMPDIR. Change the get_tmp_filename function to return int (not void), so that it can inform its two callers of those failures. Also avoid the risk of buffer overrun and do not ignore mkstemp or close failure. Update both callers (in block.c and vvfat.c) to propagate temp-file-creation failure to their callers. get_tmp_filename creates and closes an empty file, while its callers later open that presumed-existing file with O_CREAT. The problem was that a malicious user could provoke mkstemp failure and race to create a symlink with the selected temporary file name, thus causing the qemu process (usually root owned) to open through the symlink, overwriting an attacker-chosen file. This addresses CVE-2012-2652. http://bugzilla.redhat.com/CVE-2012-2652 Signed-off-by: Jim Meyering <meyering@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-30sheepdog: fix return value of do_load_save_vm_stateMORITA Kazutaka
bdrv_save_vmstate and bdrv_load_vmstate should return the vmstate size on success, and -errno on error. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-30virtio: Fix compiler warning for non Linux hostsStefan Weil
The local variables ret, i are only used if __linux__ is defined. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-30Merge remote-tracking branch 'mdroth/qga-pull-5-29-12-v2' into stagingAnthony Liguori
* mdroth/qga-pull-5-29-12-v2: qemu-ga: avoid blocking on atime update when reading /etc/mtab qemu-ga: Fix use of environ on Darwin
2012-05-30block: prevent snapshot mode $TMPDIR symlink attackJim Meyering
In snapshot mode, bdrv_open creates an empty temporary file without checking for mkstemp or close failure, and ignoring the possibility of a buffer overrun given a surprisingly long $TMPDIR. Change the get_tmp_filename function to return int (not void), so that it can inform its two callers of those failures. Also avoid the risk of buffer overrun and do not ignore mkstemp or close failure. Update both callers (in block.c and vvfat.c) to propagate temp-file-creation failure to their callers. get_tmp_filename creates and closes an empty file, while its callers later open that presumed-existing file with O_CREAT. The problem was that a malicious user could provoke mkstemp failure and race to create a symlink with the selected temporary file name, thus causing the qemu process (usually root owned) to open through the symlink, overwriting an attacker-chosen file. This addresses CVE-2012-2652. http://bugzilla.redhat.com/CVE-2012-2652 Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-30xhci: add usage info to docsGerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-30vnc: fix segfault in vnc_display_pw_expire()Gerd Hoffmann
NULL pointer dereference in case no vnc server is configured. Catch this and return -EINVAL like vnc_display_password() does. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-30Expose CPUID leaf 7 only for -cpu hostEduardo Habkost
Changes v2 -> v3; - Check for kvm_enabled() before setting cpuid_7_0_ebx_features Changes v1 -> v2: - Use kvm_arch_get_supported_cpuid() instead of host_cpuid() on cpu_x86_fill_host(). We should use GET_SUPPORTED_CPUID for all bits on "-cpu host" eventually, but I am not changing all the other CPUID leaves because we may not be able to test such an intrusive change in time for 1.1. Description of the bug: Since QEMU 0.15, the CPUID information on CPUID[EAX=7,ECX=0] is being returned unfiltered to the guest, directly from the GET_SUPPORTED_CPUID return value. The problem is that this makes the resulting CPU feature flags unpredictable and dependent on the host CPU and kernel version. This breaks live-migration badly if migrating from a host CPU that supports some features on that CPUID leaf (running a recent kernel) to a kernel or host CPU that doesn't support it. Migration also is incorrect (the virtual CPU changes under the guest's feet) if you migrate in the opposite direction (from an old CPU/kernel to a new CPU/kernel), but with less serious consequences (guests normally query CPUID information only once on boot). Fortunately, the bug affects only users using cpudefs with level >= 7. The right behavior should be to explicitly enable those features on [cpudef] config sections or on the "-cpu" command-line arguments. Right now there is no predefined CPU model on QEMU that has those features: the latest Intel model we have is Sandy Bridge. I would like to get this fixed on 1.1, so I am submitting this patch, that enables those features only if "-cpu host" is being used (as we don't have any pre-defined CPU model that actually have those features). After 1.1 is released, we can make those features properly configurable on [cpudef] and -cpu configuration. One problem is: with this patch, users with the following setup: - Running QEMU 1.0; - Using a cpudef having level >= 7; - Running a kernel that supports the features on CPUID leaf 7; and - Running on a CPU that supports some features on CPUID leaf 7 won't be able to live-migrate to QEMU 1.1. But for these users live-migration is already broken (they can't live-migrate to hosts with older CPUs or older kernels, already), I don't see how to avoid this problem. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-29qemu-ga: avoid blocking on atime update when reading /etc/mtabMichael Roth
Currently we re-read/re-process /etc/mtab to get an updated list of mounts when guest-fsfreeze-thaw is called. This can cause an atime update on /etc/mtab, which will block if we're in a frozen state. Instead, use /proc's version of mtab, which may not be up-to-date with options passed via -o remount, but is compatible for our use cases since we only care about the filesystem type. Reported-by: Matsuda, Daiki <matsudadik@intellilink.co.jp> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-05-29qemu-ga: Fix use of environ on DarwinAndreas Färber
Use _NSGetEnviron() helper to access the environment. Signed-off-by: Andreas Färber <andreas.faerber@web.de> Cc: Charlie Somerville <charlie@charliesomerville.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-05-29pci: call object_unparent() before free_qdev()Amos Kong
Start VM with 8 multiple-function block devs, hot-removing those block devs by 'device_del ...' would cause qemu abort. | (qemu) device_del virti0-0-0 | (qemu) ** |ERROR:qom/object.c:389:object_delete: assertion failed: (obj->ref == 0) It's a regression introduced by commit 57c9fafe The whole PCI slot should be removed once. Currently only one func is cleaned in pci_unplug_device(), if you try to remove a single func by monitor cmd. free_qdev() are called for all functions in slot, but unparent_delete() is only called for one function. Signed-off-by: XXXX Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-29fix multiboot loading if load_end_addr == 0Scott Moser
The previous multiboot load code did not treat the case where load_end_addr was 0 specially. The multiboot specification says the following: * load_end_addr Contains the physical address of the end of the data segment. (load_end_addr - load_addr) specifies how much data to load. This implies that the text and data segments must be consecutive in the OS image; this is true for existing a.out executable formats. If this field is zero, the boot loader assumes that the text and data segments occupy the whole OS image file. Signed-off-by: Scott Moser <smoser@ubuntu.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>