aboutsummaryrefslogtreecommitdiff
path: root/kvm-all.c
AgeCommit message (Collapse)Author
2015-10-12kvm-all: Align to qemu_real_host_page_size in kvm_set_phys_memAlexey Kardashevskiy
As the comment in kvm_set_phys_mem() says, KVM works in page size chunks. However it uses hardcoded TARGET_PAGE_SIZE which is 4K on most platforms while actual host may use different page size, for example, PPC64 hosts use 64K system pages. This replaces static TARGET_PAGE_SIZE with run-time calculated qemu_real_host_page_size. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <1444102257-17405-1-git-send-email-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-24intc/gic: Extract some reusable vGIC codePavel Fedin
Some functions previously used only by vGICv2 are useful also for vGICv3 implementation. Untie them from GICState and make accessible from within other modules: - kvm_arm_gic_set_irq() - kvm_gic_supports_attr() - moved to common code and renamed to kvm_device_check_attr() - kvm_gic_access() - turned into GIC-independent kvm_device_access(). Data pointer changed to void * because some GICv3 registers are 64-bit wide Some of these changes are not used right now, but they will be helpful for implementing live migration. Actually kvm_dist_get() and kvm_dist_put() could also be made reusable, but they would require two extra parameters (s->dev_fd and s->num_cpu) as well as lots of typecasts of 's' to DeviceState * and back to GICState *. This makes the code very ugly so i decided to stop at this point. I tried also an approach with making a base class for all possible GICs, but it would contain only three variables (dev_fd, cpu_num and irq_num), and accessing them through the rest of the code would be again tedious (either ugly casts or qemu-style separate object pointer). So i disliked it too. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Tested-by: Ashok kumar <ashoks@broadcom.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 2ef56d1dd64ffb75ed02a10dcdaf605e5b8ff4f8.1441784344.git.p.fedin@samsung.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-16kvm: Add kvm system event crash handlerAndrey Smetanin
KVM kernel can send guest crash events into userspace. Appropriate guest crash handler is called when kernel guest crash event received. Guest crash event recognized by a KVM_SYSTEM_EVENT_CRASH type of system event. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Andreas Färber <afaerber@suse.de> Message-Id: <1435924905-8926-11-git-send-email-den@openvz.org> [Rebase: add lock/unlock iothread around qemu_system_guest_panicked - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-07s390x/kvm: make setting of in-kernel irq routes more efficientJens Freimann
When we add new adapter routes we call kvm_irqchip_add_route() for every virtqueue and in the same step also do the KVM_SET_GSI_ROUTING ioctl. This is unnecessary costly as the interface allows us to set multiple routes in one go. Let's first add all routes to the table stored in the global kvm_state and then do the ioctl to commit the routes to the in-kernel irqchip. This saves us several ioctls to the kernel where for each call a list is reallocated and populated. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-07-07Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150706.0' ↵Peter Maydell
into staging VFIO updates for 2.4-rc0 - "real" host page size API (Peter Crosthwaite) - platform device irqfd support (Eric Auger) - spapr container disconnect fix (Alexey Kardashevskiy) - quirk for broken Chelsio hardware (Gabriel Laupre) - coverity fix (Paolo Bonzini) # gpg: Signature made Mon Jul 6 19:23:49 2015 BST using RSA key ID 3BB08B22 # gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" # gpg: aka "Alex Williamson <alex@shazbot.org>" # gpg: aka "Alex Williamson <alwillia@redhat.com>" # gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" * remotes/awilliam/tags/vfio-update-20150706.0: vfio/pci : Add pba_offset PCI quirk for Chelsio T5 devices vfio: Unregister IOMMU notifiers when container is destroyed hw/vfio/platform: add irqfd support kvm: some fixes to kvm_resamplefds_allowed sysbus: add irq_routing_notifier intc: arm_gic_kvm: set the qemu_irq/gsi mapping kvm-all.c: add qemu_irq/gsi hash table and utility routines kvm: rename kvm_irqchip_[add,remove]_irqfd_notifier with gsi suffix vfio: cpu: Use "real" page size API cpu-all: complete "real" host page size API vfio: fix return type of pread Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Conflicts: kvm-all.c
2015-07-06kvm-all.c: add qemu_irq/gsi hash table and utility routinesEric Auger
VFIO platform device needs to setup irqfd but it does not know the gsi corresponding to the device qemu_irq. This patch proposes to store a hash table in kvm_state using the qemu_irq as key and the gsi as a value. kvm_irqchip_set_qemuirq_gsi allows to insert such a pair. The interrupt controller is supposed to use it. kvm_irqchip_[add, remove]_irqfd_notifier allows to setup/tear down irqfd directly from the qemu_irq. Signed-off-by: Eric Auger <eric.auger@linaro.org> Tested-by: Vikram Sethi <vikrams@codeaurora.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06kvm: rename kvm_irqchip_[add,remove]_irqfd_notifier with gsi suffixEric Auger
Anticipating for the introduction of new add/remove functions taking a qemu_irq parameter, let's rename existing ones with a gsi suffix. Signed-off-by: Eric Auger <eric.auger@linaro.org> Tested-by: Vikram Sethi <vikrams@codeaurora.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06kvm-all: kvm_irqchip_create is not expected to failPaolo Bonzini
KVM_CREATE_IRQCHIP should never fail, and so should its userspace wrapper kvm_irqchip_create. The function does not do anything if the irqchip capability is not available, as is the case for PPC. With this patch, kvm_arch_init can allocate memory and it will not be leaked. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06kvm-all: add support for multiple address spacesPaolo Bonzini
Make kvm_memory_listener_register public, and assign a kernel address space id to each KVMMemoryListener. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06kvm-all: make KVM's memory listener more genericPaolo Bonzini
No semantic change, but s->slots moves into a new struct KVMMemoryListener. KVM's memory listener becomes a member of struct KVMState, and becomes of type KVMMemoryListener. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06kvm-all: move internal types to kvm_int.hPaolo Bonzini
i386 code will have to define a different KVMMemoryListener. Create an internal header so that KVMSlot is not exposed outside. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06kvm-all: remove useless typedefPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06kvm-all: put kvm_mem_flags to more workAndrew Jones
Currently kvm_mem_flags just translates bools to bits, let's make it also determine the bools first. This avoids its parameter list growing each time we add a flag. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-01kvm: Switch to unlocked MMIOPaolo Bonzini
Do not take the BQL before dispatching MMIO requests of KVM VCPUs. Instead, address_space_rw will do it if necessary. This enables completely BQL-free MMIO handling in KVM mode for upcoming devices with fine-grained locking. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1434646046-27150-10-git-send-email-pbonzini@redhat.com>
2015-07-01kvm: Switch to unlocked PIOJan Kiszka
Do not take the BQL before dispatching PIO requests of KVM VCPUs. Instead, address_space_rw will do it if necessary. This enables completely BQL-free PIO handling in KVM mode for upcoming devices with fine-grained locking. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1434646046-27150-8-git-send-email-pbonzini@redhat.com>
2015-07-01kvm: First step to push iothread lock out of inner run loopJan Kiszka
This opens the path to get rid of the iothread lock on vmexits in KVM mode. On x86, the in-kernel irqchips has to be used because we otherwise need to synchronize APIC and other per-cpu state accesses that could be changed concurrently. Regarding pre/post-run callbacks, s390x and ARM should be fine without specific locking as the callbacks are empty. MIPS and POWER require locking for the pre-run callback. For the handle_exit callback, it is non-empty in x86, POWER and s390. Some POWER cases could do without the locking, but it is left in place for now. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1434646046-27150-7-git-send-email-pbonzini@redhat.com>
2015-07-01Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES马文霜
Last month, we experienced several guests crash(6cores-8cores), qemu logs display the following messages: qemu-system-x86_64: /build/qemu-2.1.2/kvm-all.c:976: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. After analysis and verification, we can confirm it's irq-balance daemon(in guest) leads to the assertion failure. Start a 8 core guest with two disks, execute the following scripts will reproduce the BUG quickly: irq_affinity.sh ======================================================================== vda_irq_num=25 vdb_irq_num=27 while [ 1 ] do for irq in {1,2,4,8,10,20,40,80} do echo $irq > /proc/irq/$vda_irq_num/smp_affinity echo $irq > /proc/irq/$vdb_irq_num/smp_affinity dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct done done ======================================================================== QEMU setup static irq route entries in kvm_pc_setup_irq_routing(), PIC and IOAPIC share the first 15 GSI numbers, take up 23 GSI numbers, but take up 38 irq route entries. When change irq smp_affinity in guest, a dynamic route entry may be setup, the current logic is: if allocate GSI number succeeds, a new route entry can be added. The available dynamic GSI numbers is 1021(KVM_MAX_IRQ_ROUTES-23), but available irq route entries is only 986(KVM_MAX_IRQ_ROUTES-38), GSI numbers greater than route entries. irq-balance's behavior will eventually leads to total irq route entries exceed KVM_MAX_IRQ_ROUTES, ioctl(KVM_SET_GSI_ROUTING) fail and kvm_irqchip_commit_routes() trigger assertion failure. This patch fix the BUG. Signed-off-by: Wenshuang Ma <kevinnma@tencent.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05kvm: remove special handling of DIRTY_MEMORY_MIGRATION in the dirty log maskPaolo Bonzini
One recent example is commit 4cc856f (kvm-all: Sync dirty-bitmap from kvm before kvm destroy the corresponding dirty_bitmap, 2015-04-02). Another performance problem is that KVM keeps tracking dirty pages after a failed live migration, which causes bad performance due to disallowing huge page mapping. Thanks to the previous patch, KVM can now stop hooking into log_global_start/stop. This simplifies the KVM code noticeably. Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com> Reported-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05kvm: accept non-mapped memory in kvm_dirty_pages_log_changePaolo Bonzini
It is okay if memory is not mapped into the guest but has dirty logging enabled. When this happens, KVM will not do anything and only accesses from the host will be logged. This can be triggered by iofuzz. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05memory: prepare for multiple bits in the dirty log maskPaolo Bonzini
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA, some listeners may be interested in the overall zero/non-zero value of the dirty log mask; others may be interested in the value of single bits. For this reason, always call log_start/log_stop if bits have respectively appeared or disappeared, and pass the old and new values of the dirty log mask so that listeners can distinguish the kinds of change. For example, KVM checks if dirty logging used to be completely disabled (in log_start) or is now completely disabled (in log_stop). On the other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed, since that is the only bit it cares about. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05memory: differentiate memory_region_is_logging and ↵Paolo Bonzini
memory_region_get_dirty_log_mask For now memory regions only track DIRTY_MEMORY_VGA individually, but this will change soon. To support this, split memory_region_is_logging in two functions: one that returns a given bit from dirty_log_mask, and one that returns the entire mask. memory_region_is_logging gets an extra parameter so that the compiler flags misuse. While VGA-specific users (including the Xen listener!) will want to keep checking that bit, KVM and vhost check for "any bit except migration" (because migration is handled via the global start/stop listener callbacks). Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05ppc: add helpful message when KVM fails to start VCPULaurent Vivier
On POWER8 systems, KVM checks if VCPU is running on primary threads, and that secondary threads are offline. If this is not the case, ioctl() fails with errno set to EBUSY. QEMU aborts with a non explicit error message: $ ./qemu-system-ppc64 --nographic -machine pseries,accel=kvm error: kvm run failed Device or resource busy To help user to diagnose the problem, this patch adds an informative error message. There is no easy way to check if SMT is enabled before starting the VCPU, and as this case is the only one setting errno to EBUSY, we just check the errno value to display a message. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <1431976007-20503-1-git-send-email-lvivier@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-02kvm: introduce kvm_arch_msi_data_to_gsiEric Auger
On ARM the MSI data corresponds to the shared peripheral interrupt (SPI) ID. This latter equals to the SPI index + 32. to retrieve the SPI index, matching the gsi, an architecture specific function is introduced. Signed-off-by: Eric Auger <eric.auger@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-04-30kvm: add support for memory transaction attributesPaolo Bonzini
Let kvm_arch_post_run convert fields in the kvm_run struct to MemTxAttrs. These are then passed to address_space_rw. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-30kvm: Silence warning from valgrindThomas Huth
valgrind complains here about uninitialized bytes with the following message: ==17814== Syscall param ioctl(generic) points to uninitialised byte(s) ==17814== at 0x466A780: ioctl (in /usr/lib64/power8/libc-2.17.so) ==17814== by 0x100735B7: kvm_vm_ioctl (kvm-all.c:1920) ==17814== by 0x10074583: kvm_set_ioeventfd_mmio (kvm-all.c:574) Let's fix it by using a proper struct initializer in kvm_set_ioeventfd_mmio(). Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1430153944-24368-1-git-send-email-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-30kvm: better advice for failed s390x startupCornelia Huck
If KVM_CREATE failed on s390x, we print a hint to enable the switch_amode kernel parameter. This only applies to old kernels, and only if the error was -EINVAL. Moreover, with new kernels, the most likely reason for -EINVAL is that pgstes were not enabled. Let's update the error message to give a better hint on where things may need fixing. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-04-28Convert ffs() != 0 callers to ctz32()Stefan Hajnoczi
There are a number of ffs(3) callers that do roughly: bit = ffs(val); if (bit) { do_something(bit - 1); } This pattern can be converted to ctz32() like this: zeroes = ctz32(val); if (zeroes != 32) { do_something(zeroes); } Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427124571-28598-6-git-send-email-stefanha@redhat.com Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-26exec.c: Make address_space_rw take transaction attributesPeter Maydell
Make address_space_rw take transaction attributes, rather than always using the 'unspecified' attributes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-02kvm-all: Sync dirty-bitmap from kvm before kvm destroy the corresponding ↵zhanghailiang
dirty_bitmap Sometimes, we destroy the dirty_bitmap in kvm_memory_slot before any sync action occur, this bit in dirty_bitmap will be missed, and which will lead the corresponding dirty pages to be missed in migration. This usually happens when do migration during VM's Start-up or Reboot. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> [Use s->migration_log instead of exec.c's in_migration. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-18kvm: fix ioeventfd endianness on bi-endian architecturesGreg Kurz
KVM expects host endian values. Hosts that don't use the default endianness need to negate the swap performed in adjust_endianness(). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Message-Id: <20150313212337.31142.3991.stgit@bahia.local> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-16kvm: encapsulate HAS_DEVICE for vm attrsDominik Dingel
More and more virtual machine specifics between kvm and qemu will be transferred with vm attributes. So we encapsulate the common logic in a generic function. Additionally we need only to check during initialization if kvm supports virtual machine attributes. Cc: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Message-Id: <1426164834-38648-2-git-send-email-jfrei@linux.vnet.ibm.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-03-11kvm: add machine state to kvm_arch_initMarcel Apfelbaum
Needed to query machine's properties. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-11machine: query kernel-irqchip propertyMarcel Apfelbaum
Running x86_64-softmmu/qemu-system-x86_64 -machine pc,kernel_irqchip=on -enable-kvm leads to crash: qemu-system-x86_64: qemu/util/qemu-option.c:387: qemu_opt_get_bool_helper: Assertion `opt->desc && opt->desc->type == QEMU_OPT_BOOL' failed. Aborted (core dumped) This happens because the commit e79d5a6 ("machine: remove qemu_machine_opts global list") removed the global option descriptions and moved them to MachineState's QOM properties. Fix this by querying machine properties through designated wrappers. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-03-10fix GCC 5.0.0 logical-not-parentheses warningsRadim Krčmář
man gcc: Warn about logical not used on the left hand side operand of a comparison. This option does not warn if the RHS operand is of a boolean type. By preferring bool over int where sensible, but without modifying any depending code, make GCC happy in cases like this, qemu-img.c: In function ‘compare_sectors’: qemu-img.c:992:39: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses] if (!!memcmp(buf1, buf2, 512) != res) { hw/ide/core.c:1836 doesn't throw an error, assert(!!s->error == !!(s->status & ERR_STAT)); even thought the second operand is int (and first hunk of this patch has a very similar case), maybe GCC developers still have a little faith in C programmers. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-02-10kvm: g_malloc() can't fail, bury dead error handlingMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-01-12kvm: extend kvm_irqchip_add_msi_route to work on s390Frank Blaschka
on s390 MSI-X irqs are presented as thin or adapter interrupts for this we have to reorganize the routing entry to contain valid information for the adapter interrupt code on s390. To minimize impact on existing code we introduce an architecture function to fixup the routing entry. Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-15coverity/s390x: avoid false positive in kvm_irqchip_add_adapter_routeChristian Borntraeger
Paolo Bonzini reported that Coverity reports an uninitialized pad value. Let's use a designated initializer for kvm_irq_routing_entry to avoid this false positive. This is similar to kvm_irqchip_add_msi_route and other users of kvm_irq_routing_entry. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15valgrind: avoid false positives in KVM_GET_DIRTY_LOG ioctlChristian Borntraeger
struct kvm_dirty_log contains padding fields that trigger false positives in valgrind. Let's use a designated initializer to avoid false positives from valgrind/memcheck. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checksEric Auger
Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension. Remove direct settings in architecture specific files. Add a new kvm_resamplefds_allowed variable, initialized by checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding kvm_resamplefds_enabled() function. A special notice for s390 where KVM_CAP_IRQFD was not immediatly advirtised when irqfd capability was introduced in the kernel. KVM_CAP_IRQ_ROUTING was advertised instead. This was fixed in "KVM: s390: announce irqfd capability", ebc3226202d5956a5963185222982d435378b899 whereas irqfd support was brought in 84223598778ba08041f4297fda485df83414d57e, "KVM: s390: irq routing for adapter interrupts". Both commits first appear in 3.15 so there should not be any kernel version impacted by this QEMU modification. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-23pc: kvm: check if KVM has free memory slots to avoid abort()Igor Mammedov
When more memory devices are used than available KVM memory slots, QEMU crashes with: kvm_alloc_slot: no free slot available Aborted (core dumped) Fix this by checking that KVM has a free slot before attempting to map memory in guest address space. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-20kvm: Fix memory slot page alignment logicAlexander Graf
Memory slots have to be page aligned to get entered into KVM. There is existing logic that tries to ensure that we pad memory slots that are not page aligned to the biggest region that would still fit in the alignment requirements. Unfortunately, that logic is broken. It tries to calculate the start offset based on the region size. Fix up the logic to do the thing it was intended to do and document it properly in the comment above it. With this patch applied, I can successfully run an e500 guest with more than 3GB RAM (at which point RAM starts overlapping subpage memory regions). Cc: qemu-stable@nongnu.org Signed-off-by: Alexander Graf <agraf@suse.de>
2014-10-10kvm fix compilation with GCC 4.3.4Paolo Bonzini
As usual, SLES11's GCC complained about double typedefs: /home/cohuck/git/qemu/kvm-all.c:110: error: redefinition of typedef ‘KVMState’ /home/cohuck/git/qemu/include/sysemu/kvm.h:161: error: previous declaration of ‘KVMState’ was here Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-09kvm: Make KVMState be the TYPE_KVM_ACCEL instance structEduardo Habkost
Now that we create an accel object before calling machine_init, we can simply use the accel object to save all KVMState data, instead of allocationg KVMState manually. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09accel: Pass MachineState object to accel init functionsEduardo Habkost
Most of the machine options and machine state information is in the MachineState object, not on the MachineClass. This will allow init functions to use the MachineState object directly instead of qemu_get_machine_opts() or the current_machine global. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04accel: Rename 'init' method to 'init_machine'Eduardo Habkost
Today, all accelerator init functions affect some global state: * tcg_init() calls tcg_exec_init() and affects globals such as tcg_tcx, page size globals, and possibly others; * kvm_init() changes the kvm_state global, cpu_interrupt_handler, and possibly others; * xen_init() changes the xen_xc global, and registers a change state handler. With the new accelerator QOM classes, initialization may now be split in two steps: * instance_init() will do basic initialization that doesn't affect any global state and don't need MachineState or MachineClass data. This will allow probing code to safely create multiple accelerator objects on the fly just for reporting host/accelerator capabilities, for example. * accel_init_machine()/init_machine() will save the accelerator object in MachineState, and do initialization steps which still affect global state, machine state, or that need data from MachineClass or MachineState. To clarify the difference between those two steps, rename init() to init_machine(). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04accel: Move KVM accel registration to kvm-all.cEduardo Habkost
Note that this has an user-visible side-effect: instead of reporting "KVM is not supported for this target", QEMU binaries not supporting KVM will report "kvm accelerator does not exist". As kvm_availble() always return 1 when CONFIG_KVM is enabled, we don't need to set AccelClass.available anymore. kvm_enabled() is not being completely removed yet only because qmp_query_kvm() still uses it. This also allows us to make kvm_init() static. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-26kvm/valgrind: don't mark memory as initializedChristian Borntraeger
since commit 7dda5dc82a77 ("migration: initialize RAM to zero") the guest memory is defined zero. No need to call valgrind on guest memory. This reverts commit 62fe83318d2f ("qemu: Use valgrind annotations to mark kvm guest memory as defined") thus speeding up kvm start if <includedir>/valgrind/valgrind.h is available. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16Introduce cpu_clean_all_dirtyMarcelo Tosatti
Introduce cpu_clean_all_dirty, to force subsequent cpu_synchronize_all_states to read in-kernel register state. Cc: qemu-stable@nongnu.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-12Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
- Memory: improve error reporting and avoid crashes on hotplug - Build: fixing block/iscsi.so and ranlib warnings on Mac OS X - Migration fixes for x86 - The odd KVM patch. # gpg: Signature made Thu 11 Sep 2014 11:21:10 BST using RSA key ID 9B4D86F2 # gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>" # gpg: aka "Paolo Bonzini <bonzini@gnu.org>" * remotes/bonzini/tags/for-upstream: (21 commits) gdbstub: init mon_chr through qemu_chr_alloc pckbd: adding new fields to vmstate mc146818rtc: add missed field to vmstate piix: do not set irq while loading vmstate serial: fixing vmstate for save/restore parallel: adding vmstate for save/restore fdc: adding vmstate for save/restore cpu: init vmstate for ticks and clock offset apic_common: vapic_paddr synchronization fix vl: use QLIST_FOREACH_SAFE to visit change state handlers exec: add parameter errp to gethugepagesize exec: report error when memory < hpagesize hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big memory: add parameter errp to memory_region_init_rom_device memory: add parameter errp to memory_region_init_ram exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr rules.mak: Fix DSO build by pulling in archive symbols util: Don't link host-utils.o if it's empty util: Move general qemu_getauxval to util/getauxval.c trace: Only link generated-tracers.o with "simple" backend ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-09kvm: do not abort if KVM_RUN failsPaolo Bonzini
Just go to the internal error runstate. This lets you use the "x", "dump-guest-memory" or "info register" commands. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>