aboutsummaryrefslogtreecommitdiff
path: root/cpus.c
AgeCommit message (Collapse)Author
2012-08-04Fixes related to processing of qemu's -numa optionChegu Vinod
The -numa option to qemu is used to create [fake] numa nodes and expose them to the guest OS instance. There are a couple of issues with the -numa option: a) Max VCPU's that can be specified for a guest while using the qemu's -numa option is 64. Due to a typecasting issue when the number of VCPUs is > 32 the VCPUs don't show up under the specified [fake] numa nodes. b) KVM currently has support for 160VCPUs per guest. The qemu's -numa option has only support for upto 64VCPUs per guest. This patch addresses these two issues. Below are examples of (a) and (b) a) >32 VCPUs are specified with the -numa option: /usr/local/bin/qemu-system-x86_64 \ -enable-kvm \ 71:01:01 \ -net tap,ifname=tap0,script=no,downscript=no \ -vnc :4 ... Upstream qemu : -------------- QEMU 1.1.50 monitor - type 'help' for more information (qemu) info numa 6 nodes node 0 cpus: 0 1 2 3 4 5 6 7 8 9 32 33 34 35 36 37 38 39 40 41 node 0 size: 131072 MB node 1 cpus: 10 11 12 13 14 15 16 17 18 19 42 43 44 45 46 47 48 49 50 51 node 1 size: 131072 MB node 2 cpus: 20 21 22 23 24 25 26 27 28 29 52 53 54 55 56 57 58 59 node 2 size: 131072 MB node 3 cpus: 30 node 3 size: 131072 MB node 4 cpus: node 4 size: 131072 MB node 5 cpus: 31 node 5 size: 131072 MB With the patch applied : ----------------------- QEMU 1.1.50 monitor - type 'help' for more information (qemu) info numa 6 nodes node 0 cpus: 0 1 2 3 4 5 6 7 8 9 node 0 size: 131072 MB node 1 cpus: 10 11 12 13 14 15 16 17 18 19 node 1 size: 131072 MB node 2 cpus: 20 21 22 23 24 25 26 27 28 29 node 2 size: 131072 MB node 3 cpus: 30 31 32 33 34 35 36 37 38 39 node 3 size: 131072 MB node 4 cpus: 40 41 42 43 44 45 46 47 48 49 node 4 size: 131072 MB node 5 cpus: 50 51 52 53 54 55 56 57 58 59 node 5 size: 131072 MB b) >64 VCPUs specified with -numa option: /usr/local/bin/qemu-system-x86_64 \ -enable-kvm \ -cpu Westmere,+rdtscp,+pdpe1gb,+dca,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+d-vnc :4 ... Upstream qemu : -------------- only 63 CPUs in NUMA mode supported. only 64 CPUs in NUMA mode supported. QEMU 1.1.50 monitor - type 'help' for more information (qemu) info numa 8 nodes node 0 cpus: 6 7 8 9 38 39 40 41 70 71 72 73 node 0 size: 65536 MB node 1 cpus: 10 11 12 13 14 15 16 17 18 19 42 43 44 45 46 47 48 49 50 51 74 75 76 77 78 79 node 1 size: 65536 MB node 2 cpus: 20 21 22 23 24 25 26 27 28 29 52 53 54 55 56 57 58 59 60 61 node 2 size: 65536 MB node 3 cpus: 30 62 node 3 size: 65536 MB node 4 cpus: node 4 size: 65536 MB node 5 cpus: node 5 size: 65536 MB node 6 cpus: 31 63 node 6 size: 65536 MB node 7 cpus: 0 1 2 3 4 5 32 33 34 35 36 37 64 65 66 67 68 69 node 7 size: 65536 MB With the patch applied : ----------------------- QEMU 1.1.50 monitor - type 'help' for more information (qemu) info numa 8 nodes node 0 cpus: 0 1 2 3 4 5 6 7 8 9 node 0 size: 65536 MB node 1 cpus: 10 11 12 13 14 15 16 17 18 19 node 1 size: 65536 MB node 2 cpus: 20 21 22 23 24 25 26 27 28 29 node 2 size: 65536 MB node 3 cpus: 30 31 32 33 34 35 36 37 38 39 node 3 size: 65536 MB node 4 cpus: 40 41 42 43 44 45 46 47 48 49 node 4 size: 65536 MB node 5 cpus: 50 51 52 53 54 55 56 57 58 59 node 5 size: 65536 MB node 6 cpus: 60 61 62 63 64 65 66 67 68 69 node 6 size: 65536 MB node 7 cpus: 70 71 72 73 74 75 76 77 78 79 Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>, Jim Hull <jim.hull@hp.com>, Craig Hada <craig.hada@hp.com> Tested-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-02cpu: Move thread_kicked to CPUStateAndreas Färber
Change field type to bool. Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-08-02cpu: Move thread field into CPUStateAndreas Färber
Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-08-02cpu: Move CPU_COMMON_THREAD into CPUStateAndreas Färber
CPU_COMMON_THREAD was only used for Windows, adding an hThread field to CPU_COMMON. Move the field into QOM CPUState and change its type to HANDLE, which it is assigned from. This requires Windows headers, pulled in through qemu-thread.h. Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-07-21cpus.c: Make all_cpu_threads_idle() staticPeter Maydell
Commit 946fb27c1 moved all the uses of all_cpu_threads_idle() into cpus.c. This means we can mark the function 'static' (again), if we shuffle it a bit earlier in the source file. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-04-12kvm: Drop redundant kvm_enabled from cpu_thread_is_idleJan Kiszka
This is now implied by kvm_irqchip_in_kernel. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-30qtest: add clock managementPaolo Bonzini
This patch combines qtest and -icount together to turn the vm_clock into a source that can be fully managed by the client. To this end new commands clock_step and clock_set are added. Hooking them with libqtest is left as an exercise to the reader. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-03-30qtest: add test frameworkAnthony Liguori
The idea behind qtest is pretty simple. Instead of executing a CPU via TCG or KVM, rely on an external process to send events to the device model that the CPU would normally generate. qtest presents itself as an accelerator. In addition, a new option is added to establish a qtest server (-qtest) that takes a character device. This is what allows the external process to send CPU events to the device model. qtest uses a simple line based protocol to send the events. Documentation of that protocol is in qtest.c. I considered reusing the monitor for this job. Adding interrupts would be a bit difficult. In addition, logging would also be difficult. qtest has extensive logging support. All protocol commands are logged with time stamps using a new command line option (-qtest-log). Logging is important since ultimately, this is a feature for debugging. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-03-14Rename CPUState -> CPUArchStateAndreas Färber
Scripted conversion: for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do sed -i "s/CPUState/CPUArchState/g" $file done All occurrences of CPUArchState are expected to be replaced by QOM CPUState, once all targets are QOM'ified and common fields have been extracted. Signed-off-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
2012-02-18Allow to use pause_all_vcpus from VCPU contextJan Kiszka
In order to perform critical manipulations on the VM state in the context of a VCPU, specifically code patching, stopping and resuming of all VCPUs may be necessary. resume_all_vcpus is already compatible, now enable pause_all_vcpus for this use case by stopping the calling context before starting to wait for the whole gang. CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-18Process pending work while waiting for initial kick-off in TCG modeJan Kiszka
When the TCG thread is started but not yet the machine, we wait in qemu_tcg_cpu_thread_fn on tcg_halt_cond. To allow run_on_cpu already at this time, we need to process pending request in that loop. CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-18Remove useless casts from cpu iteratorsJan Kiszka
CPUState::next_cpu is already CPUState *. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-18kvm: Set cpu_single_env only onceJan Kiszka
As we have thread-local cpu_single_env now and KVM uses exactly one thread per VCPU, we can drop the cpu_single_env updates from the loop and initialize this variable only once during setup. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-19apic: Inject external NMI events via LINT1Jan Kiszka
On real hardware, NMI button events are injected via the LINT1 line of the APICs. E.g. kdump expect this wiring and gets upset if the per-APIC LINT1 mask is not respected, i.e. if NMIs are injected to VCPUs that should not receive them. Change the APIC emulation code to reflect this. Based on qemu-kvm patch by Lai Jiangshan. CC: Lai Jiangshan <laijs@cn.fujitsu.com> Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-01-12cleanup, Remove duplicated codeLai Jiangshan
These two blocks of code are exactly the same, remove one. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-15fix win32 buildPaolo Bonzini
On Windows, cpus.c needs access to the hThread. Add a Windows-specific function to grab it. This requires changing the CPU threads to joinable. There is no substantial change because the threads run in an infinite loop. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-14Merge remote-tracking branch 'stefanha/trivial-patches-next' into stagingAnthony Liguori
2011-12-12qemu-thread: add API for joinable threadsJan Kiszka
Split from Jan's original qemu-thread-posix.c patch. No semantic change, just introduce the new API that POSIX and Win32 implementations will conform to. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-06qapi: Convert inject-nmiLuiz Capitulino
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-12-06qapi: Convert pmemsaveLuiz Capitulino
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-12-06qapi: Convert memsaveLuiz Capitulino
Please, note that the QMP command has a new 'cpu-index' parameter. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-12-06fix typo: delete redundant semicolonDong Xu Wang
Double semicolons should be single. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori
2011-12-05block: convert qemu_aio_flush() calls to bdrv_drain_all()Stefan Hajnoczi
Many places in QEMU call qemu_aio_flush() to complete all pending asynchronous I/O. Most of these places actually want to drain all block requests but there is no block layer API to do so. This patch introduces the bdrv_drain_all() API to wait for requests across all BlockDriverStates to complete. As a bonus we perform checks after qemu_aio_wait() to ensure that requests really have finished. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-02fix spelling in main directoryDong Xu Wang
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-11-07reenable vm_clock when resuming all vcpusWen Congyang
We disable vm_clock when pausing all vcpus, but we forget to reenable it when resuming all vcpus. It will cause that the guest can not be rebooted. Tested-by: Zhi Yong Wu <zwu.kernel@gmai.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-01Simplify cpu_exec_all to tcg_exec_allJan Kiszka
After the removal of the non-threaded mode cpu_exec_all is now only used by TCG. Refactor it accordingly, also dropping its unused return value. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-10-27qapi: Convert query-cpusLuiz Capitulino
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-10-21simplify main loop functionsPaolo Bonzini
Provide a clean example of how to use the main loop in the tools. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-10-21main-loop: create main-loop.cPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-10-21main-loop: create main-loop.hPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-10-21qemu-timer: do not refer to runstate_is_running()Paolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-10-21qemu-timer: move icount to cpus.cPaolo Bonzini
None of this is needed by tools, and most of it can even be made static inside cpus.c. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-10-19runstate: Allow user to migrate twiceLuiz Capitulino
It should be a matter of allowing the transition POSTMIGRATE -> FINISH_MIGRATE, but it turns out that the VM won't do the transition the second time because it's already stopped. So this commit also adds vm_stop_force_state() which performs the transition even if the VM is already stopped. While there also allow other states to migrate. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-09-20Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori
2011-09-20block: avoid SIGUSR2Frediano Ziglio
Now that iothread is always compiled sending a signal seems only an additional step. This patch also avoid writing to two pipe (one from signal and one in qemu_service_io). Work with kvm enabled or disabled. strace output is more readable (less syscalls). [ kwolf: Merged build fix by Paolo Bonzini ] Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-15Drop the vm_running global variableLuiz Capitulino
Use runstate_is_running() instead, which is introduced by this commit. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-09-15RunState: Add additional statesLuiz Capitulino
Currently, only vm_start() and vm_stop() change the VM state. That's, the state is only changed when starting or stopping the VM. This commit adds the runstate_set() function, which makes it possible to also do state transitions when the VM is stopped or running. Additional states are also added and the current state is stored. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-09-15Replace the VMSTOP macros with a proper state typeLuiz Capitulino
Today, when notifying a VM state change with vm_state_notify(), we pass a VMSTOP macro as the 'reason' argument. This is not ideal because the VMSTOP macros tell why qemu stopped and not exactly what the current VM state is. One example to demonstrate this problem is that vm_start() calls vm_state_notify() with reason=0, which turns out to be VMSTOP_USER. This commit fixes that by replacing the VMSTOP macros with a proper state type called RunState. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-09-02main: force enabling of I/O threadAnthony Liguori
Enabling the I/O thread by default seems like an important part of declaring 1.0. Besides allowing true SMP support with KVM, the I/O thread means that the TCG VCPU doesn't have to multiplex itself with the I/O dispatch routines which currently requires a (racey) signal based alarm system. I know there have been concerns about performance. I think so far the ones that have come up (virtio-net) are most likely due to secondary reasons like decreased batching. I think we ought to force enabling I/O thread early in 1.0 development and commit to resolving any lingering issues. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22Replace qemu_system_cond with VCPU stop mechanismJan Kiszka
We can express the VCPU thread wakeup with the stop mechanism, saving both qemu_system_ready and the qemu_system_cond. For KVM threads, we can just enter the main loop as long as the thread is stopped. The central TCG thread is better held back before the loop as there can be side effects of the services called even when all CPUs are stopped. Creating VCPUs in stopped state will also be required for proper CPU hotplugging support. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22Do not kick vcpus in TCG modeJan Kiszka
In TCG mode, iothread and vcpus run in lock-step. So it's pointless to send a signal from qemu_cpu_kick to the vcpu thread - if we got here, the receiver already left the vcpu loop. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20Use glib memory allocation and free functionsAnthony Liguori
qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-07-23iothread: replace fair_mutex with a condition variablePaolo Bonzini
This conveys the intention better, and scales to more than >1 threads contending the mutex with the iothread (as long as all of them have a "quiescent point" like the TCG thread has). Also, on Mac OS X the fair_mutex somehow didn't work as intended and deadlocked. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Alexander Graf <agraf@suse.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-07-16Fix signal handling of SIG_IPI when io-thread is enabledAlexandre Raymond
Both the signal thread (via sigwait()) and the cpu thread (via a normal signal handler) were attempting to catch SIG_IPI. This resulted in random freezes under Darwin. This patch separates SIG_IPI from the rest of the signals handled by the signal thread, because it is independently caught by the cpu thread. Signed-off-by: Alexandre Raymond <cerbere@gmail.com> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-07-16Fix signal handling when io-thread is disabledAlexandre Raymond
Changes since v1: - take pthread_sigmask() out of the ifdef as it is now common to both parts. This fix effectively blocks, in the main thread, the signals handled by signalfd or the compatibility signal thread. This way, such signals are received synchronously in the main thread through sigfd_handler() instead of triggering the signal handler directly, asynchronously. Signed-off-by: Alexandre Raymond <cerbere@gmail.com> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-06-27Merge remote-tracking branch 'stefanha/trivial-patches' into stagingAnthony Liguori
2011-06-26Remove exec-all.h include directivesBlue Swirl
Most exec-all.h include directives are now useless, remove them. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-06-24Do not include compatfd for WIN32Jan Kiszka
sigset_t, used by that header, is not available in mingw32 environments. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-06-20Fix typo in cpus.cAlexandre Raymond
filed -> failed Signed-off-by: Alexandre Raymond <cerbere@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>