slackcoder/qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer

Age	Commit message (Collapse)	Author
2016-10-26	target-i386: remove helper_lock()	Emilio G. Cota
	It's been superseded by the atomic helpers. The use of the atomic helpers provides a significant performance and scalability improvement. Below is the result of running the atomic_add-test microbenchmark with: $ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n , where $n is the number of threads and $r is the allowed range for the additions. The scenarios measured are: - atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset) - cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper - master: before this patchset Results sorted in ascending range, i.e. descending degree of contention. Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64 Opteron 6376 cores. atomic_add-bench: 5000000 ops/thread, [0,1] range 25 ++---------+----------+---------+----------+----------+----------+---++ + atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 +Emaster +-N--+ ++ \|\| \| \|++ \| \|\| \| 15 +++ ++ \|N\| \| \|+\| \| 10 ++\| ++ \|+\|+ \| \| \| -+E+------ +++ ---+E+------+E+------+E+-----+E+------+E\| \|+E+E+- +++ +E+------+E+-- \| 5 ++\|+ ++ \|+N+H+--- +++ \| ++++N+--+H++----+++ + +++ --++H+------+H+------+H++----+H+---+--- \| 0 ++---------+-----H----+---H-----+----------+----------+----------+---H+ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,2] range 25 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 ++master +-N--+ ++ \|E\| \| \|++ \| \|\|E \| 15 ++\| ++ \|N\|\| \| \|+\|\| ---+E+------+E+-----+E+------+E\| 10 ++\| \| ---+E+------+E+-----+E+--- +++ +++ \|\|H+E+--+E+-- \| \|+++++ \| \| \|\| \| 5 ++\|+H+-- +++ ++ \|+N+ - ---+H+------+H+------ \| + +N+--+H++----+H+---+--+H+----++H+--- + + +H+---+--+H\| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,8] range 40 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| 35 +cmpxchg +-H--+ ++ \| master +-N--+ ---+E+------+E+------+E+-----+E+------+E\| 30 ++\| ---+E+-- +++ ++ \| \| -+E+--- \| 25 ++E ---- +++ ++ \|+++++ -+E+ \| 20 +E+ E-- +++ ++ \|H\|+++ \| \|+\| +H+------- \| 15 ++H+ ---+++ +H+------ ++ \|N++H+-- +++--- +H+------++\| 10 ++ +++ - +++ ---+H+ +++ +H+ \| \| +H+-----+H+------+H+-- \| 5 ++\| +++ ++ ++N+N+--+N++ + + + + + \| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,128] range 160 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 140 +cmpxchg +-H--+ +++ +++ ++ \| master +-N--+ E--------E------+E+------++\| 120 ++ --\| \| +++ E+ \| -- +++ +++ ++\| 100 ++ - ++ \| +++- +++ ++\| 80 ++ -+E+ -+H+------+H+------H--------++ \| ---- ---- +++ H\| \| ---+E+-----+E+- ---+H+ ++\| 60 ++ +E+--- +++ ---+H+--- ++ \| --+++ ---+H+-- \| 40 ++ +E+-+H+--- ++ \| +H+ \| 20 +EE+ ++ +N+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,1024] range 350 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 300 +cmpxchg +-H--+ +++ \| master +-N--+ +++ \|\| \| +++ \| ----E\| 250 ++ \| ----E---- ++ \| ----E--- \| ---+H\| 200 ++ -+E+--- +++ ---+H+--- ++ \| ---- -+H+-- \| \| +E+ +++ ---- +++ \| 150 ++ ---+++ ---+H+- ++ \| --- -+H+-- \| 100 ++ ---+E+ ---- +++ ++ \| +++ ---+E+-----+H+- \| \| -+E+------+H+-- \| 50 ++ +E+ ++ +EE+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads hi-res: http://imgur.com/a/fMRmq For master I stopped measuring master after 8 threads, because there is little point in measuring the well-known performance collapse of a contended lock. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-21-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26	target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers	Emilio G. Cota
	The diff here is uglier than necessary. All this does is to turn FOO into: if (s->prefix & PREFIX_LOCK) { BAR } else { FOO } where FOO is the original implementation of an unlocked cmpxchg. [rth: Adjust unlocked cmpxchg to use movcond instead of branches. Adjust helpers to use atomic helpers.] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1467054136-10430-6-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-03-24	target-i386: implement PKE for TCG	Paolo Bonzini
	Tested with kvm-unit-tests. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-15	target-i386: Implement FSGSBASE	Richard Henderson
	Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15	target-i386: Clear bndregs during legacy near jumps	Richard Henderson
	Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15	target-i386: Implement BNDLDX, BNDSTX	Richard Henderson
	Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15	target-i386: Implement BNDCL, BNDCU, BNDCN	Richard Henderson
	Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13	target-i386: Perform set/reset_inhibit_irq inline	Richard Henderson
	With helpers that can be reused for other things. Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13	target-i386: Implement XSAVEOPT	Richard Henderson
	Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13	target-i386: Add XSAVE extension	Richard Henderson
	This includes XSAVE, XRSTOR, XGETBV, XSETBV, which are all related, as well as the associate cpuid bits. Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13	target-i386: Split fxsave/fxrstor implementation	Richard Henderson
	We will be able to reuse these pieces for XSAVE/XRSTOR. Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09	target-i386: Rewrite gen_enter inline	Richard Henderson
	Use gen_lea_v_seg for centralized segment base knowledge. Unify code across 32- and 64-bit. Fix note about "must save state" before using the out-of-line helpers. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1450379966-28198-8-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-23	target-i386: Check CR4[DE] for processing DR4/DR5	Richard Henderson
	Introduce helper_get_dr so that we don't have to put CR4[DE] into the scarce HFLAGS resource. At the same time, rename helper_movl_drN_T0 to helper_set_dr and set the helper flags. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-10-23	target-i386: Handle I/O breakpoints	Eduardo Habkost
	Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-09-15	target-i386: exception handling for seg_helper functions	Pavel Dovgalyuk
	This patch fixes exception handling for seg_helper functions. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-06-05	target-i386: Use correct memory attributes for ioport accesses	Paolo Bonzini
	In order to do this, stop using the cpu_in/out helpers, and instead access address_space_io directly. cpu_in* and cpu_out* remain for usage in the monitor, in qtest, and in Xen. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-28	tcg: Invert the inclusion of helper.h	Richard Henderson
	Rather than include helper.h with N values of GEN_HELPER, include a secondary file that sets up the macros to include helper.h. This minimizes the files that must be rebuilt when changing the macros for file N. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-21	target-i386: yield to another VCPU on PAUSE	Paolo Bonzini
	After commit b1bbfe7 (aio / timers: On timer modification, qemu_notify or aio_notify, 2013-08-21) FreeBSD guests report a huge slowdown. The problem shows up as soon as FreeBSD turns out its periodic (~1 ms) tick, but the timers are only the trigger for a pre-existing problem. Before the offending patch, setting a timer did a timer_settime system call. After, setting the timer exits the event loop (which uses poll) and reenters it with a new deadline. This does not cause any slowdown; the difference is between one system call (timer_settime and a signal delivery (SIGALRM) before the patch, and two system calls afterwards (write to a pipe or eventfd + calling poll again when re-entering the event loop). Unfortunately, the exit/enter causes the main loop to grab the iothread lock, which in turns kicks the VCPU thread out of execution. This causes TCG to execute the next VCPU in its round-robin scheduling of VCPUS. When the second VCPU is mostly unused, FreeBSD runs a "pause" instruction in its idle loop which only burns cycles without any progress. As soon as the timer tick expires, the first VCPU runs the interrupt handler but very soon it sets it again---and QEMU then goes back doing nothing in the second VCPU. The fix is to make the pause instruction do "cpu_loop_exit". Cc: Richard Henderson <rth@twiddle.net> Reported-by: Luigi Rizzo <rizzo@iet.unipi.it> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1384948442-24217-1-git-send-email-pbonzini@redhat.com Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-02-27	target-i386: Use mulu2 and muls2	Richard Henderson
	These correspond very closely to the insns that we're emulating. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-19	target-i386: Implement tzcnt and fix lzcnt	Richard Henderson
	We weren't computing flags for lzcnt at all. At the same time, adjust the implementation of bsf/bsr to avoid the local branch, using movcond instead. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-19	target-i386: Use clz/ctz for bsf/bsr helpers	Richard Henderson
	And mark the helpers as NO_RWG_SE. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18	target-i386: Implement PDEP, PEXT	Richard Henderson
	Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18	target-i386: Implement MULX	Richard Henderson
	Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18	target-i386: Use CC_SRC2 for ADC and SBB	Richard Henderson
	Add another slot in ENV and store two of the three inputs. This lets us do less work when carry-out is not needed, and avoids the unpredictable CC_OP after translating these insns. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18	target-i386: Make helper_cc_compute_{all,c} const	Richard Henderson
	Pass the data in explicitly, rather than indirectly via env. This avoids all sorts of unnecessary register spillage. Signed-off-by: Richard Henderson <rth@twiddle.net>
2012-12-19	exec: move include files to include/exec/	Paolo Bonzini
	Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-28	target-i386: rename helper flags	Aurelien Jarno
	Rename helper flags to the new ones. This is purely a mechanical change, it's possible to use better flags by looking at the helpers. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-01	x86: Implement SMEP and SMAP	H. Peter Anvin
	This patch implements Supervisor Mode Execution Prevention (SMEP) and Supervisor Mode Access Prevention (SMAP) for x86. The purpose of the patch, obviously, is to help kernel developers debug the support for those features. A fair bit of the code relates to the handling of CPUID features. The CPUID code probably would get greatly simplified if all the feature bit words were unified into a single vector object, but in the interest of producing a minimal patch for SMEP/SMAP, and because I had very limited time for this project, I followed the existing style. [ v2: don't change the definition of the qemu64 CPU shorthand, since that breaks loading old snapshots. Per Anthony Liguori this can be fixed once the CPU feature set is snapshot. Change the coding style slightly to conform to checkpatch.pl. ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-08-14	x86: switch to AREG0 free mode	Blue Swirl
	Add an explicit CPUX86State parameter instead of relying on AREG0. Remove temporary wrappers and switch to AREG0 free mode. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-14	x86: avoid AREG0 in segmentation helpers	Blue Swirl
	Add an explicit CPUX86State parameter instead of relying on AREG0. Rename remains of op_helper.c to seg_helper.c. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-14	x86: avoid AREG0 for misc helpers	Blue Swirl
	Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-14	x86: avoid AREG0 for SMM helpers	Blue Swirl
	Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-14	x86: avoid AREG0 for SVM helpers	Blue Swirl
	Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-14	x86: avoid AREG0 for integer helpers	Blue Swirl
	Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-14	x86: avoid AREG0 for condition code helpers	Blue Swirl
	Add an explicit CPUX86State parameter instead of relying on AREG0. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-14	x86: avoid AREG0 for FPU helpers	Blue Swirl
	Make FPU helpers take a parameter for CPUState instead of relying on global env. Introduce temporary wrappers for FPU load and store ops. Remove wrappers for non-AREG0 code. Don't call unconverted helpers directly. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-06-28	x86: avoid AREG0 for exceptions	Blue Swirl
	Add an explicit CPUX86State parameter instead of relying on AREG0. Merge raise_exception_env() to raise_exception(), likewise with raise_exception_err_env() and raise_exception_err(). Introduce cpu_svm_check_intercept_param() and cpu_vmexit() as wrappers. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-01-11	target-i386: fix SSE rounding and flush to zero	Aurelien Jarno
	SSE rounding and flush to zero control has never been implemented. However given that softfloat-native was using a single state for FPU and SSE and given that glibc is setting both FPU and SSE state in fesetround(), this was working correctly up to the switch to softfloat. Fix that by adding an update_sse_status() function similar to update_fpu_status(), and callin git on write to mxcsr. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-10-23	target-i386: implement lzcnt emulation	Andre Przywara
	lzcnt is a AMD Phenom/Barcelona added instruction returning the number of leading zero bits in a word. As this is similar to the "bsr" instruction, reuse the existing code. There need to be some more changes, though, as lzcnt always returns a valid value (in opposite to bsr, which has a special case when the operand is 0). lzcnt is guarded by the ABM CPUID bit (Fn8000_0001:ECX_5). Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-10-04	target-i386: add RDTSCP support	Andre Przywara
	RDTSCP reads the time stamp counter and atomically also the content of a 32-bit MSR, which can be freely set by the OS. This allows CPU local data to be queried by userspace. Linux uses this to allow a fast implementation of the getcpu() syscall, which uses the vsyscall page to avoid a context switch. AMD CPUs since K8RevF and Intel CPUs since Nehalem support this instruction. RDTSCP is guarded by the RDTSCP CPUID bit (Fn8000_0001:EDX[27]). Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-05-22	x86: Add support for resume flag	Jan Kiszka
	Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2008-11-17	TCG variable type checking.	pbrook
	Signed-off-by: Paul Brook <paul@codesourcery.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5729 c046a42c-6fe2-441c-8c8c-71466251a162
2008-09-25	SYSENTER/SYSEXIT IA-32e implementation (Alexander Graf).	balrog
	On Intel CPUs, sysenter and sysexit are valid in 64-bit mode. This patch makes both 64-bit aware and enables them for Intel CPUs. Add cpu save/load for 64-bit wide sysenter variables. Signed-off-by: Alexander Graf <agraf@suse.de> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5318 c046a42c-6fe2-441c-8c8c-71466251a162
2008-08-30	Fix some warnings that would be generated by gcc -Wredundant-decls	blueswir1
	git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5115 c046a42c-6fe2-441c-8c8c-71466251a162
2008-06-18	HLT, MWAIT and MONITOR insn fixes (initial patch by Alexander Graf)	bellard
	git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4746 c046a42c-6fe2-441c-8c8c-71466251a162
2008-06-04	reworked SVM interrupt handling logic - fixed vmrun EIP saved value - ↵	bellard
	reworked cr8 handling - added CPUState.hflags2 git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4662 c046a42c-6fe2-441c-8c8c-71466251a162
2008-06-04	32 bit SVM fixes - INVLPG and INVLPGA updates	bellard
	git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4660 c046a42c-6fe2-441c-8c8c-71466251a162
2008-05-28	SVM rework	bellard
	git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4605 c046a42c-6fe2-441c-8c8c-71466251a162
2008-05-22	proper helper definition registering (all targets must do that)	bellard
	git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4530 c046a42c-6fe2-441c-8c8c-71466251a162
2008-05-22	cmpxchg8b fix - added cmpxchg16b	bellard
	git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4522 c046a42c-6fe2-441c-8c8c-71466251a162