Age | Commit message (Collapse) | Author |
|
mmu access looks something like:
<check tlb>
if miss goto slow_path
<fast path>
done:
...
; end of the TB
slow_path:
<pre process>
mr r3, r27 ; move areg0 to r3
; (r3 holds the first argument for all the PPC32 ABIs)
<call mmu_helper>
b $+8
.long done
<post process>
b done
On ppc32 <call mmu_helper> is:
(SysV and Darwin)
mmu_helper is most likely not within direct branching distance from
the call site, necessitating
a. moving 32 bit offset of mmu_helper into a GPR ; 8 bytes
b. moving GPR to CTR/LR ; 4 bytes
c. (finally) branching to CTR/LR ; 4 bytes
r3 setting - 4 bytes
call - 16 bytes
dummy jump over retaddr - 4 bytes
embedded retaddr - 4 bytes
Total overhead - 28 bytes
(PowerOpen (AIX))
a. moving 32 bit offset of mmu_helper's TOC into a GPR1 ; 8 bytes
b. loading 32 bit function pointer into GPR2 ; 4 bytes
c. moving GPR2 to CTR/LR ; 4 bytes
d. loading 32 bit small area pointer into R2 ; 4 bytes
e. (finally) branching to CTR/LR ; 4 bytes
r3 setting - 4 bytes
call - 24 bytes
dummy jump over retaddr - 4 bytes
embedded retaddr - 4 bytes
Total overhead - 36 bytes
Following is done to trim the code size of slow path sections:
In tcg_target_qemu_prologue trampolines are emitted that look like this:
trampoline:
mfspr r3, LR
addi r3, 4
mtspr LR, r3 ; fixup LR to point over embedded retaddr
mr r3, r27
<jump mmu_helper> ; tail call of sorts
And slow path becomes:
slow_path:
<pre process>
<call trampoline>
.long done
<post process>
b done
call - 4 bytes (trampoline is within code gen buffer
and most likely accessible via
direct branch)
embedded retaddr - 4 bytes
Total overhead - 8 bytes
In the end the icache pressure is decreased by 20/28 bytes at the cost
of an extra jump to trampoline and adjusting LR (to skip over embedded
retaddr) once inside.
Signed-off-by: malc <av1474@comtv.ru>
|
|
Signed-off-by: malc <av1474@comtv.ru>
|
|
* 'trivial-patches' of git://github.com/stefanha/qemu:
pc: Drop redundant test for ROM memory region
exec: make some functions static
target-ppc: make some functions static
ppc: add missing static
vnc: add missing static
vl.c: add missing static
target-sparc: make do_unaligned_access static
m68k: Return semihosting errno values correctly
cadence_uart: More debug information
Conflicts:
target-m68k/m68k-semi.c
|
|
Add GETPC_EXT which is used by MMU helpers to selectively calculate the code
address of accessing guest memory when called from a qemu_ld/st optimized code
or a C function. Currently, it supports only i386 and x86-64 hosts.
Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are
reserved) and its purpose doesn't match the name (most target_phys_addr_t
addresses are not target specific). Replace it with a finger-friendly,
standards conformant hwaddr.
Outstanding patchsets can be fixed up with the command
git rebase -i --exec 'find -name "*.[ch]"
| xargs s/target_phys_addr_t/hwaddr/g' origin
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
It is used nowhere else, and the corresponding MAX_CODE_GEN_BUFFER_SIZE
also lives there.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
commit c28ae41 introduced GETPC() usage for sparc, which is currently
not defined when building with --enable-tcg-interpreter. Add sparc to
the list of targets we selectively define GETPC() for.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Now that CONFIG_TCG_PASS_AREG0 is enabled for all targets,
remove dead code and support for !CONFIG_TCG_PASS_AREG0 case.
Remove dyngen-exec.h and all references to it. Although included by
hw/spapr_hcall.c, it does not seem to use it.
Remove unused HELPER_CFLAGS.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
|
|
DEF_HELPER_FLAGS_5 was added some time ago without adjusting
MAX_OPC_PARAM_IARGS.
Fixing the definition becomes more important as QEMU is using
an increasing number of helper functions called with 5 arguments.
Add also a comment to avoid future problems when DEF_HELPER_FLAGS_6
will be added.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
There are no users left for previous exception handler returned from
cpu_set_debug_excp_handler. It should simplify code a little.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
|
|
If we execute linux-user code that does the following:
* A = mmap()
* execute code in A
* munmap(A)
* B = mmap(), but mmap returns the same address as A
* execute code in B
we end up executing a stale cached tb that contains translated code
from A, while we want new code from B.
This patch adds a TB flush for mmap'ed regions, before we return them,
avoiding the whole issue. It also adds a flush for munmap, so that we
don't execute stale TBs instead of getting a segfault.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Move TLB handling and softmmu code load helpers to cputlb.c,
compile only for softmmu targets.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Change the data type of tci_tb_ptr, so GETPC() returns an
uintptr_t now (like for all other TCG targets).
This completes commit 2050396801ca0c8359364d61eaadece951006057
and fixes builds with TCI.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Allow TB invalidation by its physical address, extract implementation
from the breakpoint_invalidate function.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Use uintptr_t instead of void * or unsigned long in
several op related functions, env->mem_io_pc and
GETPC() macro.
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
cpu_io_recompile terminates by calling either cpu_abort or
cpu_resume_from_signal which both never return.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
cpu_resume_from_signal terminates by calling longjmp.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
QEMU host addresses must use uintptr_t to be portable for hosts with
an unusual size of long (w64).
tb_jmp_offset is an uint16_t value, therefore the local variable offset
in function tb_set_jmp_target was changed from unsigned long to uint16_t.
The type cast to long in function tb_add_jump now also uses uintptr_t.
For the bit operation used here, the signedness of the type cast does
not matter.
Some remaining unsigned long values are either only used for ARM assembler
code or will be fixed in a later patch for PPC.
v2:
Fix signature of tb_find_pc in exec.c, too (hint from Blue Swirl, thanks).
There remain lots of other long / unsigned long in exec.c which must be
replaced by uintptr_t. This will be done in a separate patch. Here
only one of these type casts is fixed.
v3:
Also fix signature of page_unprotect.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Optionally, make memory access helpers take a parameter for CPUState
instead of relying on global env.
On most targets, perform simple moves to reorder registers. On i386,
switch from regparm(3) calling convention to standard stack-based
version.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
The return value of cpu_register_io_memory() is no longer used anywhere, so
we can remove it and all associated data and code.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Instead of indirecting via io_mem_region, dispatch directly
through the MemoryRegion obtained from the iotlb or phys_page_find().
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
A step towards eliminating io indices.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Now that all mmio goes through MemoryRegions, we can convert
io_mem_opaque to be a MemoryRegion pointer, and remove the thunks
that convert from old-style CPU{Read,Write}MemoryFunc to MemoryRegionOps.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
|
|
Its use of IO_MEM_ROM and friends will later cause #include loops; and it
is too large to merit inlining.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
|
|
The code sometimes uses range comparisons on io indexes (e.g.
index =< IO_MEM_ROM). Avoid these as they make moving to objects harder.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
|
|
Currently mmio access goes directly to the io_mem_{read,write} arrays.
In preparation for eliminating them, add indirection via a function.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
|
|
Unlike other tcg target code generators, this one does not generate
machine code for some cpu. It generates machine independent bytecode
which is interpreted later.
This allows running QEMU on any host.
Interpreted bytecode is slower than direct execution of generated
machine code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
|
|
Adding an offset to a void pointer works with gcc but is not allowed
by the current C standards. With -pedantic, gcc complains:
exec-all.h:344: error: pointer of type ‘void *’ used in arithmetic
Fix this, and also replace (unsigned long) by (uintptr_t) in the same
statement.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
None of this is needed by tools, and most of it can even be made static
inside cpus.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
GETPC() can be used even from outside of helper code. Move the macro to
a more accessible location. Avoid a compile warning from redefining it in exec.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Pass CPUState pointer to tlb_fill() instead of architecture local
cpu_single_env hacks.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
cea5f9a28faa528b6b1b117c9ab2d8828f473fef exposed bugs in unassigned memory
access handling. Fix them by always passing CPUState to the handlers.
Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
The target-arm frontend's worst-case TCG ops per instr is 194 (and in
general many of the "load multiple registers" ARM instructions generate
more than 100 TCG ops). Raise MAX_OP_PER_INSTR accordingly to avoid
possible buffer overruns.
Since it doesn't make any sense for the "64 bit guest on 32 bit host"
case to have a smaller limit than the normal case, we collapse the
two cases back into each other again.
(This increase costs us about 14K in extra static buffer space and
21K of extra margin at the end of a 32MB codegen buffer.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Move functions cpu_has_work() and cpu_pc_from_tb() from exec.h to cpu.h. This is
needed by later patches.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Make cpu_loop_exit() take a parameter for CPUState instead of relying
on global env.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
* 's390-next' of git://repo.or.cz/qemu/agraf:
s390x: complain when allocating ram fails
s390x: fix memory detection for guests > 64GB
s390x: change mapping base to allow guests > 2GB
s390x: Fix debugging for unknown sigp order codes
s390x: build s390x by default
s390x: remove compatibility cc field
s390x: Adjust GDB stub
s390x: translate engine for s390x CPU
s390x: Adjust internal kvm code
s390x: Implement opcode helpers
s390x: helper functions for system emulation
s390x: Shift variables in CPUState for memset(0)
s390x: keep hint on virtio managing size
s390x: make kvm exported functions conditional on kvm
s390x: s390x-linux-user support
tcg: extend max tcg opcodes when using 64-on-32bit
s390x: fix smp support for kvm
|
|
tb_invalidate_page_range() was intended to be used to invalidate an
area of a TB which the guest explicitly flushes from i-cache. However,
QEMU detects writes to code areas where TBs have been generated, so
his has never been useful.
Delete the function, adjust callers.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
When running a 64 bit guest on a 32 bit host, we tend to use more TCG ops
than on a 64 bit host. Reflect that in the reserved opcode amount constant.
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
The previous patch removed the need for parameter puc.
Is is now unused, so remove it.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
|
|
Function gen_pc_load was introduced in commit
d2856f1ad4c259e5766847c49acbb4e390731bd4.
The only reason for parameter searched_pc was
a debug statement in target-i386/translate.c.
Parameter puc was needed by target-sparc until
commit d7da2a10402f1644128b66414ca8f86bdea9ae7c.
Remove searched_pc from the debug statement and remove both
parameters from the parameter list of gen_pc_load.
As the function name gen_pc_load was also misleading,
it is now called restore_state_to_opc. This new name
was suggested by Peter Maydell, thanks.
v2: Remove last parameter, too, and rename the function.
v3: Fix [] typo in target-arm/translate.c.
Fix wrong SHA1 object name in commit message (copy+paste error).
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
|
|
This function is only used within exec.c, so no need to make it public.
Signed-off-by: Tristan Gingold <gingold@adacore.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
'extern' qualifier is useless for function declarations. Delete
them.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Most of emulated CPU have instructions aligned on 16 or 32 bits, while
on others GCC tries to align the target jump location. This means that
1/2 or 3/4 of tb_phys_hash entries are never used.
Update the hash function tb_phys_hash_func() to ignore the two lowest
bits of the address. This brings a 6% speed-up when booting a MIPS
image.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Use __builtin___clear_cache() instead of __clear_cache() to avoid having
to define the function as extern. Fix the following warning:
| In file included from qemu/cpus.c:34:
| qemu/exec-all.h: In function 'tb_set_jmp_target1':
| qemu/exec-all.h:208: error: nested extern declaration of '__clear_cache'
| make[1]: *** [cpus.o] Error 1
| make: *** [subdir-i386-softmmu] Error 2
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
To be used by next patches.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
this patch removes unused function cpu_restore_state_copy().
Signed-off-by: Jun Koi <junkoi2004@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
|