aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-05-29exec: hide mr->ram_addr from qemu_get_ram_ptr usersPaolo Bonzini
Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an address that is relative to the MemoryRegion. This basically means what address_space_translate returns. Because the semantics of the second parameter change, rename the function to qemu_map_ram_ptr. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29memory: split memory_region_from_host from qemu_ram_addr_from_hostPaolo Bonzini
Move the old qemu_ram_addr_from_host to memory_region_from_host and make it return an offset within the region. For qemu_ram_addr_from_host return the ram_addr_t directly, similar to what it was before commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host", 2013-07-04). Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29exec: remove ram_addr argument from qemu_ram_block_from_hostPaolo Bonzini
Of the two callers, one does not use it, and the other can compute it itself based on the other output argument (offset) and the RAMBlock. Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptrPaolo Bonzini
Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd now that a MemoryRegion knows its RAMBlock directly. Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi-generic: Merge block max xfer len in INQUIRY responseFam Zheng
The rationale is similar to the above mode sense response interception: this is practically the only channel to communicate restraints from elsewhere such as host and block driver. The scsi bus we attach onto can have a larger max xfer len than what is accepted by the host file system (guarding between the host scsi LUN and QEMU), in which case the SG_IO we generate would get -EINVAL. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1464243305-10661-3-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi-block: always use SG_IOPaolo Bonzini
Using pread/pwrite or io_submit has the advantage of eliminating the bounce buffer, but drops the SCSI status. This keeps the guest from seeing unit attention codes, as well as statuses such as RESERVATION CONFLICT. Because we know scsi-block operates on an SBC device we can still use the DMA helpers with SG_IO; just remember to patch the CDBs if the transfer is split into multiple segments. This means that scsi-block will always use the thread-pool unfortunately, instead of respecting aio=native. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi-disk: introduce scsi_disk_req_check_errorPaolo Bonzini
Commonize all the checks for canceled requests and errors. The next patch will add another case to check for, in order to handle passthrough commands. There is no semantic change here; the only nontrivial modification is in scsi_write_do_fua, where cancellation has been checked earlier by both callers. Thus, the check is replaced with an assertion. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi-disk: add need_fua_emulation to SCSIDiskClassPaolo Bonzini
scsi-block will be able to do FUA just by passing the request through to the LUN (which is also more efficient); there is no need to emulate it like we do for scsi-disk. Add a new method to distinguish this. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi-disk: introduce dma_readv and dma_writevPaolo Bonzini
These are replacements for blk_aio_readv and blk_aio_writev that allow customization of the data path. They reuse the DMA helpers' DMAIOFunc callback type, so that the same function can be used in either the QEMUSGList or the bounce-buffered case. This customization will be needed in the next patch to do zero-copy SG_IO on scsi-block. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi-disk: introduce a common base classPaolo Bonzini
This will be the place to add DMAIOFuncs in the next patch. There are also a couple DeviceClass members that can be moved to the abstract class's initialization function. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29xen-hvm: ignore background I/O sectionsPaul Durrant
Since Xen will correctly handle accesses to unimplemented I/O ports (by returning all 1's for reads and ignoring writes) there is no need for QEMU to register backgroud I/O sections. This patch therefore adds checks to xen_io_add/del so that sections with memory-region ops pointing at 'unassigned_io_ops' are ignored. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1462811480-16295-1-git-send-email-paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29docs/atomics: update comparison with LinuxPaolo Bonzini
Over time, some differences between QEMU and Linux atomics are getting smoothed. In particular, Linux grew atomic_fetch_or (and in general the differences regarding RMW operations were not described accurately) and smp_load_acquire/smp_store_release. Also, set_mb was renamed to smp_store_mb(). Include these changes in the documentation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29atomics: do not emit consume barrier for atomic_rcu_readEmilio G. Cota
Currently we emit a consume-load in atomic_rcu_read. Because of limitations in current compilers, this is overkill for non-Alpha hosts and it is only useful to make Thread Sanitizer work. This patch leaves the consume-load in atomic_rcu_read when compiling with Thread Sanitizer enabled, and resorts to a relaxed load + smp_read_barrier_depends otherwise. On an RMO host architecture, such as aarch64, the performance improvement of this change is easily measurable. For instance, qht-bench performs an atomic_rcu_read on every lookup. Performance before and after applying this patch: $ tests/qht-bench -d 5 -n 1 Before: 9.78 MT/s After: 10.96 MT/s Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1464120374-8950-4-git-send-email-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29atomics: emit an smp_read_barrier_depends() barrier only for Alpha and ↵Emilio G. Cota
Thread Sanitizer For correctness, smp_read_barrier_depends() is only required to emit a barrier on Alpha hosts. However, we are currently emitting a consume fence unconditionally, and most compilers currently treat consume and acquire fences as equivalent. Fix it by keeping the consume fence if we're compiling with Thread Sanitizer, since this might help prevent false warnings. Otherwise, only emit the barrier for Alpha hosts. Note that we still guarantee that smp_read_barrier_depends() is a compiler barrier. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1464120374-8950-3-git-send-email-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29docs/atomics: update atomic_read/set comparison with LinuxEmilio G. Cota
Recently Linux did a mass conversion of its atomic_read/set calls so that they at least are READ/WRITE_ONCE. See Linux's commit 62e8a325 ("atomic, arch: Audit atomic_{read,set}()"). It seems though that their documentation hasn't been updated to reflect this. The appended updates our documentation to reflect the change, which means there is effectively no difference between our atomic_read/set and the current Linux implementation. While at it, fix the statement that a barrier is implied by atomic_read/set, which is incorrect. Volatile/atomic semantics prevent transformations pertaining the variable they apply to; this, however, has no effect on surrounding statements like barriers do. For more details on this, see: https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1464120374-8950-2-git-send-email-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29bt: rewrite csrhci_write to avoid out-of-bounds writesPaolo Bonzini
The usage of INT_MAX in this function confuses Coverity. I think the defect is bogus, however there is no protection against getting more than sizeof(s->inpkt) bytes from the character device backend. Rewrite the function to only fill in as much data as needed from buf into s->inpkt. The plen variable is replaced by a simple state machine and there is no need anymore to shift contents to the beginning of s->inpkt. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29block/iscsi: avoid potential overflow of acb->task->cdbPeter Lieven
at least in the path via virtio-blk the maximum size is not restricted. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1464080368-29584-1-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi: megasas: check 'read_queue_head' index valuePrasad J Pandit
While doing MegaRAID SAS controller command frame lookup, routine 'megasas_lookup_frame' uses 'read_queue_head' value as an index into 'frames[MEGASAS_MAX_FRAMES=2048]' array. Limit its value within array bounds to avoid any OOB access. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <1464179110-18593-1-git-send-email-ppandit@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi: megasas: initialise local configuration data bufferPrasad J Pandit
When reading MegaRAID SAS controller configuration via MegaRAID Firmware Interface(MFI) commands, routine megasas_dcmd_cfg_read uses an uninitialised local data buffer. Initialise this buffer to avoid stack information leakage. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <1464178304-12831-1-git-send-email-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi: megasas: use appropriate property buffer sizePrasad J Pandit
When setting MegaRAID SAS controller properties via MegaRAID Firmware Interface(MFI) commands, a user supplied size parameter is used to set property value. Use appropriate size value to avoid OOB access issues. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <1464172291-2856-2-git-send-email-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi: mptsas: infinite loop while fetching requestsPrasad J Pandit
The LSI SAS1068 Host Bus Adapter emulator in Qemu, periodically looks for requests and fetches them. A loop doing that in mptsas_fetch_requests() could run infinitely if 's->state' was not operational. Move check to avoid such a loop. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-stable@nongnu.org Message-Id: <1464077264-25473-1-git-send-email-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29scsi: pvscsi: check command descriptor ring buffer size (CVE-2016-4952)Prasad J Pandit
Vmware Paravirtual SCSI emulation uses command descriptors to process SCSI commands. These descriptors come with their ring buffers. A guest could set the ring buffer size to an arbitrary value leading to OOB access issue. Add check to avoid it. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-stable@nongnu.org Message-Id: <1464000485-27041-1-git-send-email-ppandit@redhat.com> Reviewed-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> Reviewed-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29kvm_stat: RemovePaolo Bonzini
The source has moved to the Linux kernel tree. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29nbd: Don't trim unrequested bytesEric Blake
Similar to commit df7b97ff, we are mishandling clients that give an unaligned NBD_CMD_TRIM request, and potentially trimming bytes that occur before their request; which in turn can cause potential unintended data loss (unlikely in practice, since most clients are sane and issue aligned trim requests). However, while we fixed read and write by switching to the byte interfaces of blk_, we don't yet have a byte interface for discard. On the other hand, trim is advisory, so rounding the user's request to simply ignore the first and last unaligned sectors (or the entire request, if it is sub-sector in length) is just fine. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1464173965-9694-1-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29hw/char: QOM'ify milkymist-uart.cxiaoqiang zhao
drop the qemu_char_get_next_serial and use chardev prop instead Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-6-git-send-email-zxq_yx_007@163.com> Tested-by: Michael Walle <michael@walle.cc> Acked-by: Michael Walle <michael@walle.cc> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29hw/char: QOM'ify lm32_uart.cxiaoqiang zhao
* Drop the old SysBus init function and use instance_init * Call qemu_chr_add_handlers in the realize callback * Use qdev chardev prop instead of qemu_char_get_next_serial * Add lm32_uart_create function to create lm32 uart device Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-5-git-send-email-zxq_yx_007@163.com> Tested-by: Michael Walle <michael@walle.cc> Acked-by: Michael Walle <michael@walle.cc> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29hw/char: QOM'ify lm32_juart.cxiaoqiang zhao
* Drop the old SysBus init function * Call qemu_chr_add_handlers in the realize callback * Use qdev chardev prop instead of qemu_char_get_next_serial Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-4-git-send-email-zxq_yx_007@163.com> Tested-by: Michael Walle <michael@walle.cc> Acked-by: Michael Walle <michael@walle.cc> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29hw/char: QOM'ify etraxfs_ser.cxiaoqiang zhao
* Drop the old SysBus init function and use instance_init * Call qemu_chr_add_handlers in the realize callback * Use qdev chardev prop instead of qemu_char_get_next_serial * Add etraxfs_ser_create function to create etraxfs serial device Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-3-git-send-email-zxq_yx_007@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29hw/char: QOM'ify escc.cxiaoqiang zhao
* Drop the old SysBus init function and use instance_init * Call qemu_chr_add_handlers in the realize callback Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Message-Id: <1464158344-12266-2-git-send-email-zxq_yx_007@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29Revert "memory: Drop FlatRange.romd_mode"Paolo Bonzini
This reverts commit 5b5660adf1fdb61db14ec681b10463b8cba633f1, as it breaks the UEFI guest firmware (known as ArmVirtPkg or AAVMF) running in the "virt" machine type of "qemu-system-aarch64": Contrary to the commit message, (a->mr == b->mr) does *not* imply that (a->romd_mode == b->romd_mode): the pflash device model calls memory_region_rom_device_set_romd() -- for switching between the above modes --, and that function changes mr->romd_mode but the current AddressSpaceDispatch's FlatRange keeps the old value. Therefore region_del/region_add are not called on the KVM MemoryListener. Reported-by: Drew Jones <drjones@redhat.com> Tested-by: Drew Jones <drjones@redhat.com> Analyzed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-27Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160527' ↵Peter Maydell
into staging linux-user pull request v2 for may 2016 # gpg: Signature made Fri 27 May 2016 12:51:10 BST using RSA key ID DE3C9BC0 # gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>" # gpg: aka "Riku Voipio <riku.voipio@linaro.org>" * remotes/riku/tags/pull-linux-user-20160527: (38 commits) linux-user,target-ppc: fix use of MSR_LE linux-user/signal.c: Use s390 target space address instead of host space linux-user/signal.c: Use target address instead of host address for microblaze restorer linux-user/signal.c: Generate opcode data for restorer in setup_rt_frame linux-user: arm: Remove ARM_cpsr and similar #defines linux-user: Use direct syscalls for setuid(), etc linux-user: x86_64: Don't use 16-bit UIDs linux-user: Use g_try_malloc() in do_msgrcv() linux-user: Handle msgrcv error case correctly linux-user: Handle negative values in timespec conversion linux-user: Use safe_syscall for futex syscall linux-user: Use safe_syscall for pselect, select syscalls linux-user: Use safe_syscall for execve syscall linux-user: Use safe_syscall for wait system calls linux-user: Use safe_syscall for open and openat system calls linux-user: Use safe_syscall for read and write system calls linux-user: Provide safe_syscall for fixing races between signals and syscalls linux-user: Add debug code to exercise restarting system calls linux-user: Support for restarting system calls for Microblaze targets linux-user: Set r14 on exit from microblaze syscall ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-27linux-user,target-ppc: fix use of MSR_LELaurent Vivier
setup_frame()/setup_rt_frame()/restore_user_regs() are using MSR_LE as the similar kernel functions do: as a bitmask. But in QEMU, MSR_LE is a bit position, so change this accordingly. The previous code was doing nothing as MSR_LE is 0, and "env->msr &= ~MSR_LE" doesn't change the value of msr. And yes, a user process can change its endianness, see linux kernel commit: fab5db9 [PATCH] powerpc: Implement support for setting little-endian mode via prctl and prctl(2): PR_SET_ENDIAN, PR_GET_ENDIAN Reviewed-by: Thomas Huth <huth@tuxfamily.org> Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user/signal.c: Use s390 target space address instead of host spaceChen Gang
The return address is in target space, so the restorer address needs to be target space, too. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
2016-05-27linux-user/signal.c: Use target address instead of host address for ↵Chen Gang
microblaze restorer The return address is in target space, so the restorer address needs to be target space, too. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user/signal.c: Generate opcode data for restorer in setup_rt_frameChen Gang
Original implementation uses do_rt_sigreturn directly in host space, when a guest program is in unwind procedure in guest space, it will get an incorrect restore address, then causes unwind failure. Also cleanup the original incorrect indentation. Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: arm: Remove ARM_cpsr and similar #definesPeter Maydell
The #defines of ARM_cpsr and friends in linux-user/arm/target-syscall.h can clash with versions in the system headers if building on an ARM or AArch64 build (though this seems to be dependent on the version of the system headers). The QEMU defines are not very useful (it's not clear that they're intended for use with the target_pt_regs struct rather than (say) the CPUARMState structure) and we only use them in one function in elfload.c anyway. So just remove the #defines and directly access regs->uregs[]. Reported-by: Christopher Covington <cov@codeaurora.org> Tested-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use direct syscalls for setuid(), etcPeter Maydell
On Linux the setuid(), setgid(), etc system calls have different semantics from the libc functions. The libc functions follow POSIX and update the credentials for all threads in the process; the system calls update only the thread which makes the call. (This impedance mismatch is worked around in libc by signalling all threads to tell them to do a syscall, in a byzantine and fragile way; see http://ewontfix.com/17/.) Since in linux-user we are trying to emulate the system call semantics, we must implement all these syscalls to directly call the underlying host syscall, rather than calling the host libc function. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: x86_64: Don't use 16-bit UIDsPeter Maydell
The 64-bit x86 syscall ABI uses 32-bit UIDs; only define USE_UID16 for 32-bit x86. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use g_try_malloc() in do_msgrcv()Peter Maydell
In do_msgrcv() we want to allocate a message buffer, whose size is passed to us by the guest. That means we could legitimately fail, so use g_try_malloc() and handle the error case, in the same way that do_msgsnd() does. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Handle msgrcv error case correctlyPeter Maydell
The msgrcv ABI is a bit odd -- the msgsz argument is a size_t, which is unsigned, but it must fail EINVAL if the value is negative when cast to a long. We were incorrectly passing the value through an "unsigned int", which meant that if the guest was 32-bit longs and the host was 64-bit longs an input of 0xffffffff (which should trigger EINVAL) would simply be passed to the host msgrcv() as 0xffffffff, where it does not cause the host kernel to reject it. Follow the same approach as do_msgsnd() in using a ssize_t and doing the check for negative values by hand, so we correctly fail in this corner case. This fixes the msgrcv03 Linux Test Project test case, which otherwise hangs. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Handle negative values in timespec conversionPeter Maydell
In a struct timespec, both fields are signed longs. Converting them from guest to host with code like host_ts->tv_sec = tswapal(target_ts->tv_sec); mishandles negative values if the guest has 32-bit longs and the host has 64-bit longs because tswapal()'s return type is abi_ulong: the assignment will zero-extend into the host long type rather than sign-extending it. Make the conversion routines use __get_user() and __set_user() instead: this automatically picks up the signedness of the field type and does the correct kind of sign or zero extension. It also handles the possibility that the target struct is not sufficiently aligned for the host's requirements. In particular, this fixes a hang when running the Linux Test Project mq_timedsend01 and mq_timedreceive01 tests: one of the test cases sets the timeout to -1 and expects an EINVAL failure, but we were setting a very long timeout instead. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for futex syscallPeter Maydell
Use the safe_syscall wrapper for the futex syscall. In particular, this fixes hangs when using programs that link against the Boehm garbage collector, including the Mono runtime. (We don't change the sys_futex() call in the implementation of the exit syscall, because as the FIXME comment there notes that should be handled by disabling signals, since we can't easily back out if the futex were to return ERESTARTSYS.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for pselect, select syscallsPeter Maydell
Use the safe_syscall wrapper for the pselect and select syscalls. Since not every architecture has the select syscall, we now have to implement select in terms of pselect, which means doing timeval<->timespec conversion. (Five years on from the initial patch that added pselect support to QEMU and a decade after pselect6 went into the kernel, it seems safe to not try to support hosts with header files which don't define __NR_pselect6.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for execve syscallTimothy E Baldwin
Wrap execve() in the safe-syscall handling. Although execve() is not an interruptible syscall, it is a special case: if we allow a signal to happen before we make the host$ syscall then we will 'lose' it, because at the point of execve the process leaves QEMU's control. So we use the safe syscall wrapper to ensure that we either take the signal as a guest signal, or else it does not happen before the execve completes and makes it the other program's problem. The practical upshot is that without this SIGTERM could fail to terminate the process. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-25-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: expanded commit message to explain in more detail why this is needed, and add comment about it too] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for wait system callsTimothy E Baldwin
Use safe_syscall for waitpid, waitid and wait4 syscalls. Note that this change allows us to implement support for waitid's fifth (rusage) argument in future; for the moment we ignore it as we have done up til now. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-18-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: Adjust to new safe_syscall convention. Add fifth waitid syscall argument (which isn't present in the libc interface but is in the syscall ABI)] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for open and openat system callsTimothy E Baldwin
Restart open() and openat() if signals occur before, or during with SA_RESTART. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-17-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: Adjusted to follow new -1-and-set-errno safe_syscall convention] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for read and write system callsTimothy E Baldwin
Restart read() and write() if signals occur before, or during with SA_RESTART Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-15-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: Update to new safe_syscall() convention of setting errno] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Provide safe_syscall for fixing races between signals and syscallsTimothy E Baldwin
If a signal is delivered immediately before a blocking system call the handler will only be called after the system call returns, which may be a long time later or never. This is fixed by using a function (safe_syscall) that checks if a guest signal is pending prior to making a system call, and if so does not call the system call and returns -TARGET_ERESTARTSYS. If a signal is received between the check and the system call host_signal_handler() rewinds execution to before the check. This rewinding has the effect of closing the race window so that safe_syscall will reliably either (a) go into the host syscall with no unprocessed guest signals pending or or (b) return -TARGET_ERESTARTSYS so that the caller can deal with the signals. Implementing this requires a per-host-architecture assembly language fragment. This will also resolve the mishandling of the SA_RESTART flag where we would restart a host system call and not call the guest signal handler until the syscall finally completed -- syscall restarting now always happens at the guest syscall level so the guest signal handler will run. (The host syscall will never be restarted because if the host kernel rewinds the PC to point at the syscall insn for a restart then our host_signal_handler() will see this and arrange the guest PC rewind.) This commit contains the infrastructure for implementing safe_syscall and the assembly language fragment for x86-64, but does not change any syscalls to use it. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-14-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: * Avoid having an architecture if-ladder in configure by putting linux-user/host/$(ARCH) on the include path and including safe-syscall.inc.S from it * Avoid ifdef ladder in signal.c by creating new hostdep.h to hold host-architecture-specific things * Added copyright/license header to safe-syscall.inc.S * Rewrote commit message * Added comments to safe-syscall.inc.S * Changed calling convention of safe_syscall() to match syscall() (returns -1 and host error in errno on failure) * Added a long comment in qemu.h about how to use safe_syscall() to implement guest syscalls. ] RV: squashed Peters "fixup! linux-user: compile on non-x86-64 hosts" patch Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-27linux-user: Add debug code to exercise restarting system callsTimothy E Baldwin
If DEBUG_ERESTARTSYS is set restart all system calls once. This is pure debug code for exercising the syscall restart code paths in the per-architecture cpu main loops. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-10-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: Add comment and a commented-out #define next to the commented-out generic DEBUG #define; remove the check on TARGET_USE_ERESTARTSYS; tweak comment message] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Support for restarting system calls for Microblaze targetsTimothy E Baldwin
Update the Microblaze main loop and sigreturn code: * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn * set all guest CPU state within signal.c code on sigreturn * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication that the main loop should not touch any guest CPU state Note that this in passing fixes a bug where we were corrupting the guest r[3] on sigreturn with the guest's r[10] because do_sigreturn() was returning env->regs[10] but the register for syscall return values is env->regs[3]. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-11-git-send-email-T.E.Baldwin99@members.leeds.ac.uk Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: Commit message tweaks; drop TARGET_USE_ERESTARTSYS define; drop whitespace changes] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>