aboutsummaryrefslogtreecommitdiff
path: root/linux-user/syscall.c
AgeCommit message (Collapse)Author
2016-06-30linux-user: Fix compilation when F_SETPIPE_SZ isn't definedPeter Maydell
Older kernels don't have F_SETPIPE_SZ and F_GETPIPE_SZ (in particular RHEL6's system headers don't define these). Add ifdefs so that we can gracefully fall back to not supporting those guest ioctls rather than failing to build. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-id: 1467304429-21470-1-git-send-email-peter.maydell@linaro.org
2016-06-29Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' ↵Peter Maydell
into staging # gpg: Signature made Tue 28 Jun 2016 22:27:20 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/tracing-pull-request: trace: [*-user] Add events to trace guest syscalls in syscall emulation mode trace: enable tracing in qemu-img qemu-img: move common options parsing before commands processing trace: enable tracing in qemu-nbd trace: enable tracing in qemu-io trace: move qemu_trace_opts to trace/control.c doc: move text describing --trace to specific .texi file doc: sync help description for --trace with man for qemu.1 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-28trace: [*-user] Add events to trace guest syscalls in syscall emulation modeLluís Vilanova
Adds two events to trace syscalls in syscall emulation mode (*-user): * guest_user_syscall: Emitted before the syscall is emulated; contains the syscall number and arguments. * guest_user_syscall_ret: Emitted after the syscall is emulated; contains the syscall number and return value. Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Message-id: 146651712411.12388.10024905980452504938.stgit@fimbulvetr.bsc.es Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-26linux-user: don't swap NLMSG_DATA() fieldsLaurent Vivier
If the structure pointed by NLMSG_DATA() is bigger than the size of NLMSG_DATA(), don't swap its fields to avoid memory corruption. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26linux-user: fd_trans_host_to_target_data() must process only received dataLaurent Vivier
if we process the whole buffer, the netlink helpers can try to swap invalid data. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26linux-user: add missing return in netlink switch statementLaurent Vivier
Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26linux-user: Support F_GETPIPE_SZ and F_SETPIPE_SZ fcntlsPeter Maydell
Support the F_GETPIPE_SZ and F_SETPIPE_SZ fcntl operations. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26linux-user: Fix wrong type used for argument to rt_sigqueueinfoPeter Maydell
The third argument to the rt_sigqueueinfo syscall is a pointer to a siginfo_t, not a pointer to a sigset_t. Fix the error in the arguments to lock_user(), which meant that we would not have detected some faults that we should. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26linux-user: Don't use sigfillset() on uc->uc_sigmaskPeter Maydell
The kernel and libc have different ideas about what a sigset_t is -- for the kernel it is only _NSIG / 8 bytes in size (usually 8 bytes), but for libc it is much larger, 128 bytes. In most situations the difference doesn't matter, because if you pass a pointer to a libc sigset_t to the kernel it just acts on the first 8 bytes of it, but for the ucontext_t* argument to a signal handler it trips us up. The kernel allocates this ucontext_t on the stack according to its idea of the sigset_t type, but the type of the ucontext_t defined by the libc headers uses the libc type, and so do the manipulator functions like sigfillset(). This means that (1) sizeof(uc->uc_sigmask) is much larger than the actual space used on the stack (2) sigfillset(&uc->uc_sigmask) will write garbage 0xff bytes off the end of the structure, which can trash data that was on the stack before the signal handler was invoked, and may result in a crash after the handler returns To avoid this, we use a memset() of the correct size to fill the signal mask rather than using the libc function. This fixes a problem where we would crash at least some of the time on an i386 host when a signal was taken. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26linux-user: Use safe_syscall wrapper for fcntlPeter Maydell
Use the safe_syscall wrapper for fcntl. This is straightforward now that we always use 'struct fcntl64' on the host, as we don't need to select whether to call the host's fcntl64 or fcntl syscall (a detail that the libc previously hid for us). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26linux-user: Use __get_user() and __put_user() to handle structs in do_fcntl()Peter Maydell
Use the __get_user() and __put_user() to handle reading and writing the guest structures in do_ioctl(). This has two benefits: * avoids possible errors due to misaligned guest pointers * correctly sign extends signed fields (like l_start in struct flock) which might be different sizes between guest and host To do this we abstract out into copy_from/to_user functions. We also standardize on always using host flock64 and the F_GETLK64 etc flock commands, as this means we always have 64 bit offsets whether the host is 64-bit or 32-bit and we don't need to support conversion to both host struct flock and struct flock64. In passing we fix errors in converting l_type from the host to the target (where we were doing a byteswap of the host value before trying to do the convert-bitmasks operation rather than otherwise, and inexplicably shifting left by 1); these were accidentally left over when the original simple "just shift by 1" arm<->x86 conversion of commit 43f238d was changed to the more general scheme of using target_to_host_bitmask() functions in 2ba7f73. [RV: fixed ifdef guard for eabi functions] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-16os-posix: include sys/mman.hPaolo Bonzini
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check is bogus without a previous inclusion of sys/mman.h. Include it in sysemu/os-posix.h and remove it from everywhere else. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-08Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160608' ↵Peter Maydell
into staging linux-user pull request for June 2016 # gpg: Signature made Wed 08 Jun 2016 14:27:14 BST # gpg: using RSA key 0xB44890DEDE3C9BC0 # gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>" # gpg: aka "Riku Voipio <riku.voipio@linaro.org>" * remotes/riku/tags/pull-linux-user-20160608: (44 commits) linux-user: In fork_end(), remove correct CPUs from CPU list linux-user: Special-case ERESTARTSYS in target_strerror() linux-user: Make target_strerror() return 'const char *' linux-user: Correct signedness of target_flock l_start and l_len fields linux-user: Use safe_syscall wrapper for ioctl linux-user: Use safe_syscall wrapper for accept and accept4 syscalls linux-user: Use safe_syscall wrapper for semop linux-user: Use safe_syscall wrapper for epoll_wait syscalls linux-user: Use safe_syscall wrapper for poll and ppoll syscalls linux-user: Use safe_syscall wrapper for sleep syscalls linux-user: Use safe_syscall wrapper for rt_sigtimedwait syscall linux-user: Use safe_syscall wrapper for flock linux-user: Use safe_syscall wrapper for mq_timedsend and mq_timedreceive linux-user: Use safe_syscall wrapper for msgsnd and msgrcv linux-user: Use safe_syscall wrapper for send* and recv* syscalls linux-user: Use safe_syscall wrapper for connect syscall linux-user: Use safe_syscall wrapper for readv and writev syscalls linux-user: Fix error conversion in 64-bit fadvise syscall linux-user: Fix NR_fadvise64 and NR_fadvise64_64 for 32-bit guests linux-user: Fix handling of arm_fadvise64_64 syscall ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Conflicts: configure scripts/qemu-binfmt-conf.sh
2016-06-08linux-user: Special-case ERESTARTSYS in target_strerror()Peter Maydell
Since TARGET_ERESTARTSYS and TARGET_ESIGRETURN are internal-to-QEMU error numbers, handle them specially in target_strerror(), to avoid confusing strace output like: 9521 rt_sigreturn(14,8,274886297808,8,0,268435456) = -1 errno=513 (Unknown error 513) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Make target_strerror() return 'const char *'Peter Maydell
Make target_strerror() return 'const char *' rather than just 'char *'; this will allow us to return constant strings from it for some special cases. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
2016-06-08linux-user: Use safe_syscall wrapper for ioctlPeter Maydell
Use the safe_syscall wrapper to implement the ioctl syscall. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for accept and accept4 syscallsPeter Maydell
Use the safe_syscall wrapper for the accept and accept4 syscalls. accept4 has been in the kernel since 2.6.28 so we can assume it is always present. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for semopPeter Maydell
Use the safe_syscall wrapper for the semop syscall or IPC operation. (We implement via the semtimedop syscall to make it easier to implement the guest semtimedop syscall later.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for epoll_wait syscallsPeter Maydell
Use the safe_syscall wrapper for epoll_wait and epoll_pwait syscalls. Since we now directly use the host epoll_pwait syscall for both epoll_wait and epoll_pwait, we don't need the configure machinery to check whether glibc supports epoll_pwait(). (The kernel has supported the syscall since 2.6.19 so we can assume it's always there.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for poll and ppoll syscallsPeter Maydell
Use the safe_syscall wrapper for the poll and ppoll syscalls. Since not all host architectures will have a poll syscall, we have to rewrite the TARGET_NR_poll handling to use ppoll instead (we can assume everywhere has ppoll by now). We take the opportunity to switch to the code structure already used in the implementation of epoll_wait and epoll_pwait, which uses a switch() to avoid interleaving #if and if (), and to stop using a variable with a leading '_' which is in the implementation's namespace. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for sleep syscallsPeter Maydell
Use the safe_syscall wrapper for the clock_nanosleep and nanosleep syscalls. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for rt_sigtimedwait syscallPeter Maydell
Use the safe_syscall wrapper for the rt_sigtimedwait syscall. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for flockPeter Maydell
Use the safe_syscall wrapper for the flock syscall. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for mq_timedsend and mq_timedreceivePeter Maydell
Use the safe_syscall wrapper for mq_timedsend and mq_timedreceive syscalls. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for msgsnd and msgrcvPeter Maydell
Use the safe_syscall wrapper for msgsnd and msgrcv syscalls. This is made slightly awkward by some host architectures providing only a single 'ipc' syscall rather than separate syscalls per operation; we provide safe_msgsnd() and safe_msgrcv() as wrappers around safe_ipc() to handle this if needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for send* and recv* syscallsPeter Maydell
Use the safe_syscall wrapper for the send, sendto, sendmsg, recv, recvfrom and recvmsg syscalls. RV: adjusted to apply Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for connect syscallPeter Maydell
Use the safe_syscall wrapper for the connect syscall. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Use safe_syscall wrapper for readv and writev syscallsPeter Maydell
Use the safe_syscall wrapper for readv and writev syscalls. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Fix error conversion in 64-bit fadvise syscallPeter Maydell
Fix a missing host-to-target errno conversion in the 64-bit fadvise syscall emulation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Fix NR_fadvise64 and NR_fadvise64_64 for 32-bit guestsPeter Maydell
Fix errors in the implementation of NR_fadvise64 and NR_fadvise64_64 for 32-bit guests, which pass their off_t values in register pairs. We can't use the 64-bit code path for this, so split out the 32-bit cases, so that we can correctly handle the "only offset is 64-bit" and "both offset and length are 64-bit" syscall flavours, and "uses aligned register pairs" and "does not" flavours of target. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08linux-user: Fix handling of arm_fadvise64_64 syscallPeter Maydell
32-bit ARM has an odd variant of the fadvise syscall which has rearranged arguments, which we try to implement. Unfortunately we got the rearrangement wrong. This is a six-argument syscall whose arguments are: * fd * advise parameter * offset high half * offset low half * len high half * len low half Stop trying to share code with the standard fadvise syscalls, and just implement the syscall with the correct argument order. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: Use DIV_ROUND_UPLaurent Vivier
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d). This patch is the result of coccinelle script scripts/coccinelle/round.cocci CC: Riku Voipio <riku.voipio@iki.fi> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07linux-user: Restart fork() if signals pendingTimothy E Baldwin
If there is a signal pending during fork() the signal handler will erroneously be called in both the parent and child, so handle any pending signals first. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-20-git-send-email-T.E.Baldwin99@members.leeds.ac.uk Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: Use safe_syscall for kill, tkill and tgkill syscallsPeter Maydell
Use the safe_syscall wrapper for the kill, tkill and tgkill syscalls. Without this, if a thread sent a SIGKILL to itself it could kill the thread before we had a chance to process a signal that arrived just before the SIGKILL, and that signal would get lost. We drop all the ifdeffery for tkill and tgkill, because every guest architecture we support implements them, and they've been in Linux since 2003 so we can assume the host headers define the __NR_tkill and __NR_tgkill constants. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: Restart exit() if signal pendingTimothy E Baldwin
Without this a signal could vanish on thread exit. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-26-git-send-email-T.E.Baldwin99@members.leeds.ac.uk Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: pause() should not pause if signal pendingTimothy E Baldwin
Fix races between signal handling and the pause syscall by reimplementing it using block_signals() and sigsuspend(). (Using safe_syscall(pause) would also work, except that the pause syscall doesn't exist on all architectures.) Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-28-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: tweaked commit message] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: Fix race between multiple signalsPeter Maydell
If multiple host signals are received in quick succession they would be queued in TaskState then delivered to the guest in spite of signals being supposed to be blocked by the guest signal handler's sa_mask. Fix this by decoupling the guest signal mask from the host signal mask, so we can have protected sections where all host signals are blocked. In particular we block signals from when host_signal_handler() queues a signal from the guest until process_pending_signals() has unqueued it. We also block signals while we are manipulating the guest signal mask in emulation of sigprocmask and similar syscalls. Blocking host signals also ensures the correct behaviour with respect to multiple threads and the overrun count of timer related signals. Alas blocking and queuing in qemu is still needed because of virtual processor exceptions, SIGSEGV and SIGBUS. Blocking signals inside process_pending_signals() protects against concurrency problems that would otherwise happen if host_signal_handler() ran and accessed the signal data structures while process_pending_signals() was manipulating them. Since we now track the guest signal mask separately from that of the host, the sigsuspend system calls must track the signal mask passed to them, because when we process signals as we leave the sigsuspend the guest signal mask in force is that passed to sigsuspend. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-19-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: make signal_pending a simple flag rather than a word with two flag bits; ensure we don't call block_signals() twice in sigreturn codepaths; document and assert() the guarantee that using do_sigprocmask() to get the current mask never fails; use the qemu atomics.h functions rather than raw volatile variable access; add extra commentary and documentation; block SIGSEGV/SIGBUS in block_signals() and in process_pending_signals() because they can't occur synchronously here; check the right do_sigprocmask() call for errors in ssetmask syscall; expand commit message; fixed sigsuspend() hanging] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: Use safe_syscall for sigsuspend syscallsPeter Maydell
Use the safe_syscall wrapper for sigsuspend syscalls. This means that we will definitely deliver a signal that arrives before we do the sigsuspend call, rather than blocking first and delivering afterwards. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: Define macro for size of host kernel sigset_tPeter Maydell
Some host syscalls take an argument specifying the size of a host kernel's sigset_t (which isn't necessarily the same as that of the host libc's type of that name). Instead of hardcoding _NSIG / 8 where we do this, define and use a SIGSET_T_SIZE macro. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: check if NETLINK_ROUTE is availableLaurent Vivier
Some IFLA_* symbols can be missing in the host linux/if_link.h, but as they are enums and not "#defines", check in "configure" if last known (IFLA_PROTO_DOWN) is available and if not, disable management of NETLINK_ROUTE protocol. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: add netlink auditLaurent Vivier
This is, for instance, needed to log in a container. Without this, the user cannot be identified and the console login fails with "Login incorrect". Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: support netlink protocol NETLINK_KOBJECT_UEVENTLaurent Vivier
This is the protocol used by udevd to manage kernel events. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07linux-user: add rtnetlink(7) supportLaurent Vivier
rtnetlink is needed to use iproute package (ip addr, ip route) and dhcp client. Examples: Without this patch: # ip link Cannot open netlink socket: Address family not supported by protocol # ip addr Cannot open netlink socket: Address family not supported by protocol # ip route Cannot open netlink socket: Address family not supported by protocol # dhclient eth0 Cannot open netlink socket: Address family not supported by protocol Cannot open netlink socket: Address family not supported by protocol With this patch: # ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT qlen 1000 link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff # ip addr show eth0 51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP qlen 1000 link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff inet 192.168.122.197/24 brd 192.168.122.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::216:3eff:fe89:6bd7/64 scope link valid_lft forever preferred_lft forever # ip route default via 192.168.122.1 dev eth0 192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.197 # ip addr flush eth0 # ip addr add 192.168.122.10 dev eth0 # ip addr show eth0 51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP qlen 1000 link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff inet 192.168.122.10/32 scope global eth0 valid_lft forever preferred_lft forever # ip route add 192.168.122.0/24 via 192.168.122.10 # ip route 192.168.122.0/24 via 192.168.122.10 dev eth0 Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use direct syscalls for setuid(), etcPeter Maydell
On Linux the setuid(), setgid(), etc system calls have different semantics from the libc functions. The libc functions follow POSIX and update the credentials for all threads in the process; the system calls update only the thread which makes the call. (This impedance mismatch is worked around in libc by signalling all threads to tell them to do a syscall, in a byzantine and fragile way; see http://ewontfix.com/17/.) Since in linux-user we are trying to emulate the system call semantics, we must implement all these syscalls to directly call the underlying host syscall, rather than calling the host libc function. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use g_try_malloc() in do_msgrcv()Peter Maydell
In do_msgrcv() we want to allocate a message buffer, whose size is passed to us by the guest. That means we could legitimately fail, so use g_try_malloc() and handle the error case, in the same way that do_msgsnd() does. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Handle msgrcv error case correctlyPeter Maydell
The msgrcv ABI is a bit odd -- the msgsz argument is a size_t, which is unsigned, but it must fail EINVAL if the value is negative when cast to a long. We were incorrectly passing the value through an "unsigned int", which meant that if the guest was 32-bit longs and the host was 64-bit longs an input of 0xffffffff (which should trigger EINVAL) would simply be passed to the host msgrcv() as 0xffffffff, where it does not cause the host kernel to reject it. Follow the same approach as do_msgsnd() in using a ssize_t and doing the check for negative values by hand, so we correctly fail in this corner case. This fixes the msgrcv03 Linux Test Project test case, which otherwise hangs. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Handle negative values in timespec conversionPeter Maydell
In a struct timespec, both fields are signed longs. Converting them from guest to host with code like host_ts->tv_sec = tswapal(target_ts->tv_sec); mishandles negative values if the guest has 32-bit longs and the host has 64-bit longs because tswapal()'s return type is abi_ulong: the assignment will zero-extend into the host long type rather than sign-extending it. Make the conversion routines use __get_user() and __set_user() instead: this automatically picks up the signedness of the field type and does the correct kind of sign or zero extension. It also handles the possibility that the target struct is not sufficiently aligned for the host's requirements. In particular, this fixes a hang when running the Linux Test Project mq_timedsend01 and mq_timedreceive01 tests: one of the test cases sets the timeout to -1 and expects an EINVAL failure, but we were setting a very long timeout instead. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for futex syscallPeter Maydell
Use the safe_syscall wrapper for the futex syscall. In particular, this fixes hangs when using programs that link against the Boehm garbage collector, including the Mono runtime. (We don't change the sys_futex() call in the implementation of the exit syscall, because as the FIXME comment there notes that should be handled by disabling signals, since we can't easily back out if the futex were to return ERESTARTSYS.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for pselect, select syscallsPeter Maydell
Use the safe_syscall wrapper for the pselect and select syscalls. Since not every architecture has the select syscall, we now have to implement select in terms of pselect, which means doing timeval<->timespec conversion. (Five years on from the initial patch that added pselect support to QEMU and a decade after pselect6 went into the kernel, it seems safe to not try to support hosts with header files which don't define __NR_pselect6.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27linux-user: Use safe_syscall for execve syscallTimothy E Baldwin
Wrap execve() in the safe-syscall handling. Although execve() is not an interruptible syscall, it is a special case: if we allow a signal to happen before we make the host$ syscall then we will 'lose' it, because at the point of execve the process leaves QEMU's control. So we use the safe syscall wrapper to ensure that we either take the signal as a guest signal, or else it does not happen before the execve completes and makes it the other program's problem. The practical upshot is that without this SIGTERM could fail to terminate the process. Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Message-id: 1441497448-32489-25-git-send-email-T.E.Baldwin99@members.leeds.ac.uk [PMM: expanded commit message to explain in more detail why this is needed, and add comment about it too] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>