aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-19Revert some patches from recent [PATCH v6] "Fixing record/replay and adding ↵Artem Pisarenko
reverse debugging" That patch series introduced new virtual clock type for use in external subsystems. It breaks desired behavior in non-record/replay usage scenarios due to a small change to existing behavior. Processing of virtual timers belonging to new clock type is kicked off to the main loop, which makes these timers asynchronous with vCPU thread and, in icount mode, with whole guest execution. This breaks expected determinism in non-record/replay icount mode of emulation where these "external subsystems" are isolated from the host (i.e. they are external only to guest core, not to the entire emulation environment). Example for slirp ("user" backend for network device): User runs qemu in icount mode with rtc clock=vm without any external communication interfaces but with "-netdev user,restrict=on". It expects deterministic execution, because network services are emulated inside qemu and isolated from host. There are no reasons to get reply from DHCP server with different delay or something like that. The next patches revert reimplements the same changes in a better way. This reverts commit 87f4fe7653baf55b5c2f2753fe6003f473c07342. This reverts commit 775a412bf83f6bc0c5c02091ee06cf649b34c593. This reverts commit 9888091404a702d7ec79d51b088d994b9fc121bd. Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com> Message-Id: <18b1e7c8f155fe26976f91be06bde98eef6f8751.1539764043.git.artem.k.pisarenko@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19es1370: more fixes for ADC_FRAMEADR and ADC_FRAMECNTPaolo Bonzini
They are not consecutive with DAC1_FRAME* and DAC2_FRAME*; Coverity still complains about es1370_read, while es1370_write was fixed in commit cf9270e5220671f49cc238deaf6136669cc07ae1. Fixes: 154c1d1f960c5147a3f8ef00907504112f271cd8 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19Merge remote-tracking branch ↵Peter Maydell
'remotes/vivier2/tags/linux-user-for-3.1-pull-request' into staging Add a workaround for clang bug and remove misleading comment (sparc) # gpg: Signature made Thu 18 Oct 2018 20:00:17 BST # gpg: using RSA key F30C38BD3F2FBE3C # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" # gpg: aka "Laurent Vivier <laurent@vivier.eu>" # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/linux-user-for-3.1-pull-request: linux-user/sparc/signal.c: Remove unnecessary comment linux-user: Suppress address-of-packed-member warnings in __get/put_user_e Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-19Merge remote-tracking branch ↵Peter Maydell
'remotes/amarkovic/tags/mips-queue-october-2018-part1-v2' into staging MIPS queue October 2018, part1, v2 # gpg: Signature made Thu 18 Oct 2018 19:39:00 BST # gpg: using RSA key D4972A8967F75A65 # gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65 * remotes/amarkovic/tags/mips-queue-october-2018-part1-v2: (28 commits) target/mips: Add opcodes for nanoMIPS EVA instructions target/mips: Fix misplaced 'break' in handling of NM_SHRA_R_PH target/mips: Fix emulation of microMIPS R6 <SELEQZ|SELNEZ>.<D|S> target/mips: Implement hardware page table walker for MIPS32 target/mips: Add reset state for PWSize and PWField registers target/mips: Add CP0 PWCtl register target/mips: Add CP0 PWSize register target/mips: Add CP0 PWField register target/mips: Add CP0 PWBase register target/mips: Add CP0 Config2 to DisasContext target/mips: Improve DSP R2/R3-related naming target/mips: Add availability control for DSP R3 ASE target/mips: Add bit definitions for DSP R3 ASE target/mips: Reorganize bit definitions for insn_flags (ISAs/ASEs flags) target/mips: Increase 'supported ISAs/ASEs' flag holder size target/mips: Add opcode values of MXU ASE target/mips: Add organizational chart of MXU ASE target/mips: Add assembler mnemonics list for MXU ASE target/mips: Add basic description of MXU ASE target/mips: Add a comment before each CP0 register section in cpu.h ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-19qemu-options: Fix bad "macaddr" property in the documentationThomas Huth
When using the "-device" option, the property is called "mac". "macaddr" is only used for the legacy "-net nic" option. Reported-by: Harald Hoyer <harald@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19e1000: indicate dropped packets in HW countersJason Wang
The e1000 emulation silently discards RX packets if there's insufficient space in the ring buffer. This leads to errors on higher-level protocols in the guest, with no indication about the error cause. This patch increments the "Missed Packets Count" (MPC) and "Receive No Buffers Count" (RNBC) HW counters in this case. As the emulation has no FIFO for buffering packets that can't immediately be pushed to the guest, these two registers are practically equivalent (see 10.2.7.4, 10.2.7.33 in https://www.intel.com/content/www/us/en/embedded/products/networking/82574l-gbe-controller-datasheet.html). On a Linux guest, the register content will be reflected in the "rx_missed_errors" and "rx_no_buffer_count" stats from "ethtool -S", and in the "missed" stat from "ip -s -s link show", giving at least some hint about the error cause inside the guest. If the cause is known, problems like this can often be avoided easily, by increasing the number of RX descriptors in the guest e1000 driver (e.g under Linux, "e1000.RxDescriptors=1024"). The patch also adds a qemu trace message for this condition. Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19net: ignore packet size greater than INT_MAXJason Wang
There should not be a reason for passing a packet size greater than INT_MAX. It's usually a hint of bug somewhere, so ignore packet size greater than INT_MAX in qemu_deliver_packet_iov() CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira <daniel@twistlock.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19pcnet: fix possible buffer overflowJason Wang
In pcnet_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira <daniel@twistlock.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19rtl8139: fix possible out of bound accessJason Wang
In rtl8139_do_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access of for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira <daniel@twistlock.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19ne2000: fix possible out of bound access in ne2000_receiveJason Wang
In ne2000_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access of for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira <daniel@twistlock.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19clean up callback when del virtqueueliujunjie
Before, we did not clear callback like handle_output when delete the virtqueue which may result be segmentfault. The scene is as follows: 1. Start a vm with multiqueue vhost-net, 2. then we write VIRTIO_PCI_GUEST_FEATURES in PCI configuration to triger multiqueue disable in this vm which will delete the virtqueue. In this step, the tx_bh is deleted but the callback virtio_net_handle_tx_bh still exist. 3. Finally, we write VIRTIO_PCI_QUEUE_NOTIFY in PCI configuration to notify the deleted virtqueue. In this way, virtio_net_handle_tx_bh will be called and qemu will be crashed. Although the way described above is uncommon, we had better reinforce it. CC: qemu-stable@nongnu.org Signed-off-by: liujunjie <liujunjie23@huawei.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19docs: Add COLO status diagram to COLO-FT.txtZhang Chen
This diagram make user better understand COLO. Suggested by Markus Armbruster. Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: quick failover process by kick COLO threadzhanghailiang
COLO thread may sleep at qemu_sem_wait(&s->colo_checkpoint_sem), while failover works begin, It's better to wakeup it to quick the process. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: notify net filters about checkpoint/failover eventzhanghailiang
Notify all net filters about the checkpoint and failover event. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19filter-rewriter: handle checkpoint and failover eventZhang Chen
After one round of checkpoint, the states between PVM and SVM become consistent, so it is unnecessary to adjust the sequence of net packets for old connections, besides, while failover happens, filter-rewriter will into failover mode that needn't handle the new TCP connection. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19filter: Add handle_event method for NetFilterClassZhang Chen
Filter needs to process the event of checkpoint/failover or other event passed by COLO frame. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: flush host dirty ram from cachezhanghailiang
Don't need to flush all VM's ram from cache, only flush the dirty pages since last checkpoint Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19savevm: split the process of different stages for loadvm/savevmZhang Chen
There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19qapi: Add new command to query colo statusZhang Chen
Libvirt or other high level software can use this command query colo status. You can test this command like that: {'execute':'query-colo-status'} Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19qapi/migration.json: Rename COLO unknown mode to none mode.Zhang Chen
Suggested by Markus Armbruster rename COLO unknown mode to none mode. Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19qmp event: Add COLO_EXIT event to notify users while exited COLOzhanghailiang
If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x-colo-lost-heartbeat', Users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exited COLO mode. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: Flush memory data from ram cacheZhang Chen
During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's RAM cache will always be same with PVM's memory after checkpoint. Instead of flushing all content of PVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19ram/COLO: Record the dirty pages that SVM receivedZhang Chen
We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. Here, it is a trick, we record dirty pages by re-using migration dirty bitmap. In the later patch, we will start the dirty log for SVM, just like migration, in this way, we can record both the dirty pages caused by PVM and SVM, we only flush those dirty pages from RAM cache while do checkpoint. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: Load dirty pages into SVM's RAM cache firstlyZhang Chen
We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: Remove colo_state migration structZhang Chen
We need to know if migration is going into COLO state for incoming side before start normal migration. Instead by using the VMStateDescription to send colo_state from source side to destination side, we use MIG_CMD_ENABLE_COLO to indicate whether COLO is enabled or not. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: Add block replication into colo processZhang Chen
Make sure master start block replication after slave's block replication started. Besides, we need to activate VM's blocks before goes into COLO state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19COLO: integrate colo compare with colo frameZhang Chen
For COLO FT, both the PVM and SVM run at the same time, only sync the state while it needs. So here, let SVM runs while not doing checkpoint, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100. Besides, we forgot to release colo_checkpoint_semd and colo_delay_timer, fix them here. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19colo-compare: use notifier to notify packets comparing resultZhang Chen
It's a good idea to use notifier to notify COLO frame of inconsistent packets comparing. Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19colo-compare: implement the process of checkpointZhang Chen
While do checkpoint, we need to flush all the unhandled packets, By using the filter notifier mechanism, we can easily to notify every compare object to do this process, which runs inside of compare threads as a coroutine. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19filter-rewriter: Add TCP state machine and fix memory leak in ↵Zhang Chen
connection_track_table We add almost full TCP state machine in filter-rewriter, except TCPS_LISTEN and some simplify in VM active close FIN states. The reason for this simplify job is because guest kernel will track the TCP status and wait 2MSL time too, if client resend the FIN packet, guest will resend the last ACK, so we needn't wait 2MSL time in filter-rewriter. After a net connection is closed, we didn't clear its related resources in connection_track_table, which will lead to memory leak. Let's track the state of net connection, if it is closed, its related resources will be cleared up. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-18cputlb: read CPUTLBEntry.addr_write atomicallyEmilio G. Cota
Updates can come from other threads, so readers that do not take tlb_lock must use atomic_read to avoid undefined behaviour (UB). This completes the conversion to tlb_lock. This conversion results on average in no performance loss, as the following experiments (run on an Intel i7-6700K CPU @ 4.00GHz) show. 1. aarch64 bootup+shutdown test: - Before: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7487.087786 task-clock (msec) # 0.998 CPUs utilized ( +- 0.12% ) 31,574,905,303 cycles # 4.217 GHz ( +- 0.12% ) 57,097,908,812 instructions # 1.81 insns per cycle ( +- 0.08% ) 10,255,415,367 branches # 1369.747 M/sec ( +- 0.08% ) 173,278,962 branch-misses # 1.69% of all branches ( +- 0.18% ) 7.504481349 seconds time elapsed ( +- 0.14% ) - After: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7462.441328 task-clock (msec) # 0.998 CPUs utilized ( +- 0.07% ) 31,478,476,520 cycles # 4.218 GHz ( +- 0.07% ) 57,017,330,084 instructions # 1.81 insns per cycle ( +- 0.05% ) 10,251,929,667 branches # 1373.804 M/sec ( +- 0.05% ) 173,023,787 branch-misses # 1.69% of all branches ( +- 0.11% ) 7.474970463 seconds time elapsed ( +- 0.07% ) 2. SPEC06int: SPEC06int (test set) [Y axis: Speedup over master] 1.15 +-+----+------+------+------+------+------+-------+------+------+------+------+------+------+----+-+ | | 1.1 +-+.................................+++.............................+ tlb-lock-v2 (m+++x) +-+ | +++ | +++ tlb-lock-v3 (spinl|ck) | | +++ | | +++ +++ | | | 1.05 +-+....+++...........####.........|####.+++.|......|.....###....+++...........+++....###.........+-+ | ### ++#| # |# |# ***### +++### +++#+# | +++ | #|# ### | 1 +-+++***+#++++####+++#++#++++++++++#++#+*+*++#++++#+#+****+#++++###++++###++++###++++#+#++++#+#+++-+ | *+* # #++# *** # #### *** # * *++# ****+# *| * # ****|# |# # #|# #+# # # | 0.95 +-+..*.*.#....#..#.*|*..#...#..#.*|*..#.*.*..#.*|.*.#.*++*.#.*++*+#.****.#....#+#....#.#..++#.#..+-+ | * * # # # *|* # # # *|* # * * # *++* # * * # * * # * |* # ++# # # # *** # | | * * # ++# # *+* # # # *|* # * * # * * # * * # * * # *++* # **** # ++# # * * # | 0.9 +-+..*.*.#...|#..#.*.*..#.++#..#.*|*..#.*.*..#.*..*.#.*..*.#.*..*.#.*..*.#.*.|*.#...|#.#..*.*.#..+-+ | * * # *** # * * # |# # *+* # * * # * * # * * # * * # * * # *++* # |# # * * # | 0.85 +-+..*.*.#..*|*..#.*.*..#.***..#.*.*..#.*.*..#.*..*.#.*..*.#.*..*.#.*..*.#.*..*.#.****.#..*.*.#..+-+ | * * # *+* # * * # *|* # * * # * * # * * # * * # * * # * * # * * # * |* # * * # | | * * # * * # * * # *+* # * * # * * # * * # * * # * * # * * # * * # * |* # * * # | 0.8 +-+..*.*.#..*.*..#.*.*..#.*.*..#.*.*..#.*.*..#.*..*.#.*..*.#.*..*.#.*..*.#.*..*.#.*++*.#..*.*.#..+-+ | * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # | 0.75 +-+--***##--***###-***###-***###-***###-***###-****##-****##-****##-****##-****##-****##--***##--+-+ 400.perlben401.bzip2403.gcc429.m445.gob456.hmme45462.libqua464.h26471.omnet473483.xalancbmkgeomean png: https://imgur.com/a/BHzpPTW Notes: - tlb-lock-v2 corresponds to an implementation with a mutex. - tlb-lock-v3 corresponds to the current implementation, i.e. a spinlock and a single lock acquisition in tlb_set_page_with_attrs. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181016153840.25877-1-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/s390x: Check HAVE_ATOMIC128 and HAVE_CMPXCHG128 at translateRichard Henderson
Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/s390x: Skip wout, cout helpers if op helper does not returnRichard Henderson
When op raises an exception, it may not have initialized the output temps that would be written back by wout or cout. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/s390x: Split do_cdsg, do_lpq, do_stpqRichard Henderson
Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/s390x: Convert to HAVE_CMPXCHG128 and HAVE_ATOMIC128Richard Henderson
Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/ppc: Convert to HAVE_CMPXCHG128 and HAVE_ATOMIC128Richard Henderson
Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/arm: Check HAVE_CMPXCHG128 at translate timeRichard Henderson
Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/arm: Convert to HAVE_CMPXCHG128Richard Henderson
Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/i386: Convert to HAVE_CMPXCHG128Richard Henderson
Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18tcg: Split CONFIG_ATOMIC128Richard Henderson
GCC7+ will no longer advertise support for 16-byte __atomic operations if only cmpxchg is supported, as for x86_64. Fortunately, x86_64 still has support for __sync_compare_and_swap_16 and we can make use of that. AArch64 does not have, nor ever has had such support, so open-code it. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18tcg: Add tlb_index and tlb_entry helpersRichard Henderson
Isolate the computation of an index from an address into a helper before we change that function. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> [ cota: convert tlb_vaddr_to_host; use atomic_read on addr_write ] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009175129.17888-2-cota@braap.org>
2018-10-18cputlb: serialize tlb updates with env->tlb_lockEmilio G. Cota
Currently we rely on atomic operations for cross-CPU invalidations. There are two cases that these atomics miss: cross-CPU invalidations can race with either (1) vCPU threads flushing their TLB, which happens via memset, or (2) vCPUs calling tlb_reset_dirty on their TLB, which updates .addr_write with a regular store. This results in undefined behaviour, since we're mixing regular and atomic ops on concurrent accesses. Fix it by using tlb_lock, a per-vCPU lock. All updaters of tlb_table and the corresponding victim cache now hold the lock. The readers that do not hold tlb_lock must use atomic reads when reading .addr_write, since this field can be updated by other threads; the conversion to atomic reads is done in the next patch. Note that an alternative fix would be to expand the use of atomic ops. However, in the case of TLB flushes this would have a huge performance impact, since (1) TLB flushes can happen very frequently and (2) we currently use a full memory barrier to flush each TLB entry, and a TLB has many entries. Instead, acquiring the lock is barely slower than a full memory barrier since it is uncontended, and with a single lock acquisition we can flush the entire TLB. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009174557.16125-6-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18cputlb: fix assert_cpu_is_self macroEmilio G. Cota
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009174557.16125-5-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18exec: introduce tlb_initEmilio G. Cota
Paves the way for the addition of a per-TLB lock. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009174557.16125-4-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/unicore32: remove tlb_flush from uc32_init_fnEmilio G. Cota
As far as I can tell tlb_flush does not need to be called this early. tlb_flush is eventually called after the CPU has been realized. This change paves the way to the introduction of tlb_init, which will be called from cpu_exec_realizefn. Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009174557.16125-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18target/alpha: remove tlb_flush from alpha_cpu_initfnEmilio G. Cota
As far as I can tell tlb_flush does not need to be called this early. tlb_flush is eventually called after the CPU has been realized. This change paves the way to the introduction of tlb_init, which will be called from cpu_exec_realizefn. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009174557.16125-2-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18tcg: distribute tcg_time into TCG contextsEmilio G. Cota
When we implemented per-vCPU TCG contexts, we forgot to also distribute the tcg_time counter, which has remained as a global accessed without any serialization, leading to potentially missed counts. Fix it by distributing the field over the TCG contexts, embedding it into TCGProfile with a field called "cpu_exec_time", which is more descriptive than "tcg_time". Add a function to query this value directly, and for completeness, fill in the field in tcg_profile_snapshot, even though its callers do not use it. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-5-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18tcg: plug holes in struct TCGProfileEmilio G. Cota
This plugs two 4-byte holes in 64-bit. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-4-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18tcg: fix use of uninitialized variable under CONFIG_PROFILEREmilio G. Cota
We forgot to initialize n in commit 15fa08f845 ("tcg: Dynamically allocate TCGOps", 2017-12-29). Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18tcg: access cpu->icount_decr.u16.high with atomicsEmilio G. Cota
Consistently access u16.high with atomics to avoid undefined behaviour in MTTCG. Note that icount_decr.u16.low is only used in icount mode, so regular accesses to it are OK. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-2-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>