aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2017-03-15pci: introduce a bus master containerJason Wang
96a8821d2141 ("virtio: unbreak virtio-pci with IOMMU after caching ring translations") tries to make IOMMU works with virtio memory region cache, but it requires IOMMU to be created before any virtio devices. This is sub optimal, fixing this by introduce a bus master container to make sure address space can be initialized during device registering, and then we can safely set alias and make bus_master_enable_region as its subregion during bus master initialization. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-14icount: process QEMU_CLOCK_VIRTUAL timers in vCPU threadPaolo Bonzini
icount has become much slower after tcg_cpu_exec has stopped using the BQL. There is also a latent bug that is masked by the slowness. The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL timer now has to wake up the I/O thread and wait for it. The rendez-vous is mediated by the BQL QemuMutex: - handle_icount_deadline wakes up the I/O thread with BQL taken - the I/O thread wakes up and waits on the BQL - the VCPU thread releases the BQL a little later - the I/O thread raises an interrupt, which calls qemu_cpu_kick - the VCPU thread notices the interrupt, takes the BQL to process it and waits on it All this back and forth is extremely expensive, causing a 6 to 8-fold slowdown when icount is turned on. One may think that the issue is that the VCPU thread is too dependent on the BQL, but then the latent bug comes in. I first tried removing the BQL completely from the x86 cpu_exec, only to see everything break. The only way to fix it (and make everything slow again) was to add a dummy BQL lock/unlock pair. This is because in -icount mode you really have to process the events before the CPU restarts executing the next instruction. Therefore, this series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in the vCPU thread when running in icount mode. The required changes include: - make the timer notification callback wake up TCG's single vCPU thread when run from another thread. By using async_run_on_cpu, the callback can override all_cpu_threads_idle() when the CPU is halted. - move handle_icount_deadline after qemu_tcg_wait_io_event, so that the timer notification callback is invoked after the dummy work item wakes up the vCPU thread - make handle_icount_deadline run the timers instead of just waking the I/O thread. - stop processing the timers in the main loop Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14cpus: define QEMUTimerListNotifyCB for QEMU system emulationPaolo Bonzini
There is no change for now, because the callback just invokes qemu_notify_event. Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.hPaolo Bonzini
This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h in the next patch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14mem-prealloc: reduce large guest start-up and migration time.Jitendra Kolhe
Using "-mem-prealloc" option for a large guest leads to higher guest start-up and migration time. This is because with "-mem-prealloc" option qemu tries to map every guest page (create address translations), and make sure the pages are available during runtime. virsh/libvirt by default, seems to use "-mem-prealloc" option in case the guest is configured to use huge pages. The patch tries to map all guest pages simultaneously by spawning multiple threads. Currently limiting the change to QEMU library functions on POSIX compliant host only, as we are not sure if the problem exists on win32. Below are some stats with "-mem-prealloc" option for guest configured to use huge pages. ------------------------------------------------------------------------ Idle Guest | Start-up time | Migration time ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - single threaded (existing code) ------------------------------------------------------------------------ 64 Core - 4TB | 54m11.796s | 75m43.843s 64 Core - 1TB | 8m56.576s | 14m29.049s 64 Core - 256GB | 2m11.245s | 3m26.598s ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - map guest pages using 8 threads ------------------------------------------------------------------------ 64 Core - 4TB | 5m1.027s | 34m10.565s 64 Core - 1TB | 1m10.366s | 8m28.188s 64 Core - 256GB | 0m19.040s | 2m10.148s ----------------------------------------------------------------------- Guest stats with 2M HugePage usage - map guest pages using 16 threads ----------------------------------------------------------------------- 64 Core - 4TB | 1m58.970s | 31m43.400s 64 Core - 1TB | 0m39.885s | 7m55.289s 64 Core - 256GB | 0m11.960s | 2m0.135s ----------------------------------------------------------------------- Changed in v2: - modify number of memset threads spawned to min(smp_cpus, 16). - removed 64GB memory restriction for spawning memset threads. Changed in v3: - limit number of threads spawned based on min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus) - implement memset thread specific siglongjmp in SIGBUS signal_handler. Changed in v4 - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread as main thread no longer touches any pages. - simplify code my returning memset_thread_failed status from touch_all_pages. Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com> Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14memory_region: Fix name commentsDr. David Alan Gilbert
The 'name' parameter to memory_region_init_* had been marked as debug only, however vmstate_region_ram uses it as a parameter to qemu_ram_set_idstr to set RAMBlock names and these form part of the migration stream. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170309152708.30635-1-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170314' ↵Peter Maydell
into staging ppc patch queue for 2017-03-14 This set has a handful og bugfixes to go into qemu-2.9. This includes an update to the dtc/libfdt submodule which will fix the build errors seen on some distributions. # gpg: Signature made Tue 14 Mar 2017 04:00:41 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.9-20170314: dtc: Update submodule to avoid build errors pseries: Don't expose PCIe extended config space on older machine types target/ppc: fix cpu_ov setting for 32-bit target/ppc: Fix wrong number of UAMR register Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14build: include sys/sysmacros.h for major() and minor()Christopher Covington
The definition of the major() and minor() macros are moving within glibc to <sys/sysmacros.h>. Include this header when it is available to avoid the following sorts of build-stopping messages: qga/commands-posix.c: In function ‘dev_major_minor’: qga/commands-posix.c:656:13: error: In the GNU C Library, "major" is defined by <sys/sysmacros.h>. For historical compatibility, it is currently defined by <sys/types.h> as well, but we plan to remove this soon. To use "major", include <sys/sysmacros.h> directly. If you did not intend to use a system-defined macro "major", you should undefine it after including <sys/types.h>. [-Werror] *devmajor = major(st.st_rdev); ^~~~~~~~~~~~~~~~~~~~~~~~~~ qga/commands-posix.c:657:13: error: In the GNU C Library, "minor" is defined by <sys/sysmacros.h>. For historical compatibility, it is currently defined by <sys/types.h> as well, but we plan to remove this soon. To use "minor", include <sys/sysmacros.h> directly. If you did not intend to use a system-defined macro "minor", you should undefine it after including <sys/types.h>. [-Werror] *devminor = minor(st.st_rdev); ^~~~~~~~~~~~~~~~~~~~~~~~~~ The additional include allows the build to complete on Fedora 26 (Rawhide) with glibc version 2.24.90. Signed-off-by: Christopher Covington <cov@codeaurora.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14pseries: Don't expose PCIe extended config space on older machine typesDavid Gibson
bb9986452 "spapr_pci: Advertise access to PCIe extended config space" allowed guests to access the extended config space of PCI Express devices via the PAPR interfaces, even though the paravirtualized bus mostly acts like plain PCI. However, that patch enabled access unconditionally, including for existing machine types, which is an unwise change in behaviour. This patch limits the change to pseries-2.9 (and later) machine types. Suggested-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-10i386: Change stepping of Haswell to non-blacklisted valueEduardo Habkost
glibc blacklists TSX on Haswell CPUs with model==60 and stepping < 4. To make the Haswell CPU model more useful, make those guests actually use TSX by changing CPU stepping to 4. References: * glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359 https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359 Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170309181212.18864-4-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-08Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer fixes for 2.9.0-rc0 # gpg: Signature made Tue 07 Mar 2017 14:59:18 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (27 commits) commit: Don't use error_abort in commit_start block: Don't use error_abort in blk_new_open sheepdog: Support blockdev-add qapi-schema: Rename SocketAddressFlat's variant tcp to inet qapi-schema: Rename GlusterServer to SocketAddressFlat gluster: Plug memory leaks in qemu_gluster_parse_json() gluster: Don't duplicate qapi-util.c's qapi_enum_parse() gluster: Drop assumptions on SocketTransport names sheepdog: Implement bdrv_parse_filename() sheepdog: Use SocketAddress and socket_connect() sheepdog: Report errors in pseudo-filename more usefully sheepdog: Don't truncate long VDI name in _open(), _create() sheepdog: Fix snapshot ID parsing in _open(), _create, _goto() sheepdog: Mark sd_snapshot_delete() lossage FIXME sheepdog: Fix error handling sd_create() sheepdog: Fix error handling in sd_snapshot_delete() sheepdog: Defuse time bomb in sd_open() error handling block: Fix error handling in bdrv_replace_in_backing_chain() block: Handle permission errors in change_parent_backing_link() block: Ignore multiple children in bdrv_check_update_perm() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-07qapi: New qobject_input_visitor_new_str() for convenienceMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488317230-26248-21-git-send-email-armbru@redhat.com>
2017-03-07qapi: New parse_qapi_name()Markus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488317230-26248-19-git-send-email-armbru@redhat.com>
2017-03-07qobject: Propagate parse errors through qobject_from_json()Markus Armbruster
The next few commits will put the errors to use where appropriate. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488317230-26248-13-git-send-email-armbru@redhat.com>
2017-03-07qobject: Propagate parse errors through qobject_from_jsonv()Markus Armbruster
The next few commits will put the errors to use where appropriate. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <1488317230-26248-9-git-send-email-armbru@redhat.com>
2017-03-07qapi: qobject input visitor variant for use with keyval_parse()Daniel P. Berrange
Currently the QObjectInputVisitor assumes that all scalar values are directly represented as the final types declared by the thing being visited. i.e. it assumes an 'int' is using QInt, and a 'bool' is using QBool, etc. This is good when QObjectInputVisitor is fed a QObject that came from a JSON document on the QMP monitor, as it will strictly validate correctness. To allow QObjectInputVisitor to be reused for visiting a QObject originating from keyval_parse(), an alternative mode is needed where all the scalars types are represented as QString and converted on the fly to the final desired type. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1475246744-29302-8-git-send-email-berrange@redhat.com> Rebased, conflicts resolved, commit message updated to refer to keyval_parse(). autocast replaced by keyval in identifiers, noautocast replaced by fail in tests. Fix qobject_input_type_uint64_keyval() not to reject '-', for QemuOpts compatibility: replace parse_uint_full() by open-coded parse_option_number(). The next commit will add suitable tests. Leave out the fancy ERANGE error reporting for now, but add a TODO comment. Add it qobject_input_type_int64_keyval() and qobject_input_type_number_keyval(), too. Open code parse_option_bool() and parse_option_size() so we have to call qobject_input_get_name() only when actually needed. Again, leave out ERANGE error reporting for now. QAPI/QMP downstream extension prefixes __RFQDN_ don't work, because keyval_parse() splits them at '.'. This will be addressed later in the series. qobject_input_type_int64_keyval(), qobject_input_type_uint64_keyval(), qobject_input_type_number_keyval() tweaked for style. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <1488317230-26248-5-git-send-email-armbru@redhat.com>
2017-03-07keyval: New keyval_parse()Markus Armbruster
keyval_parse() parses KEY=VALUE,... into a QDict. Works like qemu_opts_parse(), except: * Returns a QDict instead of a QemuOpts (d'oh). * Supports nesting, unlike QemuOpts: a KEY is split into key fragments at '.' (dotted key convention; the block layer does something similar on top of QemuOpts). The key fragments are QDict keys, and the last one's value is updated to VALUE. * Each key fragment may be up to 127 bytes long. qemu_opts_parse() limits the entire key to 127 bytes. * Overlong key fragments are rejected. qemu_opts_parse() silently truncates them. * Empty key fragments are rejected. qemu_opts_parse() happily accepts empty keys. * It does not store the returned value. qemu_opts_parse() stores it in the QemuOptsList. * It does not treat parameter "id" specially. qemu_opts_parse() ignores all but the first "id", and fails when its value isn't id_wellformed(), or duplicate (a QemuOpts with the same ID is already stored). It also screws up when a value contains ",id=". * Implied value is not supported. qemu_opts_parse() desugars "foo" to "foo=on", and "nofoo" to "foo=off". * An implied key's value can't be empty, and can't contain ','. I intend to grow this into a saner replacement for QemuOpts. It'll take time, though. Note: keyval_parse() provides no way to do lists, and its key syntax is incompatible with the __RFQDN_ prefix convention for downstream extensions, because it blindly splits at '.', even in __RFQDN_. Both issues will be addressed later in the series. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488317230-26248-4-git-send-email-armbru@redhat.com>
2017-03-07block: Fix error handling in bdrv_replace_in_backing_chain()Kevin Wolf
When adding an Error parameter, bdrv_replace_in_backing_chain() would become nothing more than a wrapper around change_parent_backing_link(). So make the latter public, renamed as bdrv_replace_node(), and remove bdrv_replace_in_backing_chain(). Most of the callers just remove a node from the graph that they just inserted, so they can use &error_abort, but completion of a mirror job with 'replaces' set can actually fail. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07block: Ignore multiple children in bdrv_check_update_perm()Kevin Wolf
change_parent_backing_link() will need to update multiple BdrvChild objects at once. Checking permissions reference by reference doesn't work because permissions need to be consistent only with all parents moved to the new child. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-06Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Mon 06 Mar 2017 04:15:17 GMT # gpg: using RSA key 0xEF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: net/filter-mirror: Follow CODING_STYLE COLO-compare: Fix icmp and udp compare different packet always dump bug COLO-compare: Optimize compare_common and compare_tcp COLO-compare: Rename compare function and remove duplicate codes filter-rewriter: skip net_checksum_calculate() while offset = 0 net/colo: fix memory double free error vmxnet3: VMStatify rx/tx q_descr and int_state vmxnet3: Convert ring values to uint32_t's net/colo-compare: Fix memory free error colo-compare: Fix removing fds been watched incorrectly in finalization char: remove the right fd been watched in qemu_chr_fe_set_handlers() colo-compare: kick compare thread to exit after some cleanup in finalization colo-compare: use g_timeout_source_new() to process the stale packets NetRxPkt: Remove code duplication in net_rx_pkt_pull_data() NetRxPkt: Account buffer with ETH header in IOV length NetRxPkt: Do not try to pull more data than present NetRxPkt: Fix memory corruption on VLAN header stripping eth: Extend vlan stripping functions net: Remove useless local var pkt Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-06eth: Extend vlan stripping functionsDmitry Fleytman
Make VLAN stripping functions return number of bytes copied to given Ethernet header buffer. This information should be used to re-compose packet IOV after VLAN stripping. Cc: qemu-stable@nongnu.org Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-05qapi: Improve qobject visitor documentationMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-29-git-send-email-armbru@redhat.com>
2017-03-05qapi: Make input visitors detect unvisited list tailsMarkus Armbruster
Fix the design flaw demonstrated in the previous commit: new method check_list() lets input visitors report that unvisited input remains for a list, exactly like check_struct() lets them report that unvisited input remains for a struct or union. Implement the method for the qobject input visitor (straightforward), and the string input visitor (less so, due to the magic list syntax there). The opts visitor's list magic is even more impenetrable, and all I can do there today is a stub with a FIXME comment. No worse than before. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488544368-30622-26-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-05qapi: Drop unused non-strict qobject input visitorMarkus Armbruster
The split between tests/test-qobject-input-visitor.c and tests/test-qobject-input-strict.c now makes less sense than ever. The next commit will take care of that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-20-git-send-email-armbru@redhat.com>
2017-03-05qapi: Drop string input visitor method optional()Markus Armbruster
visit_optional() is to be called only between visit_start_struct() and visit_end_struct(). Visitors that don't support struct visits, i.e. don't implement start_struct(), end_struct(), have no use for it. Clarify documentation. The string input visitor doesn't support struct visits. Its parse_optional() is therefore useless. Drop it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-16-git-send-email-armbru@redhat.com>
2017-03-05qapi: Improve qobject input visitor error reportingMarkus Armbruster
Error messages refer to nodes of the QObject being visited by name. Trouble is the names are sometimes less than helpful: * The name of the root QObject is whatever @name argument got passed to the visitor, except NULL gets mapped to "null". We commonly pass NULL. Not good. Avoiding errors "at the root" mitigates. For instance, visit_start_struct() can only fail when the visited object is not a dictionary, and we commonly ensure it is beforehand. * The name of a QDict's member is the member key. Good enough only when this happens to be unique. * The name of a QList's member is "null". Not good. Improve error messages by referring to nodes by path instead, as follows: * The path of the root QObject is whatever @name argument got passed to the visitor, except NULL gets mapped to "<anonymous>". * The path of a root QDict's member is the member key. * The path of a root QList's member is "[%u]", where %u is the list index, starting at zero. * The path of a non-root QDict's member is the path of the QDict concatenated with "." and the member key. * The path of a non-root QList's member is the path of the QList concatenated with "[%u]", where %u is the list index. For example, the incorrect QMP command { "execute": "blockdev-add", "arguments": { "node-name": "foo", "driver": "raw", "file": {"driver": "file" } } } now fails with {"error": {"class": "GenericError", "desc": "Parameter 'file.filename' is missing"}} instead of {"error": {"class": "GenericError", "desc": "Parameter 'filename' is missing"}} and { "execute": "input-send-event", "arguments": { "device": "bar", "events": [ [] ] } } now fails with {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'events[0]', expected: object"}} instead of {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'null', expected: QDict"}} Aside: calling the thing "parameter" is suboptimal for QMP, because the root object is "arguments" there. The qobject output visitor doesn't have this problem because it should not fail. Same for dealloc and clone visitors. The string visitors don't have this problem because they visit just one value, whose name needs to be passed to the visitor as @name. The string output visitor shouldn't fail anyway. The options visitor uses QemuOpts names. Their name space is flat, so the use of QDict member keys as names is fine. NULL names used with roots and lists could conceivably result in bad error messages. Left for another day. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-15-git-send-email-armbru@redhat.com>
2017-03-05qmp: Eliminate silly QERR_QMP_* macrosMarkus Armbruster
The QERR_ macros are leftovers from the days of "rich" error objects. QERR_QMP_BAD_INPUT_OBJECT, QERR_QMP_BAD_INPUT_OBJECT_MEMBER, QERR_QMP_EXTRA_MEMBER are used in just one place now, except for one use that has crept into qobject-input-visitor.c. Drop these macros, to make the (bad) error messages more visible. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-10-git-send-email-armbru@redhat.com>
2017-03-05qapi: Support multiple command registries per programMarkus Armbruster
The command registry encapsulates a single command list. Give the functions using it a parameter instead. Define suitable command lists in monitor, guest agent and test-qmp-commands. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488544368-30622-6-git-send-email-armbru@redhat.com> [Debugging turds buried] Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-05qmp: Dumb down how we run QMP command registrationMarkus Armbruster
The way we get QMP commands registered is high tech: * qapi-commands.py generates qmp_init_marshal() that does the actual work * it also generates the magic to register it as a MODULE_INIT_QAPI function, so it runs when someone calls module_call_init(MODULE_INIT_QAPI) * main() calls module_call_init() QEMU needs to register a few non-qapified commands. Same high tech works: monitor.c has its own qmp_init_marshal() along with the magic to make it run in module_call_init(MODULE_INIT_QAPI). QEMU also needs to unregister commands that are not wanted in this build's configuration (commit 5032a16). Simple enough: qmp_unregister_commands_hack(). The difficulty is to make it run after the generated qmp_init_marshal(). We can't simply run it in monitor.c's qmp_init_marshal(), because the order in which the registered functions run is indeterminate. So qmp_init_marshal() registers qmp_unregister_commands_hack() separately. Since registering *appends* to the list of registered functions, this will make it run after all the functions that have been registered already. I suspect it takes a long and expensive computer science education to not find this silly. Dumb it down as follows: * Drop MODULE_INIT_QAPI entirely * Give the generated qmp_init_marshal() external linkage. * Call it instead of module_call_init(MODULE_INIT_QAPI) * Except in QEMU proper, call new monitor_init_qmp_commands() that in turn calls the generated qmp_init_marshal(), registers the additional commands and unregisters the unwanted ones. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1488544368-30622-5-git-send-email-armbru@redhat.com>
2017-03-04Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' ↵Peter Maydell
into staging ppc patch queuye for 2017-03-03 This will probably be my last pull request before the hard freeze. It has some new work, but that has all been posted in draft before the soft freeze, so I think it's reasonable to include in qemu-2.9. This batch has: * A substantial amount of POWER9 work * Implements the legacy (hash) MMU for POWER9 * Some more preliminaries for implementing the POWER9 radix MMU * POWER9 has_work * Basic POWER9 compatibility mode handling * Removal of some premature tests * Some cleanups and fixes to the existing MMU code to make the POWER9 work simpler * A bugfix for TCG multiply adds on power * Allow pseries guests to access PCIe extended config space This also includes a code-motion not strictly in ppc code - moving getrampagesize() from ppc code to exec.c. This will make some future VFIO improvements easier, Paolo said it was ok to merge via my tree. # gpg: Signature made Fri 03 Mar 2017 03:20:36 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.9-20170303: target/ppc: rewrite f[n]m[add,sub] using float64_muladd spapr: Small cleanup of PPC MMU enums spapr_pci: Advertise access to PCIe extended config space target/ppc: Rework hash mmu page fault code and add defines for clarity target/ppc: Move no-execute and guarded page checking into new function target/ppc: Add execute permission checking to access authority check target/ppc: Add Instruction Authority Mask Register Check hw/ppc/spapr: Add POWER9 to pseries cpu models target/ppc/POWER9: Add cpu_has_work function for POWER9 target/ppc/POWER9: Add POWER9 pa-features definition target/ppc/POWER9: Add POWER9 mmu fault handler target/ppc: Don't gen an SDR1 on POWER9 and rework register creation target/ppc: Add patb_entry to sPAPRMachineState target/ppc/POWER9: Add POWERPC_MMU_V3 bit powernv: Don't test POWER9 CPU yet exec, kvm, target-ppc: Move getrampagesize() to common code target/ppc: Add POWER9/ISAv3.00 to compat_table Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-04ppc: avoid typedef redefinitionsPaolo Bonzini
These cause compilation failures on CentOS 6 or other operating systems with older GCCs. Cc: David Gibson <dgibson@redhat.com> Cc: qemu-ppc@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1488558530-21016-3-git-send-email-pbonzini@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03x86: Work around SMI migration breakagesDr. David Alan Gilbert
Migration from a 2.3.0 qemu results in a reboot on the receiving QEMU due to a disagreement about SM (System management) interrupts. 2.3.0 didn't have much SMI support, but it did set CPU_INTERRUPT_SMI and this gets into the migration stream, but on 2.3.0 it never got delivered. ~2.4.0 SMI interrupt support was added but was broken - so that when a 2.3.0 stream was received it cleared the CPU_INTERRUPT_SMI but never actually caused an interrupt. The SMI delivery was recently fixed by 68c6efe07a, but the effect now is that an incoming 2.3.0 stream takes the interrupt it had flagged but it's bios can't actually handle it(I think partly due to the original interrupt not being taken during boot?). The consequence is a triple(?) fault and a reboot. Tested from: 2.3.1 -M 2.3.0 2.7.0 -M 2.3.0 2.8.0 -M 2.3.0 2.8.0 -M 2.8.0 This corresponds to RH bugzilla entry 1420679. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170223133441.16010-1-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03memory: Introduce DEVICE_HOST_ENDIAN for ram deviceYongji Xie
At the moment ram device's memory regions are DEVICE_NATIVE_ENDIAN. It's incorrect. This memory region is backed by a MMIO area in host, so the uint64_t data that MemoryRegionOps read from/write to this area should be host-endian rather than target-endian. Hence, current code does not work when target and host endianness are different which is the most common case on PPC64. To fix it, this introduces DEVICE_HOST_ENDIAN for the ram device. This has been tested on PPC64 BE/LE host/guest in all possible combinations including TCG. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> Message-Id: <1488171164-28319-1-git-send-email-xyjxie@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03KVM: move SIG_IPI handling to kvm-all.cPaolo Bonzini
This lets us remove a bunch of CONFIG_LINUX defines. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03KVM: do not use sigtimedwait to catch SIGBUSPaolo Bonzini
Call kvm_on_sigbus_vcpu asynchronously from the VCPU thread. Information for the SIGBUS can be stored in thread-local variables and processed later in kvm_cpu_exec. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03KVM: remove kvm_arch_on_sigbusPaolo Bonzini
Build it on kvm_arch_on_sigbus_vcpu instead. They do the same for "action optional" SIGBUSes, and the main thread should never get "action required" SIGBUSes because it blocks the signal. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03cpus: reorganize signal handling codePaolo Bonzini
Move the KVM "eat signals" code under CONFIG_LINUX, in preparation for moving it to kvm-all.c; reraise non-MCE SIGBUS immediately, without passing it to KVM. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03cpus: remove ugly cast on sigbus_handlerPaolo Bonzini
The cast is there because sigbus_handler is invoked via sigfd_handler. But it feels just wrong to use struct qemu_signalfd_siginfo in the prototype of a function that is passed to sigaction. Instead, do a simple-minded conversion of qemu_signalfd_siginfo to siginfo_t. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03Merge branch 'icount-update' into HEADPaolo Bonzini
Merge the original development branch due to breakage caused by the MTTCG merge. Conflicts: cpu-exec.c translate-common.c Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio, pc: fixes, features virtio support for region caches broke a bunch of stuff - fixing most of it though it's not ideal. Still pondering the right way to fix it. New: VM gen ID and hotplug for PXB. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 02 Mar 2017 06:19:17 GMT # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: hw/pxb-pcie: fix PCI Express hotplug support tests/acpi: update DSDT after last patch acpi: simplify _OSC virtio: unbreak virtio-pci with IOMMU after caching ring translations virtio: add missing region cache init in virtio_load() virtio: invalidate memory in vring_set_avail_event() virtio: guard vring access when setting notification virtio: check for vring setup in virtio_queue_empty MAINTAINERS: Add VM Generation ID entries tests: Move reusable ACPI code into a utility file qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands ACPI: Add Virtual Machine Generation ID support ACPI: Add vmgenid blob storage to the build tables docs: VM Generation ID device description linker-loader: Add new 'write pointer' command Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03target/ppc: Add patb_entry to sPAPRMachineStateSuraj Jitindar Singh
ISA v3.00 adds the idea of a partition table which is used to store the address translation details for all partitions on the system. The partition table consists of double word entries indexed by partition id where the second double word contains the location of the process table in guest memory. The process table is registered by the guest via a h-call. We need somewhere to store the address of the process table so we add an entry to the sPAPRMachineState struct called patb_entry to represent the second doubleword of a single partition table entry corresponding to the current guest. We need to store this value so we know if the guest is using radix or hash translation and the location of the corresponding process table in guest memory. Since we only have a single guest per qemu instance, we only need one entry. Since the partition table is technically a hypervisor resource we require that access to it is abstracted by the virtual hypervisor through the get_patbe() call. Currently the value of the entry is never set (and thus defaults to 0 indicating hash), but it will be required to both implement POWER9 kvm support and tcg radix support. We also add this field to be migrated as part of the sPAPRMachineState as we will need it on the receiving side as the guest will never tell us this information again and we need it to perform translation. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03exec, kvm, target-ppc: Move getrampagesize() to common codeAlexey Kardashevskiy
getrampagesize() returns the largest supported page size and mainly used to know if huge pages are enabled. However is implemented in target-ppc/kvm.c and not available in TCG or other architectures. This renames and moves gethugepagesize() to mmap-alloc.c where fd-based analog of it is already implemented. This renames and moves getrampagesize() to exec.c as it seems to be the common place for helpers like this. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-02Merge remote-tracking branch 'remotes/kraxel/tags/pull-audio-20170301-1' ↵Peter Maydell
into staging audio: replay support, sdl2 fix. # gpg: Signature made Wed 01 Mar 2017 15:38:09 GMT # gpg: using RSA key 0x4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/pull-audio-20170301-1: audio/sdlaudio: Allow audio playback with SDL2 audio: make audio poll timer deterministic replay: add record/replay for audio passthrough Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-02Merge remote-tracking branch ↵Peter Maydell
'remotes/dgilbert/tags/pull-migration-20170228a' into staging Migration pull Note: The 'postcopy: Update userfaultfd.h header' is part of Paolo's header update and will disappear if applied after it. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> # gpg: Signature made Tue 28 Feb 2017 12:38:34 GMT # gpg: using RSA key 0x0516331EBC5BFDE7 # gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7 * remotes/dgilbert/tags/pull-migration-20170228a: (27 commits) postcopy: Add extra check for COPY function postcopy: Add doc about hugepages and postcopy postcopy: Check for userfault+hugepage feature postcopy: Update userfaultfd.h header postcopy: Allow hugepages postcopy: Send whole huge pages postcopy: Mask fault addresses to huge page boundary postcopy: Load huge pages in one go postcopy: Use temporary for placing zero huge pages postcopy: Plumb pagesize down into place helpers postcopy: Record largest page size postcopy: enhance ram_block_discard_range for hugepages exec: ram_block_discard_range postcopy: Chunk discards for hugepages postcopy: Transmit and compare individual page sizes postcopy: Transmit ram size summary word migration: fix use-after-free of to_dst_file migration: Update docs to discourage version bumps migration: fix id leak regression migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-02Merge remote-tracking branch 'remotes/elmarco/tags/leak-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Wed 01 Mar 2017 09:02:53 GMT # gpg: using RSA key 0xDAE8E10975969CE5 # gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" # gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5 * remotes/elmarco/tags/leak-pull-request: (28 commits) tests: fix virtio-blk-test leaks tests: add specialized device_find function tests: fix usb-test leaks tests: allows to run single test in usb-hcd-ehci-test usb: release the created buses bus: do not unref hotplug handler tests: fix virtio-9p-test leaks tests: fix virtio-scsi-test leak tests: fix e1000e leaks tests: fix i440fx-test leaks tests: fix e1000-test leak tests: fix tco-test leaks tests: fix eepro100-test leak pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice tests: fix ipmi-bt-test leak tests: fix ipmi-kcs-test leak tests: fix bios-tables-test leak tests: fix hd-geo-test leaks tests: fix ide-test leaks tests: fix vhost-user-test leaks ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-02Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170301' ↵Peter Maydell
into staging ppc patch queue for 2017-03-01 I was hoping to get this pull request squeezed in before the soft freeze, but I ran into some difficulties during testing. Everything here was at least posted before the soft freeze, so I'm hoping we can still merge it for 2.9. The biggest things here are: * Cleanups to handling of hashed page tables, that will make adding support for the POWER9 MMU easier * Cleanups to the XICS interrupt controller that will make implementing the powernv machine easier * TCG implementation of extended overflow and carry handling for POWER9 It also includes: * Increasing the CPU limit for pseries to 1024 vCPUs * Generating proper OF node names in qemu (making hotplug and coldplug logic closer together) # gpg: Signature made Wed 01 Mar 2017 04:43:06 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.9-20170301: (50 commits) Add PowerPC 32-bit guest memory dump support ppc/xics: rename 'ICPState *' variables to 'icp' ppc/xics: move InterruptStatsProvider to the sPAPR machine ppc/xics: move ics-simple post_load under the machine ppc/xics: remove the XICSState classes ppc/xics: export the XICS init routines ppc/xics: move the ICP array under the sPAPR machine ppc/xics: register the reset handler of ICP objects ppc/xics: simplify spapr_dt_xics() interface ppc/xics: use the QOM interface to grab an ICP ppc/xics: move the cpu_setup() handler under the ICPState class ppc/xics: simplify the cpu_setup() handler ppc/xics: move kernel_xics_fd out of KVMXICSState ppc/xics: extend the QOM interface to handle ICPs ppc/xics: remove the XICS list of ICS ppc/xics: register the reset handler of ICS objects ppc/xics: remove xics_find_source() ppc/xics: use the QOM interface to resend irqs ppc/xics: use the QOM interface to get irqs ppc/xics: use the QOM interface under the sPAPR machine ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-02ACPI: Add Virtual Machine Generation ID supportBen Warren
This implements the VM Generation ID feature by passing a 128-bit GUID to the guest via a fw_cfg blob. Any time the GUID changes, an ACPI notify event is sent to the guest The user interface is a simple device with one parameter: - guid (string, must be "auto" or in UUID format xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) Signed-off-by: Ben Warren <ben@skyportsystems.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02ACPI: Add vmgenid blob storage to the build tablesBen Warren
This allows them to be centrally initialized and destroyed The "AcpiBuildTables.vmgenid" array will be used to construct the "etc/vmgenid_guid" fw_cfg blob. Its contents will be linked into fw_cfg after being built on the pc_machine_done() -> acpi_setup() -> acpi_build() call path, and dropped without use on the subsequent, guest triggered, acpi_build_update() -> acpi_build() call path. Signed-off-by: Ben Warren <ben@skyportsystems.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02linker-loader: Add new 'write pointer' commandBen Warren
This is similar to the existing 'add pointer' functionality, but instead of instructing the guest (BIOS or UEFI) to patch memory, it instructs the guest to write the pointer back to QEMU via a writeable fw_cfg file. Signed-off-by: Ben Warren <ben@skyportsystems.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-01Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches # gpg: Signature made Tue 28 Feb 2017 20:35:32 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (46 commits) block: Add Error parameter to bdrv_append() block: Add Error parameter to bdrv_set_backing_hd() block: Assertions for resize permission block: Assertions for write permissions block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read tests: Remove FIXME comments nbd/server: Use real permissions for NBD exports migration/block: Use real permissions hmp: Request permissions in qemu-io commit: Add filter-node-name to block-commit mirror: Add filter-node-name to blockdev-mirror stream: Use real permissions in streaming block job mirror: Use real permissions in mirror/active commit block job blockjob: Factor out block_job_remove_all_bdrv() block: Allow backing file links in change_parent_backing_link() block: BdrvChildRole.attach/detach() callbacks block: Fix pending requests check in bdrv_append() backup: Use real permissions in backup block job commit: Use real permissions for HMP 'commit' commit: Use real permissions in commit block job ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>