aboutsummaryrefslogtreecommitdiff
path: root/docs
AgeCommit message (Collapse)Author
2015-10-22vhost user: add rarp sending after live migration for legacy guestThibaut Collet
A new vhost user message is added to allow QEMU to ask to vhost user backend to broadcast a fake RARP after live migration for guest without GUEST_ANNOUNCE capability. This new message is sent only if the backend supports the new VHOST_USER_PROTOCOL_F_RARP protocol feature. The payload of this new message is the MAC address of the guest (not known by the backend). The MAC address is copied in the first 6 bytes of a u64 to avoid to create a new payload message type. This new message has no equivalent ioctl so a new callback is added in the userOps structure to send the request. Upon reception of this new message the vhost user backend must generate and broadcast a fake RARP request to notify the migration is terminated. Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com> [Rebased and fixed checkpatch errors - Marc-André] Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
2015-10-22vhost-user: document migration logMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
2015-10-22virtio: add some migration docCornelia Huck
Try to cover the basics of virtio migration. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>
2015-10-12qapi: Consistent generated code: prefer visitor 'v'Eric Blake
We had some pointless differences in the generated code for visit, command marshalling, and events; unifying them makes it easier for future patches to consolidate to common helper functions. This is one patch of a series to clean up these differences. This patch names the local visitor variable 'v' rather than 'm'. Related objects, such as 'QapiDeallocVisitor', are also named by their initials instead of an unrelated leading m. No change in semantics to the generated code. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1443565276-4535-12-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-10-12qapi: Consistent generated code: prefer error 'err'Eric Blake
We had some pointless differences in the generated code for visit, command marshalling, and events; unifying them makes it easier for future patches to consolidate to common helper functions. This is one patch of a series to clean up these differences. This patch consistently names the local error variable 'err' rather than 'local_err'. No change in semantics to the generated code. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1443565276-4535-11-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-10-12docs: Move files from docs/qmp/ to docs/Markus Armbruster
Giving QMP its own subdirectory in docs/ is hardly worthwhile when we have just four files, and one of them isn't even in the subdirectory. Move the files from docs/qmp/ to docs/, renaming docs/qmp/README to docs/qmp-intro. Update MAINTAINERS. The new pattern also captures the fourth file docs/writing-qmp-commands.txt. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1443111117-29831-2-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-10-09docs: update the usage example of "dtrace" backend in tracing.txtLin Ma
The usage example of dtrace is quite ancient, We have tracetool.py with different parameters instead of the original tracetool shell script for a long time, So update the old information. Signed-off-by: Lin Ma <lma@suse.com> Message-id: 1441954730-17341-1-git-send-email-lma@suse.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-25Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* First batch of MAINTAINERS updates * IOAPIC fixes (to pass kvm-unit-tests with -machine kernel_irqchip=off) * NBD API upgrades from Daniel * strtosz fixes from Marc-André * improved support for readonly=on on scsi-generic devices * new "info ioapic" and "info lapic" monitor commands * Peter Crosthwaite's ELF_MACHINE cleanups * docs patches from Thomas and Daniel # gpg: Signature made Fri 25 Sep 2015 11:20:52 BST using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream: (52 commits) doc: Refresh URLs in the qemu-tech documentation docs: describe the QEMU build system structure / design typedef: add typedef for QemuOpts i386: interrupt poll processing i386: partial revert of interrupt poll fix ppc: Rename ELF_MACHINE to be PPC specific i386: Rename ELF_MACHINE to be x86 specific alpha: Remove ELF_MACHINE from cpu.h mips: Remove ELF_MACHINE from cpu.h sparc: Remove ELF_MACHINE from cpu.h s390: Remove ELF_MACHINE from cpu.h sh4: Remove ELF_MACHINE from cpu.h xtensa: Remove ELF_MACHINE from cpu.h tricore: Remove ELF_MACHINE from cpu.h or32: Remove ELF_MACHINE from cpu.h lm32: Remove ELF_MACHINE from cpu.h unicore: Remove ELF_MACHINE from cpu.h moxie: Remove ELF_MACHINE from cpu.h cris: Remove ELF_MACHINE from cpu.h m68k: Remove ELF_MACHINE from cpu.h ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-25Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio,pc features, fixes New features: vhost-user multiqueue support virtio-ccw virtio 1 support Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Fri 25 Sep 2015 07:40:35 BST using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" * remotes/mst/tags/for_upstream: MAINTAINERS: add more devices to the PCI section MAINTAINERS: add more devices to the PC section vhost-user: add a new message to disable/enable a specific virt queue. vhost-user: add multiple queue support vhost: introduce vhost_backend_get_vq_index method vhost-user: add VHOST_USER_GET_QUEUE_NUM message vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE vhost-user: add protocol feature negotiation vhost-user: use VHOST_USER_XXX macro for switch statement virtio-ccw: enable virtio-1 virtio-ccw: feature bits > 31 handling virtio-ccw: support ring size changes virtio: ring sizes vs. reset pc: Introduce pc-*-2.5 machine classes q35: Move options common to all classes to pc_i440fx_machine_options() q35: Move options common to all classes to pc_q35_machine_options() virtio-net: unbreak self announcement and guest offloads after migration virtio: right size for virtio_queue_get_avail_size Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-25docs: describe the QEMU build system structure / designDaniel P. Berrange
Developers who are new to QEMU, or have a background familiarity with GNU autotools, can have trouble getting their head around the home-grown QEMU build system. This document attempts to explain the structure / design of the configure script and the various Makefile pieces that live across the source tree. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1443102098-13642-1-git-send-email-berrange@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-24qemu-thread: add a fast path to the Win32 QemuEventPaolo Bonzini
QemuEvents are used heavily by call_rcu. We do not want them to be slow, but the current implementation does a kernel call on every invocation of qemu_event_* and won't cut it. So, wrap a Win32 manual-reset event with a fast userspace path. The states and transitions are the same as for the futex and mutex/condvar implementations, but the slow path is different of course. The idea is to reset the Win32 event lazily, as part of a test-reset-test-wait sequence. Such a sequence is, indeed, how QemuEvents are used by RCU and other subsystems! The patch includes a formal model of the algorithm. Tested-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de>
2015-09-24vhost-user: add a new message to disable/enable a specific virt queue.Changchun Ouyang
Add a new message, VHOST_USER_SET_VRING_ENABLE, to enable or disable a specific virt queue, which is similar to attach/detach queue for tap device. virtio driver on guest doesn't have to use max virt queue pair, it could enable any number of virt queue ranging from 1 to max virt queue pair. Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24vhost-user: add multiple queue supportChangchun Ouyang
This patch is initially based a patch from Nikolay Nikolaev. This patch adds vhost-user multiple queue support, by creating a nc and vhost_net pair for each queue. Qemu exits if find that the backend can't support the number of requested queues (by providing queues=# option). The max number is queried by a new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol feature VHOST_USER_PROTOCOL_F_MQ is present first. The max queue check is done at vhost-user initiation stage. We initiate one queue first, which, in the meantime, also gets the max_queues the backend supports. In older version, it was reported that some messages are sent more times than necessary. Here we came an agreement with Michael that we could categorize vhost user messages to 2 types: non-vring specific messages, which should be sent only once, and vring specific messages, which should be sent per queue. Here I introduced a helper function vhost_user_one_time_request(), which lists following messages as non-vring specific messages: VHOST_USER_SET_OWNER VHOST_USER_RESET_DEVICE VHOST_USER_SET_MEM_TABLE VHOST_USER_GET_QUEUE_NUM For above messages, we simply ignore them when they are not sent the first time. Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24vhost-user: add VHOST_USER_GET_QUEUE_NUM messageYuanhan Liu
This is for querying how many queues the backend supports if it has mq support(when VHOST_USER_PROTOCOL_F_MQ flag is set from the quried protocol features). vhost_net_get_max_queues() is the interface to export that value, and to tell if the backend supports # of queues user requested, which is done in the following patch. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICEYuanhan Liu
Quote from Michael: We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24vhost-user: add protocol feature negotiationMichael S. Tsirkin
Support a separate bitmask for vhost-user protocol features, and messages to get/set protocol features. Invoke them at init. No features are defined yet. [ leverage vhost_user_call for request handling -- Yuanhan Liu ] Signed-off-by: Michael S. Tsirkin <address@hidden> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-23libcacard: use the standalone projectMarc-André Lureau
libcacard is now a standalone project hosted with the Spice project (see the 2.5.0 release announcement), remove it from qemu tree. Use the library if found during configure or if --enable-smartcard. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-23spapr: Support ibm,dynamic-reconfiguration-memoryBharata B Rao
Parse ibm,architecture.vec table obtained from the guest and enable memory node configuration via ibm,dynamic-reconfiguration-memory if guest supports it. This is in preparation to support memory hotplug for sPAPR guests. This changes the way memory node configuration is done. Currently all memory nodes are built upfront. But after this patch, only memory@0 node for RMA is built upfront. Guest kernel boots with just that and rest of the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory) are built when guest does ibm,client-architecture-support call. Note: This patch needs a SLOF enhancement which is already part of SLOF binary in QEMU. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-21qapi-introspect: Hide type namesMarkus Armbruster
To eliminate the temptation for clients to look up types by name (which are not ABI), replace all type names by meaningless strings. Reduces output of query-schema by 13 out of 85KiB. As a debugging aid, provide option -u to suppress the hiding. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1442401589-24189-27-git-send-email-armbru@redhat.com>
2015-09-21qapi: New QMP command query-qmp-schema for QMP introspectionMarkus Armbruster
qapi/introspect.json defines the introspection schema. It's designed for QMP introspection, but should do for similar uses, such as QGA. The introspection schema does not reflect all the rules and restrictions that apply to QAPI schemata. A valid QAPI schema has an introspection value conforming to the introspection schema, but the converse is not true. Introspection lowers away a number of schema details, and makes implicit things explicit: * The built-in types are declared with their JSON type. All integer types are mapped to 'int', because how many bits we use internally is an implementation detail. It could be pressed into external interface service as very approximate range information, but that's a bad idea. If we need range information, we better do it properly. * Implicit type definitions are made explicit, and given auto-generated names: - Array types, named by appending "List" to the name of their element type, like in generated C. - The enumeration types implicitly defined by simple union types, named by appending "Kind" to the name of their simple union type, like in generated C. - Types that don't occur in generated C. Their names start with ':' so they don't clash with the user's names. * All type references are by name. * The struct and union types are generalized into an object type. * Base types are flattened. * Commands take a single argument and return a single result. Dictionary argument or list result is an implicit type definition. The empty object type is used when a command takes no arguments or produces no results. The argument is always of object type, but the introspection schema doesn't reflect that. The 'gen': false directive is omitted as implementation detail. The 'success-response' directive is omitted as well for now, even though it's not an implementation detail, because it's not used by QMP. * Events carry a single data value. Implicit type definition and empty object type use, just like for commands. The value is of object type, but the introspection schema doesn't reflect that. * Types not used by commands or events are omitted. Indirect use counts as use. * Optional members have a default, which can only be null right now Instead of a mandatory "optional" flag, we have an optional default. No default means mandatory, default null means optional without default value. Non-null is available for optional with default (possible future extension). * Clients should *not* look up types by name, because type names are not ABI. Look up the command or event you're interested in, then follow the references. TODO Should we hide the type names to eliminate the temptation? New generator scripts/qapi-introspect.py computes an introspection value for its input, and generates a C variable holding it. It can generate awfully long lines. Marked TODO. A new test-qmp-input-visitor test case feeds its result for both tests/qapi-schema/qapi-schema-test.json and qapi-schema.json to a QmpInputVisitor to verify it actually conforms to the schema. New QMP command query-qmp-schema takes its return value from that variable. Its reply is some 85KiBytes for me right now. If this turns out to be too much, we have a couple of options: * We can use shorter names in the JSON. Not the QMP style. * Optionally return the sub-schema for commands and events given as arguments. Right now qmp_query_schema() sends the string literal computed by qmp-introspect.py. To compute sub-schema at run time, we'd have to duplicate parts of qapi-introspect.py in C. Unattractive. * Let clients cache the output of query-qmp-schema. It changes only on QEMU upgrades, i.e. rarely. Provide a command query-qmp-schema-hash. Clients can have a cache indexed by hash, and re-query the schema only when they don't have it cached. Even simpler: put the hash in the QMP greeting. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-21qapi: Pseudo-type '**' is now unused, drop itMarkus Armbruster
'gen': false needs to stay for now, because netdev_add is still using it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1442401589-24189-25-git-send-email-armbru@redhat.com>
2015-09-21qapi-schema: Fix up misleading specification of netdev_addMarkus Armbruster
It doesn't take a 'props' argument, let alone one in the format "NAME=VALUE,..." The bogus arguments specification doesn't matter due to 'gen': false. Clean it up to be incomplete rather than wrong, and document the incompleteness. While there, improve netdev_add usage example in the manual: add a device option to show how it's done. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1442401589-24189-24-git-send-email-armbru@redhat.com>
2015-09-21qapi: Introduce a first class 'any' typeMarkus Armbruster
It's first class, because unlike '**', it actually works, i.e. doesn't require 'gen': false. '**' will go away next. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-09-21qapi: Improve built-in type documentationMarkus Armbruster
Clarify how they map to JSON. Add how they map to C. Fix the reference to StringInputVisitor. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1442401589-24189-20-git-send-email-armbru@redhat.com>
2015-09-21qapi-commands: De-duplicate output marshaling functionsMarkus Armbruster
gen_marshal_output() uses its parameter name only for name of the generated function. Name it after the type being marshaled instead of its caller, and drop duplicates. Saves 7 copies of qmp_marshal_output_int() in qemu-ga, and one copy of qmp_marshal_output_str() in qemu-system-*. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1442401589-24189-19-git-send-email-armbru@redhat.com>
2015-09-21qapi: Rename qmp_marshal_input_FOO() to qmp_marshal_FOO()Markus Armbruster
These functions marshal both input and output. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1442401589-24189-17-git-send-email-armbru@redhat.com>
2015-09-21qapi: Clean up after recent conversions to QAPISchemaVisitorMarkus Armbruster
Generate just 'FOO' instead of 'struct FOO' when possible. Drop helper functions that are now unused. Make pep8 and pylint reasonably happy. Rename generate_FOO() functions to gen_FOO() for consistency. Use more consistent and sensible variable names. Consistently use c_ for mapping keys when their value is a C identifier or type. Simplify gen_enum() and gen_visit_union() Consistently use single quotes for C text string literals. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1442401589-24189-14-git-send-email-armbru@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-21qapi: De-duplicate enum code generationMarkus Armbruster
Duplicated in commit 21cd70d. Yes, we can't import qapi-types, but that's no excuse. Move the helpers from qapi-types.py to qapi.py, and replace the duplicates in qapi-event.py. The generated event enumeration type's lookup table becomes const-correct (see commit 2e4450f), and uses explicit indexes instead of relying on order (see commit 912ae9c). Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1442401589-24189-10-git-send-email-armbru@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-21qapi-types: Convert to QAPISchemaVisitor, fixing flat unionsMarkus Armbruster
Fixes flat unions to get the base's base members. Test case is from commit 2fc0043, in qapi-schema-test.json: { 'union': 'UserDefFlatUnion', 'base': 'UserDefUnionBase', 'discriminator': 'enum1', 'data': { 'value1' : 'UserDefA', 'value2' : 'UserDefB', 'value3' : 'UserDefB' } } { 'struct': 'UserDefUnionBase', 'base': 'UserDefZero', 'data': { 'string': 'str', 'enum1': 'EnumOne' } } { 'struct': 'UserDefZero', 'data': { 'integer': 'int' } } Patch's effect on UserDefFlatUnion: struct UserDefFlatUnion { /* Members inherited from UserDefUnionBase: */ + int64_t integer; char *string; EnumOne enum1; /* Own members: */ union { /* union tag is @enum1 */ void *data; UserDefA *value1; UserDefB *value2; UserDefB *value3; }; }; Flat union visitors remain broken. They'll be fixed next. Code is generated in a different order now, but that doesn't matter. The two guards QAPI_TYPES_BUILTIN_STRUCT_DECL and QAPI_TYPES_BUILTIN_CLEANUP_DECL are replaced by just QAPI_TYPES_BUILTIN. Two ugly special cases for simple unions now stand out like sore thumbs: 1. The type tag is named 'type' everywhere, except in generated C, where it's 'kind'. 2. QAPISchema lowers simple unions to semantically equivalent flat unions. However, the C generated for a simple unions differs from the C generated for its equivalent flat union, and we therefore need special code to preserve that pointless difference for now. Mark both TODO. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-15qapi: allow override of default enum prefix namingDaniel P. Berrange
The camel_to_upper() method applies some heuristics to turn a mixed case type name into an all-uppercase name. This is used for example, to generate enum constant name prefixes. The heuristics don't also generate a satisfactory name though. eg { 'enum': 'QCryptoTLSCredsEndpoint', 'data': ['client', 'server']} Results in Q_CRYPTOTLS_CREDS_ENDPOINT_CLIENT. This has an undesirable _ after the initial Q and is missing an _ between the CRYPTO & TLS strings. Rather than try to add more and more heuristics to try to cope with this, simply allow the QAPI schema to specify the desired enum constant prefix explicitly. eg { 'enum': 'QCryptoTLSCredsEndpoint', 'prefix': 'QCRYPTO_TLS_CREDS_ENDPOINT', 'data': ['client', 'server']} Now gives the QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT name. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-09-11typofixes - v4Veres Lajos
Signed-off-by: Veres Lajos <vlajos@gmail.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11maint: remove / fix many doubled wordsDaniel P. Berrange
Many source files have doubled words (eg "the the", "to to", and so on). Most of these can simply be removed, but a couple were actual mis-spellings (eg "to to" instead of "to do"). There was even one triple word score "to to to" :-) Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-04docs: document how to configure the qcow2 L2/refcount cachesAlberto Garcia
QEMU has options to configure the size of the L2 and refcount caches for the qcow2 format. However, choosing the right sizes for a particular disk image is not a straightforward operation since the ratio between the cache size and the allocated disk space is not obvious and depends on the size of the cluster and the refcount entries. This document attempts to give an overview of both caches and how to configure their sizes. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 55de928e139b1ba3f3d40fe9c6c88f30b1f36410.1438690126.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-09-04docs/qapi-code-gen.txt: Fix QAPI schema examplesMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-04qapi: Generated code cleanupMarkus Armbruster
Clean up white-space, brace placement, and superfluous #ifdef QAPI_TYPES_BUILTIN_CLEANUP_DEF. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-04qapi-commands: Drop useless initializationMarkus Armbruster
In generated command handlers, the assignment to retval dominates its only use. Therefore, its initialization is useless. Drop it. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-04qapi: Command returning anonymous type doesn't work, outlawMarkus Armbruster
Reproducer: with { 'command': 'user_def_cmd4', 'returns': { 'a': 'int' } } added to qapi-schema-test.json, qapi-commands.py dies when it tries to generate the command handler function Traceback (most recent call last): File "/work/armbru/qemu/scripts/qapi-commands.py", line 359, in <module> ret = generate_command_decl(cmd['command'], arglist, ret_type) + "\n" File "/work/armbru/qemu/scripts/qapi-commands.py", line 29, in generate_command_decl ret_type=c_type(ret_type), name=c_name(name), File "/work/armbru/qemu/scripts/qapi.py", line 927, in c_type assert isinstance(value, str) and value != "" AssertionError because the return type doesn't exist. Simply outlaw this usage, and drop or dumb down test cases accordingly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-04qapi: Fix to reject union command and event argumentsMarkus Armbruster
A command's or event's 'data' must be a struct type, given either as a dictionary, or as struct type name. Commit dd883c6 tightened the checking there, but not enough: we still accept 'union'. Fix to reject it. We may want to support union types there, but we'll have to extend qapi-commands.py and qapi-events.py for it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-04qapi-event: Clean up how name of enum QAPIEvent is madeMarkus Armbruster
Use c_name() instead of ad hoc code. Doesn't upcase the -p prefix, which is an improvement in my book. Unbreaks prefix containing '.', but other funny characters remain broken. To be fixed next. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-09-04qapi: Clarify docs on including the same file multiple timesMarkus Armbruster
It's idempotent. While there, update examples to current code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-07-22AioContext: optimize clearing the EventNotifierPaolo Bonzini
It is pretty rare for aio_notify to actually set the EventNotifier. It can happen with worker threads such as thread-pool.c's, but otherwise it should never be set thanks to the ctx->notify_me optimization. The previous patch, unfortunately, added an unconditional call to event_notifier_test_and_clear; now add a userspace fast path that avoids the call. Note that it is not possible to do the same with event_notifier_set; it would break, as proved (again) by the included formal model. This patch survived over 3000 reboots on aarch64 KVM. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Message-id: 1437487673-23740-7-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-22AioContext: fix broken placement of event_notifier_test_and_clearPaolo Bonzini
event_notifier_test_and_clear must be called before processing events. Otherwise, an aio_poll could "eat" the notification before the main I/O thread invokes ppoll(). The main I/O thread then never wakes up. This is an example of what could happen: i/o thread vcpu thread worker thread --------------------------------------------------------------------- lock_iothread notify_me = 1 ... unlock_iothread bh->scheduled = 1 event_notifier_set lock_iothread notify_me = 3 ppoll notify_me = 1 aio_dispatch aio_bh_poll thread_pool_completion_bh bh->scheduled = 1 event_notifier_set node->io_read(node->opaque) event_notifier_test_and_clear ppoll *** hang *** "Tracing" with qemu_clock_get_ns shows pretty much the same behavior as in the previous bug, so there are no new tricks here---just stare more at the code until it is apparent. One could also use a formal model, of course. The included one shows this with three processes: notifier corresponds to a QEMU thread pool worker, temporary_waiter to a VCPU thread that invokes aio_poll(), waiter to the main I/O thread. I would be happy to say that the formal model found the bug for me, but actually I wrote it after the fact. This patch is a bit of a big hammer. The next one optimizes it, with help (this time for real rather than a posteriori :)) from another, similar formal model. Reported-by: Richard W. M. Jones <rjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Message-id: 1437487673-23740-6-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-22AioContext: fix broken ctx->dispatching optimizationPaolo Bonzini
This patch rewrites the ctx->dispatching optimization, which was the cause of some mysterious hangs that could be reproduced on aarch64 KVM only. The hangs were indirectly caused by aio_poll() and in particular by flash memory updates's call to blk_write(), which invokes aio_poll(). Fun stuff: they had an extremely short race window, so much that adding all kind of tracing to either the kernel or QEMU made it go away (a single printf made it half as reproducible). On the plus side, the failure mode (a hang until the next keypress) made it very easy to examine the state of the process with a debugger. And there was a very nice reproducer from Laszlo, which failed pretty often (more than half of the time) on any version of QEMU with a non-debug kernel; it also failed fast, while still in the firmware. So, it could have been worse. For some unknown reason they happened only with virtio-scsi, but that's not important. It's more interesting that they disappeared with io=native, making thread-pool.c a likely suspect for where the bug arose. thread-pool.c is also one of the few places which use bottom halves across threads, by the way. I hope that no other similar bugs exist, but just in case :) I am going to describe how the successful debugging went... Since the likely culprit was the ctx->dispatching optimization, which mostly affects bottom halves, the first observation was that there are two qemu_bh_schedule() invocations in the thread pool: the one in the aio worker and the one in thread_pool_completion_bh. The latter always causes the optimization to trigger, the former may or may not. In order to restrict the possibilities, I introduced new functions qemu_bh_schedule_slow() and qemu_bh_schedule_fast(): /* qemu_bh_schedule_slow: */ ctx = bh->ctx; bh->idle = 0; if (atomic_xchg(&bh->scheduled, 1) == 0) { event_notifier_set(&ctx->notifier); } /* qemu_bh_schedule_fast: */ ctx = bh->ctx; bh->idle = 0; assert(ctx->dispatching); atomic_xchg(&bh->scheduled, 1); Notice how the atomic_xchg is still in qemu_bh_schedule_slow(). This was already debated a few months ago, so I assumed it to be correct. In retrospect this was a very good idea, as you'll see later. Changing thread_pool_completion_bh() to qemu_bh_schedule_fast() didn't trigger the assertion (as expected). Changing the worker's invocation to qemu_bh_schedule_slow() didn't hide the bug (another assumption which luckily held). This already limited heavily the amount of interaction between the threads, hinting that the problematic events must have triggered around thread_pool_completion_bh(). As mentioned early, invoking a debugger to examine the state of a hung process was pretty easy; the iothread was always waiting on a poll(..., -1) system call. Infinite timeouts are much rarer on x86, and this could be the reason why the bug was never observed there. With the buggy sequence more or less resolved to an interaction between thread_pool_completion_bh() and poll(..., -1), my "tracing" strategy was to just add a few qemu_clock_get_ns(QEMU_CLOCK_REALTIME) calls, hoping that the ordering of aio_ctx_prepare(), aio_ctx_dispatch, poll() and qemu_bh_schedule_fast() would provide some hint. The output was: (gdb) p last_prepare $3 = 103885451 (gdb) p last_dispatch $4 = 103876492 (gdb) p last_poll $5 = 115909333 (gdb) p last_schedule $6 = 115925212 Notice how the last call to qemu_poll_ns() came after aio_ctx_dispatch(). This makes little sense unless there is an aio_poll() call involved, and indeed with a slightly different instrumentation you can see that there is one: (gdb) p last_prepare $3 = 107569679 (gdb) p last_dispatch $4 = 107561600 (gdb) p last_aio_poll $5 = 110671400 (gdb) p last_schedule $6 = 110698917 So the scenario becomes clearer: iothread VCPU thread -------------------------------------------------------------------------- aio_ctx_prepare aio_ctx_check qemu_poll_ns(timeout=-1) aio_poll aio_dispatch thread_pool_completion_bh qemu_bh_schedule() At this point bh->scheduled = 1 and the iothread has not been woken up. The solution must be close, but this alone should not be a problem, because the bottom half is only rescheduled to account for rare situations (see commit 3c80ca1, thread-pool: avoid deadlock in nested aio_poll() calls, 2014-07-15). Introducing a third thread---a thread pool worker thread, which also does qemu_bh_schedule()---does bring out the problematic case. The third thread must be awakened *after* the callback is complete and thread_pool_completion_bh has redone the whole loop, explaining the short race window. And then this is what happens: thread pool worker -------------------------------------------------------------------------- <I/O completes> qemu_bh_schedule() Tada, bh->scheduled is already 1, so qemu_bh_schedule() does nothing and the iothread is never woken up. This is where the bh->scheduled optimization comes into play---it is correct, but removing it would have masked the bug. So, what is the bug? Well, the question asked by the ctx->dispatching optimization ("is any active aio_poll dispatching?") was wrong. The right question to ask instead is "is any active aio_poll *not* dispatching", i.e. in the prepare or poll phases? In that case, the aio_poll is sleeping or might go to sleep anytime soon, and the EventNotifier must be invoked to wake it up. In any other case (including if there is *no* active aio_poll at all!) we can just wait for the next prepare phase to pick up the event (e.g. a bottom half); the prepare phase will avoid the blocking and service the bottom half. Expressing the invariant with a logic formula, the broken one looked like: !(exists(thread): in_dispatching(thread)) => !optimize or equivalently: !(exists(thread): in_aio_poll(thread) && in_dispatching(thread)) => !optimize In the correct one, the negation is in a slightly different place: (exists(thread): in_aio_poll(thread) && !in_dispatching(thread)) => !optimize or equivalently: (exists(thread): in_prepare_or_poll(thread)) => !optimize Even if the difference boils down to moving an exclamation mark :) the implementation is quite different. However, I think the new one is simpler to understand. In the old implementation, the "exists" was implemented with a boolean value. This didn't really support well the case of multiple concurrent event loops, but I thought that this was okay: aio_poll holds the AioContext lock so there cannot be concurrent aio_poll invocations, and I was just considering nested event loops. However, aio_poll _could_ indeed be concurrent with the GSource. This is why I came up with the wrong invariant. In the new implementation, "exists" is computed simply by counting how many threads are in the prepare or poll phases. There are some interesting points to consider, but the gist of the idea remains: 1) AioContext can be used through GSource as well; as mentioned in the patch, bit 0 of the counter is reserved for the GSource. 2) the counter need not be updated for a non-blocking aio_poll, because it won't sleep forever anyway. This is just a matter of checking the "blocking" variable. This requires some changes to the win32 implementation, but is otherwise not too complicated. 3) as mentioned above, the new implementation will not call aio_notify when there is *no* active aio_poll at all. The tests have to be adjusted for this change. The calls to aio_notify in async.c are fine; they only want to kick aio_poll out of a blocking wait, but need not do anything if aio_poll is not running. 4) nested aio_poll: these just work with the new implementation; when a nested event loop is invoked, the outer event loop is never in the prepare or poll phases. The outer event loop thus has already decremented the counter. Reported-by: Richard W. M. Jones <rjones@redhat.com> Reported-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Message-id: 1437487673-23740-5-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-20Revert "vhost-user: add multi queue support"Michael S. Tsirkin
This reverts commit 830d70db692e374b55555f4407f96a1ceefdcc97. The interface isn't fully backwards-compatible, which is bad. Let's redo this properly after 2.4. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-07Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' ↵Peter Maydell
into staging Patch queue for ppc - 2015-07-07 A few last minute PPC changes for 2.4: - spapr: Update SLOF - spapr: Fix a few bugs - spapr: Preparation for hotplug - spapr: Minor code cleanups - linux-user: Add mftb handling - kvm: Enable hugepage support with memory-backend-file - mac99: Remove nonexistent interrupt pin (Mac OS 9 fix) # gpg: Signature made Tue Jul 7 16:48:41 2015 BST using RSA key ID 03FEDC60 # gpg: Good signature from "Alexander Graf <agraf@suse.de>" # gpg: aka "Alexander Graf <alex@csgraf.de>" * remotes/agraf/tags/signed-ppc-for-upstream: (30 commits) sPAPR: Clear stale MSIx table during EEH reset sPAPR: Reenable EEH functionality on reboot sPAPR: Don't enable EEH on emulated PCI devices spapr-vty: Use TYPE_ definition instead of hardcoding spapr_vty: lookup should only return valid VTY objects spapr_pci: drop redundant args in spapr_[populate, create]_pci_child_dt spapr_pci: populate ibm,loc-code spapr_pci: enumerate and add PCI device tree xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled ppc: Update cpu_model in MachineState spapr: Consolidate cpu init code into a routine spapr: Reorganize CPU dt generation code cpus: Add a macro to walk CPUs in reverse spapr: Support ibm, lrdr-capacity device tree property spapr: Consider max_cpus during xics initialization Revert "hw/ppc/spapr_pci.c: Avoid functions not in glib 2.12 (g_hash_table_iter_*)" spapr_iommu: translate sPAPRTCEAccess to IOMMUAccessFlags spapr_iommu: drop erroneous check in h_put_tce_indirect() spapr_pci: set device node unit address as hex spapr_pci: encode class code including Prog IF register ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-07Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20150707' ↵Peter Maydell
into staging migration/next for 20150707 # gpg: Signature made Tue Jul 7 13:56:30 2015 BST using RSA key ID 5872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" # gpg: aka "Juan Quintela <quintela@trasno.org>" * remotes/juanquintela/tags/migration/20150707: (28 commits) migration: extend migration_bitmap migration: protect migration_bitmap check_section_footers: Check the correct section_id migration: Add migration events on target side migration: Make events a capability migration: create migration event migration: No need to call trace_migrate_set_state() migration: Use always helper to set state migration: ensure we start in NONE state migration: Use cmpxchg correctly migration: Add configuration section vmstate: Create optional sections global_state: Make section optional migration: create new section to store global state runstate: migration allows more transitions now runstate: Add runstate store Fix older machine type compatibility on power with section footers Fail more cleanly in mismatched RAM cases Sanity check RDMA remote data Sort destination RAMBlocks to be the same as the source ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-07spapr: Support ibm, lrdr-capacity device tree propertyBharata B Rao
Add support for ibm,lrdr-capacity since this is needed by the guest kernel to know about the possible hot-pluggable CPUs and Memory. With this, pseries kernels will start reporting correct maxcpus in /sys/devices/system/cpu/possible. Also define the minimum hotpluggable memory size as 256MB. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [agraf: Fix compile error on 32bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07migration: create migration eventJuan Quintela
We have one argument that tells us what event has happened. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-07-07rocker: mark copy-to-cpu pkts as forwarding offloadedScott Feldman
For pkts copied to the CPU (to be processed by guest driver), mark the Rx descriptor with flag "OFFLOAD_FWD" to indicate device has already forwarded pkt. The guest driver will use this indicator to avoid duplicate forwarding in the guest OS. Examples include bcast/mcast/unknown ucast pkts flooded to bridged ports. We want to avoid both the device and the guest bridge driver flooding these pkts, which would result in duplicates pkts on the wire. Packet sampling, such as sFlow, can also use this technique to mark pkts for the guest OS to record but otherwise drop. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Message-id: 1435746792-41278-5-git-send-email-sfeldma@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-03update pci-bridge-seat section in docs/multiseat.txtGerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>