Age | Commit message (Collapse) | Author |
|
The Xen mapcache is able to create long term mappings, they are called
"locked" mappings. The third parameter of the xen_map_cache call
specifies if a mapping is a "locked" mapping.
>From the QEMU point of view there are two kinds of long term mappings:
[a] device memory mappings, such as option roms and video memory
[b] dma mappings, created by dma_memory_map & friends
After certain operations, ballooning a VM in particular, Xen asks QEMU
kindly to destroy all mappings. However, certainly [a] mappings are
present and cannot be removed. That's not a problem as they are not
affected by balloonning. The *real* problem is that if there are any
mappings of type [b], any outstanding dma operations could fail. This is
a known shortcoming. In other words, when Xen asks QEMU to destroy all
mappings, it is an error if any [b] mappings exist.
However today we have no way of distinguishing [a] from [b]. Because of
that, we cannot even print a decent warning.
This patch introduces a new "dma" bool field to MapCacheRev entires, to
remember if a given mapping is for dma or is a long term device memory
mapping. When xen_invalidate_map_cache is called, we print a warning if
any [b] mappings exist. We ignore [a] mappings.
Mappings created by qemu_map_ram_ptr are assumed to be [a], while
mappings created by address_space_map->qemu_ram_ptr_length are assumed
to be [b].
The goal of the patch is to make debugging and system understanding
easier.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 1ff7c5986a515d2d936eba026ff19947bbc7cb92)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
|
|
cannot_instantiate_with_device_add_yet was introduced by commit
efec3dd631d94160288392721a5f9c39e50fb2bc to replace no_user. It was
supposed to be a temporary measure.
When it was introduced, we had 54
cannot_instantiate_with_device_add_yet=true lines in the code.
Today (3 years later) this number has not shrunk: we now have
57 cannot_instantiate_with_device_add_yet=true lines. I think it
is safe to say it is not a temporary measure, and we won't see
the flag go away soon.
Instead of a long field name that misleads people to believe it
is temporary, replace it a shorter and less misleading field:
user_creatable.
Except for code comments, changes were generated using the
following Coccinelle patch:
@@
expression DC;
@@
(
-DC->cannot_instantiate_with_device_add_yet = false;
+DC->user_creatable = true;
|
-DC->cannot_instantiate_with_device_add_yet = true;
+DC->user_creatable = false;
)
@@
typedef ObjectClass;
expression dc;
identifier class, data;
@@
static void device_class_init(ObjectClass *class, void *data)
{
...
dc->hotpluggable = true;
+dc->user_creatable = true;
...
}
@@
@@
struct DeviceClass {
...
-bool cannot_instantiate_with_device_add_yet;
+bool user_creatable;
...
}
@@
expression DC;
@@
(
-!DC->cannot_instantiate_with_device_add_yet
+DC->user_creatable
|
-DC->cannot_instantiate_with_device_add_yet
+!DC->user_creatable
)
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Thomas Huth <thuth@redhat.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-2-ehabkost@redhat.com>
[ehabkost: kept "TODO remove once we're there" comment]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit e90f2a8c3e0e677eeea46a9b401c3f98425ffa37)
Conflicts:
include/hw/qdev-core.h
* remove context dep on 08f00df
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
|
|
Commits 0db832f and 6cdbceb introduced the automatic insertion of filter
nodes above the top layer of mirror and commit block jobs. The
assumption made there was that since libvirt doesn't do node-level
management of the block layer yet, it shouldn't be affected by added
nodes.
This is true as far as commands issued by libvirt are concerned. It only
uses BlockBackend names to address nodes, so any operations it performs
still operate on the root of the tree as intended.
However, the assumption breaks down when you consider query commands,
which return data for the wrong node now. These commands also return
information on some child nodes (bs->file and/or bs->backing), which
libvirt does make use of, and which refer to the wrong nodes, too.
One of the consequences is that oVirt gets wrong information about the
image size and stops the VM in response as long as a mirror or commit
job is running:
https://bugzilla.redhat.com/show_bug.cgi?id=1470634
This patch fixes the problem by hiding the implicit nodes created
automatically by the mirror and commit block jobs in the output of
query-block and BlockBackend-based query-blockstats as long as the user
doesn't indicate that they are aware of those nodes by providing a node
name for them in the QMP command to start the block job.
The node-based commands query-named-block-nodes and query-blockstats
with query-nodes=true still show all nodes, including implicit ones.
This ensures that users that are capable of node-level management can
still access the full information; users that only know BlockBackends
won't use these commands.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit d3c8c67469ee70fcae116d5abc277a7ebc8a19fd)
Conflicts:
block/qapi.c
include/block/block_int.h
* fix context deps on 46eade7b and 5a9347c6
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
|
|
Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
server would not quit, regardless of how many probe connections
came and went, until a connection actually negotiated). But we
broke that in commit ee7d7aa when removing the return value to
nbd_client_new(), although that patch also introduced a bug causing
an assertion failure on a client that fails negotiation. We then
made it worse during refactoring in commit 1a6245a (a segfault
before we could even assert); the (masked) assertion was cleaned
up in d3780c2 (still in 2.6), and just recently we finally fixed
the segfault ("nbd: Fully intialize client in case of failed
negotiation"). But that still means that ever since we added
TLS support to qemu-nbd, we have been vulnerable to an ill-timed
port-scan being able to cause a denial of service by taking down
qemu-nbd before a real client has a chance to connect.
Since negotiation is now handled asynchronously via coroutines,
we no longer have a synchronous point of return by re-adding a
return value to nbd_client_new(). So this patch instead wires
things up to pass the negotiation status through the close_fn
callback function.
Simple test across two terminals:
$ qemu-nbd -f raw -p 30001 file
$ nmap 127.0.0.1 -p 30001 && \
qemu-io -c 'r 0 512' -f raw nbd://localhost:30001
Note that this patch does not change what constitutes successful
negotiation (thus, a client must enter transmission phase before
that client can be considered as a reason to terminate the server
when the connection ends). Perhaps we may want to tweak things
in a later patch to also treat a client that uses NBD_OPT_ABORT
as being a 'successful' negotiation (the client correctly talked
the NBD protocol, and informed us it was not going to use our
export after all), but that's a discussion for another day.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170608222617.20376-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c9390d978cbf61e8f16c9f580fa96b305c43568)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
|
|
Since the automatic cpuid-level code was introduced in commit
c39c0edf9bb3b968ba95484465a50c7b19f4aa3a ("target-i386: Automatically
set level/xlevel/xlevel2 when needed"), the CPU model tables just define
the default CPUID level code (set using "min-level"). Setting
"[x]level" forces CPUID level to a specific value and disable the
automatic-level logic.
But the PC compat code was not updated and the existing "[x]level"
compat properties broke compatibility for people using features that
triggered the auto-level code. To keep previous behavior, we should set
"min-[x]level" instead of "[x]level" on compat_props.
This was not a problem for most cases, because old machine-types don't
have full-cpuid-auto-level enabled. The only common use case it broke
was the CPUID[7] auto-level code, that was already enabled since the
first CPUID[7] feature was introduced (in QEMU 1.4.0).
This causes the regression reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1454641
Change the PC compat code to use "min-[x]level" instead of "[x]level" on
compat_props, and add new test cases to ensure we don't break this
again.
Reported-by: "Guo, Zhiyi" <zhguo@redhat.com>
Fixes: c39c0edf9bb ("target-i386: Automatically set level/xlevel/xlevel2 when needed")
Cc: qemu-stable@nongnu.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 1f43571604da85c62f25f3ba6d275b1b5ea76ca2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
|
|
For one thing, this allows us to drop the error message generation from
qemu-img.c and blockdev.c and instead have it unified in
bdrv_truncate().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-3-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit ed3d2ec98a33fbdeabc471b11ff807075f07e996)
* prereq for 698bdfa
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
|
|
The main loop uses aio_disable_external()/aio_enable_external() to
temporarily disable processing of external AioContext clients like
device emulation.
This allows monitor commands to quiesce I/O and prevent the guest from
submitting new requests while a monitor command is in progress.
The aio_enable_external() API is currently broken when an IOThread is in
aio_poll() waiting for fd activity when the main loop re-enables
external clients. Incrementing ctx->external_disable_cnt does not wake
the IOThread from ppoll(2) so fd processing remains suspended and leads
to unresponsive emulated devices.
This patch adds an aio_notify() call to aio_enable_external() so the
IOThread is kicked out of ppoll(2) and will re-arm the file descriptors.
The bug can be reproduced as follows:
$ qemu -M accel=kvm -m 1024 \
-object iothread,id=iothread0 \
-device virtio-scsi-pci,iothread=iothread0,id=virtio-scsi-pci0 \
-drive if=none,id=drive0,aio=native,cache=none,format=raw,file=test.img \
-device scsi-hd,id=scsi-hd0,drive=drive0 \
-qmp tcp::5555,server,nowait
$ scripts/qmp/qmp-shell localhost:5555
(qemu) blockdev-snapshot-sync device=drive0 snapshot-file=sn1.qcow2
mode=absolute-paths format=qcow2
After blockdev-snapshot-sync completes the SCSI disk will be
unresponsive. This leads to request timeouts inside the guest.
Reported-by: Qianqian Zhu <qizhu@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170508180705.20609-1-stefanha@redhat.com
Suggested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 321d1dba8bef9676a77e9399484e3cd8bf2cf16a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
|
|
Rather than making lots of callers wrap a scalar in a QInt, QString,
or QBool, provide helper macros that do the wrapping automatically.
Update the Coccinelle script to make mass conversions easy, although
the conversion itself will be done as a separate patches to ease
review and backport efforts.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-6-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit a92c21591b5bb9543996538f14854ca6b528318b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
|
|
During block job completion, nothing is preventing
block_job_defer_to_main_loop_bh from being called in a nested
aio_poll(), which is a trouble, such as in this code path:
qmp_block_commit
commit_active_start
bdrv_reopen
bdrv_reopen_multiple
bdrv_reopen_prepare
bdrv_flush
aio_poll
aio_bh_poll
aio_bh_call
block_job_defer_to_main_loop_bh
stream_complete
bdrv_reopen
block_job_defer_to_main_loop_bh is the last step of the stream job,
which should have been "paused" by the bdrv_drained_begin/end in
bdrv_reopen_multiple, but it is not done because it's in the form of a
main loop BH.
Similar to why block jobs should be paused between drained_begin and
drained_end, BHs they schedule must be excluded as well. To achieve
this, this patch forces draining the BH in BDRV_POLL_WHILE.
As a side effect this fixes a hang in block_job_detach_aio_context
during system_reset when a block job is ready:
#0 0x0000555555aa79f3 in bdrv_drain_recurse
#1 0x0000555555aa825d in bdrv_drained_begin
#2 0x0000555555aa8449 in bdrv_drain
#3 0x0000555555a9c356 in blk_drain
#4 0x0000555555aa3cfd in mirror_drain
#5 0x0000555555a66e11 in block_job_detach_aio_context
#6 0x0000555555a62f4d in bdrv_detach_aio_context
#7 0x0000555555a63116 in bdrv_set_aio_context
#8 0x0000555555a9d326 in blk_set_aio_context
#9 0x00005555557e38da in virtio_blk_data_plane_stop
#10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd
#11 0x00005555559fa49b in virtio_bus_stop_ioeventfd
#12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd
#13 0x00005555559f6a18 in virtio_pci_reset
#14 0x00005555559139a9 in qdev_reset_one
#15 0x0000555555916738 in qbus_walk_children
#16 0x0000555555913318 in qdev_walk_children
#17 0x0000555555916738 in qbus_walk_children
#18 0x00005555559168ca in qemu_devices_reset
#19 0x000055555581fcbb in pc_machine_reset
#20 0x00005555558a4d96 in qemu_system_reset
#21 0x000055555577157a in main_loop_should_exit
#22 0x000055555577157a in main_loop
#23 0x000055555577157a in main
The rationale is that the loop in block_job_detach_aio_context cannot
make any progress in pausing/completing the job, because bs->in_flight
is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
BH. With this patch, it does.
Reported-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-3-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
|
|
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
|
|
They start the coroutine on the specified context.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
|
|
It's a variant of qemu_coroutine_enter with an explicit AioContext
parameter.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
|
|
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
|
|
'remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-100417-1' into staging
Final icount and misc MTTCG fixes for 2.9
Minor differences from:
Message-Id: <20170405132503.32125-1-alex.bennee@linaro.org>
- dropped new feature patches
- last minute typo fix from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
# gpg: Signature made Mon 10 Apr 2017 11:38:10 BST
# gpg: using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-100417-1:
replay: assert time only goes forward
cpus: call cpu_update_icount on read
cpu-exec: update icount after each TB_EXIT
cpus: introduce cpu_update_icount helper
cpus: don't credit executed instructions before they have run
cpus: move icount preparation out of tcg_exec_cpu
cpus: check cpu->running in cpu_get_icount_raw()
cpus: remove icount handling from qemu_tcg_cpu_thread_fn
target/i386/misc_helper: wrap BQL around another IRQ generator
cpus: fix wrong define name
scripts/qemugdb/mtree.py: fix up mtree dump
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
By holding off updates to timer_state.qemu_icount we can run into
trouble when the non-vCPU thread needs to know the time. This helper
ensures we atomically update timers_state.qemu_icount based on what
has been currently executed.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
Outside of the vCPU thread icount time will only be tracked against
timers_state.qemu_icount. We no longer credit cycles until they have
completed the run. Inside the vCPU thread we adjust for passage of
time by looking at how many have run so far. This is only valid inside
the vCPU thread while it is running.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.
This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.
Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
|
|
This behavior is not indicated in the datasheet and can confuse the OS.
The TCO can trap NMIs from SERR# or IOCHK# and convert them to SMIs; but
any other TCO event is either delivered as an SMI or completely disabled.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
* MemoryRegionCache revert
* glib optimization workaround
* fix "info lapic" segfault on isapc
* fix QIOChannel memory leak
# gpg: Signature made Mon 03 Apr 2017 18:17:00 BST
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
main-loop: Acquire main_context lock around os_host_main_loop_wait.
exec: revert MemoryRegionCache
nbd: fix memory leak on socket_connect failed
ipmi: Fix macro issues
target-i386: fix "info lapic" segfault on isapc
iscsi: drop unused IscsiAIOCB.qiov field
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
SocketAddress is a simple union, and simple unions are awkward: they
have their variant members wrapped in a "data" object on the wire, and
require additional indirections in C. I intend to limit its use to
existing external interfaces. New ones should use SocketAddressFlat.
I further intend to convert all internal interfaces to
SocketAddressFlat. This helper should go away then.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-8-git-send-email-armbru@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
MemoryRegionCache did not know about virtio support for IOMMUs (because the
two features were developed at the same time). Revert MemoryRegionCache
to "normal" address_space_* operations for 2.9, as it is simpler than
undoing the virtio patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We assumes the iommu_ops were attached to the root region of address
space. This may not be true for all kinds of IOMMU implementation and
especially after commit 3716d5902d74 ("pci: introduce a bus master
container"). So fix this by not assuming as->root has iommu_ops,
instead depending on the regions reported by memory listener through:
- register a memory listener to dma_as
- during region_add, if it's a region of IOMMU, register a specific
IOMMU notifier, and store all notifiers in a list.
- during region_del, compare and delete the IOMMU notifier from the list
This is also a must for making vhost device IOTLB works for all types
of IOMMUs. Note, since we register one notifier during each
.region_add, the IOTLB may be flushed more than one times, this is
suboptimal and could be optimized in the future.
Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Fixes: 3716d5902d74 ("pci: introduce a bus master container")
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
|
|
into staging
ppc patch queue for 2017-03-29
Two more bugfixes of sufficient severity to warrant going into 2.9.
# gpg: Signature made Wed 29 Mar 2017 04:33:19 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.9-20170329:
spapr: fix memory hot-unplugging
spapr: fix buffer-overflow
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
If, once the kernel has booted, we try to remove a memory
hotplugged while the kernel was not started, QEMU crashes on
an assert:
qemu-system-ppc64: hw/virtio/vhost.c:651:
vhost_commit: Assertion `r >= 0' failed.
...
#4 in vhost_commit
#5 in memory_region_transaction_commit
#6 in pc_dimm_memory_unplug
#7 in spapr_memory_unplug
#8 spapr_machine_device_unplug
#9 in hotplug_handler_unplug
#10 in spapr_lmb_release
#11 in detach
#12 in set_allocation_state
#13 in rtas_set_indicator
...
If we take a closer look to the guest kernel log, we can see when
we try to unplug the memory:
pseries-hotplug-mem: Attempting to hot-add 4 LMB(s)
What happens:
1- The kernel has ignored the memory hotplug event because
it was not started when it was generated.
2- When we hot-unplug the memory,
QEMU starts to remove the memory,
generates an hot-unplug event,
and signals the kernel of the incoming new event
3- as the kernel is started, on the QEMU signal, it reads
the event list, decodes the hotplug event and tries to
finish the hotplugging.
4- QEMU receive the the hotplug notification while it
is trying to hot-unplug the memory. This moves the memory
DRC to an invalid state
This patch prevents this by not allowing to set the allocation
state to USABLE while the DRC is awaiting release.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1432382
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
long is 32-bits on 64-bit windows, which caused the top half of the
address to be truncated; this patch changes it to use the
QEMU_ALIGN_UP macro which does not suffer the same problem
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
|
|
* MTTCG fix for win32
* virtio-scsi assertion failure
* mem-prealloc coverity fix
* x86 migration revert which requires more thought
* x86 instruction limit (avoids >2 page translation blocks)
* nbd dead code cleanup
* small memory.c logic fix
# gpg: Signature made Mon 27 Mar 2017 17:03:04 BST
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
scsi-generic: Fill in opt_xfer_len in INQUIRY reply if it is zero
Revert "apic: save apic_delivered flag"
nbd: drop unused NBDClientSession.is_unix field
win32: replace custom mutex and condition variable with native primitives
mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.
tcg/i386: Check the size of instruction being translated
virtio-scsi: Fix acquire/release in dataplane handlers
virtio-scsi: Make virtio_scsi_acquire/release public
clear pending status before calling memory commit
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
This reverts commit 07bfa354772f2de67008dc66c201b627acff0106.
The global variable is only read as part of a
apic_reset_irq_delivered();
qemu_irq_raise(s->irq);
if (!apic_get_irq_delivered()) {
sequence, so the value never matters at migration time.
Reported-by: Dr. David Alan Gilbert <dglibert@redhat.com>
Cc: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The multithreaded TCG implementation exposed deadlocks in the win32
condition variables: as implemented, qemu_cond_broadcast waited on
receivers, whereas the pthreads API it was intended to emulate does
not. This was causing a deadlock because broadcast was called while
holding the IO lock, as well as all possible waiters blocked on the
same lock.
This patch replaces all the custom synchronisation code for mutexes
and condition variables with native Windows primitives (SRWlocks and
condition variables) with the same semantics as their POSIX
equivalents. To enable that, it requires a Windows Vista or newer host
OS.
Signed-off-by: Andrey Shedel <ashedel@microsoft.com>
[AB: edited commit message]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-Id: <20170324220141.10104-1-Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
virtio_input_send buffers input events until it sees a SYNC. Then it
either sends or drops the entire batch, depending on whether eventq
has enough space available. The case to avoid here is partial sends
where only part of the batch would get to the guest.
Using virtqueue_get_avail_bytes to check the state of eventq was not
correct. The queue may have a smaller number of larger buffers
available so bytes may be enough but the batch would still not be
possible to send, leading to the "Huh? No vq elem available" error.
Instead of checking available bytes, this patch optimistically pops
buffers from the queue and puts them back in case it runs out of
space and the batch needs to be dropped.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1490365490-4854-3-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
|
|
They will be used in virtio-scsi-dataplane.c as well, so move them to
header.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170317061447.16243-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
into staging
ppc patch queue for 2017-03-23
Just a single bugfix in this batch. It's not strictly in ppc code,
though it's for the pseries machine's benefit. Eduardo suggested it
go through my tree however.
# gpg: Signature made Thu 23 Mar 2017 10:09:17 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.9-20170323:
numa,spapr: align default numa node memory size to 256MB
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
# gpg: Signature made Wed 22 Mar 2017 17:28:56 GMT
# gpg: using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg: aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg: aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057
* remotes/cody/tags/block-pull-request:
blockjob: add devops to blockjob backends
block-backend: add drained_begin / drained_end ops
blockjob: add block_job_start_shim
blockjob: avoid recursive AioContext locking
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Allow block backends to forward drain requests to their devices/users.
The initial intended purpose for this patch is to allow BBs to forward
requests along to BlockJobs, which will want to pause if their associated
BB has entered a drained region.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-3-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
|
|
A system with multiple VMGENID devices is undefined in the VMGENID spec by
omission.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ben Warren <ben@skyportsystems.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
|
|
The WRITE_POINTER linker/loader command that underlies VMGENID depends on
commit baf2d5bfbac0 ("fw-cfg: support writeable blobs", 2017-01-12), which
in turn depends on fw_cfg DMA.
DMA for fw_cfg is enabled in 2.5+ machine types only (see commit
e6915b5f3a87, "fw_cfg: unbreak migration compatibility for 2.4 and earlier
machines", 2016-02-18).
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ben Warren <ben@skyportsystems.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ben Warren <ben@skyportsystems.com <mailto:ben@skyportsystems.com>>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
|
|
Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).
But when "-numa" option is provided without "mem" parameter,
the memory is equally divided between nodes, but 8MB aligned.
This can be not valid for pseries.
In that case we can have:
$ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB
With this patch, we have:
(qemu) info numa
3 nodes
node 0 cpus: 0
node 0 size: 1280 MB
node 1 cpus:
node 1 size: 1280 MB
node 2 cpus:
node 2 size: 1536 MB
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
qemu-ga's socket activation support was not obeying the LISTEN_PID
environment variable, which avoids that a process uses a socket-activation
file descriptor meant for its parent.
Mess can for example ensue if a process forks a children before consuming
the socket-activation file descriptor and therefore setting O_CLOEXEC
on it.
Luckily, qemu-nbd also got socket activation code, and its copy does
support LISTEN_PID. Some extra fixups are needed to ensure that the
code can be used for both, but that's what this patch does. The
main change is to replace get_listen_fds's "consume" argument with
the FIRST_SOCKET_ACTIVATION_FD macro from the qemu-nbd code.
Cc: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
bdrv_child_set_perm alone is not very usable because the caller must
call bdrv_child_check_perm first. This is already encapsulated
conveniently in bdrv_child_try_set_perm, so remove the other prototypes
from the header and fix the one wrong caller, block/mirror.c.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
into staging
cirrus: blitter fixes.
# gpg: Signature made Thu 16 Mar 2017 09:05:22 GMT
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/pull-cirrus-20170316-1:
cirrus: stop passing around src pointers in the blitter
cirrus: stop passing around dst pointers in the blitter
cirrus: fix cirrus_invalidate_region
cirrus: add option to disable blitter
cirrus: switch to 4 MB video memory by default
cirrus/vnc: zap bitblit support from console code.
fix :cirrus_vga fix OOB read case qemu Segmentation fault
# Conflicts:
# include/hw/compat.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
into staging
migration/next for 20170316
# gpg: Signature made Thu 16 Mar 2017 08:21:51 GMT
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20170316:
postcopy: Check for shared memory
RAMBlocks: qemu_ram_is_shared
vmstate: fix failed iotests case 68 and 91
migration/block: Avoid invoking blk_drain too frequently
migration: use "" as the default for tls-creds/hostname
Change the method to calculate dirty-pages-rate
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
virtio, pci: fixes
More fixes missed in the previous pull request.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 16 Mar 2017 02:29:49 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio-serial-bus: Delete timer from list before free it
hw/virtio: fix Power Management Control Register for PCI Express virtio devices
hw/virtio: fix Link Control Register for PCI Express virtio devices
hw/virtio: fix error enabling flags in Device Control register
hw/pcie: fix Extended Configuration Space for devices with no Extended Capabilities
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Provide a helper to say whether a RAMBlock was created as a
shared mapping.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Quoting cirrus source code:
Follow real hardware, cirrus card emulated has 4 MB video memory.
Also accept 8 MB/16 MB for backward compatibility.
So just use 4MB by default. We decided to leave that at 8MB by default
a while ago, for live migration compatibility reasons. But we have
compat properties to handle that, so that isn't a compeling reason.
This also removes some sanity check inconsistencies in the cirrus code.
Some places check against the allocated video memory, some places check
against the 4MB physical hardware has. Guest code can trigger asserts
because of that.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489494514-15606-1-git-send-email-kraxel@redhat.com
|
|
There is a special code path (dpy_gfx_copy) to allow graphic emulation
notify user interface code about bitblit operations carryed out by
guests. It is supported by cirrus and vnc server. The intended purpose
is to optimize display scrolls and just send over the scroll op instead
of a full display update.
This is rarely used these days though because modern guests simply don't
use the cirrus blitter any more. Any linux guest using the cirrus drm
driver doesn't. Any windows guest newer than winxp doesn't ship with a
cirrus driver any more and thus uses the cirrus as simple framebuffer.
So this code tends to bitrot and bugs can go unnoticed for a long time.
See for example commit "3e10c3e vnc: fix qemu crash because of SIGSEGV"
which fixes a bug lingering in the code for almost a year, added by
commit "c7628bf vnc: only alloc server surface with clients connected".
Also the vnc server will throttle the frame rate in case it figures the
network can't keep up (send buffers are full). This doesn't work with
dpy_gfx_copy, for any copy operation sent to the vnc client we have to
send all outstanding updates beforehand, otherwise the vnc client might
run the client side blit on outdated data and thereby corrupt the
display. So this dpy_gfx_copy "optimization" might even make things
worse on slow network links.
Lets kill it once for all.
Oh, and one more reason: Turns out (after writing the patch) we have a
security bug in that code path ...
Fixes: CVE-2016-9603
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489494419-14340-1-git-send-email-kraxel@redhat.com
|
|
In function cpu_physical_memory_sync_dirty_bitmap, file
include/exec/ram_addr.h:
if (src[idx][offset]) {
unsigned long bits = atomic_xchg(&src[idx][offset], 0);
unsigned long new_dirty;
new_dirty = ~dest[k];
dest[k] |= bits;
new_dirty &= bits;
num_dirty += ctpopl(new_dirty);
}
After these codes executed, only the pages not dirtied in bitmap(dest),
but dirtied in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated.
For example:
When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b00001111,
and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b00000011,
the new_dirty will be 0b00001100, and this function will return 2 but not
4 which is expected.
the dirty pages in dirty_memory[DIRTY_MEMORY_MIGRATION] are all new,
so these should be calculated also.
Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
As the pci ahci can be hotplug and unplug, in the ahci unrealize
function it should free all the resource once allocated in the
realized function. This patch add ide_exit to free the resource.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1488449293-80280-3-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
|
|
Make Power Management State flag writable to conform
with the PCI Express spec.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Make several Link Control Register flags writable to conform
with the PCI Express spec.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
When the virtio devices are PCI Express, make error-enabling flags
writable to respect the PCIe spec.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Capabilities
Absence of any Extended Capabilities is required to be
indicated by an Extended Capability header with a Capability ID of
0000h, a Capability Version of 0h, and a Next Capability Offset of 000h.
Instead of inserting a 'NULL' capability is simpler to mark the start
of the Extended Configuration Space as read-only to achieve the same
behaviour.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|