Age | Commit message (Collapse) | Author |
|
Handling errors in memory hotunplug in the pSeries machine is more
complex than any other device type, because there are all the
complications that other devices has, and more.
For instance, determining a timeout for a DIMM hotunplug must consider
if it's a Hash-MMU or a Radix-MMU guest, because Hash guests takes
longer to hotunplug DIMMs. The size of the DIMM is also a factor, given
that longer DIMMs naturally takes longer to be hotunplugged from the
kernel. And there's also the guest memory usage to be considered: if
there's a process that is consuming memory that would be lost by the
DIMM unplug, the kernel will postpone the unplug process until the
process finishes, and then initiate the regular hotunplug process. The
first two considerations are manageable, but the last one is a deal
breaker.
There is no sane way for the pSeries machine to determine the memory
load in the guest when attempting a DIMM hotunplug - and even if there
was a way, the guest can start using all the RAM in the middle of the
unplug process and invalidate our previous assumptions - and in result
we can't even begin to calculate a timeout for the operation. This means
that we can't implement a viable timeout mechanism for memory unplug in
pSeries.
Going back to why we would consider an unplug timeout, the reason is
that we can't know if the kernel is giving up the unplug. Turns out
that, sometimes, we can. Consider a failed memory hotunplug attempt
where the kernel will error out with the following message:
'pseries-hotplug-mem: Memory indexed-count-remove failed, adding any
removed LMBs'
This happens when there is a LMB that the kernel gave up in removing,
and the LMBs previously marked for removal are now being added back.
This happens in the pseries kernel in [1], dlpar_memory_remove_by_ic()
into dlpar_add_lmb(), and after that update_lmb_associativity_index().
In this function, the kernel is configuring the LMB DRC connector again.
Note that this is a valid usage in LOPAR, as stated in section
"ibm,configure-connector RTAS Call":
'A subsequent sequence of calls to ibm,configure-connector with the same
entry from the “ibm,drc-indexes” or “ibm,drc-info” property will restart
the configuration of devices which were not completely configured.'
We can use this kernel behavior in our favor. If a DRC connector
reconfiguration for a LMB that we marked as unplug pending happens, this
indicates that the kernel changed its mind about the unplug and is
reasserting that it will keep using all the LMBs of the DIMM. In this
case, it's safe to assume that the whole DIMM device unplug was
cancelled.
This patch hops into rtas_ibm_configure_connector() and, in the scenario
described above, clear the unplug state for the DIMM device. This will
not solve all the problems we still have with memory unplug, but it will
cover this case where the kernel reconfigures LMBs after a failed
unplug. We are a bit more resilient, without using an unreliable
timeout, and we didn't make the remaining error cases any worse.
[1] arch/powerpc/platforms/pseries/hotplug-memory.c
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210222194531.62717-6-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
There is a reliable way to make a CPU hotunplug fail in the pseries
machine. Hotplug a CPU A, then offline all other CPUs inside the guest
but A. When trying to hotunplug A the guest kernel will refuse to do it,
because A is now the last online CPU of the guest. PAPR has no 'error
callback' in this situation to report back to the platform, so the guest
kernel will deny the unplug in silent and QEMU will never know what
happened. The unplug pending state of A will remain until the guest is
shutdown or rebooted.
Previous attempts of fixing it (see [1] and [2]) were aimed at trying to
mitigate the effects of the problem. In [1] we were trying to guess
which guest CPUs were online to forbid hotunplug of the last online CPU
in the QEMU layer, avoiding the scenario described above because QEMU is
now failing in behalf of the guest. This is not robust because the last
online CPU of the guest can change while we're in the middle of the
unplug process, and our initial assumptions are now invalid. In [2] we
were accepting that our unplug process is uncertain and the user should
be allowed to spam the IRQ hotunplug queue of the guest in case the CPU
hotunplug fails.
This patch presents another alternative, using the timeout
infrastructure introduced in the previous patch. CPU hotunplugs in the
pSeries machine will now timeout after 15 seconds. This is a long time
for a single CPU unplug to occur, regardless of guest load - although
the user is *strongly* encouraged to *not* hotunplug devices from a
guest under high load - and we can be sure that something went wrong if
it takes longer than that for the guest to release the CPU (the same
can't be said about memory hotunplug - more on that in the next patch).
Timing out the unplug operation will reset the unplug state of the CPU
and allow the user to try it again, regardless of the error situation
that prevented the hotunplug to occur. Of all the not so pretty
fixes/mitigations for CPU hotunplug errors in pSeries, timing out the
operation is an admission that we have no control in the process, and
must assume the worst case if the operation doesn't succeed in a
sensible time frame.
[1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg03353.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04400.html
Reported-by: Xujun Ma <xuma@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1911414
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210222194531.62717-5-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
The LoPAR spec provides no way for the guest kernel to report failure of
hotplug/hotunplug events. This wouldn't be bad if those operations were
granted to always succeed, but that's far for the reality.
What ends up happening is that, in the case of a failed hotunplug,
regardless of whether it was a QEMU error or a guest misbehavior, the
pSeries machine is retaining the unplug state of the device in the
running guest. This state is cleanup in machine reset, where it is
assumed that this state represents a device that is pending unplug, and
the device is hotunpluged from the board. Until the reset occurs, any
hotunplug operation of the same device is forbid because there is a
pending unplug state.
This behavior has at least one undesirable side effect. A long standing
pending unplug state is, more often than not, the result of a hotunplug
error. The user had to dealt with it, since retrying to unplug the
device is noy allowed, and then in the machine reset we're removing the
device from the guest. This means that we're failing the user twice -
failed to hotunplug when asked, then hotunplugged without notice.
Solutions to this problem range between trying to predict when the
hotunplug will fail and forbid the operation from the QEMU layer, from
opening up the IRQ queue to allow for multiple hotunplug attempts, from
telling the users to 'reboot the machine if something goes wrong'. The
first solution is flawed because we can't fully predict guest behavior
from QEMU, the second solution is a trial and error remediation that
counts on a hope that the unplug will eventually succeed, and the third
is ... well.
This patch introduces a crude, but effective solution to hotunplug
errors in the pSeries machine. For each unplug done, we'll timeout after
some time. If a certain amount of time passes, we'll cleanup the
hotunplug state from the machine. During the timeout period, any unplug
operations in the same device will still be blocked. After that, we'll
assume that the guest failed the operation, and allow the user to try
again. If the timeout is too short we'll prevent legitimate hotunplug
situations to occur, so we'll need to overestimate the regular time an
unplug operation takes to succeed to account that.
The true solution for the hotunplug errors in the pSeries machines is a
PAPR change to allow for the guest to warn the platform about it. For
now, the work done in this timeout design can be used for the new PAPR
'abort hcall' in the future, given that for both cases we'll need code
to cleanup the existing unplug states of the DRCs.
At this moment we're adding the basic wiring of the timer into the DRC.
Next patch will use the timer to timeout failed CPU hotunplugs.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210222194531.62717-4-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
spapr_drc_detach() is not the best name for what the function does. The
function does not detach the DRC, it makes an uncommited attempt to do
it. It'll mark the DRC as pending unplug, via the 'unplug_request'
flag, and only if the DRC state is drck->empty_state it will detach the
DRC, via spapr_drc_release().
This is a contrast with its pair spapr_drc_attach(), where the function
is indeed creating the DRC QOM object. If you know what
spapr_drc_attach() does, you can be misled into thinking that
spapr_drc_detach() is removing the DRC from QEMU internal state, which
isn't true.
The current role of this function is better described as a request for
detach, since there's no guarantee that we're going to detach the DRC in
the end. Rename the function to spapr_drc_unplug_request to reflect
what is is doing.
The initial idea was to change the name to spapr_drc_detach_request(),
and later on change the unplug_request flag to detach_request. However,
unplug_request is a migratable boolean for a long time now and renaming
it is not worth the trouble. spapr_drc_unplug_request() setting
drc->unplug_request is more natural than spapr_drc_detach_request
setting drc->unplug_request.
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210222194531.62717-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
When moving a physical DRC to "Available", drc_isolate_physical() will
move the DRC state to STATE_PHYSICAL_POWERON and, if the DRC is marked
for unplug, call spapr_drc_detach(). For physical DRCs,
drck->empty_state is STATE_PHYSICAL_POWERON, meaning that we're sure
that spapr_drc_detach() will end up calling spapr_drc_release() in the
end.
Likewise, for logical DRCs, drc_set_unusable will move the DRC to
"Unusable" state, setting drc->state to STATE_LOGICAL_UNUSABLE, which is
the drck->empty_state for logical DRCs. spapr_drc_detach() will call
spapr_drc_release() in this case as well.
In both scenarios, spapr_drc_detach() is being used as a
spapr_drc_release(), wrapper, where we also set unplug_requested (which
is already true, otherwise spapr_drc_detach() wouldn't be called in the
first place) and check if drc->state == drck->empty_state, which we also
know it's guaranteed to be true because we just set it.
Just use spapr_drc_release() in these functions to be clear of our
intentions in both these functions.
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210222194531.62717-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
drc_isolate_logical() is used to move the DRC from the "Configured" to
the "Available" state, erroring out if the DRC is in the unexpected
"Unisolate" state and doing nothing (with RTAS_OUT_SUCCESS) if the DRC
is already in "Available" or in "Unusable" state.
When moving from "Configured" to "Available", the DRC is moved to the
LOGICAL_AVAILABLE state, a drc->unplug_requested check is done and, if
true, spapr_drc_detach() is called.
What spapr_drc_detach() does then is:
- set drc->unplug_requested to true. In fact, this is the only place
where unplug_request is set to true;
- does nothing else if drc->state != drck->empty_state. If the DRC
state is equal to drck->empty_state, spapr_drc_release() is
called. For logical DRCs, drck->empty_state = LOGICAL_UNUSABLE.
In short, calling spapr_drc_detach() in drc_isolate_logical() does
nothing. It'll set unplug_request to true again ('again' since it was
already true - otherwise the function wouldn't be called), and will
return without calling spapr_drc_release() because the DRC is not in
LOGICAL_UNUSABLE, since drc_isolate_logical() just moved it to
LOGICAL_AVAILABLE. The only place where the logical DRC is released is
when called from drc_set_unusable(), when it is moved to the
"Unusable" state. As it should, according to PAPR.
Even though calling spapr_drc_detach() in drc_isolate_logical() is
benign, removing it will avoid further thought about the matter. So
let's go ahead and do that.
As a note, this logic was introduced in commit bbf5c878ab76. Since
then, the DRC handling code was refactored and enhanced, and PAPR
itself went through some changes in the DRC area as well. It is
expected that some assumptions we had back then are now deprecated.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210211225246.17315-2-danielhb413@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
We no longer need to include sm501_template.h multiple times, so
we can simply inline its contents into sm501.c.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210212180653.27588-4-peter.maydell@linaro.org>
Acked-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
Now that we only include sm501_template.h for the DEPTH==32 case, we
can expand out the uses of the BPP, PIXEL_TYPE and PIXEL_NAME macros
in that header.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210212180653.27588-3-peter.maydell@linaro.org>
Acked-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
For a long time now the UI layer has guaranteed that the console
surface is always 32 bits per pixel RGB. Remove the legacy dead
code from the sm501 display device which was handling the
possibility that the console surface was some other format.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210212180653.27588-2-peter.maydell@linaro.org>
Acked-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
The Milkymist board requires more than the PTIMER. Directly
select the LM32_DEVICES. This fixes:
/usr/bin/ld:
libqemu-lm32-softmmu.fa.p/target_lm32_gdbstub.c.o: in function `lm32_cpu_gdb_read_register':
target/lm32/gdbstub.c:46: undefined reference to `lm32_pic_get_im'
target/lm32/gdbstub.c:48: undefined reference to `lm32_pic_get_ip'
libqemu-lm32-softmmu.fa.p/target_lm32_op_helper.c.o: in function `helper_wcsr_im':
target/lm32/op_helper.c:107: undefined reference to `lm32_pic_set_im'
libqemu-lm32-softmmu.fa.p/target_lm32_op_helper.c.o: in function `helper_wcsr_ip':
target/lm32/op_helper.c:114: undefined reference to `lm32_pic_set_ip'
libqemu-lm32-softmmu.fa.p/target_lm32_op_helper.c.o: in function `helper_wcsr_jtx':
target/lm32/op_helper.c:120: undefined reference to `lm32_juart_set_jtx'
libqemu-lm32-softmmu.fa.p/target_lm32_op_helper.c.o: in function `helper_wcsr_jrx':
target/lm32/op_helper.c:125: undefined reference to `lm32_juart_set_jrx'
libqemu-lm32-softmmu.fa.p/target_lm32_translate.c.o: in function `lm32_cpu_dump_state':
target/lm32/translate.c:1161: undefined reference to `lm32_pic_get_ip'
target/lm32/translate.c:1161: undefined reference to `lm32_pic_get_im'
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210221225626.2589247-4-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
|
|
We want to be able to use the 'LM32' config for architecture
specific features. As CONFIG_LM32 is only used to select
peripherals, rename it CONFIG_LM32_DEVICES.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210221225626.2589247-3-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
|
|
We want to be able to use the 'LM32' config for architecture
specific features. Introduce CONFIG_LM32_EVR to select the
lm32-evr / lm32-uclinux boards.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210221225626.2589247-2-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
|
|
Block layer patches:
- qemu-storage-daemon: add --pidfile option
- qemu-storage-daemon: CLI error messages include the option name now
- vhost-user-blk export: Misc fixes
- docs: Improvements for qemu-storage-daemon documentation
- parallels: load bitmap extension
- backup-top: Don't crash on post-finalize accesses
- Improve error messages related to node-name options
- iotests improvements
# gpg: Signature made Mon 08 Mar 2021 17:01:41 GMT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (30 commits)
blockdev: Clarify error messages pertaining to 'node-name'
block: Clarify error messages pertaining to 'node-name'
docs: qsd: Explain --export nbd,name=... default
MAINTAINERS: update parallels block driver
iotests: add parallels-read-bitmap test
iotests.py: add unarchive_sample_image() helper
parallels: support bitmap extension for read-only mode
block/parallels: BDRVParallelsState: add cluster_size field
parallels.txt: fix bitmap L1 table description
qcow2-bitmap: make bytes_covered_by_bitmap_cluster() public
block/export: port virtio-blk read/write range check
block/export: port virtio-blk discard/write zeroes input validation
block/export: fix vhost-user-blk export sector number calculation
block/export: use VIRTIO_BLK_SECTOR_BITS
block/export: fix blk_size double byteswap
libqtest: add qtest_remove_abrt_handler()
libqtest: add qtest_kill_qemu()
libqtest: add qtest_socket_server()
vhost-user-blk: fix blkcfg->num_queues endianness
docs: replace insecure /tmp examples in qsd docs
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Fix code style. Operator needs align with eight spaces, and delete line space.
Signed-off-by: lijiejun <a_lijiejun@163.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1615292050-108748-1-git-send-email-a_lijiejun@163.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
|
|
We forward-declare Object typedef in "qemu/typedefs.h" since commit
ca27b5eb7cd ("qom/object: Move Object typedef to 'qemu/typedefs.h'").
Use it everywhere to make the code simpler.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210225182003.3629342-1-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
|
|
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20210126124240.2081959-3-armbru@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
|
|
On Fedora 33, gcc 10.2.1 notes that scsi_cdb_length(buf) can set
len==-1, which in turn overflows g_malloc():
[5/5] Linking target qemu-system-x86_64
In function ‘scsi_disk_new_request_dump’,
inlined from ‘scsi_new_request’ at ../hw/scsi/scsi-disk.c:2608:9:
../hw/scsi/scsi-disk.c:2582:19: warning: argument 1 value ‘18446744073709551612’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
2582 | line_buffer = g_malloc(len * 5 + 1);
| ^
Silence it with a decent assertion, since we only convert a buffer to
bytes when we have a valid cdb length.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210209152350.207958-1-eblake@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
|
|
An assorted set of spelling fixes in various places.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210309111510.79495-1-mjt@msgid.tls.msk.ru>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
|
|
into staging
qemu-sparc queue
# gpg: Signature made Sun 07 Mar 2021 12:07:13 GMT
# gpg: using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* remotes/mcayland/tags/qemu-sparc-20210307: (42 commits)
esp: add support for unaligned accesses
esp: implement non-DMA transfers in PDMA mode
esp: add trivial implementation of the ESP_RFLAGS register
esp: convert cmdbuf from array to Fifo8
esp: convert ti_buf from array to Fifo8
esp: transition to message out phase after SATN and stop command
esp: add maxlen parameter to get_cmd()
esp: raise interrupt after every non-DMA byte transferred to the FIFO
esp: remove old deferred command completion mechanism
esp: defer command completion interrupt on incoming data transfers
esp: latch individual bits in ESP_RINTR register
esp: implement FIFO flush command
esp: add 4 byte PDMA read and write transfers
esp: remove pdma_origin from ESPState
esp: use FIFO for PDMA transfers between initiator and device
esp: fix PDMA target selection
esp: rename get_cmd_cb() to esp_select()
esp: remove CMD pdma_origin
esp: use in-built TC to determine PDMA transfer length
esp: use ti_wptr/ti_rptr to manage the current FIFO position for PDMA
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Keyboard-Controller-Style devices for IPMI purposes are exposed via LPC
IO cycles from the BMC to the host.
Expose support on the BMC side by implementing the usual MMIO
behaviours, and expose the ability to inspect the KCS registers in
"host" style by accessing QOM properties associated with each register.
The model caters to the IRQ style of both the AST2600 and the earlier
SoCs (AST2400 and AST2500). The AST2600 allocates an IRQ for each LPC
sub-device, while there is a single IRQ shared across all subdevices on
the AST2400 and AST2500.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210302014317.915120-6-andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
This is a very minimal framework to access registers which are used to
configure the AHB memory mapping of the flash chips on the LPC HC
Firmware address space.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-Id: <20210302014317.915120-5-andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
The AST2600 allocates distinct GIC IRQs for the LPC subdevices such as
the iBT device. Previously on the AST2400 and AST2500 the LPC subdevices
shared a single LPC IRQ.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210302014317.915120-4-andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
The datasheet says we have 197 IRQs allocated, and we need more than 128
to describe IRQs from LPC devices. Raise the value now to allow
modelling of the LPC devices.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210302014317.915120-3-andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
This appears to be a requirement of the GIC model. The AST2600 allocates
197 GIC IRQs, which we will adjust shortly.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210302014317.915120-2-andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
The ast2600 machines do not have PSCI firmware, so this property should
have never been set. Removing this node fixes SMP booting Linux kernels
that have PSCI enabled, as Linux fails to find PSCI in the device tree
and falls back to the soc-specific method for enabling secondary CPUs.
The comment is out of date as Qemu has supported -kernel booting since
9bb6d14081ce ("aspeed: Add boot stub for smp booting"), in v5.1.
Fixes: f25c0ae1079d ("aspeed/soc: Add AST2600 support")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210303010505.635621-1-joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Support Identify command for Namespace attached controller list. This
command handler will traverse the controller instances in the given
subsystem to figure out whether the specified nsid is attached to the
controllers or not.
The 4096bytes Identify data will return with the first entry (16bits)
indicating the number of the controller id entries. So, the data can
hold up to 2047 entries for the controller ids.
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Klaus Jensen <k.jensen@samsung.com>
[k.jensen: rebased for dma refactor]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
If namespace inventory is changed due to some reasons (e.g., namespace
attachment/detachment), controller can send out event notifier to the
host to manage namespaces.
This patch sends out the AEN to the host after either attach or detach
namespaces from controllers. To support clear of the event from the
controller, this patch also implemented Get Log Page command for Changed
Namespace List log type. To return namespace id list through the
command, when namespace inventory is updated, id is added to the
per-controller list (changed_ns_list).
To indicate the support of this async event, this patch set
OAES(Optional Asynchronous Events Supported) in Identify Controller data
structure.
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
This patch supports Namespace Attachment command for the pre-defined
nvme-ns device nodes. Of course, attach/detach namespace should only be
supported in case 'subsys' is given. This is because if we detach a
namespace from a controller, somebody needs to manage the detached, but
allocated namespace in the NVMe subsystem.
As command effect for the namespace attachment command is registered,
the host will be notified that namespace inventory is changed so that
host will rescan the namespace inventory after this command. For
example, kernel driver manages this command effect via passthru IOCTL.
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Klaus Jensen <k.jensen@samsung.com>
[k.jensen: rebased for dma refactor]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
This patch has no functional changes. This patch just refactored
nvme_select_ns_iocs() to iterate the attached namespaces of the
controlller and make it invoke __nvme_select_ns_iocs().
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
From NVMe spec 1.4b "6.1.5. NSID and Namespace Relationships" defines
valid namespace types:
- Unallocated: Not exists in the NVMe subsystem
- Allocated: Exists in the NVMe subsystem
- Inactive: Not attached to the controller
- Active: Attached to the controller
This patch added support for allocated, but not attached namespace type:
!nvme_ns(n, nsid) && nvme_subsys_ns(n->subsys, nsid)
nvme_ns() returns attached namespace instance of the given controller
and nvme_subsys_ns() returns allocated namespace instance in the
subsystem.
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
Expand allocated namespace list (subsys->namespaces) to have 256 entries
which is a value lager than at least NVME_MAX_NAMESPACES which is for
attached namespace list in a controller.
Allocated namespace list should at least larger than attached namespace
list.
n->num_namespaces = NVME_MAX_NAMESPACES;
The above line will set the NN field by id->nn so that the subsystem
should also prepare at least this number of namespace list entries.
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
subsys->namespaces array used to be sized to NVME_SUBSYS_MAX_NAMESPACES.
But subsys->namespaces are being accessed with 1-based namespace id
which means the very first array entry will always be empty(NULL).
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
Given that now we have nvme-subsys device supported, we can manage
namespace allocated, but not attached: detached. This patch introduced
a parameter for nvme-ns device named 'detached'. This parameter
indicates whether the given namespace device is detached from
a entire NVMe subsystem('subsys' given case, shared namespace) or a
controller('bus' given case, private namespace).
- Allocated namespace
1) Shared ns in the subsystem 'subsys0':
-device nvme-ns,id=ns1,drive=blknvme0,nsid=1,subsys=subsys0,detached=true
2) Private ns for the controller 'nvme0' of the subsystem 'subsys0':
-device nvme-subsys,id=subsys0
-device nvme,serial=foo,id=nvme0,subsys=subsys0
-device nvme-ns,id=ns1,drive=blknvme0,nsid=1,bus=nvme0,detached=true
3) (Invalid case) Controller 'nvme0' has no subsystem to manage ns:
-device nvme,serial=foo,id=nvme0
-device nvme-ns,id=ns1,drive=blknvme0,nsid=1,bus=nvme0,detached=true
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
The nvme_dma function doesn't just do DMA (QEMUSGList-based) memory transfers;
it also handles QEMUIOVector copies.
Introduce the NvmeTxDirection enum and rename to nvme_tx. Remove mapping
of PRPs/SGLs from nvme_tx and instead assert that they have been mapped
previously. This allows more fine-grained use in subsequent patches.
Add new (better named) helpers, nvme_{c2h,h2c}, that does both PRP/SGL
mapping and transfer.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
The PRP and SGL mapping functions does not have any particular need for
the entire NvmeRequest as a parameter. Clean it up.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
Introduce NvmeSg and try to deal with that pesky qsg/iov duality that
haunts all the memory-related functions.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
Fix missing sign inversion.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
A Write Zeroes commands should not be counted in either the 'Data Units
Written' or in 'Host Write Commands' SMART/Health Information Log page.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
The 'len' member of the nvme_compare_ctx struct is redundant since the
same information is available in the 'iov' member.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
Dataset Management is not subject to MDTS, but exceeded a certain size
per range causes internal looping. Report this limit (DMRSL) in the NVM
command set specific identify controller data structure.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
Add a trace event for the offline zone condition when checking zone
read.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: split commit]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
assert may be compiled to a noop and we could end up returning an
uninitialized status.
Fix this by always returning Internal Device Error as a fallback.
Note that, as pointed out by Philippe, per commit 262a69f4282 ("osdep.h:
Prohibit disabling assert() in supported builds") this shouldn't be
possible. But clean it up so we don't worry about it again.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: split commit]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
Add a trace event for the Identify command.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
|
|
Remove an unnecessary le_to_cpu conversion in Identify.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
|
|
ZASL (Zone Append Size Limit) is defined exactly like MDTS (Maximum Data
Transfer Size), that is, it is a value in units of the minimum memory
page size (CAP.MPSMIN) and is reported as a power of two.
The 'mdts' nvme device parameter is specified as in the spec, but the
'zoned.append_size_limit' parameter is specified in bytes. This is
suboptimal for a number of reasons:
1. It is just plain confusing wrt. the definition of mdts.
2. There is a lot of complexity involved in validating the value; it
must be a power of two, it should be larger than 4k, if it is zero
we set it internally to mdts, but still report it as zero.
3. While "hw/block/nvme: improve invalid zasl value reporting"
slightly improved the handling of the parameter, the validation is
still wrong; it does not depend on CC.MPS, it depends on
CAP.MPSMIN. And we are not even checking that it is actually less
than or equal to MDTS, which is kinda the *one* condition it must
satisfy.
Fix this by defining zasl exactly like mdts and checking the one thing
that it must satisfy (that it is less than or equal to mdts). Also,
change the default value from 128KiB to 0 (aka, whatever mdts is).
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
If mdts is exceeded, trace it from a single place.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
Document the 'mdts' nvme device parameter.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
Add support for using the broadcast nsid to issue a flush on all
namespaces through a single command.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|
|
Commit 6eb7a071292a ("hw/block/nvme: change controller pci id") changed
the controller to use a Red Hat assigned PCI Device and Vendor ID, but
did not change the IEEE OUI away from the Intel IEEE OUI.
Fix that and use the locally assigned QEMU IEEE OUI instead if the
`use-intel-id` parameter is not explicitly set. Also reverse the Intel
IEEE OUI bytes.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
|
|
The Zone Append Size Limit (ZASL) must be at least 4096 bytes, so
improve the user experience by adding an early parameter check in
nvme_check_constraints.
When ZASL is still too small due to the host configuring the device for
an even larger page size, convert the trace point in nvme_start_ctrl to
an NVME_GUEST_ERR such that this is logged by QEMU instead of only
traced.
Reported-by: Corne <info@dantalion.nl>
Cc: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
|