aboutsummaryrefslogtreecommitdiff
path: root/hw/block
AgeCommit message (Collapse)Author
2021-02-08hw/block/nvme: move cmb logic to v1.4Padmakar Kalghatgi
Implement v1.4 logic for configuring the Controller Memory Buffer. By default, the v1.4 scheme will be used (CMB must be explicitly enabled by the host), so drivers that only support v1.3 will not be able to use the CMB anymore. To retain the v1.3 behavior, set the boolean 'legacy-cmb' nvme device parameter. Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Padmakar Kalghatgi <p.kalghatgi@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add PMR RDS/WDS supportNaveen Nagar
Add support for the PMRMSCL and PMRMSCU MMIO registers. This allows adding RDS/WDS support for PMR as well. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Naveen Nagar <naveen.n1@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: disable PMR at boot upKlaus Jensen
The PMR should not be enabled at boot up. Disable the PMR MemoryRegion initially and implement MMIO for PMRCTL, allowing the host to enable the PMR explicitly. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: remove redundant zeroing of PMR registersKlaus Jensen
The controller registers are initially zero. Remove the redundant zeroing. Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: rename PMR/CMB shift/mask fieldsKlaus Jensen
Use the correct field names. Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: allow cmb and pmr to coexistKlaus Jensen
With BAR 4 now free to use, allow PMR and CMB to be enabled simultaneously. Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: move msix table and pba to BAR 0Klaus Jensen
In the interest of supporting both CMB and PMR to be enabled on the same device, move the MSI-X table and pending bit array out of BAR 4 and into BAR 0. This is a simplified version of the patch contributed by Andrzej Jakowski (see [1]). Leaving the CMB at offset 0 removes the need for changes to CMB address mapping code. [1]: https://lore.kernel.org/qemu-devel/20200729220107.37758-3-andrzej.jakowski@linux.intel.com/ Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Tested-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: indicate CMB support through controller capabilities registerAndrzej Jakowski
This patch sets CMBS bit in controller capabilities register when user configures NVMe driver with CMB support, so capabilites are correctly reported to guest OS. Signed-off-by: Andrzej Jakowski <andrzej.jakowski@linux.intel.com> Reviewed-by: Maxim Levitsky <mlevitsky@gmail.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: fix 64 bit register hi/lo split writesKlaus Jensen
64 bit registers like ASQ and ACQ should be writable by both a hi/lo 32 bit write combination as well as a plain 64 bit write. The spec does not define ordering on the hi/lo split, but the code currently assumes that the low order bits are written first. Additionally, the code does not consider that another address might already have been written into the register, causing the OR'ing to result in a bad address. Fix this by explicitly overwriting only the low or high order bits for 32 bit writes. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: add size to mmio read/write trace eventsKlaus Jensen
Add the size of the mmio read/write to the trace event. Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: trigger async event during injecting smart warningzhenwei pi
During smart critical warning injection by setting property from QMP command, also try to trigger asynchronous event. Suggested by Keith, if a event has already been raised, there is no need to enqueue the duplicate event any more. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> [k.jensen: fix typo in commit message] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add smart_critical_warning propertyzhenwei pi
There is a very low probability that hitting physical NVMe disk hardware critical warning case, it's hard to write & test a monitor agent service. For debugging purposes, add a new 'smart_critical_warning' property to emulate this situation. The orignal version of this change is implemented by adding a fixed property which could be initialized by QEMU command line. Suggested by Philippe & Klaus, rework like current version. Test with this patch: 1, change smart_critical_warning property for a running VM: #virsh qemu-monitor-command nvme-upstream '{ "execute": "qom-set", "arguments": { "path": "/machine/peripheral-anon/device[0]", "property": "smart_critical_warning", "value":16 } }' 2, run smartctl in guest #smartctl -H -l error /dev/nvme0n1 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! - volatile memory backup device has failed Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: fix zone write finalizeKlaus Jensen
The zone write pointer is unconditionally advanced, even for write faults. Make sure that the zone is always transitioned to Full if the write pointer reaches zone capacity. Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: remove unused argument in nvme_ns_setupMinwoo Im
nvme_ns_setup() finally does not have nothing to do with NvmeCtrl instance. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: split setup and register for namespaceMinwoo Im
In NVMe, namespace is being attached to process I/O. We register NVMe namespace to a controller via nvme_register_namespace() during nvme_ns_setup(). This is main reason of receiving NvmeCtrl object instance to this function to map the namespace to a controller. To make namespace instance more independent, it should be split into two parts: setup and register. This patch split them into two differnt parts, and finally nvme_ns_setup() does not have nothing to do with NvmeCtrl instance at all. This patch is a former patch to introduce NVMe subsystem scheme to the existing design especially for multi-path. In that case, it should be split into two to make namespace independent from a controller. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: remove unused argument in nvme_ns_init_blkMinwoo Im
Removed no longer used aregument NvmeCtrl object in nvme_ns_init_blk(). Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: open code for volatile write cacheMinwoo Im
Volatile Write Cache(VWC) feature is set in nvme_ns_setup() in the initial time. This feature is related to block device backed, but this feature is controlled in controller level via Set/Get Features command. This patch removed dependency between nvme and nvme-ns to manage the VWC flag value. Also, it open coded the Get Features for VWC to check all namespaces attached to the controller, and if false detected, return directly false. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> [k.jensen: report write cache preset if present on ANY namespace] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: remove unused argument in nvme_ns_init_zonedMinwoo Im
nvme_ns_init_zoned() has no use for given NvmeCtrl object. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Correct error status for unaligned ZADmitry Fomichev
TP 4053 says (in section 2.3.1.1) - ... if a Zone Append command specifies a ZSLBA that is not the lowest logical block address in that zone, then the controller shall abort that command with a status code of Invalid Field In Command. In the code, Zone Invalid Write is returned instead, fix this. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: remove unnecessary check for appendKlaus Jensen
nvme_io_cmd already checks if the namespace supports the Zone Append command, so the removed check is dead code. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: add missing string representations for commandsKlaus Jensen
Add missing string representations for a couple of new commands. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: zero out zones on resetKlaus Jensen
The zoned command set specification states that "All logical blocks in a zone *shall* be marked as deallocated when [the zone is reset]". Since the device guarantees 0x00 to be read from deallocated blocks we have to issue a pwrite_zeroes since we cannot be sure that a discard will do anything. But typically, this will be achieved with an efficient unmap/discard operation. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: enum style fixKlaus Jensen
Align with existing style and use a typedef for header-file enums. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: merge implicitly/explicitly opened processing masksKlaus Jensen
Implicitly and explicitly opended zones are always bulk processed together, so merge the two processing masks. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: fix shutdown/reset logicKlaus Jensen
A shutdown is only about flushing stuff. It is the host that should delete any queues, so do not perform a reset here. Also, on shutdown, make sure that the PMR is flushed if in use. Fixes: 368f4e752cf9 ("hw/block/nvme: Process controller reset and shutdown differently") Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: conditionally enable DULBE for zoned namespacesKlaus Jensen
The device uses the BDRV_BLOCK_ZERO flag to determine the "deallocated" status of logical blocks. Since the zoned namespaces command set specification defines that logical blocks SHALL be marked as deallocated when the zone is in the Empty or Offline states, DULBE can only be supported if the zone size is a multiple of the calculated deallocation granularity (reported in NPDG) which depends on the underlying block device cluster size (if applicable) or the configured discard_granularity. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: fix for non-msix machinesKlaus Jensen
Commit 1c0c2163aa08 ("hw/block/nvme: verify msix_init_exclusive_bar() return value") had the unintended effect of breaking support on several platforms not supporting MSI-X. Still check for errors, but only report that MSI-X is unsupported instead of bailing out. Fixes: 1c0c2163aa08 ("hw/block/nvme: verify msix_init_exclusive_bar() return value") Fixes: fbf2e5375e33 ("hw/block/nvme: Verify msix_vector_use() returned value") Reported-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Document zoned parameters in usage textDmitry Fomichev
Added brief descriptions of the new device properties that are now available to users to configure features of Zoned Namespace Command Set in the emulator. This patch is for documentation only, no functionality change. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Support Zone Descriptor ExtensionsDmitry Fomichev
Zone Descriptor Extension is a label that can be assigned to a zone. It can be set to an Empty zone and it stays assigned until the zone is reset. This commit adds a new optional module property, "zoned.descr_ext_size". Its value must be a multiple of 64 bytes. If this value is non-zero, it becomes possible to assign extensions of that size to any Empty zones. The default value for this property is 0, therefore setting extensions is disabled by default. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Introduce max active and open zone limitsDmitry Fomichev
Add two module properties, "zoned.max_active" and "zoned.max_open" to control the maximum number of zones that can be active or open. Once these variables are set to non-default values, these limits are checked during I/O and Too Many Active or Too Many Open command status is returned if they are exceeded. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Support Zoned Namespace Command SetDmitry Fomichev
The emulation code has been changed to advertise NVM Command Set when "zoned" device property is not set (default) and Zoned Namespace Command Set otherwise. Define values and structures that are needed to support Zoned Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator. Define trace events where needed in newly introduced code. In order to improve scalability, all open, closed and full zones are organized in separate linked lists. Consequently, almost all zone operations don't require scanning of the entire zone array (which potentially can be quite large) - it is only necessary to enumerate one or more zone lists. Handlers for three new NVMe commands introduced in Zoned Namespace Command Set specification are added, namely for Zone Management Receive, Zone Management Send and Zone Append. Device initialization code has been extended to create a proper configuration for zoned operation using device properties. Read/Write command handler is modified to only allow writes at the write pointer if the namespace is zoned. For Zone Append command, writes implicitly happen at the write pointer and the starting write pointer value is returned as the result of the command. Write Zeroes handler is modified to add zoned checks that are identical to those done as a part of Write flow. Subsequent commits in this series add ZDE support and checks for active and open zone limits. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Support allocated CNS command variantsNiklas Cassel
Many CNS commands have "allocated" command variants. These include a namespace as long as it is allocated, that is a namespace is included regardless if it is active (attached) or not. While these commands are optional (they are mandatory for controllers supporting the namespace attachment command), our QEMU implementation is more complete by actually providing support for these CNS values. However, since our QEMU model currently does not support the namespace attachment command, these new allocated CNS commands will return the same result as the active CNS command variants. The reason for not hooking up this command completely is because the NVMe specification requires the namespace management command to be supported if the namespace attachment command is supported. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Add support for Namespace TypesNiklas Cassel
Define the structures and constants required to implement Namespace Types support. Namespace Types introduce a new command set, "I/O Command Sets", that allows the host to retrieve the command sets associated with a namespace. Introduce support for the command set and enable detection for the NVM Command Set. The new workflows for identify commands rely heavily on zero-filled identify structs. E.g., certain CNS commands are defined to return a zero-filled identify struct when an inactive namespace NSID is supplied. Add a helper function in order to avoid code duplication when reporting zero-filled identify structures. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Add Commands Supported and Effects logDmitry Fomichev
This log page becomes necessary to implement to allow checking for Zone Append command support in Zoned Namespace Command Set. This commit adds the code to report this log page for NVM Command Set only. The parts that are specific to zoned operation will be added later in the series. All incoming admin and i/o commands are now only processed if their corresponding support bits are set in this log. This provides an easy way to control what commands to support and what not to depending on set CC.CSS. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Combine nvme_write_zeroes() and nvme_write()Dmitry Fomichev
Move write processing to nvme_do_write() that now handles both WRITE and WRITE ZEROES. Both nvme_write() and nvme_write_zeroes() become inline helper functions. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Separate read and write handlersDmitry Fomichev
The majority of code in nvme_rw() is becoming read- or write-specific. Move these parts to two separate handlers, nvme_read() and nvme_write() to make the code more readable and to remove multiple is_write checks that has been present in the i/o path. This is a refactoring patch, no change in functionality. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Generate namespace UUIDsDmitry Fomichev
In NVMe 1.4, a namespace must report an ID descriptor of UUID type if it doesn't support EUI64 or NGUID. Add a new namespace property, "uuid", that provides the user the option to either specify the UUID explicitly or have a UUID generated automatically every time a namespace is initialized. Suggested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Process controller reset and shutdown differentlyDmitry Fomichev
Controller reset ans subsystem shutdown are handled very much the same in the current code, but some of the steps should be different in these two cases. Introduce two new functions, nvme_reset_ctrl() and nvme_shutdown_ctrl(), to separate some portions of the code from nvme_clear_ctrl(). The steps that are made different between reset and shutdown are that BAR.CC is not reset to zero upon the shutdown and namespace data is flushed to backing storage as a part of shutdown handling, but not upon reset. Suggested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: fix bad clearing of CAPKlaus Jensen
Commit 37712e00b1f0 ("hw/block/nvme: factor out pmr setup") changed the control flow such that the CAP register is erronously cleared after nvme_init_pmr() has configured it. Since the entire NvmeCtrl structure is zero-filled initially, there is no need for the explicit clearing, so just remove it. Fixes: 37712e00b1f0 ("hw/block/nvme: factor out pmr setup") Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-02-08hw/block/nvme: add compare commandGollu Appalanaidu
Add the Compare command. This implementation uses a bounce buffer to read in the data from storage and then compare with the host supplied buffer. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased] Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: add the dataset management commandKlaus Jensen
Add support for the Dataset Management command and the Deallocate attribute. Deallocation results in discards being sent to the underlying block device. Whether of not the blocks are actually deallocated is affected by the same factors as Write Zeroes (see previous commit). format | discard | dsm (512B) dsm (4KiB) dsm (64KiB) -------------------------------------------------------- qcow2 ignore n n n qcow2 unmap n n y raw ignore n n n raw unmap n y y Again, a raw format and 4KiB LBAs are preferable. In order to set the Namespace Preferred Deallocate Granularity and Alignment fields (NPDG and NPDA), choose a sane minimum discard granularity of 4KiB. If we are using a passthru device supporting discard at a 512B granularity, user should set the discard_granularity property explicitly. NPDG and NPDA will also account for the cluster_size of the block driver if required (i.e. for QCOW2). See NVM Express 1.3d, Section 6.7 ("Dataset Management command"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: add dulbe supportKlaus Jensen
Add support for reporting the Deallocated or Unwritten Logical Block Error (DULBE). Rely on the block status flags reported by the block layer and consider any block with the BDRV_BLOCK_ZERO flag to be deallocated. Multiple factors affect when a Write Zeroes command result in deallocation of blocks. * the underlying file system block size * the blockdev format * the 'discard' and 'logical_block_size' parameters format | discard | wz (512B) wz (4KiB) wz (64KiB) ----------------------------------------------------- qcow2 ignore n n y qcow2 unmap n n y raw ignore n y y raw unmap n y y So, this works best with an image in raw format and 4KiB LBAs, since holes can then be punched on a per-block basis (this assumes a file system with a 4kb block size, YMMV). A qcow2 image, uses a cluster size of 64KiB by default and blocks will only be marked deallocated if a full cluster is zeroed or discarded. However, this *is* consistent with the spec since Write Zeroes "should" deallocate the block if the Deallocate attribute is set and "may" deallocate if the Deallocate attribute is not set. Thus, we always try to deallocate (the BDRV_REQ_MAY_UNMAP flag is always set). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: pull aio error handlingKlaus Jensen
Add a new function, nvme_aio_err, to handle errors resulting from AIOs and use this from the callbacks. Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: remove superfluous NvmeCtrl parameterKlaus Jensen
nvme_check_bounds has no use of the NvmeCtrl parameter; remove it. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-01-27block: Separate blk_is_writable() and blk_supports_write_perm()Kevin Wolf
Currently, blk_is_read_only() tells whether a given BlockBackend can only be used in read-only mode because its root node is read-only. Some callers actually try to answer a slightly different question: Is the BlockBackend configured to be writable, by taking write permissions on the root node? This can differ, for example, for CD-ROM devices which don't take write permissions, but may be backed by a writable image file. scsi-cd allows write requests to the drive if blk_is_read_only() returns false. However, the write request will immediately run into an assertion failure because the write permission is missing. This patch introduces separate functions for both questions. blk_supports_write_perm() answers the question whether the block node/image file can support writable devices, whereas blk_is_writable() tells whether the BlockBackend is currently configured to be writable. All calls of blk_is_read_only() are converted to one of the two new functions. Fixes: https://bugs.launchpad.net/bugs/1906693 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210118123448.307825-2-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-01-21Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2021-01-20' into ↵Peter Maydell
staging nbd patches for 2021-01-20 - minor resource leak fixes in qemu-nbd - ensure proper aio context when nbd server uses iothreads - iotest refactorings in preparation for rewriting ./check to be more flexible, and preparing for more nbd server reconnect features # gpg: Signature made Thu 21 Jan 2021 02:28:19 GMT # gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2021-01-20: iotests.py: qemu_io(): reuse qemu_tool_pipe_and_status() iotests.py: fix qemu_tool_pipe_and_status() iotests/264: fix style iotests: define group in each iotest iotests/294: add shebang line iotests: make tests executable iotests: fix some whitespaces in test output files iotests/303: use dot slash for qcow2.py running iotests/277: use dot slash for nbd-fault-injector.py running nbd/server: Quiesce coroutines on context switch block: Honor blk_set_aio_context() context requirements qemu-nbd: Fix a memleak in nbd_client_thread() qemu-nbd: Fix a memleak in qemu_nbd_client_list() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-01-20block: Honor blk_set_aio_context() context requirementsSergio Lopez
The documentation for bdrv_set_aio_context_ignore() states this: * The caller must own the AioContext lock for the old AioContext of bs, but it * must not own the AioContext lock for new_context (unless new_context is the * same as the current context of bs). As blk_set_aio_context() makes use of this function, this rule also applies to it. Fix all occurrences where this rule wasn't honored. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20201214170519.223781-2-slp@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2021-01-20hw/block/nand: Rename PAGE_SIZE to NAND_PAGE_SIZEJiaxun Yang
As per POSIX specification of limits.h [1], OS libc may define PAGE_SIZE in limits.h. To prevent collosion of definition, we rename PAGE_SIZE here. [1]: https://pubs.opengroup.org/onlinepubs/7908799/xsh/limits.h.html Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210118063808.12471-5-jiaxun.yang@flygoat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-01-16hw/block: m25p80: Implement AAI-WP command support for SST flashesXuzhou Cheng
Auto Address Increment (AAI) Word-Program is a special command of SST flashes. AAI-WP allows multiple bytes of data to be programmed without re-issuing the next sequential address location. Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Message-id: 1608688825-81519-2-git-send-email-bmeng.cn@gmail.com Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2021-01-16hw/block: m25p80: Don't write to flash if write is disabledBin Meng
When write is disabled, the write to flash should be avoided in flash_write8(). Fixes: 82a2499011a7 ("m25p80: Initial implementation of SPI flash device") Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Message-id: 1608688825-81519-1-git-send-email-bmeng.cn@gmail.com Signed-off-by: Alistair Francis <alistair.francis@wdc.com>