aboutsummaryrefslogtreecommitdiff
path: root/hw/block/nvme.c
AgeCommit message (Collapse)Author
2021-02-08hw/block/nvme: Support Zone Descriptor ExtensionsDmitry Fomichev
Zone Descriptor Extension is a label that can be assigned to a zone. It can be set to an Empty zone and it stays assigned until the zone is reset. This commit adds a new optional module property, "zoned.descr_ext_size". Its value must be a multiple of 64 bytes. If this value is non-zero, it becomes possible to assign extensions of that size to any Empty zones. The default value for this property is 0, therefore setting extensions is disabled by default. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Introduce max active and open zone limitsDmitry Fomichev
Add two module properties, "zoned.max_active" and "zoned.max_open" to control the maximum number of zones that can be active or open. Once these variables are set to non-default values, these limits are checked during I/O and Too Many Active or Too Many Open command status is returned if they are exceeded. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Support Zoned Namespace Command SetDmitry Fomichev
The emulation code has been changed to advertise NVM Command Set when "zoned" device property is not set (default) and Zoned Namespace Command Set otherwise. Define values and structures that are needed to support Zoned Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator. Define trace events where needed in newly introduced code. In order to improve scalability, all open, closed and full zones are organized in separate linked lists. Consequently, almost all zone operations don't require scanning of the entire zone array (which potentially can be quite large) - it is only necessary to enumerate one or more zone lists. Handlers for three new NVMe commands introduced in Zoned Namespace Command Set specification are added, namely for Zone Management Receive, Zone Management Send and Zone Append. Device initialization code has been extended to create a proper configuration for zoned operation using device properties. Read/Write command handler is modified to only allow writes at the write pointer if the namespace is zoned. For Zone Append command, writes implicitly happen at the write pointer and the starting write pointer value is returned as the result of the command. Write Zeroes handler is modified to add zoned checks that are identical to those done as a part of Write flow. Subsequent commits in this series add ZDE support and checks for active and open zone limits. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Support allocated CNS command variantsNiklas Cassel
Many CNS commands have "allocated" command variants. These include a namespace as long as it is allocated, that is a namespace is included regardless if it is active (attached) or not. While these commands are optional (they are mandatory for controllers supporting the namespace attachment command), our QEMU implementation is more complete by actually providing support for these CNS values. However, since our QEMU model currently does not support the namespace attachment command, these new allocated CNS commands will return the same result as the active CNS command variants. The reason for not hooking up this command completely is because the NVMe specification requires the namespace management command to be supported if the namespace attachment command is supported. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Add support for Namespace TypesNiklas Cassel
Define the structures and constants required to implement Namespace Types support. Namespace Types introduce a new command set, "I/O Command Sets", that allows the host to retrieve the command sets associated with a namespace. Introduce support for the command set and enable detection for the NVM Command Set. The new workflows for identify commands rely heavily on zero-filled identify structs. E.g., certain CNS commands are defined to return a zero-filled identify struct when an inactive namespace NSID is supplied. Add a helper function in order to avoid code duplication when reporting zero-filled identify structures. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Add Commands Supported and Effects logDmitry Fomichev
This log page becomes necessary to implement to allow checking for Zone Append command support in Zoned Namespace Command Set. This commit adds the code to report this log page for NVM Command Set only. The parts that are specific to zoned operation will be added later in the series. All incoming admin and i/o commands are now only processed if their corresponding support bits are set in this log. This provides an easy way to control what commands to support and what not to depending on set CC.CSS. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Combine nvme_write_zeroes() and nvme_write()Dmitry Fomichev
Move write processing to nvme_do_write() that now handles both WRITE and WRITE ZEROES. Both nvme_write() and nvme_write_zeroes() become inline helper functions. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Separate read and write handlersDmitry Fomichev
The majority of code in nvme_rw() is becoming read- or write-specific. Move these parts to two separate handlers, nvme_read() and nvme_write() to make the code more readable and to remove multiple is_write checks that has been present in the i/o path. This is a refactoring patch, no change in functionality. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Generate namespace UUIDsDmitry Fomichev
In NVMe 1.4, a namespace must report an ID descriptor of UUID type if it doesn't support EUI64 or NGUID. Add a new namespace property, "uuid", that provides the user the option to either specify the UUID explicitly or have a UUID generated automatically every time a namespace is initialized. Suggested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Process controller reset and shutdown differentlyDmitry Fomichev
Controller reset ans subsystem shutdown are handled very much the same in the current code, but some of the steps should be different in these two cases. Introduce two new functions, nvme_reset_ctrl() and nvme_shutdown_ctrl(), to separate some portions of the code from nvme_clear_ctrl(). The steps that are made different between reset and shutdown are that BAR.CC is not reset to zero upon the shutdown and namespace data is flushed to backing storage as a part of shutdown handling, but not upon reset. Suggested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: fix bad clearing of CAPKlaus Jensen
Commit 37712e00b1f0 ("hw/block/nvme: factor out pmr setup") changed the control flow such that the CAP register is erronously cleared after nvme_init_pmr() has configured it. Since the entire NvmeCtrl structure is zero-filled initially, there is no need for the explicit clearing, so just remove it. Fixes: 37712e00b1f0 ("hw/block/nvme: factor out pmr setup") Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-02-08hw/block/nvme: add compare commandGollu Appalanaidu
Add the Compare command. This implementation uses a bounce buffer to read in the data from storage and then compare with the host supplied buffer. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased] Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: add the dataset management commandKlaus Jensen
Add support for the Dataset Management command and the Deallocate attribute. Deallocation results in discards being sent to the underlying block device. Whether of not the blocks are actually deallocated is affected by the same factors as Write Zeroes (see previous commit). format | discard | dsm (512B) dsm (4KiB) dsm (64KiB) -------------------------------------------------------- qcow2 ignore n n n qcow2 unmap n n y raw ignore n n n raw unmap n y y Again, a raw format and 4KiB LBAs are preferable. In order to set the Namespace Preferred Deallocate Granularity and Alignment fields (NPDG and NPDA), choose a sane minimum discard granularity of 4KiB. If we are using a passthru device supporting discard at a 512B granularity, user should set the discard_granularity property explicitly. NPDG and NPDA will also account for the cluster_size of the block driver if required (i.e. for QCOW2). See NVM Express 1.3d, Section 6.7 ("Dataset Management command"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: add dulbe supportKlaus Jensen
Add support for reporting the Deallocated or Unwritten Logical Block Error (DULBE). Rely on the block status flags reported by the block layer and consider any block with the BDRV_BLOCK_ZERO flag to be deallocated. Multiple factors affect when a Write Zeroes command result in deallocation of blocks. * the underlying file system block size * the blockdev format * the 'discard' and 'logical_block_size' parameters format | discard | wz (512B) wz (4KiB) wz (64KiB) ----------------------------------------------------- qcow2 ignore n n y qcow2 unmap n n y raw ignore n y y raw unmap n y y So, this works best with an image in raw format and 4KiB LBAs, since holes can then be punched on a per-block basis (this assumes a file system with a 4kb block size, YMMV). A qcow2 image, uses a cluster size of 64KiB by default and blocks will only be marked deallocated if a full cluster is zeroed or discarded. However, this *is* consistent with the spec since Write Zeroes "should" deallocate the block if the Deallocate attribute is set and "may" deallocate if the Deallocate attribute is not set. Thus, we always try to deallocate (the BDRV_REQ_MAY_UNMAP flag is always set). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: pull aio error handlingKlaus Jensen
Add a new function, nvme_aio_err, to handle errors resulting from AIOs and use this from the callbacks. Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: remove superfluous NvmeCtrl parameterKlaus Jensen
nvme_check_bounds has no use of the NvmeCtrl parameter; remove it. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-01-08Remove superfluous timer_del() callsPeter Maydell
This commit is the result of running the timer-del-timer-free.cocci script on the whole source tree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201215154107.3255-4-peter.maydell@linaro.org
2020-11-09hw/block/nvme: fix free of array-typed valueKlaus Jensen
Since 7f0f1acedf15 ("hw/block/nvme: support multiple namespaces"), the namespaces member of NvmeCtrl is no longer a dynamically allocated array. Remove the free. Fixes: 7f0f1acedf15 ("hw/block/nvme: support multiple namespaces") Reported-by: Coverity (CID 1436131) Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Message-Id: <20201104102248.32168-4-its@irrelevant.dk> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-11-09hw/block/nvme: fix uint16_t use of uint32_t sgls memberKlaus Jensen
nvme_map_sgl_data erroneously uses the sgls member of NvmeIdNs as a uint16_t. Reported-by: Coverity (CID 1436129) Fixes: cba0a8a344fe ("hw/block/nvme: add support for scatter gather lists") Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Message-Id: <20201104102248.32168-3-its@irrelevant.dk> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-11-09hw/block/nvme: fix null ns in register namespaceKlaus Jensen
Fix dereference after NULL check. Reported-by: Coverity (CID 1436128) Fixes: b20804946bce ("hw/block/nvme: update nsid when registered") Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Message-Id: <20201104102248.32168-2-its@irrelevant.dk> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-10-27hw/block/nvme: fix queue identifer validationGollu Appalanaidu
The nvme_check_{sq,cq} functions check if the given queue identifer is valid *and* that the queue exists. Thus, the function return value cannot simply be inverted to check if the identifer is valid and that the queue does *not* exist. Replace the call with an OR'ed version of the checks. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: fix create IO SQ/CQ status codesGollu Appalanaidu
Replace the Invalid Field in Command with the Invalid PRP Offset status code in the nvme_create_{cq,sq} functions. Also, allow PRP1 to be address 0x0. Also replace the Completion Queue Invalid status code returned in nvme_create_cq when the the queue identifier is invalid with the Invalid Queue Identifier. The Completion Queue Invalid status code is exclusively for indicating that the completion queue identifer given when creating a submission queue is invalid. See NVM Express v1.3d, Section 5.3 ("Create I/O Completion Queue command") and 5.4("Create I/O Submission Queue command"). Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: fix prp mapping status codesGollu Appalanaidu
Address 0 is not an invalid address. Remove those invalikd checks. Unaligned PRP2 and PRP list entries should result in Invalid PRP Offset status code and not Invalid Field. Fix that. See NVMe Express v1.3d, Section 4.3 ("Physical Region Page Entry and List"). Suggested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add trace event for requests with non-zero status codeKlaus Jensen
If a command results in a non-zero status code, trace it. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add nsid to get/setfeat trace eventsKlaus Jensen
Include the namespace id in the pci_nvme_{get,set}feat trace events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: reject io commands if only admin command set selectedKlaus Jensen
If the host sets CC.CSS to 111b, all commands submitted to I/O queues should be completed with status Invalid Command Opcode. Note that this is technically a v1.4 feature, but it does not hurt to implement before we finally bump the reported version implemented. Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: support for admin-only command setKeith Busch
Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2020-10-27hw/block/nvme: validate command set selectedKeith Busch
Fail to start the controller if the user requests a command set that the controller does not support. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2020-10-27hw/block/nvme: support per-namespace smart logKeith Busch
Let the user specify a specific namespace if they want to get access stats for a specific namespace. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2020-10-27hw/block/nvme: fix log page offset checkKeith Busch
Return error if the requested offset starts after the size of the log being returned. Also, move the check for earlier in the function so we're not doing unnecessary calculations. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed- by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2020-10-27hw/block/nvme: remove pointless rw indirectionKeith Busch
The code switches on the opcode to invoke a function specific to that opcode. There's no point in consolidating back to a common function that just switches on that same opcode without any actual common code. Restore the opcode specific behavior without going back through another level of switches. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2020-10-27hw/block/nvme: update nsid when registeredKlaus Jensen
If the user does not specify an nsid parameter on the nvme-ns device, nvme_register_namespace will find the first free namespace id and assign that. This fix makes sure the assigned id is saved. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2020-10-27hw/block/nvme: change controller pci idKlaus Jensen
There are two reasons for changing this: 1. The nvme device currently uses an internal Intel device id. 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: support multiple namespaces" the controller device no longer has the quirks that the Linux kernel think it has. As the quirks are applied based on pci vendor and device id, change them to get rid of the quirks. To keep backward compatibility, add a new 'use-intel-id' parameter to the nvme device to force use of the Intel vendor and device id. This is off by default but add a compat property to set this for 5.1 machines and older. If a 5.1 machine is booted (or the use-intel-id parameter is explicitly set to true), the Linux kernel will just apply these unnecessary quirks: 1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support anything else than values 0x0 and 0x1 for CNS (Identify Namespace and Identify Namespace). With multiple namespace support, this just means that the kernel will "scan" namespaces instead of using "Active Namespace ID list" (CNS 0x2). 2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a broken Write Zeroes implementation which has since been fixed in commit 9d6459d21a6e ("nvme: fix write zeroes offset and count"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
2020-10-27hw/block/nvme: support multiple namespacesKlaus Jensen
This adds support for multiple namespaces by introducing a new 'nvme-ns' device model. The nvme device creates a bus named from the device name ('id'). The nvme-ns devices then connect to this and registers themselves with the nvme device. This changes how an nvme device is created. Example with two namespaces: -drive file=nvme0n1.img,if=none,id=disk1 -drive file=nvme0n2.img,if=none,id=disk2 -device nvme,serial=deadbeef,id=nvme0 -device nvme-ns,drive=disk1,bus=nvme0,nsid=1 -device nvme-ns,drive=disk2,bus=nvme0,nsid=2 The drive property is kept on the nvme device to keep the change backward compatible, but the property is now optional. Specifying a drive for the nvme device will always create the namespace with nsid 1. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2020-10-27hw/block/nvme: refactor identify active namespace id listKlaus Jensen
Prepare to support inactive namespaces. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add support for sgl bit bucket descriptorGollu Appalanaidu
This adds support for SGL descriptor type 0x1 (bit bucket descriptor). See the NVM Express v1.3d specification, Section 4.4 ("Scatter Gather List (SGL)"). Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add support for scatter gather listsKlaus Jensen
For now, support the Data Block, Segment and Last Segment descriptor types. See NVM Express 1.3d, Section 4.4 ("Scatter Gather List (SGL)"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: harden cmb accessKlaus Jensen
Since the controller has only supported PRPs so far it has not been required to check the ending address (addr + len - 1) of the CMB access for validity since it has been guaranteed to be in range of the CMB. This changes when the controller adds support for SGLs (next patch), so add that check. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: default request status to successKlaus Jensen
Make the default request status NVME_SUCCESS so only error status codes have to be set. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: refactor aio submissionKlaus Jensen
This pulls block layer aio submission/completion to common functions. For completions, additionally map an AIO error to the Unrecovered Read and Write Fault status codes. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add symbolic command name to trace eventsKlaus Jensen
Add the symbolic command name to the pci_nvme_{io,admin}_cmd and pci_nvme_rw trace events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: fix endian conversionKlaus Jensen
The raw NLB field is a 16 bit value, so use le16_to_cpu instead of le32_to_cpu and cast to uint32_t before incrementing the value to not wrap around. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-10-27hw/block/nvme: add a lba to bytes helperKlaus Jensen
Add the nvme_l2b helper and use it for converting NLB and SLBA to byte counts and offsets. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: alignment style fixesKlaus Jensen
Style fixes. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: commonize nvme_rw error handlingKlaus Jensen
Move common error handling to a label. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: handle dma errorsKlaus Jensen
Handling DMA errors gracefully is required for the device to pass the block/011 test ("disable PCI device while doing I/O") in the blktests suite. With this patch the device sets the Controller Fatal Status bit in the CSTS register when failing to read from a submission queue or writing to a completion queue; expecting the host to reset the controller. If DMA errors occur at any other point in the execution of the command (say, while mapping the PRPs), the command is aborted with a Data Transfer Error status code. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-13hw/block/nvme: Simplify timestamp sumPhilippe Mathieu-Daudé
As the 'timestamp' variable is declared as a 48-bit bitfield, we do not need to wrap the sum result. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Message-Id: <20201002075716.1657849-1-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-09-02hw/block/nvme: remove explicit qsg/iov parametersKlaus Jensen
Since nvme_map_prp always operate on the request-scoped qsg/iovs, just pass a single pointer to the NvmeRequest instead of two for each of the qsg and iov. Suggested-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
2020-09-02hw/block/nvme: use preallocated qsg/iov in nvme_dma_prpKlaus Jensen
Since clean up of the request qsg/iov is now always done post-use, there is no need to use a stack-allocated qsg/iov in nvme_dma_prp. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2020-09-02hw/block/nvme: consolidate qsg/iov clearingKlaus Jensen
Always destroy the request qsg/iov at the end of request use. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>