aboutsummaryrefslogtreecommitdiff
path: root/hw/block/nvme-ns.h
AgeCommit message (Collapse)Author
2021-04-07hw/block/nvme: fix handling of private namespacesKlaus Jensen
Prior to this patch, if a private nvme-ns device (that is, a namespace that is not linked to a subsystem) is wired up to an nvme-subsys linked nvme controller device, the device fails to verify that the namespace id is unique within the subsystem. NVM Express v1.4b, Section 6.1.6 ("NSID and Namespace Usage") states that because the device supports Namespace Management, "NSIDs *shall* be unique within the NVM subsystem". Additionally, prior to this patch, private namespaces are not known to the subsystem and the namespace is considered exclusive to the controller with which it is initially wired up to. However, this is not the definition of a private namespace; per Section 1.6.33 ("private namespace"), a private namespace is just a namespace that does not support multipath I/O or namespace sharing, which means "that it is only able to be attached to one controller at a time". Fix this by always allocating namespaces in the subsystem (if one is linked to the controller), regardless of the shared/private status of the namespace. Whether or not the namespace is shareable is controlled by a new `shared` nvme-ns parameter. Finally, this fix allows the nvme-ns `subsys` parameter to be removed, since the `shared` parameter now serves the purpose of attaching the namespace to all controllers in the subsystem upon device realization. It is invalid to have an nvme-ns namespace device with a linked subsystem without the parent nvme controller device also being linked to one and since the nvme-ns devices will unconditionally be "attached" (in QEMU terms that is) to an nvme controller device through an NvmeBus, the nvme-ns namespace device can always get a reference to the subsystem of the controller it is explicitly (using 'bus=' parameter) or implicitly attaching to. Fixes: e570768566b3 ("hw/block/nvme: support for shared namespace in subsystem") Cc: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-04-06hw/block/nvme: fix the nsid 'invalid' valueKlaus Jensen
The `nvme_nsid()` function returns '-1' (FFFFFFFFh) when the given namespace is NULL. Since FFFFFFFFh is actually a valid namespace identifier (the "broadcast" value), change this to be '0' since that actually *is* the invalid value. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: add support for the format nvm commandMinwoo Im
Format NVM admin command can make a namespace or namespaces to be with different LBA size and metadata size with protection information types. This patch introduces Format NVM command with LBA format, Metadata, and Protection Information for the device. The secure erase operation things and support for formatting zoned namespaces are yet to be added. The parameter checks inside of this patch has been referred from Keith's old branch. Signed-off-by: Minwoo Im <minwoo.im@samsung.com> [anaidu.gollu: rebased on e2e] Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased for reworked aio tracking] Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: prefer runtime helpers instead of device parametersKlaus Jensen
In preparation for Format NVM support, use runtime helpers instead of the constant device parameters when getting lba size information etc. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-03-18hw/block/nvme: end-to-end data protectionKlaus Jensen
Add support for namespaces formatted with protection information. The type of end-to-end data protection (i.e. Type 1, Type 2 or Type 3) is selected with the `pi` nvme-ns device parameter. If the number of metadata bytes is larger than 8, the `pil` nvme-ns device parameter may be used to control the location of the 8-byte DIF tuple. The default `pil` value of '0', causes the DIF tuple to be transferred as the last 8 bytes of the metadata. Set to 1 to store this in the first eight bytes instead. Co-authored-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: add metadata supportKlaus Jensen
Add support for metadata in the form of extended logical blocks as well as a separate buffer of data. The new `ms` nvme-ns device parameter specifies the size of metadata per logical block in bytes. The `mset` nvme-ns device parameter controls whether metadata is transfered as part of an extended lba (set to '1') or in a separate buffer (set to '0', the default). Regardsless of the scheme chosen with `mset`, metadata is stored at the end of the namespace backing block device. This requires the user provided PRP/SGLs to be walked and "split" into data and metadata scatter/gather lists if the extended logical block scheme is used, but has the advantage of not breaking the deallocated blocks support. Co-authored-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: support changed namespace asynchronous eventMinwoo Im
If namespace inventory is changed due to some reasons (e.g., namespace attachment/detachment), controller can send out event notifier to the host to manage namespaces. This patch sends out the AEN to the host after either attach or detach namespaces from controllers. To support clear of the event from the controller, this patch also implemented Get Log Page command for Changed Namespace List log type. To return namespace id list through the command, when namespace inventory is updated, id is added to the per-controller list (changed_ns_list). To indicate the support of this async event, this patch set OAES(Optional Asynchronous Events Supported) in Identify Controller data structure. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: support namespace detachMinwoo Im
Given that now we have nvme-subsys device supported, we can manage namespace allocated, but not attached: detached. This patch introduced a parameter for nvme-ns device named 'detached'. This parameter indicates whether the given namespace device is detached from a entire NVMe subsystem('subsys' given case, shared namespace) or a controller('bus' given case, private namespace). - Allocated namespace 1) Shared ns in the subsystem 'subsys0': -device nvme-ns,id=ns1,drive=blknvme0,nsid=1,subsys=subsys0,detached=true 2) Private ns for the controller 'nvme0' of the subsystem 'subsys0': -device nvme-subsys,id=subsys0 -device nvme,serial=foo,id=nvme0,subsys=subsys0 -device nvme-ns,id=ns1,drive=blknvme0,nsid=1,bus=nvme0,detached=true 3) (Invalid case) Controller 'nvme0' has no subsystem to manage ns: -device nvme,serial=foo,id=nvme0 -device nvme-ns,id=ns1,drive=blknvme0,nsid=1,bus=nvme0,detached=true Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: add simple copy commandKlaus Jensen
Add support for TP 4065a ("Simple Copy Command"), v2020.05.04 ("Ratified"). The implementation uses a bounce buffer to first read in the source logical blocks, then issue a write of that bounce buffer. The default maximum number of source logical blocks is 128, translating to 512 KiB for 4k logical blocks which aligns with the default value of MDTS. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: support for shared namespace in subsystemMinwoo Im
nvme-ns device is registered to a nvme controller device during the initialization in nvme_register_namespace() in case that 'bus' property is given which means it's mapped to a single controller. This patch introduced a new property 'subsys' just like the controller device instance did to map a namespace to a NVMe subsystem. If 'subsys' property is given to the nvme-ns device, it will belong to the specified subsystem and will be attached to all controllers in that subsystem by enabling shared namespace capability in NMIC(Namespace Multi-path I/O and Namespace Capabilities) in Identify Namespace. Usage: -device nvme-subsys,id=subsys0 -device nvme,serial=foo,id=nvme0,subsys=subsys0 -device nvme,serial=bar,id=nvme1,subsys=subsys0 -device nvme,serial=baz,id=nvme2,subsys=subsys0 -device nvme-ns,id=ns1,drive=<drv>,nsid=1,subsys=subsys0 # Shared -device nvme-ns,id=ns2,drive=<drv>,nsid=2,bus=nvme2 # Non-shared In the above example, 'ns1' will be shared to 'nvme0' and 'nvme1' in the same subsystem. On the other hand, 'ns2' will be attached to the 'nvme2' only as a private namespace in that subsystem. All the namespace with 'subsys' parameter will attach all controllers in the subsystem to the namespace by default. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: remove unused argument in nvme_ns_setupMinwoo Im
nvme_ns_setup() finally does not have nothing to do with NvmeCtrl instance. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: enum style fixKlaus Jensen
Align with existing style and use a typedef for header-file enums. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: Support Zone Descriptor ExtensionsDmitry Fomichev
Zone Descriptor Extension is a label that can be assigned to a zone. It can be set to an Empty zone and it stays assigned until the zone is reset. This commit adds a new optional module property, "zoned.descr_ext_size". Its value must be a multiple of 64 bytes. If this value is non-zero, it becomes possible to assign extensions of that size to any Empty zones. The default value for this property is 0, therefore setting extensions is disabled by default. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Introduce max active and open zone limitsDmitry Fomichev
Add two module properties, "zoned.max_active" and "zoned.max_open" to control the maximum number of zones that can be active or open. Once these variables are set to non-default values, these limits are checked during I/O and Too Many Active or Too Many Open command status is returned if they are exceeded. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Support Zoned Namespace Command SetDmitry Fomichev
The emulation code has been changed to advertise NVM Command Set when "zoned" device property is not set (default) and Zoned Namespace Command Set otherwise. Define values and structures that are needed to support Zoned Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator. Define trace events where needed in newly introduced code. In order to improve scalability, all open, closed and full zones are organized in separate linked lists. Consequently, almost all zone operations don't require scanning of the entire zone array (which potentially can be quite large) - it is only necessary to enumerate one or more zone lists. Handlers for three new NVMe commands introduced in Zoned Namespace Command Set specification are added, namely for Zone Management Receive, Zone Management Send and Zone Append. Device initialization code has been extended to create a proper configuration for zoned operation using device properties. Read/Write command handler is modified to only allow writes at the write pointer if the namespace is zoned. For Zone Append command, writes implicitly happen at the write pointer and the starting write pointer value is returned as the result of the command. Write Zeroes handler is modified to add zoned checks that are identical to those done as a part of Write flow. Subsequent commits in this series add ZDE support and checks for active and open zone limits. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Add support for Namespace TypesNiklas Cassel
Define the structures and constants required to implement Namespace Types support. Namespace Types introduce a new command set, "I/O Command Sets", that allows the host to retrieve the command sets associated with a namespace. Introduce support for the command set and enable detection for the NVM Command Set. The new workflows for identify commands rely heavily on zero-filled identify structs. E.g., certain CNS commands are defined to return a zero-filled identify struct when an inactive namespace NSID is supplied. Add a helper function in order to avoid code duplication when reporting zero-filled identify structures. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Add Commands Supported and Effects logDmitry Fomichev
This log page becomes necessary to implement to allow checking for Zone Append command support in Zoned Namespace Command Set. This commit adds the code to report this log page for NVM Command Set only. The parts that are specific to zoned operation will be added later in the series. All incoming admin and i/o commands are now only processed if their corresponding support bits are set in this log. This provides an easy way to control what commands to support and what not to depending on set CC.CSS. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Generate namespace UUIDsDmitry Fomichev
In NVMe 1.4, a namespace must report an ID descriptor of UUID type if it doesn't support EUI64 or NGUID. Add a new namespace property, "uuid", that provides the user the option to either specify the UUID explicitly or have a UUID generated automatically every time a namespace is initialized. Suggested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Process controller reset and shutdown differentlyDmitry Fomichev
Controller reset ans subsystem shutdown are handled very much the same in the current code, but some of the steps should be different in these two cases. Introduce two new functions, nvme_reset_ctrl() and nvme_shutdown_ctrl(), to separate some portions of the code from nvme_clear_ctrl(). The steps that are made different between reset and shutdown are that BAR.CC is not reset to zero upon the shutdown and namespace data is flushed to backing storage as a part of shutdown handling, but not upon reset. Suggested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add dulbe supportKlaus Jensen
Add support for reporting the Deallocated or Unwritten Logical Block Error (DULBE). Rely on the block status flags reported by the block layer and consider any block with the BDRV_BLOCK_ZERO flag to be deallocated. Multiple factors affect when a Write Zeroes command result in deallocation of blocks. * the underlying file system block size * the blockdev format * the 'discard' and 'logical_block_size' parameters format | discard | wz (512B) wz (4KiB) wz (64KiB) ----------------------------------------------------- qcow2 ignore n n y qcow2 unmap n n y raw ignore n y y raw unmap n y y So, this works best with an image in raw format and 4KiB LBAs, since holes can then be punched on a per-block basis (this assumes a file system with a 4kb block size, YMMV). A qcow2 image, uses a cluster size of 64KiB by default and blocks will only be marked deallocated if a full cluster is zeroed or discarded. However, this *is* consistent with the spec since Write Zeroes "should" deallocate the block if the Deallocate attribute is set and "may" deallocate if the Deallocate attribute is not set. Thus, we always try to deallocate (the BDRV_REQ_MAY_UNMAP flag is always set). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: support multiple namespacesKlaus Jensen
This adds support for multiple namespaces by introducing a new 'nvme-ns' device model. The nvme device creates a bus named from the device name ('id'). The nvme-ns devices then connect to this and registers themselves with the nvme device. This changes how an nvme device is created. Example with two namespaces: -drive file=nvme0n1.img,if=none,id=disk1 -drive file=nvme0n2.img,if=none,id=disk2 -device nvme,serial=deadbeef,id=nvme0 -device nvme-ns,drive=disk1,bus=nvme0,nsid=1 -device nvme-ns,drive=disk2,bus=nvme0,nsid=2 The drive property is kept on the nvme device to keep the change backward compatible, but the property is now optional. Specifying a drive for the nvme device will always create the namespace with nsid 1. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>