aboutsummaryrefslogtreecommitdiff
path: root/hw/block/nvme.h
AgeCommit message (Collapse)Author
2021-05-17hw/nvme: move nvme emulation out of hw/blockKlaus Jensen
With the introduction of the nvme-subsystem device we are really cluttering up the hw/block directory. As suggested by Philippe previously, move the nvme emulation to hw/nvme. Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-05-17hw/block/nvme: remove num_namespaces memberKlaus Jensen
The NvmeCtrl num_namespaces member is just an indirection for the NVME_MAX_NAMESPACES constant. Remove the indirection. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-05-17hw/block/nvme: streamline namespace array indexingKlaus Jensen
Streamline namespace array indexing such that both the subsystem and controller namespaces arrays are 1-indexed. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-05-17hw/block/nvme: add metadata offset helperKlaus Jensen
Add an nvme_moff() helper. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-05-17hw/block/nvme: cache lba and ms sizesKlaus Jensen
There is no need to look up the lba size and metadata size in the LBA Format structure everytime we want to use it. And we use it a lot. Cache the values in the NvmeNamespace and update them if the namespace is formatted. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-05-17hw/block/nvme: replace nvme_ns_statusKlaus Jensen
The inline nvme_ns_status() helper only has a single call site. Remove it from the header file and inline it for real. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-05-17hw/block/nvme: remove non-shared defines from header fileKlaus Jensen
Remove non-shared defines from the shared header. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-05-17hw/block/nvme: consolidate header filesKlaus Jensen
In preparation for moving the nvme device into its own subtree, merge the header files into one. Also add missing copyright notice and add list of authors with substantial contributions. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-04-07hw/block/nvme: fix handling of private namespacesKlaus Jensen
Prior to this patch, if a private nvme-ns device (that is, a namespace that is not linked to a subsystem) is wired up to an nvme-subsys linked nvme controller device, the device fails to verify that the namespace id is unique within the subsystem. NVM Express v1.4b, Section 6.1.6 ("NSID and Namespace Usage") states that because the device supports Namespace Management, "NSIDs *shall* be unique within the NVM subsystem". Additionally, prior to this patch, private namespaces are not known to the subsystem and the namespace is considered exclusive to the controller with which it is initially wired up to. However, this is not the definition of a private namespace; per Section 1.6.33 ("private namespace"), a private namespace is just a namespace that does not support multipath I/O or namespace sharing, which means "that it is only able to be attached to one controller at a time". Fix this by always allocating namespaces in the subsystem (if one is linked to the controller), regardless of the shared/private status of the namespace. Whether or not the namespace is shareable is controlled by a new `shared` nvme-ns parameter. Finally, this fix allows the nvme-ns `subsys` parameter to be removed, since the `shared` parameter now serves the purpose of attaching the namespace to all controllers in the subsystem upon device realization. It is invalid to have an nvme-ns namespace device with a linked subsystem without the parent nvme controller device also being linked to one and since the nvme-ns devices will unconditionally be "attached" (in QEMU terms that is) to an nvme controller device through an NvmeBus, the nvme-ns namespace device can always get a reference to the subsystem of the controller it is explicitly (using 'bus=' parameter) or implicitly attaching to. Fixes: e570768566b3 ("hw/block/nvme: support for shared namespace in subsystem") Cc: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-04-07hw/block/nvme: fix warning about legacy namespace configurationKlaus Jensen
Remove the unused BlockConf from the controller structure and remove the noop constraint checking. Device works just fine with both legacy drive parameter namespace and nvme-ns namespace definitions. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
2021-04-06hw/block/nvme: fix missing string representation for ns attachmentKlaus Jensen
Add the missing nvme_adm_opc_str entry for the Namespace Attachment command. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: add support for the format nvm commandMinwoo Im
Format NVM admin command can make a namespace or namespaces to be with different LBA size and metadata size with protection information types. This patch introduces Format NVM command with LBA format, Metadata, and Protection Information for the device. The secure erase operation things and support for formatting zoned namespaces are yet to be added. The parameter checks inside of this patch has been referred from Keith's old branch. Signed-off-by: Minwoo Im <minwoo.im@samsung.com> [anaidu.gollu: rebased on e2e] Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased for reworked aio tracking] Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: add non-mdts command size limit for verifyKlaus Jensen
Verify is not subject to MDTS, so a single Verify command may result in excessive amounts of allocated memory. Impose a limit on the data size by adding support for TP 4040 ("Non-MDTS Command Size Limits"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: add verify commandGollu Appalanaidu
See NVM Express 1.4, section 6.14 ("Verify Command"). Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased, refactored for e2e] Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: end-to-end data protectionKlaus Jensen
Add support for namespaces formatted with protection information. The type of end-to-end data protection (i.e. Type 1, Type 2 or Type 3) is selected with the `pi` nvme-ns device parameter. If the number of metadata bytes is larger than 8, the `pil` nvme-ns device parameter may be used to control the location of the 8-byte DIF tuple. The default `pil` value of '0', causes the DIF tuple to be transferred as the last 8 bytes of the metadata. Set to 1 to store this in the first eight bytes instead. Co-authored-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: assert namespaces array indicesKlaus Jensen
Coverity complains about a possible memory corruption in the nvme_ns_attach and _detach functions. While we should not (famous last words) be able to reach this function without nsid having previously been validated, this is still an open door for future misuse. Make Coverity and maintainers happy by asserting that the index into the array is valid. Also, while not detected by Coverity (yet), add an assert in nvme_subsys_ns and nvme_subsys_register_ns as well since a similar issue is exists there. Fixes: 037953b5b299 ("hw/block/nvme: support namespace detach") Fixes: CID 1450757 Fixes: CID 1450758 Cc: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-03-09hw/block/nvme: support changed namespace asynchronous eventMinwoo Im
If namespace inventory is changed due to some reasons (e.g., namespace attachment/detachment), controller can send out event notifier to the host to manage namespaces. This patch sends out the AEN to the host after either attach or detach namespaces from controllers. To support clear of the event from the controller, this patch also implemented Get Log Page command for Changed Namespace List log type. To return namespace id list through the command, when namespace inventory is updated, id is added to the per-controller list (changed_ns_list). To indicate the support of this async event, this patch set OAES(Optional Asynchronous Events Supported) in Identify Controller data structure. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: support namespace attachment commandMinwoo Im
This patch supports Namespace Attachment command for the pre-defined nvme-ns device nodes. Of course, attach/detach namespace should only be supported in case 'subsys' is given. This is because if we detach a namespace from a controller, somebody needs to manage the detached, but allocated namespace in the NVMe subsystem. As command effect for the namespace attachment command is registered, the host will be notified that namespace inventory is changed so that host will rescan the namespace inventory after this command. For example, kernel driver manages this command effect via passthru IOCTL. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> [k.jensen: rebased for dma refactor] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: fix allocated namespace list to 256Minwoo Im
Expand allocated namespace list (subsys->namespaces) to have 256 entries which is a value lager than at least NVME_MAX_NAMESPACES which is for attached namespace list in a controller. Allocated namespace list should at least larger than attached namespace list. n->num_namespaces = NVME_MAX_NAMESPACES; The above line will set the NN field by id->nn so that the subsystem should also prepare at least this number of namespace list entries. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: support namespace detachMinwoo Im
Given that now we have nvme-subsys device supported, we can manage namespace allocated, but not attached: detached. This patch introduced a parameter for nvme-ns device named 'detached'. This parameter indicates whether the given namespace device is detached from a entire NVMe subsystem('subsys' given case, shared namespace) or a controller('bus' given case, private namespace). - Allocated namespace 1) Shared ns in the subsystem 'subsys0': -device nvme-ns,id=ns1,drive=blknvme0,nsid=1,subsys=subsys0,detached=true 2) Private ns for the controller 'nvme0' of the subsystem 'subsys0': -device nvme-subsys,id=subsys0 -device nvme,serial=foo,id=nvme0,subsys=subsys0 -device nvme-ns,id=ns1,drive=blknvme0,nsid=1,bus=nvme0,detached=true 3) (Invalid case) Controller 'nvme0' has no subsystem to manage ns: -device nvme,serial=foo,id=nvme0 -device nvme-ns,id=ns1,drive=blknvme0,nsid=1,bus=nvme0,detached=true Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: try to deal with the iov/qsg dualityKlaus Jensen
Introduce NvmeSg and try to deal with that pesky qsg/iov duality that haunts all the memory-related functions. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: report non-mdts command size limit for dsmGollu Appalanaidu
Dataset Management is not subject to MDTS, but exceeded a certain size per range causes internal looping. Report this limit (DMRSL) in the NVM command set specific identify controller data structure. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: align zoned.zasl with mdtsKlaus Jensen
ZASL (Zone Append Size Limit) is defined exactly like MDTS (Maximum Data Transfer Size), that is, it is a value in units of the minimum memory page size (CAP.MPSMIN) and is reported as a power of two. The 'mdts' nvme device parameter is specified as in the spec, but the 'zoned.append_size_limit' parameter is specified in bytes. This is suboptimal for a number of reasons: 1. It is just plain confusing wrt. the definition of mdts. 2. There is a lot of complexity involved in validating the value; it must be a power of two, it should be larger than 4k, if it is zero we set it internally to mdts, but still report it as zero. 3. While "hw/block/nvme: improve invalid zasl value reporting" slightly improved the handling of the parameter, the validation is still wrong; it does not depend on CC.MPS, it depends on CAP.MPSMIN. And we are not even checking that it is actually less than or equal to MDTS, which is kinda the *one* condition it must satisfy. Fix this by defining zasl exactly like mdts and checking the one thing that it must satisfy (that it is less than or equal to mdts). Also, change the default value from 128KiB to 0 (aka, whatever mdts is). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: add simple copy commandKlaus Jensen
Add support for TP 4065a ("Simple Copy Command"), v2020.05.04 ("Ratified"). The implementation uses a bounce buffer to first read in the source logical blocks, then issue a write of that bounce buffer. The default maximum number of source logical blocks is 128, translating to 512 KiB for 4k logical blocks which aligns with the default value of MDTS. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: support for multi-controller in subsystemMinwoo Im
We have nvme-subsys and nvme devices mapped together. To support multi-controller scheme to this setup, controller identifier(id) has to be managed. Earlier, cntlid(controller id) used to be always 0 because we didn't have any subsystem scheme that controller id matters. This patch introduced 'cntlid' attribute to the nvme controller instance(NvmeCtrl) and make it allocated by the nvme-subsys device mapped to the controller. If nvme-subsys is not given to the controller, then it will always be 0 as it was. Added 'ctrls' array in the nvme-subsys instance to manage attached controllers to the subsystem with a limit(32). This patch didn't take list for the controllers to make it seamless with nvme-ns device. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: support to map controller to a subsystemMinwoo Im
nvme controller(nvme) can be mapped to a NVMe subsystem(nvme-subsys). This patch maps a controller to a subsystem by adding a parameter 'subsys' to the nvme device. To map a controller to a subsystem, we need to put nvme-subsys first and then maps the subsystem to the controller: -device nvme-subsys,id=subsys0 -device nvme,serial=foo,id=nvme0,subsys=subsys0 If 'subsys' property is not given to the nvme controller, then subsystem NQN will be created with serial (e.g., 'foo' in above example), Otherwise, it will be based on subsys id (e.g., 'subsys0' in above example). Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: move cmb logic to v1.4Padmakar Kalghatgi
Implement v1.4 logic for configuring the Controller Memory Buffer. By default, the v1.4 scheme will be used (CMB must be explicitly enabled by the host), so drivers that only support v1.3 will not be able to use the CMB anymore. To retain the v1.3 behavior, set the boolean 'legacy-cmb' nvme device parameter. Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Padmakar Kalghatgi <p.kalghatgi@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add PMR RDS/WDS supportNaveen Nagar
Add support for the PMRMSCL and PMRMSCU MMIO registers. This allows adding RDS/WDS support for PMR as well. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Naveen Nagar <naveen.n1@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: move msix table and pba to BAR 0Klaus Jensen
In the interest of supporting both CMB and PMR to be enabled on the same device, move the MSI-X table and pending bit array out of BAR 4 and into BAR 0. This is a simplified version of the patch contributed by Andrzej Jakowski (see [1]). Leaving the CMB at offset 0 removes the need for changes to CMB address mapping code. [1]: https://lore.kernel.org/qemu-devel/20200729220107.37758-3-andrzej.jakowski@linux.intel.com/ Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Tested-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add smart_critical_warning propertyzhenwei pi
There is a very low probability that hitting physical NVMe disk hardware critical warning case, it's hard to write & test a monitor agent service. For debugging purposes, add a new 'smart_critical_warning' property to emulate this situation. The orignal version of this change is implemented by adding a fixed property which could be initialized by QEMU command line. Suggested by Philippe & Klaus, rework like current version. Test with this patch: 1, change smart_critical_warning property for a running VM: #virsh qemu-monitor-command nvme-upstream '{ "execute": "qom-set", "arguments": { "path": "/machine/peripheral-anon/device[0]", "property": "smart_critical_warning", "value":16 } }' 2, run smartctl in guest #smartctl -H -l error /dev/nvme0n1 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! - volatile memory backup device has failed Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: open code for volatile write cacheMinwoo Im
Volatile Write Cache(VWC) feature is set in nvme_ns_setup() in the initial time. This feature is related to block device backed, but this feature is controlled in controller level via Set/Get Features command. This patch removed dependency between nvme and nvme-ns to manage the VWC flag value. Also, it open coded the Get Features for VWC to check all namespaces attached to the controller, and if false detected, return directly false. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> [k.jensen: report write cache preset if present on ANY namespace] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add missing string representations for commandsKlaus Jensen
Add missing string representations for a couple of new commands. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: Support Zoned Namespace Command SetDmitry Fomichev
The emulation code has been changed to advertise NVM Command Set when "zoned" device property is not set (default) and Zoned Namespace Command Set otherwise. Define values and structures that are needed to support Zoned Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator. Define trace events where needed in newly introduced code. In order to improve scalability, all open, closed and full zones are organized in separate linked lists. Consequently, almost all zone operations don't require scanning of the entire zone array (which potentially can be quite large) - it is only necessary to enumerate one or more zone lists. Handlers for three new NVMe commands introduced in Zoned Namespace Command Set specification are added, namely for Zone Management Receive, Zone Management Send and Zone Append. Device initialization code has been extended to create a proper configuration for zoned operation using device properties. Read/Write command handler is modified to only allow writes at the write pointer if the namespace is zoned. For Zone Append command, writes implicitly happen at the write pointer and the starting write pointer value is returned as the result of the command. Write Zeroes handler is modified to add zoned checks that are identical to those done as a part of Write flow. Subsequent commits in this series add ZDE support and checks for active and open zone limits. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add the dataset management commandKlaus Jensen
Add support for the Dataset Management command and the Deallocate attribute. Deallocation results in discards being sent to the underlying block device. Whether of not the blocks are actually deallocated is affected by the same factors as Write Zeroes (see previous commit). format | discard | dsm (512B) dsm (4KiB) dsm (64KiB) -------------------------------------------------------- qcow2 ignore n n n qcow2 unmap n n y raw ignore n n n raw unmap n y y Again, a raw format and 4KiB LBAs are preferable. In order to set the Namespace Preferred Deallocate Granularity and Alignment fields (NPDG and NPDA), choose a sane minimum discard granularity of 4KiB. If we are using a passthru device supporting discard at a 512B granularity, user should set the discard_granularity property explicitly. NPDG and NPDA will also account for the cluster_size of the block driver if required (i.e. for QCOW2). See NVM Express 1.3d, Section 6.7 ("Dataset Management command"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: change controller pci idKlaus Jensen
There are two reasons for changing this: 1. The nvme device currently uses an internal Intel device id. 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: support multiple namespaces" the controller device no longer has the quirks that the Linux kernel think it has. As the quirks are applied based on pci vendor and device id, change them to get rid of the quirks. To keep backward compatibility, add a new 'use-intel-id' parameter to the nvme device to force use of the Intel vendor and device id. This is off by default but add a compat property to set this for 5.1 machines and older. If a 5.1 machine is booted (or the use-intel-id parameter is explicitly set to true), the Linux kernel will just apply these unnecessary quirks: 1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support anything else than values 0x0 and 0x1 for CNS (Identify Namespace and Identify Namespace). With multiple namespace support, this just means that the kernel will "scan" namespaces instead of using "Active Namespace ID list" (CNS 0x2). 2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a broken Write Zeroes implementation which has since been fixed in commit 9d6459d21a6e ("nvme: fix write zeroes offset and count"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
2020-10-27hw/block/nvme: support multiple namespacesKlaus Jensen
This adds support for multiple namespaces by introducing a new 'nvme-ns' device model. The nvme device creates a bus named from the device name ('id'). The nvme-ns devices then connect to this and registers themselves with the nvme device. This changes how an nvme device is created. Example with two namespaces: -drive file=nvme0n1.img,if=none,id=disk1 -drive file=nvme0n2.img,if=none,id=disk2 -device nvme,serial=deadbeef,id=nvme0 -device nvme-ns,drive=disk1,bus=nvme0,nsid=1 -device nvme-ns,drive=disk2,bus=nvme0,nsid=2 The drive property is kept on the nvme device to keep the change backward compatible, but the property is now optional. Specifying a drive for the nvme device will always create the namespace with nsid 1. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2020-10-27hw/block/nvme: refactor aio submissionKlaus Jensen
This pulls block layer aio submission/completion to common functions. For completions, additionally map an AIO error to the Unrecovered Read and Write Fault status codes. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add symbolic command name to trace eventsKlaus Jensen
Add the symbolic command name to the pci_nvme_{io,admin}_cmd and pci_nvme_rw trace events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add a lba to bytes helperKlaus Jensen
Add the nvme_l2b helper and use it for converting NLB and SLBA to byte counts and offsets. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-09-02hw/block/nvme: add ns/cmd references in NvmeRequestKlaus Jensen
Instead of passing around the NvmeNamespace and the NvmeCmd, add them as members in the NvmeRequest structure. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
2020-09-02hw/block/nvme: add check for mdtsKlaus Jensen
Add 'mdts' device parameter to control the Maximum Data Transfer Size of the controller and check that it is respected. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2020-09-02hw/block/nvme: remove redundant has_sg memberKlaus Jensen
Remove the has_sg member from NvmeRequest since it's redundant. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
2020-09-02hw/block/nvme: enforce valid queue creation sequenceKlaus Jensen
Support returning Command Sequence Error if Set Features on Number of Queues is called after queues have been created. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Message-Id: <20200706061303.246057-17-its@irrelevant.dk>
2020-09-02hw/block/nvme: move NvmeFeatureVal into hw/block/nvme.hKlaus Jensen
The NvmeFeatureVal does not belong with the spec-related data structures in include/block/nvme.h that is shared between the block-level nvme driver and the emulated nvme device. Move it into the nvme device specific header file as it is the only user of the structure. Also, remove the unused members. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200706061303.246057-10-its@irrelevant.dk>
2020-09-02hw/block/nvme: add support for the asynchronous event request commandKlaus Jensen
Add support for the Asynchronous Event Request command. Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d, Section 5.2 ("Asynchronous Event Request command"). Mostly imported from Keith's qemu-nvme tree. Modified with a max number of queued events (controllable with the aer_max_queued device parameter). The spec states that the controller *should* retain events, so we do best effort here. Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Message-Id: <20200706061303.246057-9-its@irrelevant.dk>
2020-09-02hw/block/nvme: add support for the get log page commandKlaus Jensen
Add support for the Get Log Page command and basic implementations of the mandatory Error Information, SMART / Health Information and Firmware Slot Information log pages. In violation of the specification, the SMART / Health Information log page does not persist information over the lifetime of the controller because the device has no place to store such persistent state. Note that the LPA field in the Identify Controller data structure intentionally has bit 0 cleared because there is no namespace specific information in the SMART / Health information log page. Required for compliance with NVMe revision 1.3d. See NVM Express 1.3d, Section 5.14 ("Get Log Page command"). Signed-off-by: Klaus Jensen <klaus.jensen@cnexlabs.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200706061303.246057-8-its@irrelevant.dk>
2020-09-02hw/block/nvme: add temperature threshold featureKlaus Jensen
It might seem weird to implement this feature for an emulated device, but it is mandatory to support and the feature is useful for testing asynchronous event request support, which will be added in a later patch. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Message-Id: <20200706061303.246057-6-its@irrelevant.dk>
2020-06-17hw/block/nvme: add msix_qsize parameterKlaus Jensen
Decouple the requested maximum number of ioqpairs (param max_ioqpairs) from the number of MSI-X interrupt vectors by introducing a new msix_qsize parameter and initialize MSI-X with that. This allows emulating a device that has fewer vectors than I/O queue pairs and also allows more than 2048 queue pairs. To keep the device behaving as previously, use a msix_qsize default of 65 (default max_ioqpairs + 1). This decoupling was actually suggested by Maxim some time ago in a slightly different context, so adding a Suggested-by. Suggested-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Message-Id: <20200609190333.59390-22-its@irrelevant.dk> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17hw/block/nvme: add namespace helpersKlaus Jensen
Introduce some small helpers to make the next patches easier on the eye. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Message-Id: <20200609190333.59390-14-its@irrelevant.dk> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-06-17hw/block/nvme: remove redundant cmbloc/cmbsz membersKlaus Jensen
Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Message-Id: <20200609190333.59390-10-its@irrelevant.dk> Signed-off-by: Kevin Wolf <kwolf@redhat.com>