aboutsummaryrefslogtreecommitdiff
path: root/hw/block/trace-events
AgeCommit message (Collapse)Author
2024-01-20hw/pflash: implement update buffer for block writesGerd Hoffmann
Add an update buffer where all block updates are staged. Flush or discard updates properly, so we should never see half-completed block writes in pflash storage. Drop a bunch of FIXME comments ;) Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240108160900.104835-4-kraxel@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit 284a7ee2e290e0c9b8cd3ea6164d92386933054f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: drop const in hw/block/pflash_cfi01.c for before v8.2.0-220-g7d5dc0a367 "hw/block: Constify VMState")
2023-10-06swim: update IWM/ISM register block decodingMark Cave-Ayland
Update the IWM/ISM register block decoding to match the description given in the "SWIM Chip Users Reference". This allows us to validate the device response to the guest OS which currently only does just enough to indicate that the floppy drive is unavailable. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-ID: <20231004083806.757242-14-mark.cave-ayland@ilande.co.uk> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06swim: split into separate IWM and ISM register blocksMark Cave-Ayland
The swim chip provides an implementation of both Apple's IWM and ISM floppy disk controllers. Split the existing implementation into separate register banks for each controller, whilst also switching the IWM registers from 16-bit to 8-bit as implemented in real hardware. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-ID: <20231004083806.757242-13-mark.cave-ayland@ilande.co.uk> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06swim: add trace events for IWM and ISM registersMark Cave-Ayland
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-ID: <20231004083806.757242-12-mark.cave-ayland@ilande.co.uk> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-05-15virtio-blk: add some trace events for zoned emulationSam Li
Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508051916.178322-4-faithilikerun@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-24m25p80: Add basic support for the SFDP commandCédric Le Goater
JEDEC STANDARD JESD216 for Serial Flash Discovery Parameters (SFDP) provides a mean to describe the features of a serial flash device using a set of internal parameter tables. This is the initial framework for the RDSFDP command giving access to a private SFDP area under the flash. This area now needs to be populated with the flash device characteristics, using a new 'sfdp_read' handler under FlashPartInfo. Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Message-Id: <20220722063602.128144-2-clg@kaod.org> Message-Id: <20221013161241.2805140-2-clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-06-25hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.cPhilippe Mathieu-Daudé
Some machines use floppy controllers via the SysBus interface, and don't need to pull in all the SysBus code. Extract the SysBus specific code to a new unit: fdc-sysbus.c, and add a new Kconfig symbol: "FDC_SYSBUS". Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210614193220.2007159-6-philmd@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>
2021-06-25hw/block/fdc: Replace disabled fprintf() by trace eventPhilippe Mathieu-Daudé
Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210614193220.2007159-3-philmd@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>
2021-06-02docs: fix references to docs/devel/tracing.rstStefano Garzarella
Commit e50caf4a5c ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-05-17hw/nvme: move nvme emulation out of hw/blockKlaus Jensen
With the introduction of the nvme-subsystem device we are really cluttering up the hw/block directory. As suggested by Philippe previously, move the nvme emulation to hw/nvme. Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-04-07hw/block/nvme: fix handling of private namespacesKlaus Jensen
Prior to this patch, if a private nvme-ns device (that is, a namespace that is not linked to a subsystem) is wired up to an nvme-subsys linked nvme controller device, the device fails to verify that the namespace id is unique within the subsystem. NVM Express v1.4b, Section 6.1.6 ("NSID and Namespace Usage") states that because the device supports Namespace Management, "NSIDs *shall* be unique within the NVM subsystem". Additionally, prior to this patch, private namespaces are not known to the subsystem and the namespace is considered exclusive to the controller with which it is initially wired up to. However, this is not the definition of a private namespace; per Section 1.6.33 ("private namespace"), a private namespace is just a namespace that does not support multipath I/O or namespace sharing, which means "that it is only able to be attached to one controller at a time". Fix this by always allocating namespaces in the subsystem (if one is linked to the controller), regardless of the shared/private status of the namespace. Whether or not the namespace is shareable is controlled by a new `shared` nvme-ns parameter. Finally, this fix allows the nvme-ns `subsys` parameter to be removed, since the `shared` parameter now serves the purpose of attaching the namespace to all controllers in the subsystem upon device realization. It is invalid to have an nvme-ns namespace device with a linked subsystem without the parent nvme controller device also being linked to one and since the nvme-ns devices will unconditionally be "attached" (in QEMU terms that is) to an nvme controller device through an NvmeBus, the nvme-ns namespace device can always get a reference to the subsystem of the controller it is explicitly (using 'bus=' parameter) or implicitly attaching to. Fixes: e570768566b3 ("hw/block/nvme: support for shared namespace in subsystem") Cc: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-03-18Merge remote-tracking branch 'remotes/philmd/tags/pflash-20210318' into stagingPeter Maydell
Parallel NOR Flash patches queue - Code movement to ease maintainability - Tracing improvements # gpg: Signature made Thu 18 Mar 2021 15:44:12 GMT # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full] # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * remotes/philmd/tags/pflash-20210318: hw/block/pflash_cfi: Replace DPRINTF with trace events hw/block/pflash_cfi01: Correct the type of PFlashCFI01.ro hw/block/pflash_cfi01: Clarify trace events hw/block/pflash_cfi02: Add DeviceReset method hw/block/pflash_cfi02: Factor out pflash_reset_state_machine() hw/block/pflash_cfi02: Rename register_memory(true) as mode_read_array hw/block/pflash_cfi02: Open-code pflash_register_memory(rom=false) hw/block/pflash_cfi02: Set rom_mode to true in pflash_setup_mappings() hw/block/pflash_cfi02: Extract pflash_cfi02_fill_cfi_table() hw/block/pflash_cfi01: Extract pflash_cfi01_fill_cfi_table() hw/block/pflash_cfi: Fix code style for checkpatch.pl Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-18hw/block/nvme: add support for the format nvm commandMinwoo Im
Format NVM admin command can make a namespace or namespaces to be with different LBA size and metadata size with protection information types. This patch introduces Format NVM command with LBA format, Metadata, and Protection Information for the device. The secure erase operation things and support for formatting zoned namespaces are yet to be added. The parameter checks inside of this patch has been referred from Keith's old branch. Signed-off-by: Minwoo Im <minwoo.im@samsung.com> [anaidu.gollu: rebased on e2e] Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased for reworked aio tracking] Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: add verify commandGollu Appalanaidu
See NVM Express 1.4, section 6.14 ("Verify Command"). Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased, refactored for e2e] Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: end-to-end data protectionKlaus Jensen
Add support for namespaces formatted with protection information. The type of end-to-end data protection (i.e. Type 1, Type 2 or Type 3) is selected with the `pi` nvme-ns device parameter. If the number of metadata bytes is larger than 8, the `pil` nvme-ns device parameter may be used to control the location of the 8-byte DIF tuple. The default `pil` value of '0', causes the DIF tuple to be transferred as the last 8 bytes of the metadata. Set to 1 to store this in the first eight bytes instead. Co-authored-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/nvme: add metadata supportKlaus Jensen
Add support for metadata in the form of extended logical blocks as well as a separate buffer of data. The new `ms` nvme-ns device parameter specifies the size of metadata per logical block in bytes. The `mset` nvme-ns device parameter controls whether metadata is transfered as part of an extended lba (set to '1') or in a separate buffer (set to '0', the default). Regardsless of the scheme chosen with `mset`, metadata is stored at the end of the namespace backing block device. This requires the user provided PRP/SGLs to be walked and "split" into data and metadata scatter/gather lists if the extended logical block scheme is used, but has the advantage of not breaking the deallocated blocks support. Co-authored-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-18hw/block/pflash_cfi: Replace DPRINTF with trace eventsDavid Edmondson
Rather than having a device specific debug implementation in pflash_cfi01.c and pflash_cfi02.c, use the standard tracing facility. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210216142721.1985543-2-david.edmondson@oracle.com> [PMD: Rebased, fixed pflash_write_block_erase trace event format] Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
2021-03-18hw/block/pflash_cfi02: Rename register_memory(true) as mode_read_arrayPhilippe Mathieu-Daudé
The same pattern is used when setting the flash in READ_ARRAY mode: - Set the state machine command to READ_ARRAY - Reset the write_cycle counter - Reset the memory region in ROMD Refactor the current code by extracting this pattern. It is used three times: - When the timer expires and not in bypass mode - On a read access (on invalid command). - When the device is initialized. Here the ROMD mode is hidden by the memory_region_init_rom_device() call. pflash_register_memory(rom_mode=true) already sets the ROM device in "read array" mode (from I/O device to ROM one). Explicit that by renaming the function as pflash_mode_read_array(), adding a trace event and resetting wcycle. Reviewed-by: Bin Meng <bmeng.cn@gmail.com> Reviewed-by: David Edmondson <david.edmondson@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210310170528.1184868-7-philmd@redhat.com>
2021-03-09hw/block/nvme: support Identify NS Attached Controller ListMinwoo Im
Support Identify command for Namespace attached controller list. This command handler will traverse the controller instances in the given subsystem to figure out whether the specified nsid is attached to the controllers or not. The 4096bytes Identify data will return with the first entry (16bits) indicating the number of the controller id entries. So, the data can hold up to 2047 entries for the controller ids. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> [k.jensen: rebased for dma refactor] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: support namespace attachment commandMinwoo Im
This patch supports Namespace Attachment command for the pre-defined nvme-ns device nodes. Of course, attach/detach namespace should only be supported in case 'subsys' is given. This is because if we detach a namespace from a controller, somebody needs to manage the detached, but allocated namespace in the NVMe subsystem. As command effect for the namespace attachment command is registered, the host will be notified that namespace inventory is changed so that host will rescan the namespace inventory after this command. For example, kernel driver manages this command effect via passthru IOCTL. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Klaus Jensen <k.jensen@samsung.com> [k.jensen: rebased for dma refactor] Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: remove the req dependency in map functionsKlaus Jensen
The PRP and SGL mapping functions does not have any particular need for the entire NvmeRequest as a parameter. Clean it up. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: report non-mdts command size limit for dsmGollu Appalanaidu
Dataset Management is not subject to MDTS, but exceeded a certain size per range causes internal looping. Report this limit (DMRSL) in the NVM command set specific identify controller data structure. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: add identify trace eventGollu Appalanaidu
Add a trace event for the Identify command. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-03-09hw/block/nvme: align zoned.zasl with mdtsKlaus Jensen
ZASL (Zone Append Size Limit) is defined exactly like MDTS (Maximum Data Transfer Size), that is, it is a value in units of the minimum memory page size (CAP.MPSMIN) and is reported as a power of two. The 'mdts' nvme device parameter is specified as in the spec, but the 'zoned.append_size_limit' parameter is specified in bytes. This is suboptimal for a number of reasons: 1. It is just plain confusing wrt. the definition of mdts. 2. There is a lot of complexity involved in validating the value; it must be a power of two, it should be larger than 4k, if it is zero we set it internally to mdts, but still report it as zero. 3. While "hw/block/nvme: improve invalid zasl value reporting" slightly improved the handling of the parameter, the validation is still wrong; it does not depend on CC.MPS, it depends on CAP.MPSMIN. And we are not even checking that it is actually less than or equal to MDTS, which is kinda the *one* condition it must satisfy. Fix this by defining zasl exactly like mdts and checking the one thing that it must satisfy (that it is less than or equal to mdts). Also, change the default value from 128KiB to 0 (aka, whatever mdts is). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: deduplicate bad mdts trace eventKlaus Jensen
If mdts is exceeded, trace it from a single place. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-03-09hw/block/nvme: add broadcast nsid support flush commandGollu Appalanaidu
Add support for using the broadcast nsid to issue a flush on all namespaces through a single command. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-03-09hw/block/nvme: add simple copy commandKlaus Jensen
Add support for TP 4065a ("Simple Copy Command"), v2020.05.04 ("Ratified"). The implementation uses a bounce buffer to first read in the source logical blocks, then issue a write of that bounce buffer. The default maximum number of source logical blocks is 128, translating to 512 KiB for 4k logical blocks which aligns with the default value of MDTS. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: refactor the logic for zone write checksKlaus Jensen
Refactor the zone write check logic such that the most "meaningful" error is returned first. That is, first, if the zone is not writable, return an appropriate status code for that. Then, make sure we are actually writing at the write pointer and finally check that we do not cross the zone write boundary. This aligns with the "priority" of status codes for zone read checks. Also add a couple of additional descriptive trace events and remove an always true assert. Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com> Tested-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: move cmb logic to v1.4Padmakar Kalghatgi
Implement v1.4 logic for configuring the Controller Memory Buffer. By default, the v1.4 scheme will be used (CMB must be explicitly enabled by the host), so drivers that only support v1.3 will not be able to use the CMB anymore. To retain the v1.3 behavior, set the boolean 'legacy-cmb' nvme device parameter. Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Padmakar Kalghatgi <p.kalghatgi@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add size to mmio read/write trace eventsKlaus Jensen
Add the size of the mmio read/write to the trace event. Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: zero out zones on resetKlaus Jensen
The zoned command set specification states that "All logical blocks in a zone *shall* be marked as deallocated when [the zone is reset]". Since the device guarantees 0x00 to be read from deallocated blocks we have to issue a pwrite_zeroes since we cannot be sure that a discard will do anything. But typically, this will be achieved with an efficient unmap/discard operation. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
2021-02-08hw/block/nvme: Support Zone Descriptor ExtensionsDmitry Fomichev
Zone Descriptor Extension is a label that can be assigned to a zone. It can be set to an Empty zone and it stays assigned until the zone is reset. This commit adds a new optional module property, "zoned.descr_ext_size". Its value must be a multiple of 64 bytes. If this value is non-zero, it becomes possible to assign extensions of that size to any Empty zones. The default value for this property is 0, therefore setting extensions is disabled by default. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Introduce max active and open zone limitsDmitry Fomichev
Add two module properties, "zoned.max_active" and "zoned.max_open" to control the maximum number of zones that can be active or open. Once these variables are set to non-default values, these limits are checked during I/O and Too Many Active or Too Many Open command status is returned if they are exceeded. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Support Zoned Namespace Command SetDmitry Fomichev
The emulation code has been changed to advertise NVM Command Set when "zoned" device property is not set (default) and Zoned Namespace Command Set otherwise. Define values and structures that are needed to support Zoned Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator. Define trace events where needed in newly introduced code. In order to improve scalability, all open, closed and full zones are organized in separate linked lists. Consequently, almost all zone operations don't require scanning of the entire zone array (which potentially can be quite large) - it is only necessary to enumerate one or more zone lists. Handlers for three new NVMe commands introduced in Zoned Namespace Command Set specification are added, namely for Zone Management Receive, Zone Management Send and Zone Append. Device initialization code has been extended to create a proper configuration for zoned operation using device properties. Read/Write command handler is modified to only allow writes at the write pointer if the namespace is zoned. For Zone Append command, writes implicitly happen at the write pointer and the starting write pointer value is returned as the result of the command. Write Zeroes handler is modified to add zoned checks that are identical to those done as a part of Write flow. Subsequent commits in this series add ZDE support and checks for active and open zone limits. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Add support for Namespace TypesNiklas Cassel
Define the structures and constants required to implement Namespace Types support. Namespace Types introduce a new command set, "I/O Command Sets", that allows the host to retrieve the command sets associated with a namespace. Introduce support for the command set and enable detection for the NVM Command Set. The new workflows for identify commands rely heavily on zero-filled identify structs. E.g., certain CNS commands are defined to return a zero-filled identify struct when an inactive namespace NSID is supplied. Add a helper function in order to avoid code duplication when reporting zero-filled identify structures. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Add Commands Supported and Effects logDmitry Fomichev
This log page becomes necessary to implement to allow checking for Zone Append command support in Zoned Namespace Command Set. This commit adds the code to report this log page for NVM Command Set only. The parts that are specific to zoned operation will be added later in the series. All incoming admin and i/o commands are now only processed if their corresponding support bits are set in this log. This provides an easy way to control what commands to support and what not to depending on set CC.CSS. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Combine nvme_write_zeroes() and nvme_write()Dmitry Fomichev
Move write processing to nvme_do_write() that now handles both WRITE and WRITE ZEROES. Both nvme_write() and nvme_write_zeroes() become inline helper functions. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: Separate read and write handlersDmitry Fomichev
The majority of code in nvme_rw() is becoming read- or write-specific. Move these parts to two separate handlers, nvme_read() and nvme_write() to make the code more readable and to remove multiple is_write checks that has been present in the i/o path. This is a refactoring patch, no change in functionality. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08hw/block/nvme: add compare commandGollu Appalanaidu
Add the Compare command. This implementation uses a bounce buffer to read in the data from storage and then compare with the host supplied buffer. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased] Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08hw/block/nvme: add dulbe supportKlaus Jensen
Add support for reporting the Deallocated or Unwritten Logical Block Error (DULBE). Rely on the block status flags reported by the block layer and consider any block with the BDRV_BLOCK_ZERO flag to be deallocated. Multiple factors affect when a Write Zeroes command result in deallocation of blocks. * the underlying file system block size * the blockdev format * the 'discard' and 'logical_block_size' parameters format | discard | wz (512B) wz (4KiB) wz (64KiB) ----------------------------------------------------- qcow2 ignore n n y qcow2 unmap n n y raw ignore n y y raw unmap n y y So, this works best with an image in raw format and 4KiB LBAs, since holes can then be punched on a per-block basis (this assumes a file system with a 4kb block size, YMMV). A qcow2 image, uses a cluster size of 64KiB by default and blocks will only be marked deallocated if a full cluster is zeroed or discarded. However, this *is* consistent with the spec since Write Zeroes "should" deallocate the block if the Deallocate attribute is set and "may" deallocate if the Deallocate attribute is not set. Thus, we always try to deallocate (the BDRV_REQ_MAY_UNMAP flag is always set). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: fix prp mapping status codesGollu Appalanaidu
Address 0 is not an invalid address. Remove those invalikd checks. Unaligned PRP2 and PRP list entries should result in Invalid PRP Offset status code and not Invalid Field. Fix that. See NVMe Express v1.3d, Section 4.3 ("Physical Region Page Entry and List"). Suggested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add trace event for requests with non-zero status codeKlaus Jensen
If a command results in a non-zero status code, trace it. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add nsid to get/setfeat trace eventsKlaus Jensen
Include the namespace id in the pci_nvme_{get,set}feat trace events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: validate command set selectedKeith Busch
Fail to start the controller if the user requests a command set that the controller does not support. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2020-10-27hw/block/nvme: remove pointless rw indirectionKeith Busch
The code switches on the opcode to invoke a function specific to that opcode. There's no point in consolidating back to a common function that just switches on that same opcode without any actual common code. Restore the opcode specific behavior without going back through another level of switches. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2020-10-27hw/block/nvme: support multiple namespacesKlaus Jensen
This adds support for multiple namespaces by introducing a new 'nvme-ns' device model. The nvme device creates a bus named from the device name ('id'). The nvme-ns devices then connect to this and registers themselves with the nvme device. This changes how an nvme device is created. Example with two namespaces: -drive file=nvme0n1.img,if=none,id=disk1 -drive file=nvme0n2.img,if=none,id=disk2 -device nvme,serial=deadbeef,id=nvme0 -device nvme-ns,drive=disk1,bus=nvme0,nsid=1 -device nvme-ns,drive=disk2,bus=nvme0,nsid=2 The drive property is kept on the nvme device to keep the change backward compatible, but the property is now optional. Specifying a drive for the nvme device will always create the namespace with nsid 1. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2020-10-27hw/block/nvme: add support for scatter gather listsKlaus Jensen
For now, support the Data Block, Segment and Last Segment descriptor types. See NVM Express 1.3d, Section 4.4 ("Scatter Gather List (SGL)"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: refactor aio submissionKlaus Jensen
This pulls block layer aio submission/completion to common functions. For completions, additionally map an AIO error to the Unrecovered Read and Write Fault status codes. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: add symbolic command name to trace eventsKlaus Jensen
Add the symbolic command name to the pci_nvme_{io,admin}_cmd and pci_nvme_rw trace events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
2020-10-27hw/block/nvme: handle dma errorsKlaus Jensen
Handling DMA errors gracefully is required for the device to pass the block/011 test ("disable PCI device while doing I/O") in the blktests suite. With this patch the device sets the Controller Fatal Status bit in the CSTS register when failing to read from a submission queue or writing to a completion queue; expecting the host to reset the controller. If DMA errors occur at any other point in the execution of the command (say, while mapping the PRPs), the command is aborted with a Data Transfer Error status code. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>