aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2015-07-07spapr: Support ibm, lrdr-capacity device tree propertyBharata B Rao
Add support for ibm,lrdr-capacity since this is needed by the guest kernel to know about the possible hot-pluggable CPUs and Memory. With this, pseries kernels will start reporting correct maxcpus in /sys/devices/system/cpu/possible. Also define the minimum hotpluggable memory size as 256MB. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [agraf: Fix compile error on 32bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Consider max_cpus during xics initializationBharata B Rao
Use max_cpus instead of smp_cpus when intializating xics system. Also report max_cpus in ibm,interrupt-server-ranges device tree property of interrupt controller node. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07Revert "hw/ppc/spapr_pci.c: Avoid functions not in glib 2.12 ↵Markus Armbruster
(g_hash_table_iter_*)" Since we now require GLib 2.22+ (commit f40685c), we don't have to work around lack of g_hash_table_iter_init() & friends anymore. This reverts commit f8833a37c0c6b22ddd57b45e48cfb0f97dbd5af4. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr_iommu: translate sPAPRTCEAccess to IOMMUAccessFlagsGreg Kurz
The fact that these enums have matching values is pure coincidence. We actually need to translate from the PAPR definition to the QEMU one. This patch doesn't fix any bug, it is only code cleanup. Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr_iommu: drop erroneous check in h_put_tce_indirect()Greg Kurz
The tce_list variable is not a TCE but the address to a TCE: we shouldn't clear permission bits as we do now. And this is dead code anyway since we check tce_list is 4K aligned a few lines above. This patch doesn't fix any bug, it is only code cleanup. Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr_pci: set device node unit address as hexNikunj A Dadhania
Device node names should encode the unit address as hex, while the code was encodind it as integers. Also, use FDT_NAME_MAX macro for allocating and composing the name. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr_pci: encode class code including Prog IF registerNikunj A Dadhania
Current code missed the Prog IF register. All Class Code, Subclass, and Prog IF registers are needed to identify the accurate device type. For example: USB controllers use the PROG IF for denoting: USB FullSpeed, HighSpeed or SuperSpeed. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr_pci: encode missing 64-bit memory address spaceNikunj A Dadhania
The properties reg/assigned-resources need to encode 64-bit memory address space as part of phys.hi dword. 00 if configuration space 01 if IO region, 10 if 32-bit MEM region 11 if 64-bit MEM region Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Add sPAPRMachineClassDavid Gibson
Currently although we have an sPAPRMachineState descended from MachineState we don't have an sPAPRMAchineClass descended from MachineClass. So far it hasn't been needed, but several upcoming features are going to want it, so this patch creates a stub implementation. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Remove obsolete entry_point field from sPAPRMachineStateDavid Gibson
The sPAPRMachineState structure includes an entry_point field containing the initial PC value for starting the machine, even though this always has the value 0x100. I think this is a hangover from very early versions which bypassed the firmware when using -kernel. In any case it has no function now, so remove it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Remove obsolete ram_limit field from sPAPRMachineStateDavid Gibson
The ram_limit field was imported from sPAPREnvironment where it predates the machine's ram size being available generically from machine->ram_size. Worse, the existing code was inconsistent about where it got the ram size from. Sometimes it used spapr->ram_limit, sometimes the global 'ram_size' and sometimes a local 'ram_size' masking the global. This cleans up the code to consistently use machine->ram_size, eliminating spapr->ram_limit in the process. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: Merge sPAPREnvironment into sPAPRMachineStateDavid Gibson
The code for -machine pseries maintains a global sPAPREnvironment structure which keeps track of general state information about the guest platform. This predates the existence of the MachineState structure, but performs basically the same function. Now that we have the generic MachineState, fold sPAPREnvironment into sPAPRMachineState, the pseries specific subclass of MachineState. This is mostly a matter of search and replace, although a few places which relied on the global spapr variable are changed to find the structure via qdev_get_machine(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07spapr: ensure we have at least one XICS serverGreg Kurz
XICS needs to know the upper value for cpu_index as it is used to compute the number of servers: smp_cpus * kvmppc_smt_threads() / smp_threads When passing -smp cpus=1,threads=9 on a POWER8 host, we end up with: 1 * 8 / 9 = 0 ... which leads to an assertion in both emulated: Number of servers needs to be greater 0 Aborted (core dumped) ... and in-kernel XICS: xics_kvm_realize: Assertion `icp->nr_servers' failed. Aborted (core dumped) With this patch, we are sure that nr_servers > 0. Passing the same bogus -smp option then leads to: qemu-system-ppc64: Cannot support more than 8 threads on PPC with KVM ... which is a lot more explicit than the XICS errors. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07macio: remove nonexistent interrupt on pin 1Cormac O'Brien
The current macio implementation declares an interrupt that doesn't appear to exist in the hardware or any other emulator implementation. OpenBIOS detects this interrupt and generates an 'interrupts' property in the macio device tree entry. Mac OS 9 halts boot when it detects this interrupt, so it has been removed to permit further progress in the boot process. Signed-off-by: Cormac O'Brien <i.am.cormac.obrien@gmail.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150706.0' ↵Peter Maydell
into staging VFIO updates for 2.4-rc0 - "real" host page size API (Peter Crosthwaite) - platform device irqfd support (Eric Auger) - spapr container disconnect fix (Alexey Kardashevskiy) - quirk for broken Chelsio hardware (Gabriel Laupre) - coverity fix (Paolo Bonzini) # gpg: Signature made Mon Jul 6 19:23:49 2015 BST using RSA key ID 3BB08B22 # gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" # gpg: aka "Alex Williamson <alex@shazbot.org>" # gpg: aka "Alex Williamson <alwillia@redhat.com>" # gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" * remotes/awilliam/tags/vfio-update-20150706.0: vfio/pci : Add pba_offset PCI quirk for Chelsio T5 devices vfio: Unregister IOMMU notifiers when container is destroyed hw/vfio/platform: add irqfd support kvm: some fixes to kvm_resamplefds_allowed sysbus: add irq_routing_notifier intc: arm_gic_kvm: set the qemu_irq/gsi mapping kvm-all.c: add qemu_irq/gsi hash table and utility routines kvm: rename kvm_irqchip_[add,remove]_irqfd_notifier with gsi suffix vfio: cpu: Use "real" page size API cpu-all: complete "real" host page size API vfio: fix return type of pread Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Conflicts: kvm-all.c
2015-07-06vfio/pci : Add pba_offset PCI quirk for Chelsio T5 devicesGabriel Laupre
Fix pba_offset initialization value for Chelsio T5 Virtual Function device. The T5 hardware has a bug in it where it reports a Pending Interrupt Bit Array Offset of 0x8000 for its SR-IOV Virtual Functions instead of the 0x1000 that the hardware actually uses internally. As the hardware doesn't return the correct pba_offset value, add a quirk to instead return a hardcoded value of 0x1000 when a Chelsio T5 VF device is detected. This bug has been fixed in the Chelsio's next chip series T6 but there are no plans to respin the T5 ASIC for this bug. It is just documented in the T5 Errata and left it at that. Signed-off-by: Gabriel Laupre <glaupre@chelsio.com> Reviewed-by: Bandan Das <bsd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06vfio: Unregister IOMMU notifiers when container is destroyedAlexey Kardashevskiy
On systems with guest visible IOMMU, adding a new memory region onto PCI bus calls vfio_listener_region_add() for every DMA window. This installs a notifier for IOMMU memory regions. The notifier is supposed to be removed vfio_listener_region_del(), however in the case of mixed PHB (emulated + VFIO devices) when last VFIO device is unplugged and container gets destroyed, all existing DMA windows stay alive altogether with the notifiers which are on the linked list which head was in the destroyed container. This unregisters IOMMU memory region notifier when a container is destroyed. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06hw/vfio/platform: add irqfd supportEric Auger
This patch aims at optimizing IRQ handling using irqfd framework. Instead of handling the eventfds on user-side they are handled on kernel side using - the KVM irqfd framework, - the VFIO driver virqfd framework. the virtual IRQ completion is trapped at interrupt controller This removes the need for fast/slow path swap. Overall this brings significant performance improvements. Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com> Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Vikram Sethi <vikrams@codeaurora.org> Acked-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06sysbus: add irq_routing_notifierEric Auger
Add a new connect_irq_notifier notifier in the SysBusDeviceClass. This notifier, if populated, is called after sysbus_connect_irq. This mechanism is used to setup VFIO signaling once VFIO platform devices get attached to their platform bus, on a machine init done notifier. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Tested-by: Vikram Sethi <vikrams@codeaurora.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06intc: arm_gic_kvm: set the qemu_irq/gsi mappingEric Auger
The arm_gic_kvm now calls kvm_irqchip_set_qemuirq_gsi to build the hash table storing qemu_irq/gsi mappings. From that point on irqfd can be setup directly from the qemu_irq using kvm_irqchip_add_irqfd_notifier. Signed-off-by: Eric Auger <eric.auger@linaro.org> Tested-by: Vikram Sethi <vikrams@codeaurora.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06kvm: rename kvm_irqchip_[add,remove]_irqfd_notifier with gsi suffixEric Auger
Anticipating for the introduction of new add/remove functions taking a qemu_irq parameter, let's rename existing ones with a gsi suffix. Signed-off-by: Eric Auger <eric.auger@linaro.org> Tested-by: Vikram Sethi <vikrams@codeaurora.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06vfio: cpu: Use "real" page size APIPeter Crosthwaite
This is system level code, and should only depend on the host page size, not the target page size. Note that HOST_PAGE_SIZE is misleadingly lead and is really aligning to both host and target page size. Hence it's replacement with REAL_HOST_PAGE_SIZE. Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06vfio: fix return type of preadPaolo Bonzini
size_t is an unsigned type, thus the error case is never reached in the below call to pread. If bytes is negative, it will be seen as a very high positive value. Spotted by Coverity. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06pc: add SMM propertyPaolo Bonzini
The property can take values on, off or auto. The default is "off" for KVM and pre-2.4 machines, otherwise "auto" (which makes it available on TCG or on new-enough kernels). Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06ich9: add smm_enabled field and argumentsPaolo Bonzini
Q35's ACPI device is hard-coding SMM availability to KVM. Place the logic where the board is created instead, so that it will be possible to override it. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06pc_piix: rename kvm_enabled to smm_enabledPaolo Bonzini
We will enable SMM even if KVM is in use. Rename the field and arguments. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06piix4/ich9: do not raise SMI on ACPI enable/disable commandsPaolo Bonzini
These commands are handled entirely by QEMU. Do not raise an SMI when they happen, because Windows (at least 2008r2) expects these commands to work and (depending on the value of APMC_EN at startup) the firmware might not have installed an SMI handler. When this happens (e.g. the kernel supports SMIs, or you are using TCG, but you have used "-machine smm=off") RIP is moved to 0x38000 where there is no code to execute. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* more of Peter Crosthwaite's multiarch preparation patches * unlocked MMIO support in KVM * support for compilation with ICC # gpg: Signature made Mon Jul 6 13:59:20 2015 BST using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: exec: skip MMIO regions correctly in cpu_physical_memory_write_rom_internal Stop including qemu-common.h in memory.h kvm: Switch to unlocked MMIO acpi: mark PMTIMER as unlocked kvm: Switch to unlocked PIO kvm: First step to push iothread lock out of inner run loop memory: let address_space_rw/ld*/st* run outside the BQL exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st* memory: Add global-locking property to memory regions main-loop: introduce qemu_mutex_iothread_locked main-loop: use qemu_mutex_lock_iothread consistently Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES cpu-defs: Move out TB_JMP defines include/exec: Move tb hash functions out include/exec: Move standard exceptions to cpu-all.h cpu-defs: Move CPU_TEMP_BUF_NLONGS to tcg memory_mapping: Rework cpu related includes cutils: allow compilation with icc qemu-common: add VEC_OR macro Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-06arm_mptimer: Respect IT bit stateDmitry Osipenko
The timer should fire the interrupt only if the IT (interrupt enable) bit state of the control register is enabled. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-06arm_mptimer: Fix timer shutdown and mode changeDmitry Osipenko
The running timer can't be stopped because timer control code just doesn't handle disabling the timer. Fix it by deleting the timer if the enable bit is cleared. The timer won't start periodic ticking if a ONE-SHOT -> PERIODIC mode change happens after a one-shot tick was completed. Fix it by re-starting ticking if the timer isn't ticking right now. To avoid code churning, these two fixes are squashed in one commit. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-06hw/intc/arm_gic_common.c: Reset all registersPeter Maydell
The arm_gic_common reset function was missing reset code for several of the GIC's state fields: * bpr[] * abpr[] * priority1[] * priority2[] * sgi_pending[] * irq_target[] (SMP configurations only) These probably went unnoticed because most guests will either never touch them, or will write to them in the process of configuring the GIC before enabling interrupts. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1435602345-32210-1-git-send-email-peter.maydell@linaro.org Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2015-07-06Fix interval interrupt of cadence ttc when timer is in decrement modeJohannes Schlatow
The interval interrupt is not set if the timer is in decrement mode. This is because x >=0 and x < interval after leaving the while-loop. Signed-off-by: Johannes Schlatow <schlatow@ida.ing.tu-bs.de> Message-id: 20150630135821.51f3b4fd@johanness-latitude Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-05Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into stagingPeter Maydell
# gpg: Signature made Sat Jul 4 07:06:08 2015 BST using RSA key ID AAFC390E # gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB # Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E * remotes/jnsnow/tags/ide-pull-request: (35 commits) ahci: fix sdb fis semantics qtest/ahci: halted ncq migration test ahci: Do not map cmd_fis to generate response ahci: ncq migration ahci: add get_cmd_header helper ahci: add cmd header to ncq transfer state qtest/ahci: halted NCQ test ahci: correct ncq sector count ahci: correct types in NCQTransferState ahci: add rwerror=stop support for ncq ahci: factor ncq_finish out of ncq_cb ahci: refactor process_ncq_command ahci: assert is_ncq for process_ncq ahci: stash ncq command ide: add limit to .prepare_buf() qtest/ahci: ncq migration test qtest/ahci: simple ncq data test libqos/ahci: Force all NCQ commands to be LBA48 libqos/ahci: set the NCQ tag on command_commit libqos/ahci: adjust expected NCQ interrupts ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-04ahci: fix sdb fis semanticsJohn Snow
There are two things to fix here: The first one is subtle: the PxSACT register in the AHCI HBA has different semantics from the field it is shadowing, the ACT field in the Set Device Bits FIS. In the HBA register, PxSACT acts as a bitfield indicating outstanding NCQ commands where a set bit indicates a pending NCQ operation. The FIS field however operates as an RWC register update to PxSACT, where a set bit indicates a *successfully* completed command. Correct the FIS semantics. At the same time, move the "clear finished" action to the SDB FIS generation instead of the register read to mimick how the other shadow registers work, which always just report the last reported value from a FIS, and not the most current values which may not have been reported by a FIS yet. Lastly and more simply, SATA 3.2 section 13.6.4.2 (and later sections) all specify that the Interrupt bit for the SDB FIS should always be set to one for NCQ commands. That's currently the only time we generate this FIS, so set it on all the time. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-16-git-send-email-jsnow@redhat.com
2015-07-04ahci: Do not map cmd_fis to generate responseJohn Snow
The Register D2H FIS should copy the current values of the registers instead of just parroting back the same values the guest sent back to it. In this case, the SECTOR COUNT variables are actually not generally meaningful in terms of standard commands (See ATA8-AC3 Section 9.2 Normal Outputs), so it actually probably doesn't matter what we put in here. Meanwhile, we do need to use the Register update FIS from the NCQ pathways (in error cases), so getting rid of references to cur_cmd here is a win for AHCI concurrency. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-14-git-send-email-jsnow@redhat.com
2015-07-04ahci: ncq migrationJohn Snow
Migrate the NCQ queue. This is solely for the benefit of halted commands, since anything else should have completed and had any relevant status flushed to the HBA registers already. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-13-git-send-email-jsnow@redhat.com
2015-07-04ahci: add get_cmd_header helperJohn Snow
cur_cmd is an internal bookmark that points to the current AHCI Command Header being processed by the AHCI state machine. With NCQ needing to occasionally rely on some of the same AHCI helpers, we cannot use cur_cmd and will need to grab explicit pointers instead. In an attempt to begin relying on the cur_cmd pointer less, add a helper to let us specifically get the pointer to the command header of particular interest. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-12-git-send-email-jsnow@redhat.com
2015-07-04ahci: add cmd header to ncq transfer stateJohn Snow
While the rest of the AHCI device can rely on a single bookmarked pointer for the AHCI Command Header currently being processed, NCQ is asynchronous and may have many commands in flight simultaneously. Add a cmdh pointer to the ncq_tfs object and make the sglist prepare function take an AHCICmdHeader pointer so we can be explicit about where we'd like to build SGlists from. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-11-git-send-email-jsnow@redhat.com
2015-07-04ahci: correct ncq sector countJohn Snow
uint16_t isn't enough to hold the real sector count, since a value of zero implies a full 64K sectors, so we need a uint32_t here. We *could* cheat and pretend that this value is 0-based and fit it in a uint16_t, but I'd rather waste 2 bytes instead of a future dev's 10 minutes when they forget to +1/-1 accordingly somewhere. See SATA 3.2, section 13.6.4.1 "READ FPDMA QUEUED". Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-9-git-send-email-jsnow@redhat.com
2015-07-04ahci: correct types in NCQTransferStateJohn Snow
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-8-git-send-email-jsnow@redhat.com
2015-07-04ahci: add rwerror=stop support for ncqJohn Snow
Handle NCQ failures for cases where we want to halt the VM on IO errors. Upon a VM state change, retry the halted NCQ commands. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-7-git-send-email-jsnow@redhat.com
2015-07-04ahci: factor ncq_finish out of ncq_cbJohn Snow
When we add werror=stop or rerror=stop support to NCQ, we'll want to take a codepath where we don't actually complete the command, so factor that out into a new routine. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-6-git-send-email-jsnow@redhat.com
2015-07-04ahci: refactor process_ncq_commandJohn Snow
Split off execute_ncq_command so that we can call it separately later if we desire. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-5-git-send-email-jsnow@redhat.com
2015-07-04ahci: assert is_ncq for process_ncqJohn Snow
We already checked this in the handle_cmd phase, so just change this to an assertion and simplify the error logic. (Also, fix the switch indent, because checkpatch.pl yelled.) ((Sorry for churn.)) Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-4-git-send-email-jsnow@redhat.com
2015-07-04ahci: stash ncq commandJohn Snow
For migration and werror=stop/rerror=stop resume purposes, it will be convenient to have the command handy inside of ncq_tfs. Eventually, we'd like to avoid reading from the FIS entirely after the initial read, so this is a byte (hah!) sized step in that direction. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-3-git-send-email-jsnow@redhat.com
2015-07-04ide: add limit to .prepare_buf()John Snow
prepare_buf should not always grab as many descriptors as it can, sometimes it should self-limit. For example, an NCQ transfer of 1 sector with a PRDT that describes 4GiB of data should not copy 4GiB of data, it should just transfer that first 512 bytes. PIO is not affected, because the dma_buf_rw dma helpers already have a byte limit built-in to them, but DMA/NCQ will exhaust the entire list regardless of requested size. AHCI 1.3 specifies in section 6.1.6 Command List Underflow that NCQ is not required to detect underflow conditions. Non-NCQ pathways signal underflow by writing to the PRDBC field, which will already occur by writing the actual transferred byte count to the PRDBC, signaling the underflow. Our NCQ pathways aren't required to detect underflow, but since our DMA backend uses the size of the PRDT to determine the size of the transer, if our PRDT is bigger than the transaction (the underflow condition) it doesn't cost us anything to detect it and truncate the PRDT. This is a recoverable error and is not signaled to the guest, in either NCQ or normal DMA cases. For BMDMA, the existing pathways should see no guest-visible difference, but any bytes described in the overage will no longer be transferred before indicating to the guest that there was an underflow. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435767578-32743-2-git-send-email-jsnow@redhat.com
2015-07-04ahci: ncq sector count correctionJohn Snow
This value should not be size-corrected, 0 sectors does not imply 1 sector(s). This is just debug information, but it's misleading! Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435016308-6150-8-git-send-email-jsnow@redhat.com
2015-07-04ahci: add ncq debug checksJohn Snow
Most of the time, these bits can be safely ignored. For the purposes of debugging however, it's nice to know that they're not being used. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435016308-6150-7-git-send-email-jsnow@redhat.com
2015-07-04ahci: separate prdtl from optsJohn Snow
There's no real reason to have it bundled together, and this way is a little nicer to follow if you have the AHCI spec pulled up. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435016308-6150-6-git-send-email-jsnow@redhat.com
2015-07-04ahci: check for ncq prdtl overflowJohn Snow
Don't attempt the NCQ transfer if the PRDT we were given is not big enough to perform the entire transfer. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435016308-6150-5-git-send-email-jsnow@redhat.com