aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)Author
2016-02-06ipmi: add get and set SENSOR_TYPE commandsCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Acked-by: Corey Minyard <cminyard@mvista.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06ipmi: introduce a struct ipmi_sdr_compactCédric Le Goater
Currently, sdr attributes are identified using byte offsets and this can be a bit confusing. This patch adds a struct ipmi_sdr_compact conforming to the IPMI specs and replaces byte offsets with names. It also introduces and uses a struct ipmi_sdr_header in sections of the code where no assumption is made on the type of SDR. This leave rooms to potential usage of other types in the future. Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06ipmi: fix SDR length valueCédric Le Goater
The IPMI BMC simulator populates the SDR table with a set of initial SDRs. The length of each SDR is taken from the record itself (byte 4) which does not include the size of the header. But, the full length (header + data) is required by the sdr_add_entry() routine. Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06ipmi: cleanup error_report messagesCédric Le Goater
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06ipmi: replace *_MAXCMD definesCédric Le Goater
ARRAY_SIZE() is simple to use and removes the need to pre-define the size of the command arrays. Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06ipmi: replace goto by a return statementCédric Le Goater
Each routine using the IPMI_ADD_RSP_DATA, IPMI_CHECK_CMD_LEN or IPMI_CHECK_RESERVATION macros needs to define a goto label 'out' to handle hidden errors. Using directly a return statement has the same effect and it removes the fact that 'out' needs to be defined. The code exits in ipmi_sim_handle_command() are a little different from the rest and a "possible" error in the macro IPMI_ADD_RSP_DATA is handled before making use of it. This might be a bit excessive as a minimum response len is currently 300 bytes and the patch checks that at least 3 are available. Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Corey Minyard <cminyard@mvista.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06hw/pci: ensure that only PCI/PCIe bridges can be attached to pxb/pxb-pcie ↵Marcel Apfelbaum
devices PCI devices can't be plugged directly into PCI extra root bridges because their resources can't be computed by firmware before the ACPI tables are loaded. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06hw/pxb: add pxb devices to the bridge categoryMarcel Apfelbaum
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: combine write of an entry into used ringVincenzo Maffione
Fill in an element of the used ring with a single combined access to the guest physical memory, rather than using two separated accesses. This reduces the overhead due to expensive address translation. Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Message-Id: <e4a89a767a4a92cbb6bcc551e151487eb36e1722.1450218353.git.v.maffione@gmail.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: read avail_idx from VQ only when necessaryVincenzo Maffione
The virtqueue_pop() implementation needs to check if the avail ring contains some pending buffers. To perform this check, it is not always necessary to fetch the avail_idx in the VQ memory, which is expensive. This patch introduces a shadow variable tracking avail_idx and modifies virtio_queue_empty() to access avail_idx in physical memory only when necessary. Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Message-Id: <b617d6459902773d9f4ab843bfaca764f5af8eda.1450218353.git.v.maffione@gmail.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: cache used_idx in a VirtQueue fieldVincenzo Maffione
Accessing used_idx in the VQ requires an expensive access to guest physical memory. Before this patch, 3 accesses are normally done for each pop/push/notify call. However, since the used_idx is only written by us, we can track it in our internal data structure. Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Message-Id: <3d062ec54e9a7bf9fb325c1fd693564951f2b319.1450218353.git.v.maffione@gmail.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: combine the read of a descriptorPaolo Bonzini
Compared to vring, virtio has a performance penalty of 10%. Fix it by combining all the reads for a descriptor in a single address_space_read call. This also simplifies the code nicely. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06vring: slim down allocation of VirtQueueElementsPaolo Bonzini
Build the addresses and s/g lists on the stack, and then copy them to a VirtQueueElement that is just as big as required to contain this particular s/g list. The cost of the copy is minimal compared to that of a large malloc. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: slim down allocation of VirtQueueElementsPaolo Bonzini
Build the addresses and s/g lists on the stack, and then copy them to a VirtQueueElement that is just as big as required to contain this particular s/g list. The cost of the copy is minimal compared to that of a large malloc. When virtqueue_map is used on the destination side of migration or on loadvm, the iovecs have already been split at memory region boundary, so we can just reuse the out_num/in_num we find in the file. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: introduce virtqueue_alloc_elementPaolo Bonzini
Allocate the arrays for in_addr/out_addr/in_sg/out_sg outside the VirtQueueElement. For now, virtqueue_pop and vring_pop keep allocating a very large VirtQueueElement. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: introduce qemu_get/put_virtqueue_elementPaolo Bonzini
Move allocation to virtio functions also when loading/saving a VirtQueueElement. This will also let the load/save functions keep backwards compatibility when the VirtQueueElement layout is changed. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06virtio: move allocation to virtqueue_pop/vring_popPaolo Bonzini
The return code of virtqueue_pop/vring_pop is unused except to check for errors or 0. We can thus easily move allocation inside the functions and just return a pointer to the VirtQueueElement. The advantage is that we will be able to allocate only the space that is needed for the actual size of the s/g list instead of the full VIRTQUEUE_MAX_SIZE items. Currently VirtQueueElement takes about 48K of memory, and this kind of allocation puts a lot of stress on malloc. By cutting the size by two or three orders of magnitude, malloc can use much more efficient algorithms. The patch is pretty large, but changes to each device are testable more or less independently. Splitting it would mostly add churn. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-04virtio: move VirtQueueElement at the beginning of the structsPaolo Bonzini
The next patch will make virtqueue_pop/vring_pop allocate memory for the VirtQueueElement. In some cases (blk, scsi, gpu) the device wants to extend VirtQueueElement with device-specific fields and, until now, the place of the VirtQueueElement within the containing struct didn't matter. When allocating the entire block in virtqueue_pop/vring_pop, however, the containing struct must basically be a "subclass" of VirtQueueElement, with the VirtQueueElement as the first field. Make that the case for blk and scsi; gpu is already doing it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-04pc: acpi: merge SSDT into DSDTIgor Mammedov
Since both tables are built dynamically now, there is no point in keeping ASL in them in separate tables. So do the same as we do for ARM where we have only DSDT table, i.e. move SSDT ASL into DSDT and drop SSDT altogether. This patch doesn't change moved SSDT ASL in any way, but it opens a way to relatively independently simplify generated ASL on per device/subsystem basis in followup series. It also simplifies bios-tables-test where expected SSDT blobs could be dropped and only DSDT ones have to be maintained. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-04Fix virtio migrationDr. David Alan Gilbert
I misunderstood the vmstate macro definition when I reworked the virtio .get/.put. The VMSTATE_STRUCT_VARRAY_KNOWN, was described as being for "a variable length array (i.e. _type *_field) but we know the length". However it actually specified operation for arrays embedded in the struct (i.e. _type _field[]) since it lacked the VMS_POINTER flag. This caused offset calculation to be completely off, examining and potentially sending random data instead of the VirtQueue content. Replace the otherwise unused VMSTATE_STRUCT_VARRAY_KNOWN with a VMSTATE_STRUCT_VARRAY_POINTER_KNOWN that includes the VMS_POINTER flag (so now actually doing what it advertises) and use it in the virtio migration code. Fixes and description as per Sascha's suggestions/debug. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Tested-By: Sascha Silbe <silbe@linux.vnet.ibm.com> Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com> Fixes: 50e5ae4dc3e4f21e874512f9e87b93b5472d26e0 Fixes: 2cf0148674430b6693c60d42b7eef721bfa9509f Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-03Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' ↵Peter Maydell
into staging # gpg: Signature made Wed 03 Feb 2016 15:47:34 GMT using RSA key ID 81AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" * remotes/stefanha/tags/tracing-pull-request: log: add "-d trace:PATTERN" trace: switch default backend to "log" trace: convert stderr backend to log log: move qemu-log.c into util/ directory log: do not unnecessarily include qom/cpu.h trace: add "-trace help" trace: add "-trace enable=..." trace: no need to call trace_backend_init in different branches now trace: split trace_init_file out of trace_init_backends trace: split trace_init_events out of trace_init_backends trace: fix documentation trace: track enabled events in a separate array trace: count number of enabled events Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03virtio-gpu: block any rendering until client (ui) is doneGerd Hoffmann
Wire up gl_block callback, so ui code can request to stop virtio-gpu rendering. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-03virtio-gpu: add support to enable/disable command processingGerd Hoffmann
So we can stop rendering for a while in case we have to. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-03virtio-gpu: maintain command queueGerd Hoffmann
We'll go take out the commands we receive out of the virt queue and put them into a linked list, to decouple virtio queue handling from actual command processing. Also move cmd processing to new virtio_gpu_handle_ctrl func, so we can easily kick it from different places. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-03virtio-gpu: fix memory leak in error pathGerd Hoffmann
Found by Coverity Scan, buf not freed on error. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-03log: do not unnecessarily include qom/cpu.hPaolo Bonzini
Split the bits that require it to exec/log.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-id: 1452174932-28657-8-git-send-email-den@openvz.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-02Merge remote-tracking branch ↵Peter Maydell
'remotes/maxreitz/tags/pull-block-for-peter-2016-02-02' into staging Block patches # gpg: Signature made Tue 02 Feb 2016 17:23:44 GMT using RSA key ID E838ACAD # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" * remotes/maxreitz/tags/pull-block-for-peter-2016-02-02: (50 commits) block: qemu-iotests - add test for snapshot, commit, snapshot bug block: set device_list.tqe_prev to NULL on BDS removal iotests: Add "qemu-img map" test for VMDK extents qemu-img: Make MapEntry a QAPI struct qemu-img: In "map", use the returned "file" from bdrv_get_block_status block: Use returned *file in bdrv_co_get_block_status vmdk: Return extent's file in bdrv_get_block_status vmdk: Fix calculation of block status's offset vpc: Assign bs->file->bs to file in vpc_co_get_block_status vdi: Assign bs->file->bs to file in vdi_co_get_block_status sheepdog: Assign bs to file in sd_co_get_block_status qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status parallels: Assign bs->file->bs to file in parallels_co_get_block_status iscsi: Assign bs to file in iscsi_co_get_block_status raw: Assign bs to file in raw_co_get_block_status qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status qcow: Assign bs->file->bs to file in qcow_co_get_block_status block: Add "file" output parameter to block status query functions block: acquire in bdrv_query_image_info iotests: Add test for block jobs and BDS ejection ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160202-1' into ↵Peter Maydell
staging usb: two ehci fixes. # gpg: Signature made Tue 02 Feb 2016 13:12:00 GMT using RSA key ID D3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" * remotes/kraxel/tags/pull-usb-20160202-1: ehci: update irq on reset usb: check page select value while processing iTD Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02virtio-scsi: Catch BDS-BB removal/insertionMax Reitz
Make use of the BDS-BB removal and insertion notifiers to remove or set up, respectively, virtio-scsi's op blockers. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02virtio-blk: Functions for op blocker managementMax Reitz
Put the code for setting up and removing op blockers into an own function, respectively. Then, we can invoke those functions whenever a BDS is removed from an virtio-blk BB or inserted into it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02Revert "hw/block/fdc: Implement tray status"Max Reitz
This reverts the changes that commit 2e1280e8ff95b3145bc6262accc9d447718e5318 applied to hw/block/fdc.c; also, an additional case of drv->media_inserted use has crept in since, which is replaced by a call to blk_is_inserted(). That commit changed tests/fdc-test.c, too, because after it, one less TRAY_MOVED event would be emitted when executing 'change' on an empty drive. However, now, no TRAY_MOVED events will be emitted at all, and the tray_open status returned by query-block will always be false, necessitating (different) changes to tests/fdc-test.c and iotest 118, which is why this patch is not a pure revert of said commit. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 1454096953-31773-4-git-send-email-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-02Merge remote-tracking branch 'remotes/kraxel/tags/pull-audio-20160202-1' ↵Peter Maydell
into staging audio: Clean up includes # gpg: Signature made Tue 02 Feb 2016 12:58:06 GMT using RSA key ID D3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" * remotes/kraxel/tags/pull-audio-20160202-1: audio: Clean up includes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02Merge remote-tracking branch 'remotes/kraxel/tags/pull-fwcfg-20160202-1' ↵Peter Maydell
into staging nvme: generate OpenFirmware device path in the "bootorder" fw_cfg file # gpg: Signature made Tue 02 Feb 2016 12:54:04 GMT using RSA key ID D3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" * remotes/kraxel/tags/pull-fwcfg-20160202-1: nvme: generate OpenFirmware device path in the "bootorder" fw_cfg file Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02ehci: update irq on resetGerd Hoffmann
After clearing the status register we also have to update the irq line status. Otherwise a irq which happends to be pending at reset time causes a interrupt storm. And the guest can't stop as the status register doesn't indicate any pending interrupt. Both NetBSD and FreeBSD hang on shutdown because of that. Cc: qemu-stable@nongnu.org Reported-by: Andrey Korolyov <andrey@xdel.ru> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 1453203884-4125-1-git-send-email-kraxel@redhat.com
2016-02-02usb: check page select value while processing iTDPrasad J Pandit
While processing isochronous transfer descriptors(iTD), the page select(PG) field value could lead to an OOB read access. Add check to avoid it. Reported-by: Qinghao Tang <luodalongde@gmail.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-id: 1453233406-12165-1-git-send-email-ppandit@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-02audio: Clean up includesPeter Maydell
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1453138432-8324-1-git-send-email-peter.maydell@linaro.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-02ivshmem: use a single eventfd callback, get rid of CharDriverMarc-André Lureau
Simplify the interrupt handling by having a single callback on irq&msi cases. Remove usage of CharDriver, replace it with qemu_set_fd_handler(). Use event_notifier_test_and_clear() to read the eventfd. Before this patch, ivshmem writes the first byte received to s->intrstatus. But ivshmem_device_spec.txt says "The status register is set to 1 when an interrupt occurs." Fortunately, the byte usually comes from another ivshmem device, and those always write 1. After this commit, follows the specification, set to 1 when an interrupt occurs. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com>
2016-02-02ivshmem: generalize ivshmem_setup_interruptsMarc-André Lureau
Call ivshmem_setup_interrupts() with or without MSI, always allocate msi_vectors that is going to be used in all case in the following patch. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02ivshmem: remove redundant assignment, fix crash with msi=offMarc-André Lureau
Fix crash when msi=false introduced in 660c97ee (msi_vectors is NULL in this case) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02ivshmem: no need for opaque argumentMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02nvme: generate OpenFirmware device path in the "bootorder" fw_cfg fileLaszlo Ersek
Background on QEMU boot indices ------------------------------- Normally, the "bootindex" property is configured for bootable devices with: DEVICE_instance_init() device_add_bootindex_property(..., "bootindex", ...) object_property_add(..., device_get_bootindex, device_set_bootindex, ...) and when the bootindex is set on the QEMU command line, with -device DEVICE,...,bootindex=N the setter that was configured above is invoked: device_set_bootindex() /* parse boot index */ visit_type_int32() /* verify unicity */ check_boot_index() /* store parsed boot index */ ... /* insert device path to boot order */ add_boot_device_path() In the last step, add_boot_device_path() ensures that an OpenFirmware device path will show up in the "bootorder" fw_cfg file, at a position corresponding to the device's boot index. Thus guest firmware (SeaBIOS and OVMF) can try to boot off the device with the right priority. NVMe boot index --------------- In QEMU commit 33739c712982, nvma: ide: add bootindex to qom property the following generic setters / getters: - device_set_bootindex() - device_get_bootindex() were open-coded for NVMe, under the names - nvme_set_bootindex() - nvme_get_bootindex() Plus nvme_instance_init() was added to configure the "bootindex" property manually, designating the open-coded getter & setter, rather than calling device_add_bootindex_property(). Crucially, nvme_set_bootindex() avoided the final add_boot_device_path() call. This fact is spelled out in the message of commit 33739c712982, and it was presumably the entire reason for all of the code duplication. Now, Vladislav filed an RFE for OVMF <https://github.com/tianocore/edk2/issues/48>; OVMF should boot off NVMe devices. It is simple to build edk2's existent NvmExpressDxe driver into OVMF, but the boot order matching logic in OVMF can only handle NVMe if the "bootorder" fw_cfg file includes such devices. Therefore this patch converts the NVMe device model to device_set_bootindex() all the way. Device paths ------------ device_set_bootindex() accepts an optional parameter called "suffix". When present, it is expected to take the form of an OpenFirmware device path node, and it gets appended as last node to the otherwise auto-generated OFW path. For NVMe, the auto-generated part is /pci@i0cf8/pci8086,5845@6[,1] ^ ^ ^ ^ | | PCI slot and (present when nonzero) | | function of the NVMe controller, both hex | "driver name" component, built from PCI vendor & device IDs PCI root at system bus port, PIO to which here we append the suffix /namespace@1,0 ^ ^ | big endian (MSB at lowest address) numeric interpretation | of the 64-bit IEEE Extended Unique Identifier, aka EUI-64, | hex 32-bit NVMe namespace identifier, aka NSID, hex resulting in the OFW device path /pci@i0cf8/pci8086,5845@6[,1]/namespace@1,0 The reason for including the NSID and the EUI-64 is that an NVMe device can in theory produce several different namespaces (distinguished by NSID). Additionally, each of those may (optionally) have an EUI-64 value. For now, QEMU only provides namespace 1. Furthermore, QEMU doesn't even represent the EUI-64 as a standalone field; it is embedded (and left unused) inside the "NvmeIdNs.res30" array, at the last eight bytes. (Which is fine, since EUI-64 can be left zero-filled if unsupported by the device.) Based on the above, we set the "unit address" part of the last ("namespace") node to fixed "1,0". OVMF will then map the above OFW device path to the following UEFI device path fragment, for boot order processing: PciRoot(0x0)/Pci(0x6,0x1)/NVMe(0x1,00-00-00-00-00-00-00-00) ^ ^ ^ ^ ^ ^ | | | | | octets of the EUI-64 in address order | | | | NSID | | | NVMe namespace messaging device path node | PCI slot and function PCI root bridge Cc: Keith Busch <keith.busch@intel.com> (supporter:nvme) Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) Cc: qemu-block@nongnu.org (open list:nvme) Cc: Gonglei <arei.gonglei@huawei.com> Cc: Vladislav Vovchenko <vladislav.vovchenko@sk.com> Cc: Feng Tian <feng.tian@intel.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Kevin O'Connor <kevin@koconnor.net> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Gonglei <arei.gonglei@huawei.com> Acked-by: Keith Busch <keith.busch@intel.com> Tested-by: Vladislav Vovchenko <vladislav.vovchenko@sk.com> Message-id: 1453850483-27511-1-git-send-email-lersek@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-01-30target-ppc: Helper to determine page size information from hpte aloneDavid Gibson
h_enter() in the spapr code needs to know the page size of the HPTE it's about to insert. Unlike other paths that do this, it doesn't have access to the SLB, so at the moment it determines this with some open-coded tests which assume POWER7 or POWER8 page size encodings. To make this more flexible add ppc_hash64_hpte_page_shift_noslb() to determine both the "base" page size per segment, and the individual effective page size from an HPTE alone. This means that the spapr code should now be able to handle any page size listed in the env->sps table. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Alexander Graf <agraf@suse.de>
2016-01-30target-ppc: Add new TLB invalidate by HPTE call for hash64 MMUsDavid Gibson
When HPTEs are removed or modified by hypercalls on spapr, we need to invalidate the relevant pages in the qemu TLB. Currently we do that by doing some complicated calculations to work out the right encoding for the tlbie instruction, then passing that to ppc_tlb_invalidate_one()... which totally ignores the argument and flushes the whole tlb. Avoid that by adding a new flush-by-hpte helper in mmu-hash64.c. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Alexander Graf <agraf@suse.de>
2016-01-30target-ppc: Convert mmu-hash{32,64}.[ch] from CPUPPCState to PowerPCCPUDavid Gibson
Like a lot of places these files include a mixture of functions taking both the older CPUPPCState *env and newer PowerPCCPU *cpu. Move a step closer to cleaning this up by standardizing on PowerPCCPU, except for the helper_* functions which are called with the CPUPPCState * from tcg. Callers and some related functions are updated as well, the boundaries of what's changed here are a bit arbitrary. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de>
2016-01-30uninorth.c: add support for UniNorth kMacRISCPCIAddressSelect (0x48) registerProgrammingkid
Darwin/OS X use the undocumented kMacRISCPCIAddressSelect (0x48) to configure PCI memory space size for mac99 machines. Without this register, warnings similar to below are emitted to the console during boot: AppleMacRiscPCI: bad range 2(80000000:01000000) AppleMacRiscPCI: bad range 2(81000000:00001000) AppleMacRiscPCI: bad range 2(81080000:00080000) Based upon the algorithm in Darwin's AppleMacRiscPCI.cpp driver, set the kMacRISCPCIAddressSelect register so that Darwin considers the PCI memory space to be at 0x80000000 (size 0x10000000) which matches that currently used by QEMU and OpenBIOS. Signed-off-by: John Arbuckle <programmingkidx@gmail.com> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [commit message and comment revised as suggested by Mark Cave-Ayland] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-30cuda.c: return error for unknown commandsAlyssa Milburn
This avoids MacsBug hanging at startup in the absence of ADB mouse input, by replying with an error (which is also what MOL does) when it sends an unknown command (0x1c). Signed-off-by: Alyssa Milburn <fuzzie@fuzzie.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-30pseries: Allow TCG h_enter to work with hotplugged memoryDavid Gibson
The implementation of the H_ENTER hypercall for PAPR guests needs to enforce correct access attributes on the inserted HPTE. This means determining if the HPTE's real address is a regular RAM address (which requires attributes for coherent access) or an IO address (which requires attributes for cache-inhibited access). At the moment this check is implemented with (raddr < machine->ram_size), but that only handles addresses in the base RAM area, not any hotplugged RAM. This patch corrects the problem with a new helper. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-30pseries: Clean up error reporting in htab migration functionsDavid Gibson
The functions for migrating the hash page table on pseries machine type (htab_save_setup() and htab_load()) can report some errors with an explicit fprintf() before returning an appropriate error code. Change some of these to use error_report() instead. htab_save_setup() is omitted for now to avoid conflicts with some other in-progress work. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30pseries: Clean up error reporting in ppc_spapr_init()David Gibson
This function includes a number of explicit fprintf()s for errors. Change these to use error_report() instead. Also replace the single exit(EXIT_FAILURE) with an explicit exit(1), since the latter is the more usual idiom in qemu by a large margin. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30pseries: Clean up error handling in xics_system_init()David Gibson
Use the error handling infrastructure to pass an error out from try_create_xics() instead of assuming &error_abort - the caller is in a better position to decide on error handling policy. Also change the error handling from an &error_abort to &error_fatal, since this occurs during the initial machine construction and could be triggered by bad configuration rather than a program error. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Markus Armbruster <armbru@redhat.com>