aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-06-15Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer core and image format patches # gpg: Signature made Fri Jun 12 16:08:53 2015 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (25 commits) block: Fix reopen flag inheritance block: Add BlockDriverState.inherits_from block: Add list of children to BlockDriverState queue.h: Add QLIST_FIX_HEAD_PTR() block: Drain requests before swapping nodes in bdrv_swap() block: Move flag inheritance to bdrv_open_inherit() block: Use QemuOpts in bdrv_open_common() block: Use macro for cache option names vmdk: Use bdrv_open_image() quorum: Use bdrv_open_image() check-qdict: Test cases for new functions qdict: Add qdict_{set,copy}_default() qdict: Add qdict_array_entries() iotests: Add tests for overriding BDRV_O_PROTOCOL block: driver should override flags in bdrv_open() block: Change bitmap truncate conditional to assertion block: record new size in bdrv_dirty_bitmap_truncate raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size vmdk: Use vmdk_find_index_in_cluster everywhere vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-12Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Fri Jun 12 15:57:47 2015 BST using RSA key ID 81AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" * remotes/stefanha/tags/block-pull-request: qemu-iotests: expand test 093 to support group throttling throttle: Update throttle infrastructure copyright throttle: add the name of the ThrottleGroup to BlockDeviceInfo throttle: acquire the ThrottleGroup lock in bdrv_swap() throttle: Add throttle group support throttle: Add throttle group infrastructure tests throttle: Add throttle group infrastructure throttle: Extract timers from ThrottleState into a separate structure raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size Revert "iothread: release iothread around aio_poll" Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-12block: Fix reopen flag inheritanceKevin Wolf
When reopening an image, the block layer already takes care to reopen bs->file as well with recalculated inherited flags. The same must happen for any other child (most notably missing before this patch: backing files). If bs->file (or any other child) didn't originally inherit from bs, e.g. because it was created separately and then only referenced, it must not inherit flags on reopen either, so check the inherited_from field before propagation the reopen down. VMDK already reopened its extents manually; this code can now be dropped. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-06-12block: Add BlockDriverState.inherits_fromKevin Wolf
Currently, the block layer assumes that any block node can have only one parent, and if it has a parent, that it inherits some options/flags from this parent. This is not true any more: With references used in block device creation, a single node can be used by multiple parents, or it can be created separately and not inherit flags from any parent. To handle reopens correctly, a node must know from which parent it inherited options. This patch adds the information to BlockDriverState. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Add list of children to BlockDriverStateKevin Wolf
This allows iterating over all children of a given BDS, not only including bs->file and bs->backing_hd, but also driver-specific ones like VMDK extents or Quorum children. For bdrv_swap(), the list of children of the swapped BDS stays at that BDS (because that's where the pointers stay as well). The list head moves and pointers to it must be fixed up therefore. The list of children in the parent of the swapped BDS is not affected by the swap. The contents of the BDS objects is swapped, so the existing pointer in the parent automatically points to the newly swapped in BDS. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12queue.h: Add QLIST_FIX_HEAD_PTR()Kevin Wolf
If the head of a list has been moved to a different memory location, the le_prev link in the first list entry has to be fixed up. Provide a macro that implements this fixup. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Drain requests before swapping nodes in bdrv_swap()Kevin Wolf
bdrv_swap() requires that there are no requests in flight on either of the two devices. The request coroutine would work on the wrong BlockDriverState object (with bs->opaque even being interpreted as a different type potentially) and all sorts of bad things would result from this. The currently existing callers mostly ensure that there is no I/O pending on nodes that are swapped. In detail, this is: 1. Live snapshots. This goes through qmp_transaction(), which calls bdrv_drain_all() before doing anything. The command is executed synchronously, so no new I/O can be issued concurrently. 2. snapshot=on in bdrv_open(). We're in the middle of opening the image (both the original image and its temporary overlay), so there can't be any I/O in flight yet. 3. Mirroring. bdrv_drain() is already used on the source device so that the mirror doesn't miss anything. However, the main loop runs between that and the bdrv_swap() (which is actually a bug, being addressed in another series), so there is a small window in which new I/O might be issued that would be in flight during bdrv_swap(). It is safer to just drain the request queue of both devices in bdrv_swap() instead of relying on callers to do the right thing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Move flag inheritance to bdrv_open_inherit()Kevin Wolf
Instead of letting every caller of bdrv_open() determine the right flags for its child node manually and pass them to the function, pass the parent node and the role of the newly opened child (like backing file, protocol layer, etc.). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Use QemuOpts in bdrv_open_common()Kevin Wolf
Instead of manually parsing options and then deleting them from the options QDict, just use QemuOpts like most other places that deal with block device options. More options will be added there and then QemuOpts is a lot more manageable than open-coding everything. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Use macro for cache option namesKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-06-12vmdk: Use bdrv_open_image()Kevin Wolf
Besides standardising on a single interface for opening child nodes, this patch allows the user to specify options to individual extent nodes. Overriding file names isn't possible with this yet, so it's of limited usefulness, but still a step forward. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>
2015-06-12quorum: Use bdrv_open_image()Kevin Wolf
Besides standardising on a single interface for opening child nodes, this simplifies the .bdrv_open() implementation of the quorum block driver by using block layer functionality for handling BlockdevRefs. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-06-12check-qdict: Test cases for new functionsKevin Wolf
This adds test cases for the following new QDict functions: * qdict_array_entries() * qdict_set_default_str() * qdict_copy_default() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12qdict: Add qdict_{set,copy}_default()Kevin Wolf
In the block layer functions that determine options for a child block device, it's a common pattern to either copy options from the parent's options or to set a default string if the option isn't explicitly set yet for the child. Provide convenience functions so that it becomes a one-liner for each option. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2015-06-12qdict: Add qdict_array_entries()Kevin Wolf
This counts the entries in a flattened array in a QDict without actually splitting the QDict into a QList. bdrv_open_image() doesn't take a QList, but rather a QDict and a key prefix string, so this is more convenient for block drivers which have a dynamically sized list of child nodes (e.g. Quorum) and are to be converted to using bdrv_open_image() as the standard interface for opening child nodes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Fri Jun 12 13:57:20 2015 BST using RSA key ID 81AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" * remotes/stefanha/tags/net-pull-request: qmp/hmp: add rocker device support rocker: bring link up/down on PHY enable/disable rocker: update tests using hw-derived interface names rocker: Add support for phys name iohandler: Change return type of qemu_set_fd_handler to "void" event-notifier: Always return 0 for posix implementation xen_backend: Remove unused error handling of qemu_set_fd_handler oss: Remove unused error handling of qemu_set_fd_handler alsaaudio: Remove unused error handling of qemu_set_fd_handler main-loop: Drop qemu_set_fd_handler2 Change qemu_set_fd_handler2(..., NULL, ...) to qemu_set_fd_handler tap: Drop tap_can_send net/socket: Drop net_socket_can_send netmap: Drop netmap_can_send l2tpv3: Drop l2tpv3_can_send stubs: Add qemu_set_fd_handler Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-12iotests: Add tests for overriding BDRV_O_PROTOCOLMax Reitz
This adds tests for overriding the qemu-internal BDRV_O_PROTOCOL flag by explicitly specifying a block driver. As one test must be run over the NBD protocol while the other must not, this patch adds two separate iotests. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12block: driver should override flags in bdrv_open()Max Reitz
The BDRV_O_PROTOCOL flag should have an impact only if no driver is specified explicitly. Therefore, if bdrv_open() is called with an explicit block driver argument (either through the options QDict or through the drv parameter) and that block driver is a protocol block driver, BDRV_O_PROTOCOL should be set; if it is a format block driver, BDRV_O_PROTOCOL should be unset. While there was code to unset the flag in case a format block driver has been selected, it only followed the bdrv_fill_options() function call whereas the flag in fact needs to be adjusted before it is used there. With that change, BDRV_O_PROTOCOL will always be set if the BDS should be a protocol driver; if the driver has been specified explicitly, the new code will set it; and bdrv_fill_options() will only "probe" a protocol driver if BDRV_O_PROTOCOL is set. The probing after bdrv_fill_options() cannot select a protocol driver. Thus, bdrv_open_image() to open BDS.file is never called if a protocol BDS is about to be created. With that change in turn it is impossible to call bdrv_open_common() with a protocol drv and file != NULL, which allows us to remove the bdrv_swap() call. This change breaks a test case in qemu-iotest 051: "-drive file=t.qcow2,file.driver=qcow2" now works because the explicitly specified "qcow2" overrides the BDRV_O_PROTOCOL which is automatically set for the "file" BDS (and the filename is just passed down). Therefore, this patch removes that test case. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12block: Change bitmap truncate conditional to assertionJohn Snow
This is an artifact of an older version that had both all-bitmap and single-bitmap truncate functions, and some info got lost in the shuffle. Bitmaps can only be frozen during a backup operation, and a backup operation should prevent a resize operation, so just assert that this cannot happen. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12block: record new size in bdrv_dirty_bitmap_truncateJohn Snow
ce1ffea8 neglected to update the BdrvDirtyBitmap structure itself for internal consistency. It's currently not an issue, but for migration and persistence series this will cause headaches. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12raw-posix: Fix .bdrv_co_get_block_status() for unaligned image sizeKevin Wolf
Image files with an unaligned image size have a final hole that starts at EOF, i.e. in the middle of a sector. Currently, *pnum == 0 is returned when checking the status of this sector. In qemu-img, this triggers an assertion failure. In order to fix this, one type for the sector that contains EOF must be found. Treating a hole as data is safe, so this patch rounds the calculated number of data sectors up, so that a partial sector at EOF is treated as a full data sector. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1229394 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Cole Robinson <crobinso@redhat.com>
2015-06-12vmdk: Use vmdk_find_index_in_cluster everywhereFam Zheng
Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_statusFam Zheng
It has the similar issue with b1649fae49a8. Since the calculation is repeated for a few times already, introduce a function so it can be reused. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12qcow2: Add DEFAULT_L2_CACHE_CLUSTERSMax Reitz
If a relatively large cluster size is chosen, the default of 1 MB L2 cache is not really appropriate. In this case, unless overridden by the user, the default cache size should not be determined by its size in bytes but by the number of L2 tables (clusters) it is supposed to contain. Note that without this patch, MIN_L2_CACHE_SIZE will effectively take over the same role. However, providing space for just two L2 tables is not enough to be the default. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12iotests: qcow2 COW with minimal L2 cache sizeMax Reitz
This adds a test case to test 103 for performing a COW operation in a qcow2 image using an L2 cache with minimal size (which should be at least two clusters so the COW can access both source and destination simultaneously). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12qcow2: Set MIN_L2_CACHE_SIZE to 2Max Reitz
The L2 cache must cover at least two L2 tables, because during COW two L2 tables are accessed simultaneously. Reported-by: Alexander Graf <agraf@suse.de> Cc: qemu-stable <qemu-stable@nongnu.org> Signed-off-by: Max Reitz <mreitz@redhat.com> Tested-by: Alexander Graf <agraf@suse.de> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12qemu-iotests: Fix 128 if sudo requiredFam Zheng
If passwordless "sudo" works, use it in the qemu-io cmd. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12iotests: remove assertIsNotNone callJohn Snow
RHEL6 doesn't have Python 2.7, so replace this call with assertNotEqual(x, None) which will work just as well. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12Merge remote-tracking branch 'remotes/aurel/tags/pull-sh4-next-20150612' ↵Peter Maydell
into staging sh4 linux-user cpu and hwcap misc optimizations and cleanup convert r2d to new MMIO accessor style # gpg: Signature made Fri Jun 12 11:28:43 2015 BST using RSA key ID 1DDD8C9B # gpg: Good signature from "Aurelien Jarno <aurelien@aurel32.net>" # gpg: aka "Aurelien Jarno <aurelien@jarno.fr>" # gpg: aka "Aurelien Jarno <aurel32@debian.org>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 7746 2642 A9EF 94FD 0F77 196D BA9C 7806 1DDD 8C9B * remotes/aurel/tags/pull-sh4-next-20150612: target-sh4: remove dead code target-sh4: factorize fmov implementation target-sh4: split out Q and M from of SR and optimize div1 target-sh4: optimize negc using add2 and sub2 target-sh4: optimize subc using sub2 target-sh4: optimize addc using add2 target-sh4: Split out T from SR target-sh4: use bit number for SR constants sh4/r2d: convert to new MMIO accessor style linux-user: Add HWCAP for SH4 linux-user: Default sh4 to sh7785 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-12qemu-iotests: expand test 093 to support group throttlingAlberto Garcia
This patch improves the test by attaching a different number of drives to the VM and putting them in the same throttling group. The test verifies that the I/O is evenly distributed among all members of the group, and that the limits are enforced. By default the test is repeated 3 times with 1, 2 and 3 drives, but the maximum number of simultaneous drives is configurable. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 513df1da5c658878191b579ebcddd985adcd4122.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Update throttle infrastructure copyrightAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 07dcd4ed02f0110b13b3140f477b761b8bb8e270.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: add the name of the ThrottleGroup to BlockDeviceInfoAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 172df91f09c69c6f0440a697bbd1b3f95b077ee4.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: acquire the ThrottleGroup lock in bdrv_swap()Alberto Garcia
bdrv_swap() touches the fields of a BlockDriverState that are protected by the ThrottleGroup lock. Although those fields end up in their original place, they are temporarily swapped in the process, so there's a chance that an operation on a member of the same group happening on a different thread can try to use them. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: d92dc40d7c4f1fc5cda5cbbf4ffb7a4670b79d17.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Add throttle group supportAlberto Garcia
The throttle group support use a cooperative round robin scheduling algorithm. The principles of the algorithm are simple: - Each BDS of the group is used as a token in a circular way. - The active BDS computes if a wait must be done and arms the right timer. - If a wait must be done the token timer will be armed so the token will become the next active BDS. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: f0082a86f3ac01c46170f7eafe2101a92e8fde39.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Add throttle group infrastructure testsAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: ba7b9dc7fca43efbb31d5f3aad91a8dbdbea635b.1433779731.git.berto@igalia.com Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Add throttle group infrastructureAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 2fdb4de17210b733a13eb472c33cd08b45f8fd21.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Extract timers from ThrottleState into a separate structureBenoît Canet
Group throttling will share ThrottleState between multiple bs. As a consequence the ThrottleState will be accessed by multiple aio context. Timers are tied to their aio context so they must go out of the ThrottleState structure. This commit paves the way for each bs of a common ThrottleState to have its own timer. Signed-off-by: Benoit Canet <benoit.canet@nodalink.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 6cf9ea96d8b32ae2f8769cead38f68a6a0c8c909.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12raw-posix: Fix .bdrv_co_get_block_status() for unaligned image sizeKevin Wolf
Image files with an unaligned image size have a final hole that starts at EOF, i.e. in the middle of a sector. Currently, *pnum == 0 is returned when checking the status of this sector. In qemu-img, this triggers an assertion failure. In order to fix this, one type for the sector that contains EOF must be found. Treating a hole as data is safe, so this patch rounds the calculated number of data sectors up, so that a partial sector at EOF is treated as a full data sector. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1229394 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1433840108-9996-1-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12Revert "iothread: release iothread around aio_poll"Stefan Hajnoczi
This reverts commit a0710f7995f914e3044e5899bd8ff6c43c62f916. In qemu-devel email message <556DBF87.2020908@de.ibm.com>, Christian Borntraeger writes: Having many guests all with a kernel/ramdisk (via -kernel) and several null block devices will result in hangs. All hanging guests are in partition detection code waiting for an I/O to return so very early maybe even the first I/O. Reverting that commit "fixes" the hangs. Reverting this commit for the 2.4 release. More time is needed to investigate and correct this patch. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12qmp/hmp: add rocker device supportScott Feldman
Add QMP/HMP support for rocker devices. This is mostly for debugging purposes to see inside the device's tables and port configurations. Some examples: (qemu) info rocker sw1 name: sw1 id: 0x0000013512005452 ports: 4 (qemu) info rocker-ports sw1 ena/ speed/ auto port link duplex neg? sw1.1 up 10G FD No sw1.2 up 10G FD No sw1.3 !ena 10G FD No sw1.4 !ena 10G FD No (qemu) info rocker-of-dpa-flows sw1 prio tbl hits key(mask) --> actions 2 60 pport 1 vlan 1 LLDP src 00:02:00:00:02:00 dst 01:80:c2:00:00:0e 2 60 pport 1 vlan 1 ARP src 00:02:00:00:02:00 dst 00:02:00:00:03:00 2 60 pport 2 vlan 2 IPv6 src 00:02:00:00:03:00 dst 33:33:ff:00:00:02 proto 58 3 50 vlan 2 dst 33:33:ff:00:00:02 --> write group 0x32000001 goto tbl 60 2 60 pport 2 vlan 2 IPv6 src 00:02:00:00:03:00 dst 33:33:ff:00:03:00 proto 58 3 50 1 vlan 2 dst 33:33:ff:00:03:00 --> write group 0x32000001 goto tbl 60 2 60 pport 2 vlan 2 ARP src 00:02:00:00:03:00 dst 00:02:00:00:02:00 3 50 2 vlan 2 dst 00:02:00:00:02:00 --> write group 0x02000001 goto tbl 60 2 60 1 pport 2 vlan 2 IP src 00:02:00:00:03:00 dst 00:02:00:00:02:00 proto 1 3 50 2 vlan 1 dst 00:02:00:00:03:00 --> write group 0x01000002 goto tbl 60 2 60 1 pport 1 vlan 1 IP src 00:02:00:00:02:00 dst 00:02:00:00:03:00 proto 1 2 60 pport 1 vlan 1 IPv6 src 00:02:00:00:02:00 dst 33:33:ff:00:00:01 proto 58 3 50 vlan 1 dst 33:33:ff:00:00:01 --> write group 0x31000000 goto tbl 60 2 60 pport 1 vlan 1 IPv6 src 00:02:00:00:02:00 dst 33:33:ff:00:02:00 proto 58 3 50 1 vlan 1 dst 33:33:ff:00:02:00 --> write group 0x31000000 goto tbl 60 1 60 173 pport 2 vlan 2 LLDP src <any> dst 01:80:c2:00:00:0e --> write group 0x02000000 1 60 6 pport 2 vlan 2 IPv6 src <any> dst <any> --> write group 0x02000000 1 60 174 pport 1 vlan 1 LLDP src <any> dst 01:80:c2:00:00:0e --> write group 0x01000000 1 60 174 pport 2 vlan 2 IP src <any> dst <any> --> write group 0x02000000 1 60 6 pport 1 vlan 1 IPv6 src <any> dst <any> --> write group 0x01000000 1 60 181 pport 2 vlan 2 ARP src <any> dst <any> --> write group 0x02000000 1 10 715 pport 2 --> apply new vlan 2 goto tbl 20 1 60 177 pport 1 vlan 1 ARP src <any> dst <any> --> write group 0x01000000 1 60 174 pport 1 vlan 1 IP src <any> dst <any> --> write group 0x01000000 1 10 717 pport 1 --> apply new vlan 1 goto tbl 20 1 0 1432 pport 0(0xffff) --> goto tbl 10 (qemu) info rocker-of-dpa-groups sw1 id (decode) --> buckets 0x32000001 (type L2 multicast vlan 2 index 1) --> groups [0x02000001,0x02000000] 0x02000001 (type L2 interface vlan 2 pport 1) --> pop vlan out pport 1 0x01000002 (type L2 interface vlan 1 pport 2) --> pop vlan out pport 2 0x02000000 (type L2 interface vlan 2 pport 0) --> pop vlan out pport 0 0x01000000 (type L2 interface vlan 1 pport 0) --> pop vlan out pport 0 0x31000000 (type L2 multicast vlan 1 index 0) --> groups [0x01000002,0x01000000] [Added "query-" prefixes to rocker.json commands as suggested by Eric Blake <eblake@redhat.com>. --Stefan] Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Message-id: 1433985681-56138-5-git-send-email-sfeldma@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12rocker: bring link up/down on PHY enable/disableScott Feldman
When the OS driver enables/disables the port, go ahead and set the port's link status to up/down in response to the change. This more closely emulates real hardware when the PHY for the port is brought up/down and the PHY negotiates carrier (link status) with link partner. In the case of qemu, the virtual rocker device can't really do link negotiation with the link partner as that requires signally over a physical medium (the wire), so just pretend the negotiation was successful and bring the link up when the port is enabled. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1433985681-56138-4-git-send-email-sfeldma@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12rocker: update tests using hw-derived interface namesScott Feldman
With previous patch to support phy name attribute for each port, the OS can name port interfaces using the hw-derived name. So update rocker tests to use the new hw-derived interface names. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1433985681-56138-3-git-send-email-sfeldma@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12rocker: Add support for phys nameDavid Ahern
Add ROCKER_TLV_CMD_PORT_SETTINGS_PHYS_NAME to port settings. This attribute exports the port name to the guest OS allowing it to name interfaces with sensible defaults. Mostly done by Scott for phys_id support; adapted to phys_name by David. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Message-id: 1433985681-56138-2-git-send-email-sfeldma@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12iohandler: Change return type of qemu_set_fd_handler to "void"Fam Zheng
Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1433400324-7358-14-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12event-notifier: Always return 0 for posix implementationFam Zheng
qemu_set_fd_handler cannot fail, let's always return 0. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1433400324-7358-13-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12xen_backend: Remove unused error handling of qemu_set_fd_handlerFam Zheng
The function cannot fail, so the check is superfluous. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1433400324-7358-12-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12oss: Remove unused error handling of qemu_set_fd_handlerFam Zheng
The function cannot fail, so the check is superfluous. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1433400324-7358-11-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12alsaaudio: Remove unused error handling of qemu_set_fd_handlerFam Zheng
The function cannot fail, so the check is superfluous. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1433400324-7358-10-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12main-loop: Drop qemu_set_fd_handler2Fam Zheng
All users are converted to qemu_set_fd_handler now, drop qemu_set_fd_handler2 and IOHandlerRecord.fd_read_poll. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1433400324-7358-9-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12Change qemu_set_fd_handler2(..., NULL, ...) to qemu_set_fd_handlerFam Zheng
Done with following Coccinelle semantic patch, plus manual cosmetic changes in net/*.c. @@ expression E1, E2, E3, E4; @@ - qemu_set_fd_handler2(E1, NULL, E2, E3, E4); + qemu_set_fd_handler(E1, E2, E3, E4); Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1433400324-7358-8-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>