aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-05-12tcg: Rename tb_jmp_remove() to tb_remove_from_jmp_list()Sergey Fedorov
tb_jmp_remove() was only used to remove the TB from a list of all TBs jumping to the same TB which is n-th jump destination of the given TB. Put a comment briefly describing the function behavior and rename it to better reflect its purpose. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Clarify thread safety check in tb_add_jump()Sergey Fedorov
The check is to make sure that another thread hasn't already done the same while we were outside of tb_lock. Mention this in a comment. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Init TB's direct jumps before making it visibleSergey Fedorov
Initialize TB's direct jump list data fields and reset the jumps before tb_link_page() puts it into the physical hash table and the physical page list. So TB is completely initialized before it becomes visible. This is pure rearrangement of code to a more suitable place, though it could be a preparation for relaxing the locking scheme in future. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Rearrange tb_link_page() to avoid forward declarationSergey Fedorov
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Use uintptr_t type for jmp_list_{next|first} fields of TBSergey Fedorov
These fields do not contain pure pointers to a TranslationBlock structure. So uintptr_t is the most appropriate type for them. Also put some asserts to assure that the two least significant bits of the pointer are always zero before assigning it to jmp_list_first. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Clean up direct block chaining data fieldsSergey Fedorov
Briefly describe in a comment how direct block chaining is done. It should help in understanding of the following data fields. Rename some fields in TranslationBlock and TCGContext structures to better reflect their purpose (dropping excessive 'tb_' prefix in TranslationBlock but keeping it in TCGContext): tb_next_offset => jmp_reset_offset tb_jmp_offset => jmp_insn_offset tb_next => jmp_target_addr jmp_next => jmp_list_next jmp_first => jmp_list_first Avoid using a magic constant as an invalid offset which is used to indicate that there's no n-th jump generated. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12translate-all: Adjust 256mb testing for mips64Richard Henderson
Make sure we preserve the high 32-bits when masking for mips64. Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12translate-all: add missing munmap of the code_gen guard page for MIPSEmilio G. Cota
Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1461283314-2353-2-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12translate-all: remove redundant setting of tcg_ctx.code_gen_buffer_sizeEmilio G. Cota
The setting of tcg_ctx.code_gen_buffer_size is done by the only caller of size_code_gen_buffer(), which is code_gen_alloc(): $ git grep size_code_gen_buffer translate-all.c:static inline size_t size_code_gen_buffer(size_t tb_size) translate-all.c: tcg_ctx.code_gen_buffer_size = size_code_gen_buffer(tb_size); Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1461283314-2353-1-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg: Note requirement on atomic direct jump patchingSergey Fedorov
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1461341333-19646-12-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/mips: Make direct jump patching thread-safeSergey Fedorov
Ensure direct jump patching in MIPS is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-11-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net> [rth: Merged the deposit32 followup.] [rth: Merged the following followup.] Message-Id: <1462210518-26522-1-git-send-email-sergey.fedorov@linaro.org>
2016-05-12tcg/sparc: Make direct jump patching thread-safeSergey Fedorov
Ensure direct jump patching in SPARC is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1461341333-19646-10-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/aarch64: Make direct jump patching thread-safeSergey Fedorov
Ensure direct jump patching in AArch64 is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-9-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/arm: Make direct jump patching thread-safeSergey Fedorov
Ensure direct jump patching in ARM is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-8-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/s390: Make direct jump patching thread-safeSergey Fedorov
Ensure direct jump patching in s390 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-7-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/i386: Make direct jump patching thread-safeSergey Fedorov
Ensure direct jump patching in i386 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. tcg_out_nopn() implementation: Suggested-by: Richard Henderson <rth@twiddle.net>. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-6-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tcg/ppc: Make direct jump patching thread-safeSergey Fedorov
Ensure direct jump patching in PPC is atomic by: * limiting translation buffer size in 32-bit mode to be addressable by Branch I-form instruction; * using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1461341333-19646-5-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tci: Make direct jump patching thread-safeSergey Fedorov
Ensure direct jump patching in TCI is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() to load/store the address. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-4-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12include/qemu/osdep.h: Add macros for pointer alignmentSergey Fedorov
These macros provide a convenient way to n-byte align pointers up and down and check if a pointer is n-byte aligned. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-3-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12include/qemu/osdep.h: Add a macro to check for alignmentSergey Fedorov
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-2-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tb: consistently use uint32_t for tb->flagsEmilio G. Cota
We are inconsistent with the type of tb->flags: usage varies loosely between int and uint64_t. Settle to uint32_t everywhere, which is superior to both: at least one target (aarch64) uses the most significant bit in the u32, and uint64_t is wasteful. Compile-tested for all targets. Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com> Suggested-by: Richard Henderson <rth@twiddle.net> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>
2016-05-12Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches # gpg: Signature made Thu 12 May 2016 14:37:05 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (69 commits) qemu-iotests: iotests: fail hard if not run via "check" block: enable testing of LUKS driver with block I/O tests block: add support for encryption secrets in block I/O tests block: add support for --image-opts in block I/O tests qemu-io: Add 'write -z -u' to test MAY_UNMAP flag qemu-io: Add 'write -f' to test FUA flag qemu-io: Allow unaligned access by default qemu-io: Use bool for command line flags qemu-io: Make 'open' subcommand more like command line qemu-io: Add missing option documentation qmp: add monitor command to add/remove a child quorum: implement bdrv_add_child() and bdrv_del_child() Add new block driver interface to add/delete a BDS's child qemu-img: check block status of backing file when converting. iotests: fix the redirection order in 083 block: Inactivate all children block: Drop superfluous invalidating bs->file from drivers block: Invalidate all children nbd: Simplify client FUA handling block: Honor BDRV_REQ_FUA during write_zeroes ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12Merge remote-tracking branch ↵Peter Maydell
'remotes/pmaydell/tags/pull-target-arm-20160512' into staging target-arm queue: * blizzard, omap_lcdc: code cleanup to remove DEPTH != 32 dead code * QOMify various ARM devices * bcm2835_property: use cached values when querying framebuffer * hw/arm/nseries: don't allocate large sized array on the stack * fix LPAE descriptor address masking (only visible for EL2) * fix stage 2 exec permission handling for AArch32 * first part of supporting syndrome info for data aborts to EL2 * virt: NUMA support * work towards i.MX6 support * avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes # gpg: Signature made Thu 12 May 2016 14:29:14 BST using RSA key ID 14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" * remotes/pmaydell/tags/pull-target-arm-20160512: (43 commits) hw/arm: QOM'ify versatilepb.c hw/arm: QOM'ify strongarm.c hw/arm: QOM'ify stellaris.c hw/arm: QOM'ify spitz.c hw/arm: QOM'ify pxa2xx_pic.c hw/arm: QOM'ify pxa2xx.c hw/arm: QOM'ify integratorcp.c hw/arm: QOM'ify highbank.c hw/arm: QOM'ify armv7m.c target-arm: Avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes hw/display/blizzard: Remove blizzard_template.h hw/display/blizzard: Expand out macros i.MX: Add sabrelite i.MX6 emulation. i.MX: Add i.MX6 SOC implementation. i.MX: Add the Freescale SPI Controller FIFO: Add a FIFO32 implementation i.MX: Add i.MX6 System Reset Controller device. ARM: Factor out ARM on/off PSCI control functions ACPI: Virt: Generate SRAT table ACPI: move acpi_build_srat_memory to common place ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-05-12' into ↵Peter Maydell
staging QAPI patches for 2016-05-12 # gpg: Signature made Thu 12 May 2016 08:49:04 BST using RSA key ID EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" * remotes/armbru/tags/pull-qapi-2016-05-12: (23 commits) qapi: Change visit_type_FOO() to no longer return partial objects qapi: Simplify semantics of visit_next_list() qapi: Fix string input visitor handling of invalid list tests/string-input-visitor: Add negative integer tests qapi: Split visit_end_struct() into pieces qmp: Tighten output visitor rules qmp: Don't reuse qmp visitor after grabbing output spapr_drc: Expose 'null' in qom-get when there is no fdt qmp: Support explicit null during visits qapi: Add visit_type_null() visitor tests: Add check-qnull qapi: Document visitor interfaces, add assertions qmp-input: Refactor when list is advanced qmp-input: Require struct push to visit members of top dict qom: Wrap prop visit in visit_start_struct qapi-commands: Wrap argument visit in visit_start_struct qmp-input: Don't consume input when checking has_member qapi: Use strict QMP input visitor in more places qapi: Consolidate QMP input visitor creation qmp-input: Clean up stack handling ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-05-12' ↵Kevin Wolf
into queue-block Block patches for 2.7 # gpg: Signature made Thu May 12 15:34:13 2016 CEST using RSA key ID E838ACAD # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" * mreitz/tags/pull-block-for-kevin-2016-05-12: qemu-iotests: iotests: fail hard if not run via "check" block: enable testing of LUKS driver with block I/O tests block: add support for encryption secrets in block I/O tests block: add support for --image-opts in block I/O tests qemu-io: Add 'write -z -u' to test MAY_UNMAP flag qemu-io: Add 'write -f' to test FUA flag qemu-io: Allow unaligned access by default qemu-io: Use bool for command line flags qemu-io: Make 'open' subcommand more like command line qemu-io: Add missing option documentation qmp: add monitor command to add/remove a child quorum: implement bdrv_add_child() and bdrv_del_child() Add new block driver interface to add/delete a BDS's child qemu-img: check block status of backing file when converting. iotests: fix the redirection order in 083 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160511-1' into ↵Peter Maydell
staging usb: misc fixes # gpg: Signature made Wed 11 May 2016 12:18:25 BST using RSA key ID D3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" * remotes/kraxel/tags/pull-usb-20160511-1: usb: Support compilation without poll.h usb-mtp: fix usb_mtp_get_device_info so that libmtp on the guest doesn't complain usb:xhci: no DMA on HC reset Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12qemu-iotests: iotests: fail hard if not run via "check"Sascha Silbe
Running an iotests-based Python test directly might appear to work, but may fail in subtle ways and is insecure: - It creates files with predictable file names in a world-writable location (/var/tmp). - Tests expect the environment to be set up by check. E.g. 041 and 055 may take the wrong code paths if QEMU_DEFAULT_MACHINE is not set. This can lead to false negatives. Instead fail hard and tell the user we want to be run via "check". The actual environment expected by the tests is currently only defined by the implementation of "check". We use two of the environment variables set by "check" as indication of whether we're being run via "check". Anyone writing their own test runner (replacing "check") will need to replicate the full environment (in a broader sense, not just environment variables) provided by "check" anyway, including setting the two environment variables we check. Whereas a regular developer just trying to invoke the tests usually won't have both of these defined in their environment so we can catch their mistake and give out useful advice. Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Reviewed-by: Bo Tu <tubo@linux.vnet.ibm.com> Message-id: 1461094442-16014-1-git-send-email-silbe@linux.vnet.ibm.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: enable testing of LUKS driver with block I/O testsDaniel P. Berrange
This adds support for testing the LUKS driver with the block I/O test framework. cd tests/qemu-io-tests ./check -luks A handful of test cases are modified to work with luks - 004 - whitelist luks format - 012 - use TEST_IMG_FILE instead of TEST_IMG for file ops - 048 - use TEST_IMG_FILE instead of TEST_IMG for file ops. don't assume extended image contents is all zeros, explicitly initialize with zeros Make file size smaller to avoid having to decrypt 1 GB of data. - 052 - don't assume initial image contents is all zeros, explicitly initialize with zeros - 100 - don't assume initial image contents is all zeros, explicitly initialize with zeros With this patch applied, the results are as follows: Passed: 001 002 003 004 005 008 009 010 011 012 021 032 043 047 048 049 052 087 100 134 143 Failed: 033 120 140 145 Skipped: 007 013 014 015 017 018 019 020 022 023 024 025 026 027 028 029 030 031 034 035 036 037 038 039 040 041 042 043 044 045 046 047 049 050 051 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 101 102 103 104 105 107 108 109 110 111 112 113 114 115 116 117 118 119 121 122 123 124 128 129 130 131 132 133 134 135 136 137 138 139 141 142 144 146 148 150 152 The reasons for the failed tests are: - 033 - needs adapting to use image opts syntax with blkdebug and test image in order to correctly set align property - 120 - needs adapting to use correct -drive syntax for luks - 140 - needs adapting to use correct -drive syntax for luks - 145 - needs adapting to use correct -drive syntax for luks The vast majority of skipped tests are exercising code that is qcow2 specific, though a couple could probably be usefully enabled for luks too. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1462896689-18450-4-git-send-email-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: add support for encryption secrets in block I/O testsDaniel P. Berrange
The LUKS block driver tests will require the ability to specify encryption secrets with block devices. This requires using the --object argument to qemu-img/qemu-io to create a 'secret' object. When the IMGKEYSECRET env variable is set, it provides the password to be associated with a secret called 'keysec0' The _qemu_img_wrapper function isn't modified as that needs to cope with differing syntax for subcommands, so can't be made to use the image opts syntax unconditionally. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1462896689-18450-3-git-send-email-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: add support for --image-opts in block I/O testsDaniel P. Berrange
Currently all block tests use the traditional syntax for images just specifying a filename. To support the LUKS driver without resorting to JSON, the tests need to be able to use the new --image-opts argument to qemu-img and qemu-io. This introduces a new env variable IMGOPTSSYNTAX. If this is set to 'true', then qemu-img/qemu-io should use --image-opts. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1462896689-18450-2-git-send-email-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Add 'write -z -u' to test MAY_UNMAP flagEric Blake
Make it easier to control whether the BDRV_REQ_MAY_UNMAP flag can be passed through a write_zeroes command, by adding the '-u' flag to qemu-io 'write -z' and 'aio_write -z'. To be useful, the device has to be opened with BDRV_O_UNMAP (done by default in qemu-io, but can be made explicit with '-d unmap'). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1462677405-4752-7-git-send-email-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Add 'write -f' to test FUA flagEric Blake
Make it easier to test block drivers with BDRV_REQ_FUA in .supported_write_flags, by adding the '-f' flag to qemu-io to conditionally pass the flag through to specific writes ('write', 'write -z', 'writev', 'aio_write', 'aio_write -z'). You'll want to use 'qemu-io -t none' to actually make -f useful (as otherwise, the default writethrough mode automatically sets the FUA bit on every write). Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1462677405-4752-6-git-send-email-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Allow unaligned access by defaultEric Blake
There's no reason to require the user to specify a flag just so they can pass in unaligned numbers. Keep 'read -p' and 'write -p' as no-ops so that I don't have to hunt down and update all users of qemu-io, but otherwise make their behavior default as 'read' and 'write'. Also fix 'write -z', 'readv', 'writev', 'writev', 'aio_read', 'aio_write', and 'aio_write -z'. For now, 'read -b', 'write -b', and 'write -c' still require alignment (and 'multiwrite', but that's slated to die soon). qemu-iotest 23 is updated to match, as the only test that was previously explicitly expecting an error on an unaligned request. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1462677405-4752-5-git-send-email-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Use bool for command line flagsEric Blake
We require a C99 compiler; let's use it to express what we really mean. (Yes, we now have an instance of 'if (bool + bool + bool > 1)', which, although semantically valid C, looks ugly; it gets cleaned up later.) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1462677405-4752-4-git-send-email-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Make 'open' subcommand more like command lineEric Blake
The command line defaults to BDRV_O_UNMAP, but can use -d to reset it. Meanwhile, the 'open' subcommand was defaulting to no discards, with no way to set it. The command line has both -n and -tMODE to set a variety of cache modes, but the 'open' subcommand had only -n. The 'open' subcommand had no way to set BDRV_O_NATIVE_AIO. Note that the 'reopen' subcommand uses '-c' where the command line and 'open' use -t. Making that consistent would be a separate patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1462677405-4752-3-git-send-email-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Add missing option documentationEric Blake
The Usage: summary is missing several options, but rather than having to maintain it, it's simpler to just state [OPTIONS], since the options are spelled out below. Commit 499afa2 added --image-opts, but forgot to document it in --help. Likewise for commit 9e8f183 and -d/--discard. Commit e3aff4f6 put "-o/--offset" in the long opts, but it has never been honored. Add a note that '-n' is short for '-t none'. Commit 9a2d77ad killed the -C option, but forgot to undocument it for the 'open' subcommand. Finally, commit 10d9d75 removed -g/--growable, but forgot to cull it from the valid short options. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1462677405-4752-2-git-send-email-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qmp: add monitor command to add/remove a childWen Congyang
The new QMP command name is x-blockdev-change. It's just for adding/removing quorum's child now, and doesn't support all kinds of children, all kinds of operations, nor all block drivers. So it is experimental now. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1462865799-19402-4-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12quorum: implement bdrv_add_child() and bdrv_del_child()Wen Congyang
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Message-id: 1462865799-19402-3-git-send-email-xiecl.fnst@cn.fujitsu.com Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12Add new block driver interface to add/delete a BDS's childWen Congyang
In some cases, we want to take a quorum child offline, and take another child online. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1462865799-19402-2-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-img: check block status of backing file when converting.Ren Kimura
When converting images, check the block status of its backing file chain to avoid needlessly reading zeros. Signed-off-by: Ren Kimura <rkx1209dev@gmail.com> Message-id: 1461773098-20356-1-git-send-email-rkx1209dev@gmail.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12iotests: fix the redirection order in 083Wei Jiangang
It should redirect stdout to /dev/null first, then redirect stderr to whatever stdout currently points at. Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com> Message-id: 1461665601-14908-1-git-send-email-weijg.fnst@cn.fujitsu.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: Inactivate all childrenFam Zheng
Currently we only inactivate the top BDS. Actually bdrv_inactivate should be the opposite of bdrv_invalidate_cache. Recurse into the whole subtree instead. Because a node may have multiple parents, and because once BDRV_O_INACTIVE is set for a node, further writes are not allowed, we cannot interleave flag settings and .bdrv_inactivate calls (that may submit write to other nodes in a graph) within a single pass. Therefore two passes are used here. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Drop superfluous invalidating bs->file from driversFam Zheng
Now they are invalidated by the block layer, so it's not necessary to do this in block drivers' implementations of .bdrv_invalidate_cache. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Invalidate all childrenFam Zheng
Currently we only recurse to bs->file, which will miss the children in quorum and VMDK. Recurse into the whole subtree to avoid that. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12nbd: Simplify client FUA handlingEric Blake
Now that the block layer honors per-bds FUA support, we don't have to duplicate the fallback flush at the NBD layer. The static function nbd_co_writev_flags() is no longer needed, and the driver can just directly use nbd_client_co_writev(). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Honor BDRV_REQ_FUA during write_zeroesEric Blake
The block layer has a couple of cases where it can lose Force Unit Access semantics when writing a large block of zeroes, such that the request returns before the zeroes have been guaranteed to land on underlying media. SCSI does not support FUA during WRITESAME(10/16); FUA is only supported if it falls back to WRITE(10/16). But where the underlying device is new enough to not need a fallback, it means that any upper layer request with FUA semantics was silently ignoring BDRV_REQ_FUA. Conversely, NBD has situations where it can support FUA but not ZERO_WRITE; when that happens, the generic block layer fallback to bdrv_driver_pwritev() (or the older bdrv_co_writev() in qemu 2.6) was losing the FUA flag. The problem of losing flags unrelated to ZERO_WRITE has been latent in bdrv_co_do_write_zeroes() since commit aa7bfbff, but back then, it did not matter because there was no FUA flag. It became observable when commit 93f5e6d8 paved the way for flags that can impact correctness, when we should have been using bdrv_co_writev_flags() with modified flags. Compare to commit 9eeb6dd, which got flag manipulation right in bdrv_co_do_zero_pwritev(). Symptoms: I tested with qemu-io with default writethrough cache (which is supposed to use FUA semantics on every write), and targetted an NBD client connected to a server that intentionally did not advertise NBD_FLAG_SEND_FUA. When doing 'write 0 512', the NBD client sent two operations (NBD_CMD_WRITE then NBD_CMD_FLUSH) to get the fallback FUA semantics; but when doing 'write -z 0 512', the NBD client sent only NBD_CMD_WRITE. The fix is do to a cleanup bdrv_co_flush() at the end of the operation if any step in the middle relied on a BDS that does not natively support FUA for that step (note that we don't need to flush after every operation, if the operation is broken into chunks based on bounce-buffer sizing). Each BDS gains a new flag .supported_zero_flags, which parallels the use of .supported_write_flags but only when accessing a zero write operation (the flags MUST be different, because of SCSI having different semantics based on WRITE vs. WRITESAME; and also because BDRV_REQ_MAY_UNMAP only makes sense on zero writes). Also fix some documentation to describe -ENOTSUP semantics, particularly since iscsi depends on those semantics. Down the road, we may want to add a driver where its .bdrv_co_pwritev() honors all three of BDRV_REQ_FUA, BDRV_REQ_ZERO_WRITE, and BDRV_REQ_MAY_UNMAP, and advertise this via bs->supported_write_flags for blocks opened by that driver; such a driver should NOT supply .bdrv_co_write_zeroes nor .supported_zero_flags. But none of the drivers touched in this patch want to do that (the act of writing zeroes is different enough from normal writes to deserve a second callback). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Make supported_write_flags a per-bds propertyEric Blake
Pre-patch, .supported_write_flags lives at the driver level, which means we are blindly declaring that all block devices using a given driver will either equally support FUA, or that we need a fallback at the block layer. But there are drivers where FUA support is a per-block decision: the NBD block driver is dependent on the remote server advertising NBD_FLAG_SEND_FUA (and has fallback code to duplicate the flush that the block layer would do if NBD had not set .supported_write_flags); and the iscsi block driver is dependent on the mode sense bits advertised by the underlying device (and is currently silently ignoring FUA requests if the underlying device does not support FUA). The fix is to make supported flags as a per-BDS option, set during .bdrv_open(). This patch moves the variable and fixes NBD and iscsi to set it only conditionally; later patches will then further simplify the NBD driver to quit duplicating work done at the block layer, as well as tackle the fact that SCSI does not support FUA semantics on WRITESAME(10/16) but only on WRITE(10/16). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12qcow2: improve qcow2_co_write_zeroes()Denis V. Lunev
There is a possibility that qcow2_co_write_zeroes() will be called with the partial block. This could be synthetically triggered with qemu-io -c "write -z 32k 4k" and can happen in the real life in qemu-nbd. The latter happens under the following conditions: (1) qemu-nbd is started with --detect-zeroes=on and is connected to the kernel NBD client (2) third party program opens kernel NBD device with O_DIRECT (3) third party program performs write operation with memory buffer not aligned to the page In this case qcow2_co_write_zeroes() is unable to perform the operation and mark entire cluster as zeroed and returns ENOTSUP. Thus the caller switches to non-optimized version and writes real zeroes to the disk. The patch creates a shortcut. If the block is read as zeroes, f.e. if it is unallocated, the request is extended to cover full block. User-visible situation with this block is not changed. Before the patch the block is filled in the image with real zeroes. After that patch the block is marked as zeroed in metadata. Thus any subsequent changes in backing store chain are not affected. Kevin, thank you for a cool suggestion. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Kill unused sector-based blk_* functionsEric Blake
Now that there are no remaining clients, we can drop the sector-based blk_read(), blk_write(), blk_aio_readv(), and blk_aio_writev(). Sadly, there are still remaining sector-based interfaces, such as blk_*discard(), or blk_write_compressed(); those will have to wait for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12qemu-io: Switch to byte-based block accessEric Blake
qemu-io is the last user of several sector-based interfaces. This patch upgrades to the new interfaces under the hood, then deletes the resulting dead code. Note that for maximum back-compat, while the -p option is no longer required to get blk_pread(), it is still needed to allow for unaligned access; this is because qemu-iotest 23 relies on qemu-io rejecting unaligned accesses without -p. A later patch may clean up the interface to be more user-friendly, but it's better to separate what's done under the hood from what the user sees. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>