diff options
author | Kevin Wolf <kwolf@redhat.com> | 2017-03-22 22:00:05 +0100 |
---|---|---|
committer | Max Reitz <mreitz@redhat.com> | 2017-03-27 16:53:42 +0200 |
commit | e5bcf967fb80ff9cf4d0c0d643e985ec5ff94e91 (patch) | |
tree | 8c4ebc20c2a2a3a646822f0742e7c1944b1734bb | |
parent | a12a712a7dfbd2e2f4882ef2c90a9b2162166dd7 (diff) |
file-posix: Make bdrv_flush() failure permanent without O_DIRECT
Success for bdrv_flush() means that all previously written data is safe
on disk. For fdatasync(), the best semantics we can hope for on Linux
(without O_DIRECT) is that all data that was written since the last call
was successfully written back. Therefore, and because we can't redo all
writes after a flush failure, we have to give up after a single
fdatasync() failure. After this failure, we would never be able to make
the promise that a successful bdrv_flush() makes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170322210005.16533-1-kwolf@redhat.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
-rw-r--r-- | block/file-posix.c | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/block/file-posix.c b/block/file-posix.c index 53febd3767..beb7a4f728 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -144,6 +144,7 @@ typedef struct BDRVRawState { bool has_write_zeroes:1; bool discard_zeroes:1; bool use_linux_aio:1; + bool page_cache_inconsistent:1; bool has_fallocate; bool needs_alignment; } BDRVRawState; @@ -824,10 +825,31 @@ static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb) static ssize_t handle_aiocb_flush(RawPosixAIOData *aiocb) { + BDRVRawState *s = aiocb->bs->opaque; int ret; + if (s->page_cache_inconsistent) { + return -EIO; + } + ret = qemu_fdatasync(aiocb->aio_fildes); if (ret == -1) { + /* There is no clear definition of the semantics of a failing fsync(), + * so we may have to assume the worst. The sad truth is that this + * assumption is correct for Linux. Some pages are now probably marked + * clean in the page cache even though they are inconsistent with the + * on-disk contents. The next fdatasync() call would succeed, but no + * further writeback attempt will be made. We can't get back to a state + * in which we know what is on disk (we would have to rewrite + * everything that was touched since the last fdatasync() at least), so + * make bdrv_flush() fail permanently. Given that the behaviour isn't + * really defined, I have little hope that other OSes are doing better. + * + * Obviously, this doesn't affect O_DIRECT, which bypasses the page + * cache. */ + if ((s->open_flags & O_DIRECT) == 0) { + s->page_cache_inconsistent = true; + } return -errno; } return 0; |