qed: do not evict in-use L2 table cache entries

The L2 table cache reduces QED metadata reads that would be required when translating LBAs to offsets into the image file. Since requests execute in parallel it is possible to share an L2 table between multiple requests. There is a potential data corruption issue when an in-use L2 table is evicted from the cache because the following situation occurs: 1. An allocating write performs an update to L2 table "A". 2. Another request needs L2 table "B" and causes table "A" to be evicted. 3. A new read request needs L2 table "A" but it is not cached. As a result the L2 update from #1 can overlap with the L2 fetch from #3. We must avoid doing overlapping I/O requests here since the worst case outcome is that the L2 fetch completes before the L2 update and yields stale data. In that case we would effectively discard the L2 update and lose data clusters! Thanks to Benoît Canet <benoit.canet@gmail.com> for extensive testing and debugging which lead to discovery of this bug. Reported-by: Benoît Canet <benoit.canet@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Tested-by: Benoît Canet <benoit.canet@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
author: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> 2012-02-27 13:16:01 +0000
committer: Kevin Wolf <kwolf@redhat.com> 2012-03-12 15:14:06 +0100
commit: 14fe292d86da90b79e2fb56a4986d27346339a00 (patch)
tree: d7a6d4fb4a921d9c8dcf2e680d90d5d72474ae53 /block
parent: d0895d6e390d59d3dc6edab771335adfea263029 (diff)
1 files changed, 18 insertions, 4 deletions
diff --git a/block/qed-l2-cache.c b/block/qed-l2-cache.c
index 02b81a2e33..e9b2aae44d 100644
--- a/block/qed-l2-cache.c
+++ b/block/qed-l2-cache.c
@@ -161,11 +161,25 @@ void qed_commit_l2_cache_entry(L2TableCache *l2_cache, CachedL2Table *l2_table)
         return;
     }
 
+    /* Evict an unused cache entry so we have space.  If all entries are in use
+     * we can grow the cache temporarily and we try to shrink back down later.
+     */
     if (l2_cache->n_entries >= MAX_L2_CACHE_SIZE) {
-        entry = QTAILQ_FIRST(&l2_cache->entries);
-        QTAILQ_REMOVE(&l2_cache->entries, entry, node);
-        l2_cache->n_entries--;
-        qed_unref_l2_cache_entry(entry);
+        CachedL2Table *next;
+        QTAILQ_FOREACH_SAFE(entry, &l2_cache->entries, node, next) {
+            if (entry->ref > 1) {
+                continue;
+            }
+
+            QTAILQ_REMOVE(&l2_cache->entries, entry, node);
+            l2_cache->n_entries--;
+            qed_unref_l2_cache_entry(entry);
+
+            /* Stop evicting when we've shrunk back to max size */
+            if (l2_cache->n_entries < MAX_L2_CACHE_SIZE) {
+                break;
+            }
+        }
     }
 
     l2_cache->n_entries++;
author	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2012-02-27 13:16:01 +0000
committer	Kevin Wolf <kwolf@redhat.com>	2012-03-12 15:14:06 +0100
commit	14fe292d86da90b79e2fb56a4986d27346339a00 (patch)
tree	d7a6d4fb4a921d9c8dcf2e680d90d5d72474ae53 /block
parent	d0895d6e390d59d3dc6edab771335adfea263029 (diff)