utils: Improve qemu_strtosz() to have 64 bits of precision

We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, the keyval visitor), and it gets annoying that edge-case testing is impacted by implicit rounding to 53 bits of precision due to parsing with strtod(). As an example posted by Rich Jones: $ nbdkit memory $(( 2**63 - 2**30 )) --run \ 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' write failed: Input/output error because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is out of bounds. It is also worth noting that our existing parser, by virtue of using strtod(), accepts decimal AND hex numbers, even though test-cutils previously lacked any coverage of the latter until the previous patch. We do have existing clients that expect a hex parse to work (for example, iotest 33 using qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 rather than as an invalid octal number, so we know there are no clients that depend on octal. Our use of strtod() also means that "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather than 1843 (if the fraction were 8/10); but as this was not covered in the testsuite, I have no qualms forbidding hex fractions as invalid, so this patch declares that the use of fractions is only supported with decimal input, and enhances the testsuite to document that. Our previous use of strtod() meant that -1 parsed as a negative; now that we parse with strtoull(), negative values can wrap around modulo 2^64, so we have to explicitly check whether the user passed in a '-'; and make it consistent to also reject '-0'. This has the minor effect of treating negative values as EINVAL (with no change to endptr) rather than ERANGE (with endptr advanced to what was parsed), visible in the updated iotest output. We also had no testsuite coverage of "1.1e0k", which happened to parse under strtod() but is unlikely to occur in practice; as long as we are making things more robust, it is easy enough to reject the use of exponents in a strtod parse. The fix is done by breaking the parse into an integer prefix (no loss in precision), rejecting negative values (since we can no longer rely on strtod() to do that), determining if a decimal or hexadecimal parse was intended (with the new restriction that a fractional hex parse is not allowed), and where appropriate, using a floating point fractional parse (where we also scan to reject use of exponents in the fraction). The bulk of the patch is then updates to the testsuite to match our new precision, as well as adding new cases we reject (whether they were rejected or inadvertently accepted before). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20210211204438.1184395-3-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
author: Eric Blake <eblake@redhat.com> 2021-02-11 14:44:36 -0600
committer: Eric Blake <eblake@redhat.com> 2021-03-08 13:36:12 -0600
commit: cf923b783efd565787e9ab006fb5608bb2a7297b (patch)
tree: 43d9ed85c389fe427b4663831afddc5d1af9508b /util
parent: 1657ba44b449471c665bc5d358ca33af411710f3 (diff)
1 files changed, 69 insertions, 21 deletions
diff --git a/util/cutils.c b/util/cutils.c
index 70c7d6efbd..189a184859 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -241,52 +241,100 @@ static int64_t suffix_mul(char suffix, int64_t unit)
 }
 
 /*
- * Convert string to bytes, allowing either B/b for bytes, K/k for KB,
- * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned
- * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on
- * other error.
+ * Convert size string to bytes.
+ *
+ * The size parsing supports the following syntaxes
+ * - 12345 - decimal, scale determined by @default_suffix and @unit
+ * - 12345{bBkKmMgGtTpPeE} - decimal, scale determined by suffix and @unit
+ * - 12345.678{kKmMgGtTpPeE} - decimal, scale determined by suffix, and
+ *   fractional portion is truncated to byte
+ * - 0x7fEE - hexadecimal, unit determined by @default_suffix
+ *
+ * The following are intentionally not supported
+ * - octal, such as 08
+ * - fractional hex, such as 0x1.8
+ * - floating point exponents, such as 1e3
+ *
+ * The end pointer will be returned in *end, if not NULL.  If there is
+ * no fraction, the input can be decimal or hexadecimal; if there is a
+ * fraction, then the input must be decimal and there must be a suffix
+ * (possibly by @default_suffix) larger than Byte, and the fractional
+ * portion may suffer from precision loss or rounding.  The input must
+ * be positive.
+ *
+ * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on
+ * other error (with *@end left unchanged).
  */
 static int do_strtosz(const char *nptr, const char **end,
                       const char default_suffix, int64_t unit,
                       uint64_t *result)
 {
     int retval;
-    const char *endptr;
+    const char *endptr, *f;
     unsigned char c;
-    int mul_required = 0;
-    double val, mul, integral, fraction;
+    bool mul_required = false;
+    uint64_t val;
+    int64_t mul;
+    double fraction = 0.0;
 
-    retval = qemu_strtod_finite(nptr, &endptr, &val);
+    /* Parse integral portion as decimal. */
+    retval = qemu_strtou64(nptr, &endptr, 10, &val);
     if (retval) {
         goto out;
     }
-    fraction = modf(val, &integral);
-    if (fraction != 0) {
-        mul_required = 1;
+    if (memchr(nptr, '-', endptr - nptr) != NULL) {
+        endptr = nptr;
+        retval = -EINVAL;
+        goto out;
+    }
+    if (val == 0 && (*endptr == 'x' || *endptr == 'X')) {
+        /* Input looks like hex, reparse, and insist on no fraction. */
+        retval = qemu_strtou64(nptr, &endptr, 16, &val);
+        if (retval) {
+            goto out;
+        }
+        if (*endptr == '.') {
+            endptr = nptr;
+            retval = -EINVAL;
+            goto out;
+        }
+    } else if (*endptr == '.') {
+        /*
+         * Input looks like a fraction.  Make sure even 1.k works
+         * without fractional digits.  If we see an exponent, treat
+         * the entire input as invalid instead.
+         */
+        f = endptr;
+        retval = qemu_strtod_finite(f, &endptr, &fraction);
+        if (retval) {
+            fraction = 0.0;
+            endptr++;
+        } else if (memchr(f, 'e', endptr - f) || memchr(f, 'E', endptr - f)) {
+            endptr = nptr;
+            retval = -EINVAL;
+            goto out;
+        } else if (fraction != 0) {
+            mul_required = true;
+        }
     }
     c = *endptr;
     mul = suffix_mul(c, unit);
-    if (mul >= 0) {
+    if (mul > 0) {
         endptr++;
     } else {
         mul = suffix_mul(default_suffix, unit);
-        assert(mul >= 0);
+        assert(mul > 0);
     }
     if (mul == 1 && mul_required) {
+        endptr = nptr;
         retval = -EINVAL;
         goto out;
     }
-    /*
-     * Values near UINT64_MAX overflow to 2**64 when converting to double
-     * precision.  Compare against the maximum representable double precision
-     * value below 2**64, computed as "the next value after 2**64 (0x1p64) in
-     * the direction of 0".
-     */
-    if ((val * mul > nextafter(0x1p64, 0)) || val < 0) {
+    if (val > (UINT64_MAX - ((uint64_t) (fraction * mul))) / mul) {
         retval = -ERANGE;
         goto out;
     }
-    *result = val * mul;
+    *result = val * mul + (uint64_t) (fraction * mul);
     retval = 0;
 
 out:
author	Eric Blake <eblake@redhat.com>	2021-02-11 14:44:36 -0600
committer	Eric Blake <eblake@redhat.com>	2021-03-08 13:36:12 -0600
commit	cf923b783efd565787e9ab006fb5608bb2a7297b (patch)
tree	43d9ed85c389fe427b4663831afddc5d1af9508b /util
parent	1657ba44b449471c665bc5d358ca33af411710f3 (diff)