Merge bitcoin/bitcoin#30526: doc: Correct uint256 hex string endianness

73e3fa10b4dd63b7767d6b6f270df66971067341 doc + test: Correct uint256 hex string endianness (Hodlinator) Pull request description: This PR is a follow-up to #30436. Only changes test-code and modifies/adds comments. Byte order of hex string representation was wrongfully documented as little-endian, but are in fact closer to "big-endian" (endianness is a memory-order concept rather than a numeric concept). `[arith_]uint256` both store their data in arrays with little-endian byte order (`arith_uint256` has host byte order within each `uint32_t` element). **uint256_tests.cpp** - Avoid using variable from the left side of the condition in the right side. Credits to @maflcko: https://github.com/bitcoin/bitcoin/pull/30436#discussion_r1688273553 **setup_common.cpp** - Skip needless ArithToUint256-conversion. Credits to @stickies-v: https://github.com/bitcoin/bitcoin/pull/30436#discussion_r1688621638 --- <details> <summary> ## Logical reasoning for endianness </summary> 1. Comparing an `arith_uint256` (`base_uint<256>`) to a `uint64_t` compares the beginning of the array, and verifies the remaining elements are zero. ```C++ template <unsigned int BITS> bool base_uint<BITS>::EqualTo(uint64_t b) const { for (int i = WIDTH - 1; i >= 2; i--) { if (pn[i]) return false; } if (pn[1] != (b >> 32)) return false; if (pn[0] != (b & 0xfffffffful)) return false; return true; } ``` ...that is consistent with little endian ordering of the array. 2. They have the same endianness (but `arith_*` has host-ordering of each `uint32_t` element): ```C++ arith_uint256 UintToArith256(const uint256 &a) { arith_uint256 b; for(int x=0; x<b.WIDTH; ++x) b.pn[x] = ReadLE32(a.begin() + x*4); return b; } ``` ### String conversions The reversal of order which happens when converting hex-strings <=> uint256 means strings are actually closer to big-endian, see the end of `base_blob<BITS>::SetHexDeprecated`: ```C++ unsigned char* p1 = m_data.data(); unsigned char* pend = p1 + WIDTH; while (digits > 0 && p1 < pend) { *p1 = ::HexDigit(trimmed[--digits]); if (digits > 0) { *p1 |= ((unsigned char)::HexDigit(trimmed[--digits]) << 4); p1++; } } ``` Same reversal here: ```C++ template <unsigned int BITS> std::string base_blob<BITS>::GetHex() const { uint8_t m_data_rev[WIDTH]; for (int i = 0; i < WIDTH; ++i) { m_data_rev[i] = m_data[WIDTH - 1 - i]; } return HexStr(m_data_rev); } ``` It now makes sense to me that `SetHexDeprecated`, upon receiving a shorter hex string that requires zero-padding, would pad as if the missing hex chars where towards the end of the little-endian byte array, as they are the most significant bytes. "Big-endian" string representation is also consistent with the case where `SetHexDeprecated` receives too many hex digits and discards the leftmost ones, as a form of integer narrowing takes place. ### How I got it wrong in #30436 Previously I used the less than (`<`) comparison to prove endianness, but for `uint256` it uses `memcmp` and thereby gives priority to the *lower* bytes at the beginning of the array. ```C++ constexpr int Compare(const base_blob& other) const { return std::memcmp(m_data.data(), other.m_data.data(), WIDTH); } ``` `arith_uint256` is different in that it begins by comparing the bytes from the end, as it is using little endian representation, where the bytes toward the end are more significant. ```C++ template <unsigned int BITS> int base_uint<BITS>::CompareTo(const base_uint<BITS>& b) const { for (int i = WIDTH - 1; i >= 0; i--) { if (pn[i] < b.pn[i]) return -1; if (pn[i] > b.pn[i]) return 1; } return 0; } ``` (The commit documents that `base_blob::Compare()` is doing lexicographic ordering unlike the `arith_*`-variant which is doing numeric ordering). </details> ACKs for top commit: paplorinc: ACK 73e3fa10b4dd63b7767d6b6f270df66971067341 ryanofsky: Code review ACK 73e3fa10b4dd63b7767d6b6f270df66971067341 Tree-SHA512: 121630c37ab01aa7f7097f10322ab37da3cbc0696a6bbdbf2bbd6db180dc5938c7ed91003aaa2df7cf4a4106f973f5118ba541b5e077cf3588aa641bbd528f4e
author: Ryan Ofsky <ryan@ofsky.org> 2024-08-04 21:20:39 -0400
committer: Ryan Ofsky <ryan@ofsky.org> 2024-08-04 22:27:10 -0400
commit: 1a7d20509f46f0cd38302da78dba9a0c9bb24226 (patch)
tree: f281bc377a68b2a6ed4a7d1709a4c65776af62e8 /src/test/util
parent: 55d19945efbb58200c37fb8322ed3494bf67f9b1 (diff)
parent: 73e3fa10b4dd63b7767d6b6f270df66971067341 (diff)
1 files changed, 1 insertions, 1 deletions
diff --git a/src/test/util/setup_common.cpp b/src/test/util/setup_common.cpp
index abe3d6a505..3f48ea4375 100644
--- a/src/test/util/setup_common.cpp
+++ b/src/test/util/setup_common.cpp
@@ -81,7 +81,7 @@ static FastRandomContext g_insecure_rand_ctx_temp_path;
 
 std::ostream& operator<<(std::ostream& os, const arith_uint256& num)
 {
-    os << ArithToUint256(num).ToString();
+    os << num.ToString();
     return os;
 }
author	Ryan Ofsky <ryan@ofsky.org>	2024-08-04 21:20:39 -0400
committer	Ryan Ofsky <ryan@ofsky.org>	2024-08-04 22:27:10 -0400
commit	1a7d20509f46f0cd38302da78dba9a0c9bb24226 (patch)
tree	f281bc377a68b2a6ed4a7d1709a4c65776af62e8 /src/test/util
parent	55d19945efbb58200c37fb8322ed3494bf67f9b1 (diff)
parent	73e3fa10b4dd63b7767d6b6f270df66971067341 (diff)