aboutsummaryrefslogtreecommitdiff
path: root/src/util/strencodings.cpp
AgeCommit message (Collapse)Author
2024-08-23util: remove unused IsHexNumberstickies-v
The relevant unit tests have been incorporated in uint256_tests/from_user_hex in a previous commit.
2024-05-16util: move HexStr and HexDigit from util to cryptoTheCharlatan
Move HexStr and HexDigit functions from util to crypto. The crypto library does not actually use these functions, but the consensus library does. The consensus and util libraries not allowed to depend on each other, but are allowed to depend on the cryto library, so the crypto library is a reasonable put these. The consensus library uses HexStr and HexDigit in script.cpp, transaction.cpp, and uint256.cpp. The util library does not use HexStr but does use HexDigit in strencodings.cpp to parse integers.
2024-03-13Merge bitcoin/bitcoin#29606: refactor: Reserve memory for ToLower/ToUpper ↵Ava Chow
conversions 6f2f4a4d096a3b261258c8cdd96cca532988d1d3 Reserve memory for ToLower/ToUpper conversions (Lőrinc) Pull request description: Similarly to https://github.com/bitcoin/bitcoin/pull/29458, we're preallocating the result string based on the input string's length. The methods were already [covered by tests](https://github.com/bitcoin/bitcoin/blob/master/src/test/util_tests.cpp#L1250-L1276). ACKs for top commit: tdb3: ACK for 6f2f4a4d096a3b261258c8cdd96cca532988d1d3 maflcko: lgtm ACK 6f2f4a4d096a3b261258c8cdd96cca532988d1d3 achow101: ACK 6f2f4a4d096a3b261258c8cdd96cca532988d1d3 Empact: Code Review ACK https://github.com/bitcoin/bitcoin/pull/29606/commits/6f2f4a4d096a3b261258c8cdd96cca532988d1d3 stickies-v: ACK 6f2f4a4d096a3b261258c8cdd96cca532988d1d3 Tree-SHA512: e3ba7af77decdc73272d804c94fef0b11028a85f3c0ea1ed6386672611b1c35fce151f02e64f5bb5acb5ba506aaa54577719b07925b9cc745143cf5c7e5eb262
2024-03-08Reserve memory for ToLower/ToUpper conversionsLőrinc
2024-02-28Preallocate result in `TryParseHex` to avoid resizingLőrinc
Running `make && ./src/bench/bench_bitcoin -filter=HexParse` a few times results in: ``` | ns/base16 | base16/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 0.68 | 1,465,555,976.27 | 0.8% | 0.01 | `HexParse` | 0.68 | 1,472,962,920.18 | 0.3% | 0.01 | `HexParse` | 0.68 | 1,476,159,423.00 | 0.3% | 0.01 | `HexParse` ```
2023-03-07util: Work around ParseHex gcc cross compiler bugMarcoFalke
2023-02-27util: Return empty vector on invalid hex encodingMarcoFalke
2022-12-24scripted-diff: Bump copyright headersHennadii Stepanov
-BEGIN VERIFY SCRIPT- ./contrib/devtools/copyright_header.py update ./ -END VERIFY SCRIPT- Commits of previous years: - 2021: f47dda2c58b5d8d623e0e7ff4e74bc352dfa83d7 - 2020: fa0074e2d82928016a43ca408717154a1c70a4db - 2019: aaaaad6ac95b402fe18d019d67897ced6b316ee0
2022-10-05Validate port value in `SplitHostPort`amadeuszpawlik
Forward the validation of the port from `ParseUInt16(...)`. Consider port 0 as invalid. Add suitable test for the `SplitHostPort` function. Add doxygen description to the `SplitHostPort` function.
2022-08-20Fix iwyuMacroFake
2022-07-08refactor: add most of src/util to iwyufanquake
These files change infrequently, and not much header shuffling is required. We don't add everything in src/util/ yet, because IWYU makes some dubious suggestions, which I'm going to follow up with upstream.
2022-05-20Merge bitcoin/bitcoin#23595: util: Add ParseHex<std::byte>() helperlaanwj
facd1fb911abfc595a3484ee53397eff515d4c40 refactor: Use Span of std::byte in CExtKey::SetSeed (MarcoFalke) fae1006019188700e0c497a63fc1550fe00ca8bb util: Add ParseHex<std::byte>() helper (MarcoFalke) fabdf81983e2542d60542b80fb94ccb1acdd204a test: Add test for embedded null in hex string (MarcoFalke) Pull request description: This adds the hex->`std::byte` helper after the `std::byte`->hex helper was added in commit 9394964f6b9d1cf1220a4eca17ba18dc49ae876d ACKs for top commit: pk-b2: ACK https://github.com/bitcoin/bitcoin/pull/23595/commits/facd1fb911abfc595a3484ee53397eff515d4c40 laanwj: Code review ACK facd1fb911abfc595a3484ee53397eff515d4c40 Tree-SHA512: e2329fbdea2e580bd1618caab31f5d0e59c245a028e1236662858e621929818870b76ab6834f7ac6a46d7874dfec63f498380ad99da6efe4218f720a60e859be
2022-05-04Merge bitcoin/bitcoin#24852: util: optimize HexStrlaanwj
5e61532e72c1021fda9c7b213bd9cf397cb3a802 util: optimizes HexStr (Martin Leitner-Ankerl) 4e2b99f72a90b956f3050095abed4949aff9b516 bench: Adds a benchmark for HexStr (Martin Leitner-Ankerl) 67c8411c37b483caa2fe3f7f4f40b68ed2a9bcf7 test: Adds a test for HexStr that checks all 256 bytes (Martin Leitner-Ankerl) Pull request description: In my benchmark, this rewrite improves runtime 27% (g++) to 46% (clang++) for the benchmark `HexStrBench`: g++ 11.2.0 | ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- | 0.94 | 1,061,381,310.36 | 0.7% | 12.00 | 3.01 | 3.990 | 1.00 | 0.0% | 0.01 | `HexStrBench` master | 0.68 | 1,465,366,544.25 | 1.7% | 6.00 | 2.16 | 2.778 | 1.00 | 0.0% | 0.01 | `HexStrBench` branch clang++ 13.0.1 | ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- | 0.80 | 1,244,713,415.92 | 0.9% | 10.00 | 2.56 | 3.913 | 0.50 | 0.0% | 0.01 | `HexStrBench` master | 0.43 | 2,324,188,940.72 | 0.2% | 4.00 | 1.37 | 2.914 | 0.25 | 0.0% | 0.01 | `HexStrBench` branch Note that the idea for this change comes from denis2342 in #23364. This is a rewrite so no unaligned accesses occur. Also, the lookup table is now calculated at compile time, which hopefully makes the code a bit easier to review. ACKs for top commit: laanwj: Code review ACK 5e61532e72c1021fda9c7b213bd9cf397cb3a802 aureleoules: tACK 5e61532e72c1021fda9c7b213bd9cf397cb3a802. theStack: ACK 5e61532e72c1021fda9c7b213bd9cf397cb3a802 🚤 Tree-SHA512: 40b53d5908332473ef24918d3a80ad1292b60566c02585fa548eb4c3189754971be5a70325f4968fce6d714df898b52d9357aba14d4753a8c70e6ffd273a2319
2022-04-27util: Add ParseHex<std::byte>() helperMarcoFalke
2022-04-27test: Add test for embedded null in hex stringMarcoFalke
Also, fix style in the corresponding function. The style change can be reviewed with "--word-diff-regex=."
2022-04-27Use std::string_view throughout util strencodings/stringPieter Wuille
2022-04-27Make DecodeBase{32,64} take string_view argumentsPieter Wuille
2022-04-27Make DecodeBase{32,64} return optional instead of taking bool*Pieter Wuille
2022-04-27Make DecodeBase{32,64} always return vector, not stringPieter Wuille
Base32/base64 are mechanisms for encoding binary data. That they'd decode to a string is just bizarre. The fact that they'd do that based on the type of input arguments even more so.
2022-04-27Reject incorrect base64 in HTTP authPieter Wuille
In addition, to make sure that no call site ignores the invalid decoding status, make the pf_invalid argument mandatory.
2022-04-27Make SanitizeString use string_viewPieter Wuille
2022-04-27Make IsHexNumber use string_viewPieter Wuille
2022-04-27Make IsHex use string_viewPieter Wuille
2022-04-27Make ParseHex use string_viewPieter Wuille
2022-04-17util: optimizes HexStrMartin Leitner-Ankerl
In my benchmark, this rewrite improves runtime 27% (g++) to 46% (clang++) for the benchmark `HexStrBench`: g++ 11.2.0 | ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- | 0.94 | 1,061,381,310.36 | 0.7% | 12.00 | 3.01 | 3.990 | 1.00 | 0.0% | 0.01 | `HexStrBench` master | 0.68 | 1,465,366,544.25 | 1.7% | 6.00 | 2.16 | 2.778 | 1.00 | 0.0% | 0.01 | `HexStrBench` branch clang++ 13.0.1 | ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- | 0.80 | 1,244,713,415.92 | 0.9% | 10.00 | 2.56 | 3.913 | 0.50 | 0.0% | 0.01 | `HexStrBench` master | 0.43 | 2,324,188,940.72 | 0.2% | 4.00 | 1.37 | 2.914 | 0.25 | 0.0% | 0.01 | `HexStrBench` branch Note that the idea for this change comes from denis2342 in PR 23364. This is a rewrite so no unaligned accesses occur. Also, the lookup table is now calculated at compile time, which hopefully makes the code a bit easier to review.
2022-02-10Merge bitcoin/bitcoin#24297: Fix unintended unsigned integer overflow in ↵fanquake
strencodings fac9fe5d051264fcd16e8e36d30f28c05c999837 Fix unintended unsigned integer overflow in strencodings (MarcoFalke) Pull request description: This fixes two issues for strings that start with a colon and only have one colon: * `fMultiColon` is incorrectly set to `true` * There is an unsigned integer overflow `colon - 1` (`0 - 1`) Neither issue matters, as the result is discarded. Though, it makes sense to still fix the issue for clarity and to avoid sanitizer issues in the function. ACKs for top commit: laanwj: Code review ACK fac9fe5d051264fcd16e8e36d30f28c05c999837 shaavan: Code Review ACK fac9fe5d051264fcd16e8e36d30f28c05c999837 Tree-SHA512: e71c21a0b617abf241e561ce6b90b963e2d5e2f77bd9547ce47209a1a94b454384391f86ef5d35fedd4f4df19add3896bb3d61fed396ebba8e864e3eeb75ed59
2022-02-09fuzz: Avoid unsigned integer overflow in FormatParagraphMarcoFalke
2022-02-09Fix unintended unsigned integer overflow in strencodingsMarcoFalke
2021-12-30scripted-diff: Bump copyright headersHennadii Stepanov
-BEGIN VERIFY SCRIPT- ./contrib/devtools/copyright_header.py update ./ -END VERIFY SCRIPT- Commits of previous years: * 2020: fa0074e2d82928016a43ca408717154a1c70a4db * 2019: aaaaad6ac95b402fe18d019d67897ced6b316ee0
2021-12-13Reduce size of strencodings decode tablesMarcoFalke
2021-12-13Fix implicit integer sign changes in strencodingsMarcoFalke
2021-11-24Merge bitcoin/bitcoin#23451: span: Add std::byte helpersMarcoFalke
faa3ec2304051be7cfbe301cfbfbda3faf7514fc span: Add std::byte helpers (MarcoFalke) fa18038f519db76befb9a7bd0b1540143bfeb12b refactor: Use ignore helper when unserializing an invalid pubkey (MarcoFalke) fabe18d0b39b4b918bf60e3a313eaa36fb4067f2 Use value_type in CDataStream where possible (MarcoFalke) Pull request description: This adds (currently unused) span std::byte helpers, so that they can be used in new code. The refactors are also required for https://github.com/bitcoin/bitcoin/pull/23438, but they are split up because the other pull doesn't compile with msvc right now. The third commit is not needed for the other pull, but still nice. ACKs for top commit: klementtan: reACK faa3ec2. Verified that all the new `std::byte` helper functions are tested. laanwj: Code review ACK faa3ec2304051be7cfbe301cfbfbda3faf7514fc Tree-SHA512: b1f6af39f03ea4dfebf20d4a8538fa993a6104e7fc92ddf0c4606a7efc3ca9a8c1a4741d98a1418569c11bb9ce9258bf0c0c06d93d85ed7e208902a2db04e407
2021-11-17util: ParseByteUnits - Parse a string with suffix unit [k|K|m|M|g|G|t|T]Douglas Chimento
A convenience utility for human readable arguments/config e.g. -maxuploadtarget=500g
2021-11-09span: Add std::byte helpersMarcoFalke
Also, add Span<std::byte> interface to strencondings.
2021-10-04Merge bitcoin/bitcoin#23156: refactor: Remove unused ParsePrechecks and ↵MarcoFalke
ParseDouble fa9d72a7947d2cff541794e21e0040c3c1d43b32 Remove unused ParseDouble and ParsePrechecks (MarcoFalke) fa3cd2853530c86c261ac7266ffe4f1726fe9ce6 refactor: Remove unused ParsePrechecks from ParseIntegral (MarcoFalke) Pull request description: All of the `ParsePrechecks` are already done by `ToIntegral`, so remove them from `ParseIntegral`. Also: * Remove redundant `{}`. See https://github.com/bitcoin/bitcoin/pull/20457#discussion_r720116866 * Add missing failing c-string test case * Add missing failing test cases for non-int32_t integral types ACKs for top commit: laanwj: Code review ACK fa9d72a7947d2cff541794e21e0040c3c1d43b32, good find on ParseDouble not being used at all, and testing for behavior of embedded NULL characters is always a good thing. practicalswift: cr ACK fa9d72a7947d2cff541794e21e0040c3c1d43b32 Tree-SHA512: 3d654dcaebbf312dd57e54241f9aa6d35b1d1d213c37e4c6b8b9a69bcbe8267a397474a8b86b57740fbdd8e3d03b4cdb6a189a9eb8e05cd38035dab195410aa7
2021-10-04Remove unused ParseDouble and ParsePrechecksMarcoFalke
2021-10-01refactor: Remove unused ParsePrechecks from ParseIntegralMarcoFalke
Also: * Remove redundant {} from return statement * Add missing failing c-string test case and "-" and "+" strings * Add missing failing test cases for non-int32_t integral types
2021-09-30Replace use of locale dependent atoi(…) with locale-independent ↵practicalswift
std::from_chars(…) (C++17) test: Add test cases for LocaleIndependentAtoi fuzz: Assert legacy atoi(s) == LocaleIndependentAtoi<int>(s) fuzz: Assert legacy atoi64(s) == LocaleIndependentAtoi<int64_t>(s)
2021-09-18util: Introduce ToIntegral<T>(const std::string&) for locale independent ↵practicalswift
parsing using std::from_chars(…) (C++17) util: Avoid locale dependent functions strtol/strtoll/strtoul/strtoull in ParseInt32/ParseInt64/ParseUInt32/ParseUInt64 fuzz: Assert equivalence between new and old Parse{Int,Uint}{8,32,64} functions test: Add unit tests for ToIntegral<T>(const std::string&)
2021-05-19Merge bitcoin/bitcoin#21173: util: faster HexStr => 13% faster blockToJSONW. J. van der Laan
74bf850ac47735f2ef4306059d3e664d40cac85e faster HexStr => 13% faster blockToJSON (Martin Ankerl) Pull request description: `std::string`'s push_back is rather slow because it needs to check & update the string size. For `HexStr` the output string size is already easily know, so we can initially create the string with the correct size and then just assign the data. `HexStr` is heavily usd in `blockToJSON`, so this change is a noticeable benefit. Benchmark on an i7-8700 @3.2GHz: * 71,315,461.00 ns/op master * 62,842,490.00 ns/op this commit So this little change makes `blockToJSON` about ~13% faster. ACKs for top commit: laanwj: Code review ACK 74bf850ac47735f2ef4306059d3e664d40cac85e theStack: re-ACK 74bf850ac47735f2ef4306059d3e664d40cac85e Tree-SHA512: fc99105123edc11f4e40ed77aea80cf7f32e49c53369aa364b38395dcb48575e15040b0489ed30d0fe857c032a04e225c33e9d95cdfa109a3cb5a6ec9a972415
2021-03-16util: add missing braces and apply clang format to SplitHostPort()Jon Atack
2021-03-16util: add ParseUInt16(), use it in SplitHostPort()Jon Atack
2021-03-16p2p, refactor: pass and use uint16_t CService::port as uint16_tJon Atack
2021-02-16faster HexStr => 13% faster blockToJSONMartin Ankerl
`std::string`'s push_back is rather slow because it needs to check & update the string size. For `HexStr` the output string size is already easily know, so we can initially create the string with the correct size and then just assign the data. `HexStr` is heavily usd in `blockToJSON`, so this change is a noticeable benefit. Benchmark on an i7-8700 @3.2GHz: * 71,315,461.00 ns/op master * 62,842,490.00 ns/op this commit So this little change makes `blockToJSON` about ~13% faster.
2020-11-26scripted-diff: Use [[nodiscard]] (C++17) instead of NODISCARDpracticalswift
-BEGIN VERIFY SCRIPT- sed -i "s/NODISCARD/[[nodiscard]]/g" $(git grep -l "NODISCARD" ":(exclude)src/bench/nanobench.h" ":(exclude)src/attributes.h") -END VERIFY SCRIPT-
2020-09-21net: recognize TORv3/I2P/CJDNS networksVasil Dimov
Recognizing addresses from those networks allows us to accept and gossip them, even though we don't know how to connect to them (yet). Co-authored-by: eriknylund <erik@daychanged.com>
2020-08-25util: make EncodeBase64 consume SpansSebastian Falbesoner
2020-08-25util: make EncodeBase32 consume SpansSebastian Falbesoner
2020-08-25Merge #19628: net: change CNetAddr::ip to have flexible sizeMarcoFalke
102867c587f5f7954232fb8ed8e85cda78bb4d32 net: change CNetAddr::ip to have flexible size (Vasil Dimov) 1ea57ad67406b3aaaef5254bc2fa7e4134f3a6df net: don't accept non-left-contiguous netmasks (Vasil Dimov) Pull request description: (chopped off from #19031 to ease review) Before this change `CNetAddr::ip` was a fixed-size array of 16 bytes, not being able to store larger addresses (e.g. TORv3) and encoded smaller ones as 16-byte IPv6 addresses. Change its type to `prevector`, so that it can hold larger addresses and do not disguise non-IPv6 addresses as IPv6. So the IPv4 address `1.2.3.4` is now encoded as `01020304` instead of `00000000000000000000FFFF01020304`. Rename `CNetAddr::ip` to `CNetAddr::m_addr` because it is not an "IP" or "IP address" (TOR addresses are not IP addresses). In order to preserve backward compatibility with serialization (where e.g. `1.2.3.4` is serialized as `00000000000000000000FFFF01020304`) introduce `CNetAddr` dedicated legacy serialize/unserialize methods. Adjust `CSubNet` accordingly. Still use `CSubNet::netmask[]` of fixed 16 bytes, but use the first 4 for IPv4 (not the last 4). Do not accept invalid netmasks that have 0-bits followed by 1-bits and only allow subnetting for IPv4 and IPv6. Co-authored-by: Carl Dong <contact@carldong.me> ACKs for top commit: sipa: utACK 102867c587f5f7954232fb8ed8e85cda78bb4d32 MarcoFalke: Concept ACK 102867c587f5f7954232fb8ed8e85cda78bb4d32 ryanofsky: Code review ACK 102867c587f5f7954232fb8ed8e85cda78bb4d32. Just many suggested updates since last review. Thanks for following up on everything! jonatack: re-ACK 102867c587f5f7954232fb8ed8e85cda78bb4d32 diff review, code review, build/tests/running bitcoind with ipv4/ipv6/onion peers kallewoof: ACK 102867c587f5f7954232fb8ed8e85cda78bb4d32 Tree-SHA512: d60bf716cecf8d3e8146d2f90f897ebe956befb16f711a24cfe680024c5afc758fb9e4a0a22066b42f7630d52cf916318bedbcbc069ae07092d5250a11e8f762
2020-08-24net: change CNetAddr::ip to have flexible sizeVasil Dimov
Before this change `CNetAddr::ip` was a fixed-size array of 16 bytes, not being able to store larger addresses (e.g. TORv3) and encoded smaller ones as 16-byte IPv6 addresses. Change its type to `prevector`, so that it can hold larger addresses and do not disguise non-IPv6 addresses as IPv6. So the IPv4 address `1.2.3.4` is now encoded as `01020304` instead of `00000000000000000000FFFF01020304`. Rename `CNetAddr::ip` to `CNetAddr::m_addr` because it is not an "IP" or "IP address" (TOR addresses are not IP addresses). In order to preserve backward compatibility with serialization (where e.g. `1.2.3.4` is serialized as `00000000000000000000FFFF01020304`) introduce `CNetAddr` dedicated legacy serialize/unserialize methods. Adjust `CSubNet` accordingly. Still use `CSubNet::netmask[]` of fixed 16 bytes, but use the first 4 for IPv4 (not the last 4). Only allow subnetting for IPv4 and IPv6. Co-authored-by: Carl Dong <contact@carldong.me>