From f0dd2d58ab920647a8e288c4ab93129686b20be2 Mon Sep 17 00:00:00 2001 From: bip39jp Date: Sat, 20 Dec 2014 21:07:05 +0900 Subject: Clarify necessity for ideographic spaces. I left it unclear / open to interpretation on whether to use ideograpic spaces, but realized that without being specific on its necessity, developers may implement something that would cause trouble with the Japanese user. (two words looking like one word, or phrase verification failing because it can't handle ideographic spaces, etc.) --- bip-0039/bip-0039-wordlists.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) (limited to 'bip-0039/bip-0039-wordlists.md') diff --git a/bip-0039/bip-0039-wordlists.md b/bip-0039/bip-0039-wordlists.md index 4a68e88..1eaf585 100644 --- a/bip-0039/bip-0039-wordlists.md +++ b/bip-0039/bip-0039-wordlists.md @@ -10,10 +10,12 @@ ###Japanese -1. Users will most likely separate the words with UTF-8 ideographic space. -(UTF-8 bytes: 0xE38080) When generating the seed, normalization as per the spec will -automatically change these into normal ASCII spaces. Depending on the font, displaying the -words should use the UTF-8 ideographic space if it looks like the symbols are too close. +1. **Developers implementing phrase generation or checksum verification must separate words using ideographic spaces / accommodate users inputting ideographic spaces.** +(UTF-8 bytes: **0xE38080**; C/C+/Java: **"\u3000"**; Python: **u"\u3000"**) +However, code that only accepts Japanese phrases but does not generate or verify them should be fine as is. +This is because when generating the seed, normalization as per the spec will +automatically change the ideographic spaces into normal ASCII spaces, so as long as your code never shows the user an ASCII space +separated phrase or tries to split the phrase input by the user, dealing with ASCII or Ideographic space is the same. 2. Word-wrapping doesn't work well, so making sure that words only word-wrap at one of the ideographic spaces may be a necessary step. As a long word split in two could be mistaken easily -- cgit v1.2.3