summaryrefslogtreecommitdiff
path: root/bip-0039/bip-0039-wordlists.md
diff options
context:
space:
mode:
authorbip39jp <dabura667@live.jp>2014-12-20 21:07:05 +0900
committerbip39jp <dabura667@live.jp>2015-03-12 14:06:22 +0900
commitf0dd2d58ab920647a8e288c4ab93129686b20be2 (patch)
treeab1b86e36ed093664b8b3ea84776dae3d9ccc188 /bip-0039/bip-0039-wordlists.md
parent2ea19daaa0380fed7a2b053fd1f488fadba28bda (diff)
downloadbips-f0dd2d58ab920647a8e288c4ab93129686b20be2.tar.xz
Clarify necessity for ideographic spaces.
I left it unclear / open to interpretation on whether to use ideograpic spaces, but realized that without being specific on its necessity, developers may implement something that would cause trouble with the Japanese user. (two words looking like one word, or phrase verification failing because it can't handle ideographic spaces, etc.)
Diffstat (limited to 'bip-0039/bip-0039-wordlists.md')
-rw-r--r--bip-0039/bip-0039-wordlists.md10
1 files changed, 6 insertions, 4 deletions
diff --git a/bip-0039/bip-0039-wordlists.md b/bip-0039/bip-0039-wordlists.md
index 4a68e88..1eaf585 100644
--- a/bip-0039/bip-0039-wordlists.md
+++ b/bip-0039/bip-0039-wordlists.md
@@ -10,10 +10,12 @@
###Japanese
-1. Users will most likely separate the words with UTF-8 ideographic space.
-(UTF-8 bytes: 0xE38080) When generating the seed, normalization as per the spec will
-automatically change these into normal ASCII spaces. Depending on the font, displaying the
-words should use the UTF-8 ideographic space if it looks like the symbols are too close.
+1. **Developers implementing phrase generation or checksum verification must separate words using ideographic spaces / accommodate users inputting ideographic spaces.**
+(UTF-8 bytes: **0xE38080**; C/C+/Java: **"\u3000"**; Python: **u"\u3000"**)
+However, code that only accepts Japanese phrases but does not generate or verify them should be fine as is.
+This is because when generating the seed, normalization as per the spec will
+automatically change the ideographic spaces into normal ASCII spaces, so as long as your code never shows the user an ASCII space
+separated phrase or tries to split the phrase input by the user, dealing with ASCII or Ideographic space is the same.
2. Word-wrapping doesn't work well, so making sure that words only word-wrap at one of the
ideographic spaces may be a necessary step. As a long word split in two could be mistaken easily