aboutsummaryrefslogtreecommitdiff
path: root/lib/enca/ChangeLog
blob: 6a4a0aeff7d6c6485d1d653445a0ac0f4bc241a1 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
#============================================================================
# Enca v1.12 (2009-10-29)  guess and convert encoding of text files
# Copyright (C) 2000-2003 David Necas (Yeti) <yeti@physics.muni.cz>
# Copyright (C) 2009 Michal Cihar <michal@cihar.com>
#============================================================================

List of user-visible changes in Enca
More detailed log can be obtained from older changelogs or git log.

Legend: + new feature
        * change of behaviour (including disappearing of a feature)
        - bugfix

enca-1.12 2009-10-29
  - Fixes some minor memory leaks.
  - Fixes little problems in autoconf scripts.

enca-1.11 2009-09-25
  - Dropped scanf configure test which is not used at all.
  - Fixes some wrong format strings.

enca-1.10 2009-08-25
  + Enca is back alive or at least in maintenance mode.
  * Enca now lives in git repository, see <http://gitorious.org/enca>.
  - Add missing charset koi8u to belarussian language.
  - Fixed some typos in program and documentation.

enca-1.9 2005-12-18
  + support for HZ encoding
  * Big5 and GBK detection improved
  - enca.spec no longer installs docs to world-unreadable directory

enca-1.8 2005-11-24
  + Chinese (Big5 and GBK) support (thanks to Zuxy)
  * deb/ subdirectory is gone as there is finally an Enca package in Debian
    (thanks to Michal Cihar)
  - manual page clean-up (thanks to Michal Cihar)

enca-1.7 2005-02-27
  + new name type: preferred MIME name (option -m)
  - broken iconv detection on some system was fixed

enca-1.6 2004-09-01
  * English language names (--list=languages, enca_language_english_name())
    were changed to lowercase to match common locale aliases
  - Win32, i.e. MinGW and Cygwin, build problems were fixed

enca-1.5 2004-05-30
  - crash on impossible recovery after iconv failure in pipe was fixed
  - rpm building problems on Mandrake Linux were fixed

enca-1.4 2004-05-12
  - dependency of guessing API on locales (via ctype functions) was fixed
  - --help text generation failure on some systems was fixed

enca-1.3 2003-12-24
  + [libenca] it's possible to get analyser option values, not just set them
  * a good BOM (byte order mark) increases the chance of being recognized for
    UCS-4 and UTF-8 too
  * external converter wrappers were moved from bin to libexec and the b-
    prefix was removed (though it still works)
  * external converters are no longer searched in PATH, nonstandard ones
    has to be specified with full path

enca-1.2 2003-11-26
  - fixed segfault in language detection for some locale setups

enca-1.1 2003-11-17
  - fixed losing data at the end of file when using external converters in a
    pipe (and maybe in other situations)
  - [libenca] enca_analyser_free() not freeing analyser completely was fixed

enca-1.0 2003-11-06
  * deprectated options -T, -R, -S, -u, -U, -m, and -M were finally removed
  * default HTML API docs installation path changed to the new gtk-doc style
    (DATADIR/gtk-doc/html/enca)
  * debian/ subdir moved to deb/ to allow official deb creation w/o too much
    hassle

enca-0.99.4 2003-07-15
  - several race conditions in librecode and iconv interfaces were fixed
  - temporary file names are much less predictable now

enca-0.99.3 2003-06-30
  * Debian package is back from death
  * failure to find external converter is now fatal
  - fixed build problems on FreeBSD (and probably other Unices)
  - libiconv is not used for `conversion to ASCII' since never does the
    Right Thing, whatever it is
  - when conversion with libiconv fails, the file should now survive intact
  - fixed build problems on systems w/o libiconv (hopefully)
  - fixed distclean and uninstall targets to really clean and uninstall
    everything
  - fixed builds with separate source (read-only) and build directories
  - fixed builds with --without-libiconv and --without-librecode on GNU/Linux
  - external converter is not checked when it's not going to be used

enca-0.99.2 2003-06-25
  + EOL type is used to decide ambiguous cases, e.g. CP1250 is reported
    instead of ISO-8859-2/CRLF
  * --list languages by default prints English names, instead of ISO-639a
    codes, use -e or -r to get the old listing
  * if LC_CTYPE is something like en_US, more locale categories are examined
    to detect the language
  * cork charset was modified to contain \n, \r and \t in the same places as
    ASCII
  * some heuristics tuning

enca-0.99.1 2003-06-22
  + libenca pkg-config support
  * all libenca tuning parameters (-T, -R, -S, -u, -U, -m, and -M) were
    marked deprecated and are noop, Enca should DWIM
  * ambiguity is now always OK when the sample has the same meaning in all the
    charsets
  * deprecated `built-in-encodings' and `encodings' lists were removed
  * PAGER feature was removed
  - exchanged `latvian' and `lithuanian' language names were fixed (`lv' and
    `lt' were always OK)
  - missing tests for the new languages was added to the test suit

enca-0.99.0 2003-06-14
  + added some support for: Bulgarian, Croatian, Estonian, Hungarian, Latvian,
    Lithuanian, Slovene
  + a new algorithm for 8bit-dense languages (cyrillics), the old one is used
    as a fallback
  * removed support for non-transitive iconv (such a thing should not exist)
  * auxiliary tools in data are not longer built in regular builds,
    use --enable-maintainer-mode to rebuild them, create dists, etc.
  - fixed iconv interface surface check pickier than iconv itself inhibiting
    some otherwise possible conversions
  - fixed u+x permissions on temporary files (from 0.10.7)
  - fixed not deleting temporary files in iconv interface
  - fixed broken iconv interface behaviour in pipes
  - fixed iconvcap misdetecting Latin5 as ISO-8859-5
  - fixed casual `make distclean' failures

enca-0.10.7 2003-01-28
  - fixed interchanged iconv and cstocs encoding names
  - corrected(?) librecode surface interaction
  - fixed a temporary file creation race condition
  * added tex and utf8 to cstocs (names and b-cstocs)

enca-0.10.6 2002-10-22
  + enconv uses DEFAULT_CHARSET variable, exactly as recode
  - ENCAOPT works everywhere, albeit imperfectly
  - options -P and -p no longer imply -M too
  - ambiguous mode (-M) works again
  - pager is run so that help text doesn't disappear
  - standard input it printed as STDIN with -d, not as null
  - make check works again
  - it compiles wihtout recode again

enca-0.10.5 2002-10-13
  + UTF-8 recognition in binary and otherwise messy files
  + detection of double-encoding from some 8bit charset to UTF-8
  + Cork encoding conversion
  * librecode interaction was (hopefully) improved
  - fixed some build-time problems

enca-0.10.4 2002-10-10
  + added Cork encoding support for Czech, Slovak and Polish
  - empty files are now considered convertible to any encoding
  - removed the so-called faster (in fact slower) I/O
  - fixed some more compile-time search path issues

enca-0.10.3 2002-09-22
  * added support for perl umap as external converter
  - fixed external converter wrappers to work with standard sh
  - fixed some compile-time library search path issues

enca-0.10.2 2002-09-15
  + target charset is automatically obtained from locales when called as
    enconv, new options --guess, --auto-convert
  + English language names can be used instead of ISO-639 codes everywhere
  - cs_SK and ru_UA locales are properly recognised as Slovak and Ukrainian

enca-0.10.1 2002-08-29
  + faster I/O
  * external converters can be disabled at build time
  - `-' is accepted for standard input
  - fixed broken built-in converter
  - fixed crasing on an unknown language
  - trivial (identity) conversions are not performed any more
  - help is now printed when input is a terminal and no argument specified
  - changed braindamaged <STDIN>, <STDOUT> to STDIN, STDOUT in messages
  - various small fixes and build-time improvements

enca-0.10.0 2002-08-26
  + added support for Ukraininan (CP1251, IBM855, ISO-8859-5, KOI8-U, maccyr
    CP1125), Belarussian (CP1251, IBM866, ISO-8859-5, KOI8-UNI, maccyr,
    IBM855) and Polish (ISO-8859-2, ISO-8859-12, ISO-8859-16, Baltic, macce,
    IBM852, CP1250)
  + Enca library introduced
  * dropped native Debian package
  * --details no longer prints guessing details (now is mostly like --human)
  * --list=encodings, --list=built-in-encodings corrected to --list=charsets,
    --list-built-in charsets (old names supported with a warning)
  * improved Czech and Slovak charsets detection

enca-0.9.4: 2002-03-03
  - built-in converter didn't convert more than first 64kB of a file

enca-0.9.3: 2001-07-16
  + a native Debian package
  - fixed random reporting of nonsense results
  - fixed self-contradictory --details output when file was quoted-printable
    encoded
  - fixed poor performance on non-GNU/Linux
  - made pager less intrusive (instead of intrusive `less' ;-)
  - --list=encodings prints only `known' encodings
  - fixed several compile-time/portability problems

enca-0.9.2: 2001-07-13
  * --help and --license are displayed through pager (when possible)
  - fixed broken language hooks--they were never activated (from 0.9.1)
  - fixed reporting ASCII when a 7bit encoding was detected
  - fixed boundary-case behaviour when recovering from librecode failures

enca-0.9.1: 2001-06-25
  + support for Macintosh Cyrillic, including conversion
  + support for unusual UCS-4 byte orders (3412 and 2143)
  + new option --license printing full enca license
  * exit codes now make sense (0, 1, 2; where 2 means serious troubles)
  - temporary files are no longer world-readable

enca-0.9.0: 2001-03-26
  Serious incompatibilities:
  * -E and -C option letters exchanged (much better mnemonics)
  * converter wrappers renamed to b-cstocs and b-recode
  * finding only 7bit ASCII is no longer considered failure
  * need to use --language to set language (sometimes)
  * dull converter behaviour no longer supported, -x syntax changed
  * option -g removed (try --name=aliases)
  * option -c changed to --list=converters, listing format changed
  * option -l changed to --list=encodings, listing format changed
  * converter names are no longer case insensitive
  * no longer uses cstocs names as canonical
  * external converters are called with Enca's names, not cstocs's

  Other changes:
  + support for slovak and russian (and `none') language
  + support for CP1251, IBM866, ISO-8859-5 and KOI8-R, including conversion
  + UCS-2, UCS-4, UTF-8, UTF-7 and LaTeX encoding recognition
  + much more encoding aliases accepted
  + long `GNU style' command line options
  + new output types: --enca-name, --iconv-name
  + output type --name=WORD allowing to select output type by name
  + ENCAOPT environment variable
  + language detection from locales
  + support for surfaces (experimental)
  + new option --list printing various listings
  + new converter wrapper b-map (for perl `map')
  + new option -m to reset -M back
  + new language filters
  + new options -u and -U to control multibyte encoding checks
  + included [generated] enca.spec into the tarball to allow `rpm -tb'
  * -d output improved
  * read limit changed to 16MB
  * librecode now run with flags diacritics_only and ascii_graphics
  - fixed broken -P options
  - fixed several build problems on non-GNU/Linux systems
  - fixed some missing and wrong characters in Unicode data
  - temporary copy of damaged original file is not deleted when rescue fails

enca-0.8.x: Since features planned for 0.8 and 0.9 happened to be developed
  simultaneously, this version number has been skipped.

enca-0.7.7: 2001-01-01
  + ability to use UNIX98 iconv conversion functions
  + the word `none' can be used as -E parameter causing clearing of converter
    list
  - fixed disarranged help text, misspelled word `European' in macce long
    name, obsolete statements in manual page and other stuff of this kind

enca-0.7.6: 2000-11-20
  + any converter combination/order can be now specified with -E, old -E
    meaning is no longer valid
  + new option -c (list all valid converter names)
  * cork encoding not supported anymore
  * better verbosity
  * `/' is added to recode recoding requests thus partially solving the
    surface problem---surface never changes
  * some errors like specifying invalid value of threshold are no longer fatal,
    the bad values are ignored instead
  * handling of some exotic characters in bulit-in converter slightly changed
  - fixed several fatal bugs regarding stdin to stdout conversion
  - stdin is copied to stdout in case of failure whenever possible/applicable

enca-0.7.5: 2000-10-25
  * license changed to GNU GPL Version 2 (i.e. license version is explicitly
    specified)
  * prints error message when conversion is impossible
  * binary data filter improved/changed
  - fails back to external converter when GNU recode library cannot convert
    due to errorneous request
  - '' no longer causes enca to read from stdin
  - tries to restore files damaged by GNU recode library

enca-0.7.4: 2000-10-12
  + box-drawing characters are (carefully) filtered out when guessing
  - fixed intermixed behaviour in SMS/nonSMS modes

enca-0.7.3: 2000-10-09
  + blocks of probably binary data are filtered out when guessing
  * standard input is copied to standard output when its encoding is unknown
  - fixed reading only 4096 bytes from pipe (from 0.7.1)

enca-0.7.2: has been never released
  + GNU recode recoding chains made possible by starting -x (convert) parameter
    with `..'
  + second best guess is marked with `-' in -d (print details) output

enca-0.7.1: 2000-10-02
  * in case of nonfatal i/o failure enca continues processing remaining files

enca-0.7.0: 2000-09-26
  + standard input to standard output conversion
  + short message mode -M
  + ability to use GNU recode library
  + new output type -r (encoding name after RFC1345)
  + ability to convert cork internally
  + new external converter brecode (recode wrapper)
  + new output type -g (list of aliases)
  + new option -V (verbose)
  * -x (convert) paramteres syntax changed to in_enc..out_enc (old syntax still
    supported, will be removed in 0.8.x)
  * option -e (disable external) no longer supported, empty string as -C
    (external converter) parameter can be used instead
  * encoding names specified as -x (convert) parameters are case insensitive
  * ascii is not considered unknown encoding (i.e. failure) so enca returns 0
  * -d (print details) output improved/changed/updated
  * -p (prefix result with file name) no longer prints conversion details
  * by default result is prefixed by file name when enca is run on more than
    one file

enca-0.6.2: 2000-08-17
  + help texts (-h and -v) made usable (thanx to Halef)

enca-0.6.1: 2000-08-15
  - tarball bugfix

enca-0.6.0: 2000-07-20
  + bulilt-in converter
  + -x (convert) can now take form -x in_enc,out_enc causing enca to behave
    like a dull converter
  + new options -e and -E (disable internal/external converter)
  + new option -l (print internally-convertible encodings)

enca-0.5.0: 2000-07-17
  * -p (prefix result with file name) causes enca to print what is converted
    and how
  * iso8859-2/cp1250 recognition improved
  - doesn't spawn external converters as fast as is possbile, but waits for
    them to return
  - fixed `Unrecognized encoding' when winner is 1250 (from 0.4.3)
  - corrected -d (print details) table alignment

enca-0.4.3: 2000-07-14
  * -d (print details) prints encodings alphabetically sorted
  - corrected short encoding name t1 -> cork
  - division-by-zero bugfixes

enca-0.4.2: has been never released
  * options -m/-M ([don't] use iso8892-2/cp1250 hack) no longer supported
  - fixed showing standard input as empty string (<STDIN> is printed now)

enca-0.4.1: 2000-07-12
  * default of 60 significant characters changed to 10

enca-0.4.0: 2000-07-10
  + first public release