| 2011-02-16 Németh László <nemeth at numbertext dot org>: |
| * src/*/Makefile.am: fix library versioning, the probem reported by |
| Rene Engerhald and Simon Brouwer. |
| |
| * man/hunspell.4: new version based on the revised version of Ruud Baars |
| |
| 2011-02-02 Németh László <nemeth at OOo>: |
| * suggestngr.cxx: fix ngram PHONE suggestion for input words with |
| diacritics using UTF-8 encoded dictionaries (add byte length to the |
| 8-bit phonet() argument instead of character length) |
| |
| * suggestmgr.cxx: fix missing csconv problem with UTF-8 encoding |
| dictionares, when the input contains non-BMP characters |
| - tests/utf8_nonbmp.sug: test file |
| |
| * suggestmgr.cxx: mixed and keyboard based character suggestions |
| don't forbid ngram suggestion search (optimized tests/suggestiontest) |
| |
| * affixmgr.cxx: fix hun#2999225: interfering compounding mechanisms, |
| tested on Dutch word list and reported by Ruud Baars |
| |
| * affixmgr.cxx: allomorph fix for hun#2970240 (Hungarian |
| compound "vadász+gép" was analyzed as vad+ász+gép, and rejected |
| by the ss->s rep rule (verb "vadássz"), but the analysis |
| didn't continue for the longer word parts (vadász+gép). |
| |
| * csutil.cxx: add lang code "az_AZ", "hu_HU", "tr_TR" for back |
| compatibility (fixing Azeri and Turkish casing conversion, also |
| Hungarian compound handling) |
| |
| * affixmgr.cxx: fix morphological analysis |
| |
| 2011-01-26 Németh László <nemeth at OOo>: |
| * affixmgr.cxx: fix for moz#626195 (memcheck problem with FULLSTRIP). |
| |
| * affixmgr.*, suggestmgr.cxx: FORBIDWARN parameter (see manual) |
| |
| 2011-01-24 Németh László <nemeth at OOo>: |
| * suffixmgr.cxx: fix bad suggestion of forbidden compound words, eg. |
| "termijndoel" with the Dutch dictionary. Reported by Ruud Baars. |
| |
| * latexparser.cxx: fix double apostrophe TeX quoation mark tokenization |
| (hun#3119776), reported by Wybodekker at SF.net. |
| |
| * tests/suggestiontest/*: multilanguage and single Hunspell version, see README |
| * tests/suggestiontest/prepare2: for make -f Makefile.orig single |
| |
| 2011-01-22 Németh László <nemeth at OOo>: |
| * affixmgr.*, suggestmgr.*: new features |
| ONLYMAXDIFF: remove all bad ngram suggestions (default mode keeps one) |
| NONGRAMSUGGEST: similar to NOSUGGEST, but it forbids to use the word |
| in ngram based (more, than 1-character distance) suggestions. |
| |
| 2011-01-21 Németh László <nemeth at OOo>: |
| * suggestmgr.*: limit wild suggestions (hun#2970237 by Ruud Baars) |
| - limited compound word suggestions |
| - improved and limited ngram based suggestions |
| * tests/*.sug: modified test files |
| - feature MAXCPDSUGS: |
| MAXCPDSUGS 0 : no compound suggestion, suggested by |
| Finn Gruwier Larsen in hunfeat#2836033 |
| MAXCPDSUGS n : max. ~n compound suggestions |
| - feature MAXDIFF: differency limit for ngram suggestions: 0-10 |
| eg. MAXDIFF 5: normal (default) limit |
| MAXDIFF 0: only one ngram suggestion |
| MAXDIFF 10: ~maxngramsugs ngram suggestions |
| |
| * affixmgr.*, hunspell.*: add flag FORCEUCASE (hun#2999228), force |
| capitalization of compound words, see Hunspell 4 manual), |
| suggested by Ruud Baars |
| test/forceucase.*: test files |
| |
| * affixmgr.*, hunspell.*: add flag WARN (hun#1808861), optional warning feature |
| for rare words, suggested by Ruud Baars |
| tests/warn: test files |
| * tools/hunspell.cxx: add option -r for optional filtering of rare words |
| |
| * affixmgr.cxx: fix hun#3161359 (gcc warnings) reported by Ryan VanderMeulen. |
| |
| 2011-01-17 Németh László <nemeth at OOo>: |
| * suggestmgr.cxx: fix hun#3158994 and hun#3159027 (missing csconv table |
| using awkward 8bit capitalization of UTF-8 encoded dictionary words with PHONE |
| suggestion, reported by benjarobin and dicollecte at SF.net). |
| |
| 2011-01-13 Németh László <nemeth at OOo>: |
| * affixmgr.cxx: ONLYINCOMPOUND fix for hun#2999224 (fogemorphene |
| was allowed in end position of compoundings). Reported by Ruud Baars. |
| * tests/onlyincompound2.*: test files |
| |
| 2011-01-10 Ingo H. de Boer <idb_winshell at SF.net>: |
| * win_api/{hunspell,libhunspell, testparser}.vcproj: updated project |
| files for the library and the executables. Compiling problem |
| also reported by Don Walker. |
| |
| 2011-01-06 Németh László <nemeth at OOo>: |
| * affixmgr.cxx: fix freedesktop#32850 (program halt during Hungarian |
| spell checking of the word "6csillagocska6", reported by András Tímár) |
| |
| * tools/hunspell.cxx: add Mac OS X Hunspell dictionary paths, asked by |
| Vidar Gundersen in hunfeat#3142010 |
| |
| 2011-01-05 Caolán McNamara <cmc at OOo>: |
| * moz#620626 NS_UNICHARUTIL_CID doesn't support |
| case conversion |
| |
| 2011-01-03 Németh László <nemeth at OOo>: |
| * NEWS and THANKS: update for release 1.2.13 |
| |
| 2010-12-20 Németh László <nemeth at OOo>: |
| * affixmgr.cxx: hun#3140784 |
| |
| 2010-12-16 Németh László <nemeth at OOo>: |
| * affixmgr.cxx: |
| - improved fix of hun#2970242 (supporting |
| zero affixes, reported by Ruud Baars |
| - tests/opentaal_cpdpat{,2}: test files |
| |
| - switching off default BREAK parameters by BREAK 0, |
| reported by Ruud Baars |
| |
| - hun#2999225: interfering compounding mechanisms, reported by Ruud Baars |
| |
| 2010-12-11 Németh László <nemeth at OOo>: |
| * affixmgr.cxx: fix hun#2970242 (CHECKCOMPOUNDPATTERN only with flags), |
| the bug reported by Ruud Baars |
| * tests/2970242.*: test files |
| |
| * tests/2970240.*: test files for CHECKCOMPOUNDPATTERN fix (check all |
| boundaries in compound words, fixed by the previous CHECKCOMPOUNDREP |
| fix), the bug reported by Ruud Baars |
| |
| * win_api/Makefile.cygwin: update |
| |
| 2010-12-09 Caolán McNamara <cmc at OOo>: |
| * moz#617953 fix leak |
| |
| 2010-11-08 Caolán McNamara <cmc at OOo>: |
| * rhbz#650503 crash in arabic dictionary |
| |
| 2010-11-05 Caolán McNamara <cmc at OOo>: |
| * rhbz#648740 don't warn on empty flagvector |
| |
| 2010-11-03 Caolán McNamara <cmc at OOo>: |
| * logically we shouldn't need a csconv table in utf-8 mode |
| |
| 2010-10-27 Németh László <nemeth at OOo>: |
| * hun#3000055 (requested by Ruud Baars) add REP boundary specifiation: |
| REP ^word$ xxxx |
| REP ^wordstarting xxxx |
| REP wordending$ xxxx |
| |
| * hun#3008434 (requested by Adrián Chaves Fernández) and |
| hun#3018929 (requested by Ruud Baars): REP with more than 2 words: |
| REP morethantwo more_than_two |
| |
| * suggestmgr.cxx: fix incomplete suggestion list for capitalized words, |
| eg. missing Machtstrijd->Machtsstrijd in the Dutch dictionary |
| (reported by Ruud Bars) |
| |
| * tests, man: related updates |
| |
| 2010-10-12 Caolán McNamara <cmc at OOo>: |
| * moz#603311 HashMgr::load_tables leaks dict when decode_flags fails |
| * fix mem leak found with new tests |
| * hun#3084340 allow underscores in html entity names |
| |
| 2010-10-07 Németh László <nemeth at OOo>: |
| * affixmgr.cxx: |
| - hun#2970239 fix bad suggestion of forbidden compound words |
| - hun#2999224 fix keepcase feature on compound words (only partial |
| fix for COMPOUNDRULE based compounding) |
| - fix checkcompoundrep feature in compound words (check all boundaries, |
| not only the last one) |
| Problems reported by Ruud Baars. |
| |
| * tests/opentaal_forbiddenword[12]*, tests/opentaal_keepcase*: |
| new test files for the previous fixes |
| * tests/checkcompoundrep: extended test file. |
| |
| 2010-09-05 Caolán McNamara <cmc at OOo>: |
| * moz#583582 fix double buffer gcc fortify issue |
| |
| 2010-08-13 Caolán McNamara <cmc at OOo>: |
| * moz#586671 AffixMgr::parse_convtable leaks pattern/pattern2 if it |
| can't create both |
| * moz#586686 tidy up get_xml_list and friends |
| |
| 2010-08-10 Caolán McNamara <cmc at OOo>: |
| * hun#3022860 fix remove duplicate code |
| |
| 2010-07-17 Caolán McNamara <cmc at OOo>: |
| * remove ununsed get_default_enc and avoid potential misrecognition of |
| three letter language ids |
| * normalize encoding names before lookup |
| |
| 2010-07-05 Caolán McNamara <cmc at OOo>: |
| * hun#2286060 add Hangul syllables to unicode tables |
| |
| 2010-06-26 Caolán McNamara <cmc at OOo>: |
| * moz#571728 keep new[]/delete[] wrappers in sync for embedded in moz |
| case |
| |
| 2010-06-13 Caolán McNamara <cmc at OOo>: |
| * moz#571728 keep new[]/delete[] wrappers in sync for embedded in moz |
| case |
| |
| 2010-06-02 Caolán McNamara <cmc at OOo>: |
| * moz#569611 compile cleanly under win64 |
| |
| 2010-05-22 Caolán McNamara <cmc at OOo>: |
| * moz#525581 apply mozilla's current preferred get_current_cs impl |
| |
| 2010-05-17 Németh László <nemeth at OOo>: |
| * affixmgr.cxx: fix bad limitation of parenthesized flags at |
| COMPOUNDRULEs. Windows crash reported by Ruud Baars and Simon Brouwer. |
| |
| 2010-05-05 Caolán McNamara <cmc at OOo>: |
| * rhbz#589326 malloc of int that should have been of char** |
| * hun#2997388 fix ironic misspellings |
| |
| 2010-04-28 Caolán McNamara <cmc at OOo>: |
| * moz#550942 get_xml_list doesn't handle failure from get_xml_par |
| |
| 2010-04-27 Caolán McNamara <cmc at OOo>: |
| * moz#465612 mozilla-specific code leaks |
| * moz#430900 phone is dereferenced before oom check |
| * moz#418348 ckey_utf alloc is used unchecked in SuggestMgr::badcharkey_utf |
| * CID#1487 pointer "rl" dereferenced before NULL check |
| * CID#1464 Returned without freeing storage "ptr" |
| * CID#1459 Avoid duplicate strchr |
| * CID#1443 Avoid any chance of dereferencing *slst |
| * CID#1442 Unsafe to have a null morph |
| * CID#1440 Avoid null filenames |
| * CID#1302 Dereferencing NULL value "apostrophe" |
| * CID#1441 Avoid deferencing null ppfx |
| |
| 2010-04-16 Caolán McNamara <cmc at OOo>: |
| * hun#2344123 fix U)ncap in utf-8 locale |
| * fix up hunspell text UI and lines wider than terminal |
| |
| 2010-04-15 Caolán McNamara <cmc at OOo>: |
| * hun#2613701 fix small leak in FileMgr::FileMgr |
| * fix small leak in tools/hunspell |
| * hun#2871300 avoid crash if def and words are NULL |
| * hun#2904479 fix length of hzip file |
| * hun#2986756 mingw build fix |
| * hun#2986756 fix double-free |
| * hun#2059896 fix crash in interactive mode without nls |
| * hun#2917914 add some extra words to the latexparser |
| * make some structs static |
| * C-api has duped symbol names |
| * regenerate gettext/intl with recent version |
| * hun#2796772 build a .dll under MinGW |
| * rhbz#502387 allow cross-compiling for MinGW target |
| * hun#2467643 update .vcproj files to include replist.?xx |
| * unify visiblity/dll_export support across platforms |
| * hun#2831289 sizeof(short) typo |
| * hun#2986756 add -u3 gcc style output |
| |
| 2010-04-14 Caolán McNamara <cmc at OOo>: |
| * hun#2813804 fix segfault on hu_HU stemming |
| |
| 2010-04-13 Caolán McNamara <cmc at OOo>: |
| * hun#2806689 fix ironic misspellings |
| * hun#2836240 add Italian translations |
| |
| 2010-04-09 Caolán McNamara <cmc at OOo>: |
| * fix titchy possible leak in command-line spellchecker |
| |
| 2010-04-07 Caolán McNamara <cmc at OOo>: |
| * hun#2973827 apply win64 patch |
| * hun#2005643 fix broken mystrdup |
| |
| 2010-03-04 Caolán McNamara <cmc at OOo>: |
| * ooo#107768 fix crash in long strings in spellml mode |
| * hun#1999737 add some malloc checks |
| * hun#1999769 drop old buffer on realloc failure |
| * hun#2005643 tidy string functions |
| * hun#2005643 micro-opt |
| * hun#2006077 free strings on failed dict parse |
| * hun#2110783 ispell-alike verbose mode implementation |
| |
| 2010-03-03 Németh László <nemeth at OOo>: |
| * hunspell/(affixmgr, suggestmgr).cxx: add character sequence |
| support for MAP suggestion, using parenthesized character groups |
| in the syntax, eg. MAP ß(ss). |
| * man/hunspell.4, tests/map*: documentation and test files |
| |
| 2010-02-25 Németh László <nemeth at OOo>: |
| * hunspell/hunspell.cxx: add recursion limit for BREAK (fix OOo Issue 106267) |
| |
| * hunspell/hunspell.cxx: fix crash in morphological analysis of |
| capitalized words with ending dashes |
| |
| * affixmgr.cxx: fix morphological analysis of long numbers combined with dash, |
| eg. 45-00000045 (reported by a@freeblog.hu). |
| |
| 2010-02-23 Caolán McNamara <cmc at OOo>: |
| * hun#2314461 improve ispell-alike mode |
| * hun#2784983 improve default language detection |
| * hun#2812045 fix some compiler warnings |
| * hun#2910695 survive missing HOME dir |
| * hun#2934195 fix suggestmgr crash |
| * hun#2921129 remove unused variables |
| * hun#2826164 make sure make check uses the in-tree libhunspell |
| * bump toolchain to support --disable-rpath |
| * hun#2843984 fix coverity warning |
| * hun#2843986 fix coverity warning |
| * hun#2077630 add iconv lib |
| * make gcc strict-aliasing warning free |
| * make cppcheck warning free |
| |
| 2008-11-01 Németh László <nemeth at OOo>: |
| * replist.*, hunspell.cxx, affixmgr.cxx: new input and output |
| conversion support, see ICONV and OCONV keywords in the Hunspell(4) |
| manual page and the test examples. The input/output conversion |
| problem of syllabic languages reported by Daniel Yacob and |
| Shewangizaw Gulilat. |
| - tests/{iconv,oconv}.*: test examples |
| |
| * tools/wordforms: word generation script for dictionary developers |
| (Hunspell version of the unmunch program) |
| |
| * hunspell/hunspell.cxx: extended BREAK feature: ^ and $ mean in break |
| patterns the beginning and end of the word. |
| - tests/BREAK.*: modified examples. |
| |
| * hunspell/hunspell.cxx: set default break at hyphen characters. |
| The associated problem reported by S Page in Hunspell Bug 2174061. |
| See Mozilla Bug ID 355178 and OOo Issue 64400, too. |
| - tests/breakdefault.*: test data |
| The following definition is equivalent of the default word break: |
| |
| BREAK 3 |
| BREAK - |
| BREAK ^- |
| BREAK -$ |
| |
| * affixmgr.cxx: SIMPLIFIEDTRIPLE is a new affix file keyword to allow |
| simplified forms of the compound words with triple repeating letters. |
| It is useful for Swedish and Norwegian languages. |
| |
| * affixmgr.cxx: extend CHECKCOMPOUNDPATTERN to support |
| alternations of compound words for example by sandhi |
| feature of Indian and other languages. The problem reported |
| by Kiran Chittella associated with Telugu writing system |
| (see Telugu example in tests/checkcompoundpattern4.test). |
| The new optional field of CHECKCOMPOUNDPATTERN definition is the |
| replacement of the compound boundary defined by the previous fields: |
| CHECKCOMPOUNDPATTERN ff f ff |
| means ff|f compound boundary has been replaced by "ff", like in |
| the (prereform) German Schiffahrt (Schiff+fahrt). |
| - CHECKCOMPOUNDPATTERN supports also optional flag conditions now: |
| CHECKCOMPOUNDPATTERN ff/A f/B ff |
| means that the first word of the compound needs flag "A" and |
| the second word of the compound needs flag "B" to the operation. |
| |
| * tools/hunspell.cxx: add empty lines as separators to the output of |
| the stemming and morphological analysis. |
| |
| * affixmgr.cxx: fix condition checking algorithm. Bad suggestion |
| generation reported by Mehmet Akin in SF.net Bug 2124186 with help of |
| Eleonora Goldman. |
| |
| * affixmgr,cxx: fix COMPOUNDWORDMAX feature. The problem and its |
| code details reported by Göran Andersson under SF.net Bug ID 2138001. |
| |
| * csutil.cxx: fix bad conditional code for Mozilla compilation. |
| Patch by Serge Gautherie. The problem reported by Ryan VanderMeulen. |
| |
| * hunspell/hunspell.cxx: add missing ngram suggestion for HUHINITCAP |
| (capitalized mixed case) words. |
| |
| * w_char.hxx: use GCC conditions for GCC related code. Patch by |
| Ryan VanderMeulen. |
| |
| * affixmgr.cxx: check morphological description in morphgen() |
| (fix potential program fault by incomplete morphological |
| description of affix rules) |
| |
| * src/win_api: config.h: switch on warning messages on Windows |
| |
| * tools/affixcompress: extended help for -h (use LC_ALL=C sort |
| for input word list) |
| |
| * man/hunspell.4: updated manual: |
| - new and modified features (SIMPLIFIEDTRIPLE, ICONV, OCONV, |
| BREAK, CHECKCOMPOUNDPATTERN). |
| - note about costs of zero affixes, suggested by Olivier Ronez. |
| |
| * hunspell/hunspell.cxx: remove deprecated word breaking codes. |
| |
| 2008-08-15 Németh László <nemeth at OOo>: |
| * affentry.cxx: add FULLSTRIP option. With FULLSTRIP, affix rules can |
| strip full words, not only one less characters. Suggested by |
| Davide Prina and other developers in OOo Issue 80145. |
| * tests/fullstrip.*: Test data based on Davide Prina's example. |
| * tools/unmunch.cxx: modified for FULLSTRIP. |
| |
| * affixmgr.cxx: COMPOUNDRULE now works with long and numerical flag |
| types by parenthesized flags. Syntax: (flag)*, (flag)(flag)?(flag)*. |
| * tests/compoundrule[78].*: tests with parenthesized COMPOUNDRULE |
| definitions. |
| |
| * suggestmgr.cxx: modified badchar*(), forgotchar*() and extrachar*() |
| 1-character distance suggestion algorithms: search a TRY character |
| in all position instead of all TRY characters in a character position |
| (it can give more readable suggestion order, also better suggestions |
| in the first positions, when TRY characters are sorted by frequency.) |
| For example, suggestions for "moze": |
| ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6), |
| maze, more, mote, ooze, mole etc. (Hunspell 1.2.7). |
| |
| * suggestmgr.cxx: extended compound word checking for better COMPOUNDRULE |
| related suggestions, for example English ordinal numbers: 121323th -> |
| 121323rd (it needs also a th->rd REP definition). |
| |
| * phonet.cxx: cast unsigned char parameter of isdigit() and fix |
| isalpha by myisalpha() (potential problems in Windows environment). |
| Reported by Thomas Lange in OOo Issue 92736. |
| |
| * hunspell/csutil.*,hunspell/{affentry,affixmgr,hunspell,suggestmgr}.cxx: |
| fix potential buffer overloading under morphological analysis by the |
| new mystrcat() function. Reported by Molnár Andor (dolhpy at true |
| dot hu) in SF.net Bug 2026203. |
| |
| * affixmgr.cxx: add recursion limit to defcpd(). Fix OOo Issue 76067: |
| crash-like deceleration by checking hexadecimal numbers with long FFF |
| sequence (combinatory explosion by the en_US words "f" and "ff"). |
| Missing fix reported by Mathias Bauer. |
| |
| * affixmgr.cxx: fix the difference in the Unicode and non-Unicode |
| parts of cpdcase_check(). Bug report by Brett Wilson. |
| |
| * filemgr.*, affixmgr.cxx, csutil.*, hashmgr.*: warning messages now |
| contain line numbers (use --with-warnings configure option for |
| warning messages). |
| |
| * hunspell.cxx: analyze(): fix case conversion of stemming and |
| morphological analysis of UTF-8 encoded input. Reported by Ferenc Godó. |
| |
| * tools/hunspell.cxx: fix LaTeX Unicode support in filter mode. |
| Reported by Jan Seeger in SF.net Bug 2039990. |
| |
| * affixmgr.hxx: 0.5 or in 64 bit environment, 1 MB (virtual) memory |
| saving using only the requested size for sFlag and pFlag arrays. |
| Bug report by Brett Wilson. |
| |
| * affixmgr.cxx,tools/hunspell.cxx: get_version() returns with full |
| VERSION affix parameter instead of its first word. Fixes for |
| Hunspell's header. Some problems with Hunspell header reported in |
| SF.net Bug 2043080. |
| |
| 2008-07-15 Németh László <nemeth at OOo>: |
| * affentry.cxx: fixes of the affix rule matching algorithm (affected |
| only the sk_SK dictionary from all OpenOffice.org dictionaries): |
| - fix dot pattern + accented letters matching (in non Unicode encoding) |
| - word-length conditions work again |
| * tests/condition.*: extended test for the fix. |
| |
| * hashmgr.cxx: load multiword expressions: spaces may be parts |
| of the dictionary words again (but spaces also work as morphological |
| field separators: word word2 -> "word word2", word po:noun -> "word"). |
| * man/hunspell.4: updated manual |
| |
| * tools/hunspell.cxx: add iconv character conversion support to |
| stemming and morphological analysis |
| |
| * tools/hunspell.cxx: add /usr/share/myspell/dicts search path for |
| Ubuntu support |
| |
| 2008-07-09 Németh László <nemeth at OOo>: |
| * affentry.cxx: fixes of the affix rule matching algorithm: |
| - right ASCII character handling in bracket expression; |
| - fault-tolerant nextchar() for bad rules. |
| Problem with the en_GB dictionary and nextchar() with a detailed |
| code analysis reported by John Winters in SF.net Bug ID 2012753. |
| * tests/condition.*: extended test for the fix. |
| |
| * hunspell/hunspell.*, parsers/*, tools/hunspell.cxx: fix compiler |
| warnings (deprecated const-free char consts) |
| |
| * win_api/hunspelldll.*: add hunspell_free_list(), the problem |
| reported by Laurier Mercer. |
| |
| 2008-06-30 Török László <torok_laszlo at users dot SF dot net>: |
| * tests/affixmgr.cxx: fix morphological analysis: strcat() on |
| an uninitialized char array in suffix_check_morph(). |
| |
| 2008-06-18 Németh László <nemeth at OOo>: |
| * src/hunspell/affixmgr.cxx: fix GCC compiler warnings |
| (comparisons with string literal results in unspecified behaviour). |
| The problem reported by Ladislav Michnovič. |
| |
| 2008-06-17 Németh László <nemeth at OOo>: |
| * src/hunspell/{hunspell.cxx,hunspell.h}: add free_list() to the C and |
| C++ interface to deallocate suggestion lists. The problem |
| reported by Laurie Mercer and Christophe Paris. |
| * csutil.cxx: fix freelist() to deallocate non-NULL list, when n = 0. |
| * tools/{analyze,example,chmorph,hunspell}.cxx: use free_list(). |
| |
| * tools/hunspell.cxx: fix only --with-readline compiling problem. |
| Reported by Volkov Peter in SF.net Bug 1995842. |
| |
| * man/hunspell.3,hunspell.hxx: fix analyze and generate examples in |
| the manual and comments (using char*** parameter instead of char**). |
| |
| * tools/example.cxx: fix suggestion example. |
| |
| 2008-06-17 Németh László <nemeth at OOo>: |
| * affentry.cxx: fix the new affix rule matching algorithm of |
| Hunspell 1.2. Arabic dictionary problem reported by Khaled Hosny |
| in SF.net Bug ID 1975530. Mohamed Kebdani also sent a |
| prepared test data. |
| * tests/{1975530,condition*}: tests for the fix |
| |
| 2008-06-13 Ingo H. de Boer <idb_winshell at SF.net>: |
| * src/hunspell/{affixmgr.cxx,hunspell.cxx}: add missing type |
| cast to strstr() calls for VC8 compatibility. |
| |
| 2008-06-13 Németh László <nemeth at OOo>: |
| * suggestmgr.cxx: add also part1-part2 suggestion with dash |
| for bad part1part2 word forms, suggested by Ruud Baars. |
| For example, now suggestion of "parttime": "part time" |
| and "part-time". |
| NOTE: this feature will work only when the TRY definition |
| contains "-" or the letter "a". |
| |
| * hunspell.cxx: new XML API in spell() and suggest() (see hunspell(3)). |
| |
| * src/hunspell/*: fixes for OpenOffice.org build environment. |
| |
| * man/{hunspell.3,hzip.1,hunzip.1}: add new manual pages for |
| Hunspell programming API and dictionary compression and |
| encryption utilities. |
| |
| * src/hunspell/*: handle failed mystrdup() calls and other potential |
| insufficient memory problems. The problem reported by Elio Voci |
| in OpenOffice.org Issue 90604 and others. |
| |
| * src/tools/affixmgr.cxx: restore original behaviour of get_wordchars |
| without conditional code. Problem reported by Ingo H. de Boer |
| in SF.net Bug 1763105. |
| |
| * win_api/hunspelldll.h: put_word() renamed to add() in the (old) |
| Windows DLL API bug reported in SF.net Bug 1943236. Also reported |
| by Bartkó Zoltán. |
| |
| * tools/hunspell.cxx: fix chench() for environments without |
| native language support (ENABLE_NLS 0 in config.h), |
| PHP system_exec() bug reported by Michel Weimerskirch in |
| SF.net Bug 1951087. |
| |
| * hunspell.cxx, affixmgr.cxx: remove "result" from the |
| (result && *result) conditions, when "result" is a static variable. |
| The problem and a possible solution reported by Ladislav Michnovič. |
| |
| * affixmgr.cxx: parse_affix(): print line instead of NULL in |
| the warning message, when affix class header is bad. |
| The problem reported by Ladislav Michnovič. |
| |
| 2008-06-01 Christian Lohmaier <cloph at OOo> |
| * configure.ac: patch to fix --with-readline, --with-ui logic. |
| Reported in the SF.net Bug 981395. |
| |
| 2008-05-04: Volkov Peter <volkov_peter at users sourceforge net> |
| * configure.ac: fix LibTool 2.22 incompatibility by removing |
| unused LT_* macros. Report and patch in SF.net Bug 1957383. |
| The problem reported and fixed by Ladislav Michnovič, too. |
| |
| 2008-04-23: Ladislav Michnovič <lmichnovic at suse cz> |
| * hunspell.pc.in: fix wrongly set directories. |
| |
| 2008-04-12 Németh László <nemeth at OOo>: |
| * src/tools/hunspell.cxx: |
| - Multilingual spell checking and special dictionary support with -d. |
| Multilingual spell checking suggested by Khaled Hosny (SF.net |
| Bug 1834280). Example for the new syntax: |
| |
| -d en_US,en_geo,en_med,de_DE,de_med |
| |
| en_US and de_DE are base dictionaries, and en_geo, en_med, de_med |
| are special dictionaries (dictionaries without affix file). |
| Special dictionaries are optional extension of the base dictionaries. |
| There is no explicit naming convention for special dictionaries, |
| only the ".dic" extension: dictionaries without affix file will |
| be an extension of the preceding base dictionary. First dictionary |
| in -d parameter must have an affix file (it must be a base |
| dictionary). |
| |
| - new options for debugging, morphological analysis and stemming: |
| -m: morphological analysis or flag debug mode (without affix |
| rule data it signs the flag of the affix rules) |
| -s: stemming mode |
| -D: show also available dictionaries and search path |
| (suggested by Aaron Digulla in SF.net Bug 1902133) |
| |
| - add missing refresh() to print bad words before the slower suggestion |
| search in UI (better user experience) |
| |
| - fix tabulator problems (reported by ugli-kid-joe AT sf DOT net) |
| |
| - fix different encoding of dic and input, and suggestions |
| |
| - add per mille sign to LANG hu_HU section. |
| |
| - rewrite program messages. Concatenating multiple printfs for |
| easier translation suggested by András Tímár and Gábor Kelemen. |
| |
| * src/hunspell/csutil.cxx: set static encds variable. Patch by |
| Rene Engerhald. SF.net Bug 1896207 and 1939988. |
| |
| * src/hunspell/w_char.hxx,csutil.hxx: reorganizing |
| w_char typedef and HENTRY_DATA, HENTRY_FIND consts |
| |
| * src/hunspell/hunzip.cxx: fopen(): using rb options instead of r (fix |
| for Windows) |
| |
| * src/tools/affixmgr.cxx: restore original behaviour of get_wordchars |
| in an #ifdef WINSHELL section. Problem reported by Ingo H. de Boer |
| in SF.net Bug 1763105. |
| |
| * src/tools/chmorph.cxx: remove the experimental modifications |
| |
| * src/tools/hzip.c: fopen(): using wb options instead of w (fix |
| for Windows) |
| |
| * src/tools/hunzip.cxx: add missing MOZILLA_CLIENT. Reported |
| by Ryan VanderMeulen. |
| |
| * man/*, man/hu/*: updated manual |
| |
| * man/hunspell.4: fix formatting problem (missing header) |
| |
| * tools/makealias: now works with the extra data fields. |
| |
| * phonet.cxx: use HASHSIZE const |
| |
| * tests/rep.aff: fix REP count |
| |
| * src/win_api/Makefile.cygwin, README: native Windows compilation |
| in Cygwin environment without cygwin1.dll dependency (see README |
| for compiling instructions). |
| |
| 2008-04-08 Roland Smith <rsmith AT xs4all DOT nl>: |
| * src/parsers/latexparser.cxx: fix PATTERN_LEN for AMD64 and |
| other platforms with different struct padding (SF.net Bug 1937995). |
| |
| 2008-04-03 Kelemen Gábor <kelemeng AT gnome DOT hu>: |
| * po/POTFILES.in: fix path of the source file |
| |
| * po/Makevars: add --from-code=UTF-8 gettext option |
| |
| * hunspell.cxx: add comments for shortkey translation |
| |
| 2008-02-04 Flemming Frandsen <flfr AT stibo DOT com> |
| * src/hunspell.h: fix Windows DLL support |
| - this patch also reported by Zoltán Bartkó. |
| |
| 2008-01-30 Mark McClain <marc_mcclain AT users DOT sf DOT net> |
| * src/hunspell.cxx: stem(): fix function call side effect |
| for PPC platform (SF.net Bug 1882105). |
| |
| 2008-01-30 Németh László <nemeth at OOo>: |
| * hunspell.cxx, csutil.cxx, hunspelldll.c: fix |
| SF.et Bug 1851246, patch also by Ingo H. de Boer. |
| |
| * hunspell.h: fix SF.net Bug 1856572 (C prototype problem), |
| patch by Mark de Does. |
| |
| * hunspell.pc.in: fix SF.net Bug 1857450 wrong prefix, reported |
| by Mark de Does. |
| |
| * hunspell.pc.in: reset numbering scheme: libhunspell-1.2. |
| Fix SF.net Bug 1857512 reported by Mark de Does, |
| also by Rene Engelhard. |
| |
| * csutil.cxx: patches for ARM platform, signed_chars.dpatch |
| by Rene Engelhard and arm_structure_alignment.dpatch by |
| Steinar H. Gunderson <sesse@debian.org> |
| |
| * hunzip.*, hzip.c: new hzip compression format |
| |
| * tools/affixcompressor: affix compressor utility (similar to |
| munch, but it generates affix table automatically), works |
| with million-words dictionaries of agglutinative languages. |
| |
| * README: fix problems reported by Pham Ngoc Khanh. |
| |
| * csutil.cxx, suggestmgr: Warning-free in OOo builds. |
| |
| * hashmgr.*, csutil.*: fix protected memory problems with |
| stored pointers on several not x86 platforms by |
| store_pointer(), get_stored_pointer(). |
| |
| * src/tools/hunspell.cxx: fix iconv support on Solaris platform. |
| |
| * tests/IJ.good: add missing test file |
| |
| * csutil.cxx: fix const char* related errors. Compiling bug |
| with Visual C++ reported by Ryan VanderMeulen and Ingo H. de Boer. |
| |
| 2008-01-03 Caolan McNamara <cmc at OO.o>: |
| * csutil.cxx: SF.net Bug 1863239, notrailingcomma patch and |
| optimization of get_currect_cs(). |
| |
| 2007-11-01 Németh László <nemeth at OOo>: |
| * hunspell/*: new feature: morphological generation, |
| also fix experimental morphological analysis and stemming. |
| - new API functions and improved API: |
| - analyze(word): (instead of morph()) morphological analysis |
| - stem(word): stemming |
| - stem(list): stemming based on the result of an analysis |
| - generate(word, word2): morphological generation |
| - generate(word, list): morphological generation |
| - add(word): add word to the run-time dictionary (renamed put_word()) |
| - add_with_affix(word, word2): (renamed put_word_pattern()): |
| add word to the run-time dictionary with affix flags of the |
| second parameter: all affixed forms of the user words will be |
| recognised by the spell checker. Especially useful for |
| agglutinative languages. |
| - remove(word): remove word from the run-time dictionary (not |
| implemented) |
| - see manual and hunspell/hunspell.hxx header and tests/morph.* |
| * tests/morph.*: test data, example for morphological analysis, |
| stemming and generation |
| |
| * tools/analyze, tools/chmorph: extended and new demo applications: |
| - analyze (originally hunmorph): analyses and stems input words, |
| generates word forms from input word pairs. |
| - chmorph: morphological transformation filter |
| |
| * configure.ac, hunspell/makefile.am: set library version number. |
| Bug reported by Rene Engelhard. |
| |
| * affentry.cxx, affixmgr.cxx: new pattern matching algorithm in |
| condition checking of affix rules instead of the Dömölki-algorithm: |
| - Unlimited condition length (instead of max. 8 characters). |
| - Less memory consumption, especially useful for affix rich languages: |
| 5,4 MB memory savings with hu_HU dictionary. |
| - Speed change depends from dictionaries and CPU caches: English spell |
| checking is 4% faster on Linux words with en_US dictionary, Hungarian |
| spell checking is 25% slower on most frequent words of Hungarian |
| Webcorpus. |
| |
| * tests/sug.*, sugutf.*: updated test data (use "a" and "lot" |
| dictionary items instead of "a lot".) |
| |
| * src/hunspell/hunspell.cxx: free(csconv) instead of delete csconv. |
| Report and patch by Sylvain Paschein in Mozilla Issue 398268. |
| |
| * suggestmgr.cxx, tools/hunspell.cxx: bad spelling of "misspelled". |
| Ubuntu Bug #134792, patch by Malcolm Parsons. |
| |
| * tests/base_utf.*: use Unicode apostrophe instead of 8-bit one. |
| |
| * hunspell.cxx, hashmgr.cxx: add(): use HashMgr::add() |
| |
| 2007-10-25 Pavel Janík <pjanik at OOo>: |
| * hunspell/csutil.cxx: Fix type cast warnings on 64bit Linux in |
| printing of character positions in u8_u16(). OOo issue 82984. |
| |
| 2007-09-05 Németh László <nemeth at OOo>: |
| * win_api/Hunspell.vproj, parsers/testparser.cxx,textparser.hxx: |
| warning fixes and removing unnecessary Windows project file. |
| Reported by Ingo H. de Boer. |
| |
| * hashmgr.*, {affixmgr,suggestmgr}.cxx: optimized data structure |
| for variable-count fields (only "ph" transliteration field in |
| this version, see next item). Also less memory consumption: |
| -13% (0.75 MB) with en_US dictionary, -6% (1 MB) with hu_HU. |
| |
| * suggestmgr.cxx: dictionary based phonetic suggestion for special |
| or foreign pronounciation (see also rule-based PHONE in manual). |
| Usage: tab separated field in dictionary lines, started with "ph:". |
| The field contains a phonetic transliteration of the word: |
| |
| Marseille ph:maarsayl |
| * tests/phone.*: test data for dictionary and rule based phonetic |
| suggestion. |
| |
| * hunspell.cxx: fix potential bad memory access in allcap word |
| capitalization in suggest() (bug of previous version). |
| |
| * hunspell.cxx, atypes.hxx: set correct limit for UTF-8 encoded |
| input words (256 byte). |
| |
| * suggestmgr.cxx: improved REP suggestions with spaces: it works |
| without dictionary modification. |
| OOo issue 80147, reported by Davide Prina. |
| * tests/rep.*: new test data: higher priority for "alot" -> "a lot", |
| and Italian suggestion "un'alunno" -> "un alunno". |
| |
| * affixmgr.cxx: fix Unicode ngram suggestions in expand_rootword(). |
| (Suggestions with bad affixes.) |
| Bug reported by Vitaly Piryatinksy <piv dot v dot vitaly at gmail>. |
| * tests/ngram_utf_fix.*: test based on Vitaly Piryatinksy's data. |
| |
| * suggestmgr.cxx: fix twowords() for last UTF-8 multibyte character. |
| (conditional jump or move depended on uninitialised value). |
| |
| 2007-08-29 Ingo H. de Boer <idb_winshell at SF.net>: |
| * win_api/{hunspell,libhunspell, testparser}.vcproj: new project |
| files for the library and the executables. |
| |
| * Hunspell.rc, Hunspell.sln, config.h: updated versions. |
| Version number problem also reported by András Tímár. |
| |
| 2007-08-27 Németh László <nemeth at OOo>: |
| * suggestmgr.hxx: put fixed version. Bug report by Ingo H. de Boer. |
| |
| * suggestmgr.cxx: remove variable-length local character array |
| reported by Ingo H. de Boer. |
| |
| 2007-08-27 Németh László <nemeth at OOo>: |
| * suggestmgr.hxx: change bad time_t to clock_t in header, too. |
| Bug reports or patches by Ingo H. de Boer under SF.net |
| Bug ID 1781951, János Mohácsi and Gábor Zahemszky, András Tímár, |
| OMax3 at SF.net under SF.net Bug ID 1781592. |
| |
| * phonet.*: change variable-length local character array to |
| portable fixed size character array. Problem reported by |
| Ingo H. de Boer under SF.net Bug ID 1781951 and |
| Ryan VanderMeulen. |
| |
| * suggestmgr.cxx: remove debug message (also by |
| Ingo H. de Boer). |
| |
| 2007-08-26 Ingo H. de Boer <idb_winshell at SF.net>: |
| * win_api/Hunspell.vcproj: updated version (with phonet.*) |
| |
| 2007-08-23 Németh László <nemeth at OOo>: |
| * phonet.{c,h}xx, suggestmgr.cxx: PHONE parameter: |
| pronounciation based suggestion using Björn Jacke's original Aspell |
| phonetic transcription algorithm (http://aspell.net), relicensed |
| under GPL/LGPL/MPL tri-license with the permission of the author. |
| Usage: see manual. |
| |
| * affixmgr,suggestmgr.cxx: add KEY parameter for keyboard and |
| input method error related suggestions. |
| Example: KEY qwertyuiop|asdfghjkl|zxcvbnm |
| |
| * man/hunspell.4: description about PHONE and KEY suggestion parameters. |
| |
| * suggestmgr.cxx: enhancements for better suggestions: |
| - Set ngram suggestions for badchar-type errors |
| and only two word and compound word suggestions, too. |
| - Separate not compound and compound word |
| suggestions for MAP suggestion, too. |
| - Double swap suggestions for short words. |
| For example: ahev -> have, hwihc -> which. |
| - Better time limits using clock() instead of time() |
| (tenths of a second resolution instead of second ones). |
| - leftcommonsubstring() weigth function. |
| |
| * htype.hxx, hashmgr.cxx: blen (byte length) and clen (character |
| length) fields instead of wlen |
| |
| * affixmgr.cxx: fix get_syllable() for bad Unicode inputs. |
| |
| * tests/suggestiontest/*: test environment for suggestions |
| |
| 2007-08-07 Martijn Wargers: |
| * csutil.cxx: fix Mingw build error associated with ToUpper() call. |
| Report and patch in Mozilla Issue 391447. |
| |
| 2007-08-07 Robert Longson: |
| * atypes.cxx: use empty inline function HUNSPELL_WARNING instead of |
| variadic macros to switch of Hunspell warnings. |
| Reported by Gavin Sharp in Mozilla Issue 391147. |
| |
| 2007-08-05 Ginn Chen: |
| * hashmgr.cxx: Hunspell failed to compile on OpenSolaris (use stdio |
| instead of csdio). Report and patch in Mozilla Issue 391040. |
| |
| 2007-07-25 Németh László <nemeth at OOo>: |
| * parsers/*.cxx: Hunspell executable recognises and accepts URLs, |
| e-mail addresses, directory paths, reported by Jeppe Bundsgaard. |
| * src/tools/hunspell.cxx: --check-url: new option of Hunspell program. |
| Use --check-url, if you want check URLs, e-mail addresses and paths. |
| |
| * parsers/textparser.cxx: strip colon at end of words for Finnish |
| and Swedish (colon may be in words in Finnish and Swedish). |
| Problem reported by Lars Aronsson. |
| * tests/colons_in_words.*: test data |
| |
| * tests/digits_in_words.*: example for using digits in words |
| (eg. 1-jährig, 112-jährig etc. in German), reported by Lars Aronsson. |
| |
| * hashmgr.cxx: Hunspell accepts allcaps forms of mixed case |
| words of personal dictionaries (+allcaps custom dictionary words with |
| allcaps affixes). |
| Sf.net Bug ID 1755272, reported by Ellis Miller. |
| |
| * hashmgr.cxx: fix small memory leaks with alias compressed |
| dictionaries (free flag vectors of affixed personal dictionary words |
| and flag vectors of hidden capitalized forms of mixed case and |
| allcaps words). |
| |
| * affixmgr.cxx: fix COMPOUNDRULE checking with affixed compounds. |
| Sf.net Bug ID 1706659, reported by Björn Jacke. Also fixing for |
| OOo Issue 76067 (crash-like deceleration for hexadecimal numbers |
| with long FFFFFF sequence using en_US dictionary). |
| |
| * tools/hunspell.cxx: add missing return to save_privdic(). |
| |
| * man/hunspell.4: add information about affixation of personal words: |
| "Personal dictionaries are simple word lists, but with optional |
| word patterns for affixation, separated by a slash: |
| |
| foo |
| Foo/Simpson |
| |
| In this example, "foo" and "Foo" are personal words, plus Foo |
| will be recognised with affixes of Simpson (Foo's etc.)." |
| |
| 2007-07-18 Németh László <nemeth at OOo>: |
| * src/win_api/: add missing resource files, reported by Ingo H. de Boer. |
| |
| 2007-07-16 Németh László <nemeth at OOo>: |
| * hunspell.cxx: fix dot removing from UTF-8 encoded words in cleanword2() |
| (Capitalised words with dots, as "Something." were not recognised |
| using Unicode encoded dictionaries.) |
| * tests/{base.*,base_utf.*}: extended and new test files for |
| dot removing and Unicode support. |
| |
| * tools/hunspell.cxx: fix Cygwin, OS X compatibility using platform |
| specifics iconv() header by ICONV_CONST macro of Autoconf. |
| Sf.net Bug ID 1746030, reported by Mike Tian-Jian Jiang. |
| Sf.net Bug ID 1753939, reported by Jean-Christophe Helary. |
| |
| * tools/hunspell.cxx: fix missing global path setting with -d option. |
| |
| * tests/test.sh: fix broken Valgrind checking (missing warnings |
| with VALGRIND=memcheck make check). |
| |
| * csutil.cxx: fix condition in u8_u16() to avoid invalid read |
| of not null-terminated character arrays (detected by Valgrind |
| in Hunspell executable: associated with 8-bit character table |
| conversion in tools/hunspell.cxx). |
| |
| * csutil.cxx: free_utf_tbl(): use utf_tbl_count-- instead of utf_tbl--. |
| Memory leak in Hunspell executable detected by Valgrind. |
| |
| * hashmgr.cxx: add missing free_utf_tbl(), memory leak in Hunspell |
| executable detected by Valgrind. |
| |
| * hashmgr.cxx: load_tables(): fix memory error in spec. capitalization. |
| Use sizeof(unsigned short) instead of bad sizeof(unsigned short*). |
| Invalid memory read detected by Valgrind. |
| |
| * hashmgr.cxx: add_word(): fix memory error in spec. capitalization. |
| Update also affix array length of capitalized homonyms. Invalid |
| memory read detected by Valgrind. |
| |
| * hunspell.cxx: suggest(): fix invalid memory write and leak. |
| Bad realloc() and missing free() detected by Valgrind associated |
| with suggestions for "something.The" type spelling errors. |
| |
| * {dictmgr,csutil,hashmgr,suggestmgr}.cxx: check memory allocation. |
| Sf.net Bug ID 1747507, based on the patch by Jose da Silva. |
| |
| 2007-07-13 Ingo H. de Boer <idb_winshell at SF.net>: |
| * atypes.cxx: fix Visual C compatibility: Using |
| "HUNSPELL_WARNING(a,b,...} {}" macro instead of empty "X(a,b...)". |
| |
| * hunspell.cxx: changes for Windows API. |
| * win_api/Hunspell.*: new resource files |
| * win_api/hunspelldll.*: set optional Hunspell and Borland spec. codes |
| Sf.net Bug ID 1753802, patch by Ingo H. de Boer. |
| See also Sf.net Bug ID 1751406, patch by Mike Tian-Jian Jiang. |
| |
| 2007-07-09 Caolan McNamara <cmc at OO.o>: |
| * {hunspell,hashmgr,affentry}.cxx: fix warnings of Coverity program |
| analyzer. Sf.net Bug ID, 1750219. |
| |
| 2007-07-06 Németh László <nemeth at OOo>: |
| * atypes.cxx: warning-free swallowing of conditional warning messages |
| and their parameters using empty HUNSPELL_WARNING(a,b...) macro. |
| * {affixmgr,atypes,csutil}.cxx: fix unused variable warnings |
| using WARNVAR macro for conditionally named variables. |
| * hashmgr.cxx: fix unused variable warning in add_word() by cond. name |
| * hunspell.cxx: fix shadowed declaration of captype var. in suggest() |
| |
| 2006-06-29 Caolan McNamara <cmc at OO.o>: |
| * hunspell.cxx: patch to fix possible memory leak in analyze() of |
| experimental morphological analyzer code. Sf.net Bug ID 1745263. |
| |
| 2007-06-29 Németh László <nemeth at OOo>: |
| improvements: |
| * src/hunspell/hunspell.cxx: check bad capitalisation of Dutch letter IJ. |
| - Sf.net Feature Request ID 1640985, reported by Frank Fesevur. |
| - Solution: FORBIDDENWORD for capitalised word forms (need |
| an improved Dutch dictionary with forbidden words: Ijs/*, etc.). |
| * tests/IJ.*: test data and example. |
| |
| * hashmgr.cxx, hunspell.cxx: check capitalization of special word forms |
| - words with mixed capitalisation: OpenOffice.org - OPENOFFICE.ORG |
| Sf.net Bug ID 1398550, reported by Dmitri Gabinski. |
| - allcap words and suffixes: UNICEF's - UNICEF'S |
| - prefixes with apostrophe and proper names: Sant'Elia - SANT'ELIA |
| For Catalan, French and Italian languages. |
| Reported by Davide Prina in OOo Issue 68568. |
| * tests/allcaps*: tests for OPENOFFICE.ORG, UNICEF'S capitalization. |
| * tests/i68568*: tests for SANT'ELIA capitalization. |
| |
| * hunspell/hunspell.cxx: suggestion for missing sentence spacing: |
| something.The -> something. The |
| |
| * tools/hunspell.cxx: multiple character encoding support |
| - -i option: custom input encoding |
| Sf.net Bug ID 1610866, reported by Thobias Schlemmer. |
| Sf.net Bug ID 1633413, reported by Dan Kenigsberg. |
| See also hunspell-1.1.5-encoding.patch of Fedora from Caolan Mc'Namara. |
| * tests/*.test: add input encodings |
| |
| * tools/hunspell.cxx: use locale data for default dictionary names. |
| Sf.net Bug ID 1731630, report and patch from Bernhard Rosenkraenzer, |
| See also hunspell-1.1.4-defaultdictfromlang.patch of Fedora Linux |
| from Caolan McNamara. |
| |
| * tools/hunspell.cxx: fix 8-bit tokenization (letters without |
| casing, like ß or Hebrew characters now are handled well) |
| |
| * tools/hunspell.cxx: dictionary search path |
| - DICPATH environmental variable |
| - -D option: show directory path of loaded dictionary |
| - automatic detection of OpenOffice.org directories |
| |
| fixes: |
| * affixmgr.cxx: fault-tolerant patch for REP and other affix |
| table data problems. Problem with Hunspell and en_GB dictionary |
| reported by Thomas Lange in OOo Issue 76098 and |
| Stephan Bergmann in OOo Issue 76100. |
| Sf.net Bug ID 1698240, reported by Ingo H. de Boer. |
| |
| * csutil.cxx: fix mkallcap_utf() for allcaps suggestion in UTF-8. |
| |
| * suggestmgr.cxx: fix bad movechar_utf() (missing strlen()). |
| |
| * hunspell.cxx: fix bad degree sign detection in Unicode |
| hu_HU environment. |
| |
| * hunspell/hunspell.cxx: free allocated memory of csconv in |
| ported Mozilla code. |
| - Mozilla Bugzilla Bug 383564, report and Mozilla MySpell patch |
| by Andrew Geul. Reported by Ryan VanderMeulen for Hunspell. |
| |
| * suggestmgr.cxx: fix minor difference in Unicode suggestion |
| (ngram suggestion of allcaps words in Unicode). |
| |
| * hashmgr.cxx: close file handle after errors. |
| Sf.net Bug ID 1736286, reported by John Nisly. |
| |
| * configure.ac: syntax error (shell variable with spaces). |
| Sf.net Bug ID 1731625, reported by Bernhard Rosenkraenzer. |
| |
| * hunspell.cxx: check_word(): fix bad usage of info pointer. |
| |
| * hashmgr.cxx: fix de_DE related bug (accept words with leading dash). |
| Sf.net Bug ID 1696134, reported by Björn Jacke. |
| |
| * suggestmgr.cxx, tests/1695964.*: fix NEEDAFFIX homonym suggestion. |
| Sf.net Bug ID 1695964, reported by Björn Jacke. |
| |
| * tests/1463589*: capitalized ngram suggestion test data for |
| Sf.net Bug ID 1463589, reported by Frederik Fouvry. |
| |
| * csutil.cxx, affixmgr.cxx: fix possible heap error with |
| multiple instances of utf_tbl. |
| Sf.net Bug ID 1693875, reported by Ingo H. de Boer. |
| |
| * affixmgr.cxx, suggestmgr.cxx, license.hunspell: convert to ASCII. |
| Locale dependent compiling problems. Sf.net Bug ID 1694379, reported |
| by Mike Tian-Jian Jiang. OOo Issue 78018 reported by Thomas Lange. |
| |
| * tests/test.sh: compatibility issues |
| - fix Valgrind support (check shared library instead of shell wrapper) |
| - remove deprecated "tail +2" syntax |
| - set 8-bit locale for testing (LC_ALL=C) |
| |
| * hunspell.hxx: remove license.* and config.h dependencies. |
| - hunspell-1.1.5-badheader.patch from Caolan McNamara <cmc at OO.o> |
| |
| 2007-03-21 Németh László <nemeth at OOo>: |
| * tools/Makefile.am, munch.h, unmunch.h: add missing munch.h and unmunch.h |
| Reported by Björn Jacke and Khaled Hosny (sf.net Bug ID 1684144) |
| * hunspell/hunspell.cxx, hunspell.hxx: fix --with-ui compliling error (add get_csconv()) |
| Reported by Khaled Hosny (sf.net Bug ID 1685010) |
| |
| 2007-03-19 Németh László <nemeth at OOo>: |
| * csutil.cxx, hunspell/hunspell.cxx: Unicode non BMP area (>65K character range) support |
| (except conditional patterns and strip characters of affix rules) |
| * tests/utf8_nonbmp*: test data |
| |
| * src/hunspell/*: add Mozilla patches from David Einstein |
| - run-time generated 8-bit character tables |
| - other Mozilla related changes (see Mozilla Bugzilla Bug 319778) |
| |
| * csutil.cxx, affixmgr.cxx, hashmgr.cxx: optimized version of IGNORE feature |
| - IGNORE works with affixes (except strip characters and affix conditions) |
| * tests/ignore*: test data with latin characters |
| * tests/ignoreutf*: Unicode test data with Arabic diacritics (Harakat) |
| |
| * src/hunspell/suggestmgr.cxx: new edit distance suggestion methods |
| - capitalization: nasa -> NASA |
| - long swap: permenant -> permanent |
| - long mov.: Ghandi -> Gandhi |
| - double two characters: vacacation -> vacation |
| * tests/sug.*: test data |
| |
| * src/hunspell/affixmgr.cxx: space in REP strings (alot -> a lot) |
| Note: Underline character signs the space in REP strings: REP alot a_lot, and |
| put the expression with space ("a lot") into the dic file (see tests/sug). |
| |
| * hashmgr.cxx, affixmgr.cxx: ignore Unicode byte order mark (BOM sequence) |
| * tests/utf8_bom*: test data |
| |
| * hunspell/*.cxx: OOo Issue 68903 - Make lingucomponent warning-free on wntmsci10 |
| - fix Hunspell related warning messages on Windows platform (except some assignment |
| within conditional expressions). Reported and started by Stephan Bergmann. |
| |
| * hunspell/affixmgr.cxx: fix OOo Issue 66683 - hunspell dmake debug=x fails |
| - Reported by Stephan Bergmann. |
| |
| * src/hunspell/hunspell.[ch]xx: thread safe API for Hunspell executable |
| (removing prev*() functions, new spell(word, info, root) function) |
| |
| * configure.ac, src/hunspell/*: HUNSPELL_EXPERIMENTAL code |
| --with-experimental configure option (conditional compiling of morphological analyser |
| and stemmer tools) |
| |
| * configure.ac, src/hunspell/*: conditional Hunspell warning messages |
| --with-warnings configure option |
| |
| * affixmgr.cxx: new, optimized parsing functions |
| |
| * affixmgr.cxx: fix homonym handling for German dictionary project, |
| reported by Björn Jacke (sf.net Bug ID 1592880). |
| * tests/1592880.*: test data by Björn Jacke |
| |
| * src/hunspell/affixmgr.cxx: fix CIRCUMFIX suggestion |
| Bug reported by Erdal Ronahi. |
| |
| * hunspell.cxx: reverse root word output (complex prefixes) |
| Bug reported by Munzir Taha. |
| |
| * tools/hunspell.cxx: fix Emacs compatibility, patch by marot at sf.net |
| - no % command in PIPE mode (SourceForge BugTracker 1595607) |
| - fix HUNSPELL_VERSION string |
| |
| * suggestmgr.[hc]xx: rename check() functions to checkword() (OOo Issue 68296) |
| adopt MySpell patch by Bryan Petty (tierra at ooo) for Hunspell source |
| |
| * csutil.cxx, munch.c, unmunch.c: adopt relevant parts of the MinGW patch |
| (OOo Issue 42504) by tonal at ooo |
| |
| * affigmgr.cxx: remove double candidate_check() call, reported by Bram Moolenaar |
| |
| * tests/test.sh: add LC_ALL="C" environment. Locale dependency of make check |
| reported by Gentoo project. |
| |
| * src/tools/hunspell.cxx: UTF-8 highlighting fix for console UI |
| (not solved: breaking long UTF-8 lines) |
| |
| * src/tools/unmunch.c: fix bad generation if strip is shorter than condition, |
| reported by Davide Prina |
| * src/tools/unmunch.h: increase 5000 -> 500000 |
| |
| * src/tools/hunspell.cxx: fix memory error in suggestion (uninitialized parameter), |
| Bug also reported by Björn Jacke in SourceForge Bug 1469957 |
| |
| * csutil.cxx, affixmgr.cxx: fix Caolan McNamara's patch for non OOo environment |
| |
| 2006-11-11 Caolan McNamara <cmc at OO.o>: |
| * csutil.cxx, affixmgr.cxx: UTF-8 table patch (OOo Issue 71449) |
| Description: memory optimization (OOo doesn't use the large UTF-8 table). |
| |
| * Makefile.am: shared library patch (Sourceforge ID 1610756) |
| |
| * hunspell.h, hunspell.cxx: C API patch (Sourceforge ID 1616353) |
| |
| * hunspell.pc: pkgconfig patch (Sourceforge ID 1639128) |
| |
| 2006-10-17 Ryan Jones <at Mozilla Bugzilla>: |
| * affixmgr.cxx: missing fclose(affixlst) calls |
| Reported by <gavins at ooo> in OOo Issue 70408 |
| |
| 2007-07-11 Taha Zerrouki <taha at gawab>: |
| * affixmgr.cxx, hunspell.cxx, hashmgr.cxx, csutil.cxx: IGNORE feature to remove |
| optional Arabic and other characters from input and dictionary words. |
| * src/hunspell/langnum.hxx: add Arabic language number, lang_ar=96 |
| * tests/ignore.*: test data |
| |
| 2006-05-28 Miha Vrhovnik <mvrhov at users.sourceforge>: |
| * src/win_api/*: C API for Windows DLLs |
| - also Delphi text editor example (see on Hunspell Sourceforge page) |
| |
| 2006-05-18 Kevin F. Quinn <kevquinn at gentoo>: |
| * utf_info.cxx: struct -> static struct |
| Shared library patch also developed by Gentoo developers (Hanno Meyer-Thurow, |
| Diego Pettenò, Kevin F. Quinn) |
| |
| 2006-02-02 Németh László <nemethl@gyorsposta.hu>: |
| * src/hunspell/hunspell.cxx: suggest(): replace "fooBar" -> "foo bar" suggestions |
| with "fooBar" ->"foo Bar" (missing spaces are typical OCR bugs). |
| Bug reported by stowrob at OOo in Issue 58202. |
| * src/hunspell/suggestmgr.cxx: twowords(): permit 1-character words. |
| (restore MySpell's original behavior). Here: "aNew" -> "a New". |
| * tests/i58202.*: test data |
| |
| * src/parsers/textparser.cxx: fix Unicode tokenization in is_wordchar() |
| (extra word characters (WORDCHARS) didn't work on big-endian platforms). |
| |
| * src/hunspell/{csutil,affixmgr}.cxx: inline isSubset(), isRevSubset(): |
| little speed optimalization for languages with rich morphology. |
| |
| * src/tools/hunspell.cxx: fix bad --with-ui and --with-readline compiling |
| when (N)curses is missing. Reported by Daniel Naber. |
| |
| 2006-01-19 Tor Lillqvist <tml@novell.com> |
| * src/hunspell/csutil.cxx: mystrsep(): fix locale-dependent isspace() tokenization |
| |
| 2006-01-06 András Tímár <timar@fsf.hu> |
| * src/hunspell/{hashmgr.hxx,hunspell.cxx}: fix Visual C++ compiling errors |
| |
| 2006-01-05 Németh László <nemethl@gyorsposta.hu>: |
| * COPYING: set GPL/LGPL/MPL tri-license for Mozilla integration. |
| Rationale: Mozilla source code contains an old MySpell version |
| with GPL/LGPL/MPL tri-license. (MPL license is a copyleft license, similar |
| to the LGPL, but it acts on file level.) |
| * COPYING.LGPL: GNU Lesser General Public License 2.1 (LGPL) |
| * COPYING.MPL: Mozilla Public License 1.1 (MPL) |
| * license.hunspell, src/hunspell/license.hunspell: GPL/LGPL/MPL tri-license |
| |
| * src/hunspell/{affixmgr,hashmgr}.*: AF, AM alias definitions in affix file: |
| compression of flag sets and morphological descriptions (see manual, |
| and tests/alias* test files). |
| Rationale: Alias compression is also good for loading time and memory |
| efficiency, not only smaller resources. |
| * src/tools/makealias: alias compression utility |
| (usage: ./makealias file.dic file.aff) |
| * tests/alias{,2,3}: AF, AM tests |
| * man/hunspell.4: add AF, AM documentation |
| * src/hunspell/affentry.cxx, atypes.hxx: add new opts bits (aeALIASM, aeALIASF) |
| |
| * tools/hunspell, src/parser/*, src/hunspell/*: Hunspell program |
| tokenizes Unicode texts (only with UTF-8 encoded dictionaries). |
| Missing Unicode tokenization reported by Björn Jacke, Egmont Koblinger, |
| Jess Body and others. |
| Note: Curses interactive interface hasn't worked perfectly yet. |
| * tests/*.tests: remove -1 parameters of Hunspell |
| * tests/*.{good,wrong}: remove tabulators |
| |
| * src/hunspell/{hunspell,affixmgr}.cxx: BREAK option: break words at |
| specified break points and checking word parts separately (see manual). |
| Note: COMPOUNDRULE is better (or will be better) for handling dashes and |
| other compound joining characters or character strings. Use BREAK, if you |
| want check words with dashes or other joining characters and there is no time |
| or possibility to describe precise compound rules with COMPOUNDRULE. |
| * tests/break.*: BREAK example. |
| |
| * src/hunspell/{affixmgr,hunspell}.cxx: add CHECKSHARPS declaration instead |
| of LANG de_DE definitions to handle German sharp s in both spelling and |
| suggestion. |
| * src/hunspell/hunspell.cxx: With CHECKSHARPS, uppercase words are valid |
| with both lower sharp s (it's is optional for names in German legal texts) |
| and SS (MÜßIG, MÜSSIG). Missing lower sharp s form reported by Björn Jacke. |
| * src/hunspell/hunspell.cxx: KEEPCASE flag on a sharp s word has a special |
| meaning with CHECKSHARPS declaration: KEEPCASE permits capitalisation and SS upper |
| casing of a sharp s word (Müßig and MÜSSIG), but forbids the upper cased form |
| with lower sharp s character(s): *MÜßIG. |
| * tests/germancompounding*: add CHECKSHARPS, remove LANG |
| * tests/checksharps*: add CHECKSHARPS and KEEPCASE, remove LANG |
| |
| * src/hunspell/hunspell.cxx: improved suggestions: |
| - suggestions for pressed Caps Lock problems: macARONI -> macaroni |
| - suggestions for long shift problems: MAcaroni -> Macaroni, macaroni |
| - suggestions for KEEPCASE words: KG -> kg |
| * src/hunspell/csutil.cxx: fix mystrrep() function: |
| - suggestions for lower sharp s in uppercased words: MÜßIG -> MÜSSIG |
| * tests/checksharps{,utf}.sug: add tests for mystrrep() fix |
| |
| * src/hunspell/hashmgr.cxx: Now dictionary words can contain slashes |
| with the "\/" syntax. Problem reported by Frederik Fouvry. |
| |
| * src/hunspell/hunspell.cxx: fix bad duplicate filter in suggest(). |
| (Suggesting some capitalised compound words caused program crash |
| with Hungarian dictionary, OOo Issue 59055). |
| |
| * src/hunspell/affixmgr.cxx: fix bad defcpd_check() call in compound_check(). |
| (Overlapping new COMPOUNDRULE and old compounding methods caused program |
| crash at suggestion.) |
| |
| * src/hunspell/affixmgr.{cxx,hxx}: check affix flag duplication at affix classes. |
| Suggested by Daniel Naber. |
| |
| * src/hunspell/affentry.cxx: remove unused variable declarations (OOo i58338). |
| Compiler warnings reported by András Tímár and Martin Hollmichel. |
| |
| * src/hunspell/hunspell.cxx: morph(): not analyse bad mixed uppercased forms |
| (fix Arabic morphological analysis with Buckwalter's Arabic transliteration) |
| |
| * src/hunspell/affentry.{cxx,hxx}, atypes.hxx: little memory optimization |
| in affentry: |
| - using unsigned char fields instead of short (stripl, appndl, numconds) |
| - rename xpflg field to opts |
| - removing utf8 field, use aeUTF8 bit of opts field |
| |
| * configure.ac: set tests/maputf.test to XFAILED on ARM platform. |
| Fail reported by Rene Engelhard. |
| |
| * configure.ac: link Ncursesw library, if exists. |
| |
| * BUGS: add BUGS file |
| |
| * tests/complexprefixes2.*: test for morphological analysis with COMPLEXPREFIXES |
| |
| * src/hunspell/affixmgr.cxx: use "COMPOUNDRULE" instead of |
| "COMPOUND". The new name suggested by Bram Moolenaar. |
| * tests/compoundrule*: modified and renamed compound.* test files |
| |
| * man/hunspell.4: AF, AM, BREAK, CHECKSHARPS, COMPOUNDRULE, KEEPCASE. |
| - also new addition to the documentation: |
| Header of the dictionary file define approximate dictionary size: |
| ``A dictionary file (*.dic) contains a list of words, one per line. |
| The first line of the dictionaries (except personal dictionaries) |
| contains the _approximate_ word count (for optimal hash memory size).'' |
| Asked by Frederik Foudry. |
| |
| One-character replacements in REP definitions: ``It's very useful to |
| define replacements for the most typical one-character mistakes, too: |
| with REP you can add higher priority to a subset of the TRY suggestions |
| (suggestion list begins with the REP suggestions).'' |
| |
| 2005-11-11 Németh László <nemethl@gyorsposta.hu>: |
| * src/hunspell/affixmgr.*: fix Unicode MAP errors (sorted only n-1 |
| characters instead of n ones in UTF-16 MAP character lists). |
| Bug reported by Rene Engelhard. |
| |
| * src/hunspell/affixmgr.*: fix infinite COMPOUND matching (default char |
| type is unsigned on PowerPC, s390 and ARM platforms and it will never |
| be negative). Bug reported by Rene Engelhard. |
| |
| * src/hunspell/{affixmgr,suggestmgr}.cxx: fix bad ONLYINCOMPOUND |
| word suggestions. |
| * tests/onlyincompound.sug: empty test file to check this fix. |
| Bug reported by Björn Jacke. |
| |
| * src/hunspell/affixmgr.cxx: fix backtracking in COMPOUND pattern matching. |
| * tests/compound6.*: test files to check this fix. |
| |
| * csutil.cxx: set bigger range types in flag_qsort() and flag_bsearch(). |
| |
| * affixmgr.hxx: set better type for cont_classes[] Boolean data (short -> char) |
| |
| * configure.ac, tests/automake.am: set platform specific XFAIL test |
| (flagutf8.test on ARM platform) |
| |
| 2005-11-09 Németh László <nemethl@gyorsposta.hu>: |
| improvements: |
| * src/hunspell/affixmgr.*: new and improved affix file parameters: |
| |
| - COMPOUND definitions: compound patterns with regexp-like matching. |
| See manual and test files: tests/compound*.* |
| Suggested by Bram Moolenaar. |
| Also useful for simple word-level lexical scanning, for example |
| analysing numbers or words with numbers (OOo Issue #53643): |
| http://qa.openoffice.org/issues/show_bug.cgi?id=53643 |
| Examples: tests/compound{4,5}.*. |
| |
| - NOSUGGEST flag: words signed with NOSUGGEST flag are not suggested. |
| Proposed flag for vulgar and obscene words (OOo Issue #55498). |
| Example: tests/nosuggest.*. |
| Problem reported by bobharvey at OOo: |
| http://qa.openoffice.org/issues/show_bug.cgi?id=55498 |
| |
| - KEEPCASE flag: Forbid capitalized and uppercased forms of words |
| signed with KEEPCASE flags. Useful for special ortographies |
| (measurements and currency often keep their case in uppercased |
| texts) and other writing systems (eg. keeping lower case of IPA |
| characters). |
| |
| - CHECKCOMPOUNDCASE: Forbid upper case characters at word bound in compounds. |
| Examples: tests/checkcompoundcase* and tests/germancompounding.* |
| |
| - FLAG UTF-8: New flag type: Unicode character encoded with UTF-8. |
| Example: tests/flagutf8.*. |
| Rationale: Unicode character type can be more readable |
| (in a Unicode text editor) than `long' or `num' flag type. |
| |
| bug fixes: |
| * src/hunspell/hunspell.cxx: accept numbers and numbers with separators (i53643) |
| Bug reported by skelet at OOo: |
| http://qa.openoffice.org/issues/show_bug.cgi?id=53643 |
| |
| * src/hunspell/csutil.cxx: fix casing data in ISO 8859-13 character table. |
| |
| * src/hunspell/csutil.cxx: add ISO-8859-15 character encoding (i54980) |
| Rationale: ISO-8859-15 is the default encoding of the French OpenOffice.org |
| dictionary. ISO-8859-15 is a modified version of ISO-8859-1 |
| (latin-1) character encoding with French œ ligatures and euro |
| symbol. Problem reported by cbrunet at OOo in OOo Issue 54980: |
| http://qa.openoffice.org/issues/show_bug.cgi?id=54980 |
| |
| * src/hunspell/affixmgr.cxx: fix zero-byte malloc after a bad affix header. |
| Patch by Harri Pitkänen. |
| |
| * src/hunspell/suggestmgr.cxx: fix bad NEEDAFFIX word suggestion |
| in ngram suggestions. Reported by Daniel Naber and Friedel Wolff. |
| |
| * src/hunspell/hashmgr.cxx: fix bad white space checking in affix files. |
| src/hunspell/{csutil,affixmgr}.cxx: add other white space separators. |
| Problems with tabulators reported by Frederik Fouvry. |
| |
| * src/hunspell/*: replace system-dependent <license.*> #include |
| parameters with quoted ones. Problem reported by Dafydd Jones. |
| |
| * src/hunspell/hunspell.cxx: fix missing morphological analysis of dot(s) |
| Reported by Trón Viktor. |
| |
| changes: |
| * src/hunspell/affixmgr.cxx: rename PSEUDOROOT to NEEDAFFIX. |
| Suggested by Bram Moolenaar. |
| |
| * src/hunspell/suggestmgr.hxx: Increase default maximum of |
| ngram suggestions (3->5). Suggested by Kevin Hendricks. |
| |
| * src/hunspell/htypes.hxx: Increase MAXDELEN for long affix flags. |
| |
| * src/hunspell/suggestmgr.cxx: modify (perhaps fix) Unicode map suggestion. |
| tests/maputf test fail on ARM platform reported by Rene Engelhard. |
| |
| * src/hunspell/{affentry.cxx,atypes.hxx}: remove [PREFIX] and |
| MISSING_DESCRIPTION messages from morphological analysis. |
| Problems reported by Trón Viktor. |
| |
| * tests/germancompounding.{aff,good}: Add "Computer-Arbeit" test word. |
| Suggested by Daniel Naber. |
| |
| * doc/man/hunspell.4: Proof-reading patch by Goldman Eleonóra. |
| |
| * doc/man/hunspell.4: Fix bad affix example (replace `move' with `work'). |
| Bug reported by Frederik Fouvry. |
| |
| * tests/*: new test files: |
| affixes.*: simple affix compression example from Hunspell 4 manual page |
| checkcompoundcase.*, checkcompoundcase2.*, checkcompoundcaseutf.* |
| compound.*, compound2.*, compound3.*, compound4.*, compound5.* |
| compoundflag.* (former compound.*) |
| flagutf8.*: test for FLAG UTF-8 |
| germancompounding.*: simplification with CHECKCOMPOUNDCASE. |
| germancompoundingold.* (former germancompounding.*) |
| i53643.*: check numbers with separators |
| i54980.*: ISO8859-15 test |
| keepcase.*: test for KEEPCASE |
| needaffix*.* (former pseudoroot*.* tests) |
| nosuggest.*: test for NOSUGGEST |
| |
| 2005-09-19 Németh László <nemethl@gyorsposta.hu>: |
| * src/hunspell/suggestmgr.cxx: improved ngram suggestion: |
| - detect not neighboring swap characters (pernament -> permanent) |
| Rationale: ngram method has a significant error with not neighboring |
| swap characters, especially when swap is in the middle of the word. |
| - suggest uppercase forms (unesco -> UNESCO, siggraph's -> SIGGRAPH's) |
| - suggest only ngram swap character and uppercase form, if they exist. |
| Rationale: swap character and casing equivalence give mutch better |
| suggestions as any other (weighted) ngram suggestions. |
| - add uppercase suggestion (PERMENANT -> PERMANENT) |
| |
| * src/hunspell/*: complete comparison with MySpell 3.2 (in OOo beta 2): |
| - affixmgr.cxx: add missing numrep initialization |
| - hashmgr.cxx: add_word(): don't allocate temporary records |
| - hunspell.cxx: in suggest(): |
| - check capitalized words first (better sug. order for proper names), |
| - check pSMgr->suggest() return value |
| - set pSMgr->suggest() call to not optional in HUHCAP |
| - csutil.cxx: fix bad KOI8-U -> koi8r_tbl reference in enc_entry encds |
| - csutil.cxx: fix casing data in ISO 8859-2, Windows 1251 and KOI8-U |
| encoding tables. Bug reported by Dmitri Gabinski. |
| |
| * src/hunspell/affixmgr.*: improved compound word and other features |
| - generalize hu_HU specific compound word features with new affix file |
| parameters, suggested by Bram Moolenaar: |
| - CHECKCOMPOUNDDUP: forbid word duplication in compounds (eg. foo|foo) |
| - CHECKCOMPOUNDTRIPLE: forbid triple letters in compounds (eg. foo|obar) |
| - CHECKCOMPOUNDPATTERN: forbid patterns at word bounds in compounds |
| - CHECKCOMPOUNDREP: using REP replacement table, forbid presumably bad |
| compounds (useful for languages with unlimited number of compounds) |
| - ONLYINCOMPOUND flag works also with words (see tests/onlyincompound.*) |
| Suggested by Daniel Naber, Björn Jacke, Trón Viktor & Bram Moolenaar. |
| - PSEUDOROOT works also with prefixes and prefix + suffix combinations |
| (see tests/pseudoroot5.*). Suggested by Trón Viktor. |
| - man/hunspell.4: updated man page |
| |
| * src/hunspell/affixmgr.*: fix incomplete prefix handling with twofold |
| suffixes (delete unnecessary contclasses[] conditions in |
| prefix_check_twosfx() and prefix_check_twosfx_morph()). |
| Bug reported by Trón Viktor. |
| |
| * src/hunspell/affixmgr.*: complete also *_morph() functions with |
| conditions of new Hunspell features (circumfix, pseudoroot etc.). |
| |
| * src/hunspell/suggestmgr.cxx: |
| - fix missing suggestions for words with crossed prefix and suffix |
| - fix redundant non compound word checking |
| - fix losing suggestions problem. Bug reported by Dmitri Gabinski. |
| |
| * src/hunspell/dictmgr.*: |
| - add new dictionary manager for Hunspell UNO modul |
| Problems with eo_ANY Esperanto locale reported by Dmitri Gabinski. |
| |
| * src/hunspell/*: use precise constant sizes for 8-bit and 16-bit character |
| arrays with MAXWORDUTF8LEN and MAXSWUTF8L macros. |
| |
| * src/hunspell/affixmgr.cxx: fix bad MAXNGRAMSUGS parameter handling |
| |
| * src/hunspell/affixmgr.cxx, src/tools/{un}munch.*: fix GCC 4.0 warnings |
| on fgets(), reported by Dvornik László |
| |
| * po/hu.po: improved translation by Dvornik László |
| |
| * tests/test.sh: improved test environment |
| - add suggestion testing (see tests/*.sug) |
| - add memory debugging environment, based on the excellent Valgrind debugger. |
| Usage on Linux and experimental platforms of Valgrind: |
| VALGRIND=memcheck make check |
| - rename test_hunmorph to test.sh |
| |
| * tests/*: new tests: |
| - base.*: base example based on MySpell's checkme.lst. |
| - map{,utf}.*, rep{,utf}: MAP and REP suggestion examples |
| - tests on new CHECKCOMPOUND, ONLYINCOMPOUND and PSEUDOROOT features |
| - i54633.*: capitalized suggestion test for Issue 54633 from OOo's Issuezilla |
| - i35725.*: improved ngram suggestion test for Issue 35725 |
| |
| 2005-08-26 Németh László <nemethl@gyorsposta.hu>: |
| improvements: |
| |
| * src/hunspell/suggestmgr.cxx: |
| Unicode support in related character map suggestion |
| |
| * src/hunspell/suggestmgr.cxx: Unicode support in ngram suggestion |
| |
| * src/hunspell/{suggestmgr,affixmgr,hunspell}.cxx: improve ngram suggestion. |
| Fix http://qa.openoffice.org/issues/show_bug.cgi?id=35725. See release |
| notes for examples. This problem reported by beccablain at OOo. |
| - ngram suggestions now are case insensitive (see `Permenant' bug in Issuezilla) |
| - weight ngram suggestions (with the longest common subsequent algorithm, |
| also considering lengths of bad word and suggestion, identical first |
| letters and almost completely identical character positions) |
| - set strict affix congruency in expand_rootword(). Now ngram suggestions |
| are good for languages with rich morphology and also better for English. |
| Rationale: affixed forms of the first ngram suggestion |
| very often suppress the second and subsequent root word suggestions. But |
| faults in affixes are more uncommon, and can be fix without suggestions. |
| We must prefer the more informative second and subsequent root word |
| suggestions instead of the suggestions for bad affixes. |
| - a better suggestion may not be substring of a less good suggestion |
| Rationale: Suggesting affixed forms of a root word is |
| unnecessary, when root word has got better weighted ngram value. |
| (Checking substrings is a good approximation for this refinement.) |
| - lesser ngram suggestions (default 3 maximum instead of 10) |
| Rationale: For users need a big extra effort to check a lot of bad ngram |
| suggestions, nine times out of ten unnecessarily. It is very |
| distracting, because ngram suggestions could be very different. |
| Usually Myspell and Hunspell suggest one or two suggestions with |
| the old suggestion algorithms (maximum is 15), with ngram algorithm |
| often gives maximum number suggestions. With strict affix congruency |
| and other refinements, the good suggestion there is usually among the |
| first three elements. |
| - new affix parameter: MAXNGRAMSUG |
| |
| * src/hunspell/*: support agglutinative languages with rich prefix |
| morphology or with right-to-left writing system (for example, Turkic |
| and Austronesian languages with (modified) Arabic scripts). |
| - new affix parameter: COMPLEXPREFIXES |
| Set twofold prefix stripping (but single suffix stripping) |
| * src/hunspell/affixmgr.cxx: |
| - speed up prefix loading with tree sorting algorithm. |
| * tests/complexprefixes.*, tests/complexprefixesutf.*: |
| Coptic example posted by Moheb Mekhaiel |
| |
| * src/hunspell/hashmgr.cxx: check size attribute in dic file |
| suggested by Daniel Naber |
| Rationale: With missing size attribute Hunspell allocates too small and |
| more slower hash memory, and Hunspell can lose first dictionary word. |
| |
| * src/hunspell/affixmgr.cxx: check stripping characters and condition |
| compatibility in affix rules (bugs detected in cs_CZ, es_ES, es_NEW, |
| es_MX, lt_LT, nn_NO, pt_PT, ro_RO and sk_SK dictionaries). See release |
| notes of Hunspell 1.0.9 in NEWS. |
| |
| * src/hunspell/affixmgr.cxx: check unnecessary fields in affix rules |
| (bugs detected in ro_RO and sv_SE dictionaries). See release notes. |
| |
| * src/hunspell/affixmgr.cxx: remove redundant condition checking |
| in affix rules with stripping characters (redundancy in OpenOffice.org |
| dictionaries reported by Eleonóra Goldman) |
| Rationale: this is a little optimization, but it was excellent for |
| detect the bad ngram affixation with bad or weak affix conditions. |
| |
| * tests/germancompounding.aff: improve compound definition |
| - use dash prefix instead of language specific tokenizer |
| Rationale: Using uniform approach is the right way to check and analyze |
| compound words. Language specific word breaking is deprecated, need |
| a sophisticated grammar checking for word-like word pairs |
| (for example in Hungarian there is a substandard, but accepted |
| syntax with dash for word pairs: cats, dogs -> kutyák-macskák (like |
| cats/dogs in English). |
| |
| * test Hunspell with 54 OpenOffice.org dictionaries: see release notes |
| |
| bug fixes: |
| |
| * src/hunspell/suggestmgr.*: add time limit to exponential |
| algorithm of the related character map suggestion |
| Rationale: a long word in agglutinative languages or a special pattern |
| (for example a horizontal rule) made of map characters can `crash' the |
| spell checker. |
| |
| * src/hunspell/affentry.cxx: add() functions: fix bad word generation |
| checking stripping characters (see similar bug in unmunch) |
| |
| * src/hunspell/affixmgr.cxx: parse_file(): fix unconditional getNext() |
| call for ~AffixMgr() when affix file is corrupt. |
| |
| * src/hunspell/affixmgr.*: AffixMgr(), parse_cpdsyllable(): fix missing |
| string duplications for ~AffixMgr() when affix file is corrupt. |
| |
| * src/hunspell/affixmgr.*: parse_affix(): fix fprintf() call when affix |
| file is corrupt. Bug reported by Daniel Naber. |
| |
| * suggestmgr.cxx: replace single usage of 'strdup' with 'mystrdup' |
| patch by Chris Halls (debian.org) |
| |
| * src/hunspell/makefile.mk: add makefile.mk for compiling in OpenOffice.org |
| See README in Hunspell UNO modul. |
| Problems with separated compiling reported by Rene Engelhard |
| |
| * src/hunspell/hunspell.cxx: fix pseudoroot support |
| - search a not pseudoroot homonym in check() |
| * tests/pseudoroot4.*: test this fix |
| |
| * src/tools/unmunch.c: fix bad word generation when conditions |
| are shorter or incompatible with stripping characters in affix rules |
| |
| * src/tools/unmunch.c: fix mychomp() for de_AT.dic and other dic files |
| without last new line character. |
| |
| other changes: |
| * src/hunspell/suggestmgr.*: erase ACCENT suggestion |
| Rationale: ACCENT suggestion was the same as Kevin Hendrick's map |
| suggestion algorithm, but with a less good interface in affix file. |
| |
| * src/hunspell/suggestmgr.*: combine cycle number limit |
| in badchar(), and forgotchar() with a time limit. |
| |
| * src/hunspell/affixmgr.*: remove NOMAPSUGS affix parameter |
| |
| * src/hunspell/{suggestmgr,hunspell}.*: strip periods from |
| suggestions (restore MySpell's original behaviour) |
| Rationale: OpenOffice.org has an automatic period handling mechanism |
| and suggestions look better without periods. |
| - new affix file parameter: SUGSWITHDOTS |
| Add period(s) to suggestions, if input word terminates in period(s). |
| (No need for OpenOffice.org dictionaries.) |
| |
| * tests/germancompounding.aff: improve bad german affix in affix example |
| (computeren->computern). Suggested by Daniel Naber. |
| |
| * src/tools/example.cxx: add Myspell's example |
| |
| * src/tools/munch.cxx: add Myspell's munch |
| |
| * man{,/hu}/hunspell.4: refresh manual pages |
| |
| 2005-08-01 Németh László <nemethl@gyorsposta.hu>: |
| * add missing MySpell files and features: |
| - add MySpell license.readme, README and CONTRIBUTORS ({license,README,AUTHORS}.myspell) |
| - add MySpell unmunch program (src/tools/unmunch.c) |
| - add licenses to source (src/hunspell/license.{myspell,hunspell}) |
| - port MAP suggestion (with imperfect UTF-8 support) |
| - add NOSPLITSUGS affix parameter |
| - add NOMAPSUGS affix parameter |
| |
| * src/man/man.4: MAP, COMPOUNDPERMITFLAG, NOSPLITSUGS, NOMAPSUGS |
| |
| * src/hunspell/aff{entry,ixmgr}.cxx: |
| - improve compound word support |
| - new affix parameter: COMPOUNDPERMITFLAG (see manual) |
| * src/tests/compoundaffix{,2}.*: examples for COMPOUNDPERMITFLAG |
| * src/tests/germancompounding.*: new solution for German compounding |
| Problems with German compounding reported by Daniel Naber |
| |
| * src/hunspell/hunspell.cxx: fix German uppercase word spelling |
| with the spellsharps() recursive algorithm. |
| Default recursive depth is 5 (MAXSHARPS). |
| * src/tests/germansharps*: extended German sharp s tests |
| |
| * src/tools/hunspell.cxx: fix fatal memory bug in non-interactive |
| subshells without HOME environmental variable |
| Bug detected with PHP by András Izsók. |
| |
| 2005-07-22 Németh László <nemethl@gyorsposta.hu>: |
| * src/hunspell/csutil.hxx: utf16_u8() |
| - fix 3-byte UTF-8 character conversion |
| |
| 2005-07-21 Németh László <nemethl@gyorsposta.hu>: |
| * src/hunspell/csutil.hxx: hunspell_version() for OOo UNO modul |
| |
| 2005-07-19 Németh László <nemethl@gyorsposta.hu>: |
| * renaming: |
| - src/morphbase -> src/hunspell |
| - src/hunspell, src/hunmorph -> src/tools |
| - src/huntokens -> src/parsers |
| |
| * src/tools/hunstem.cxx: add stemmer example |
| |
| 2005-07-18 Németh László <nemethl@gyorsposta.hu>: |
| * configure.ac: --with-ui, --with-readline configure options |
| * src/hunspell/hunspell.cxx: fix conditional compiling |
| |
| * src/hunspell/hunspell.cxx: set HunSPELL.bak temporaly file |
| in the same dictionary with the checked file. |
| |
| * src/morphbase/morphbase.cxx: |
| |
| - handling German sharp s (ß) |
| |
| - fix (temporaly) analyize() |
| |
| * tests: a lot of new tests |
| |
| * po/, intl/, m4/: add gettext from GNU hello |
| |
| * po/hu.po: add Hungarian translation |
| |
| * doc/, man/: rename doc to man |
| |
| 2005-07-04 Németh László <nemethl@gyorsposta.hu>: |
| * src/morphbase/hashmgr.cxx: set FLAG attributum instead of FLAG_NUM and FLAG_LONG |
| |
| * doc/hunspell.4: manual in English |
| |
| 2005-06-30 Németh László <nemethl@gyorsposta.hu>: |
| * src/morphbase/csutil.cxx: add character tables from csutil.cxx of OOo 1.1.4 |
| |
| * src/morphbase/affentry.cxx: fix Unicode condition checking |
| |
| * tests/{,utf}compound.*: tests compounding |
| |
| 2005-06-27 Németh László <nemethl@gyorsposta.hu>: |
| * src/morphbase/*: fix Unicode compound handling |
| |
| 2005-06-23 Halácsy Péter: |
| * src/hunmorph/hunmorph.cxx: delete spelling error message and suggest_auto() call |
| |
| 2005-06-21 Németh László <nemethl@gyorsposta.hu>: |
| * src/morphbase: Unicode support |
| * tests/utf8.*: SET UTF-8 test |
| |
| * src/morphbase: checking and fixing with Valgrind |
| Memory handling error reported by Ferenc Szidarovszky |
| |
| 2005-05-26 Németh László <nemethl@gyorsposta.hu>: |
| * suggestmgr.cxx: fix stemming |
| * AUTHORS, COPYING, ChangeLog: set CC-LGPL free software license |
| |
| 2004-05-25 Varga Dániel <daniel@all.hu> |
| * src/stemtool: new subproject |
| |
| 2005-05-25 Halácsy Péter <peter@halacsy.com> |
| * AUTHORS, COPYING: set CC Attribution license |
| |
| 2004-05-23 Varga Dániel <daniel@all.hu> |
| * src: - modifications for compiling with Visual C++ |
| |
| * src/hunmorph/csutil.cxx: correcting header of flag_qsort(), |
| * src/hunmorph/*: correct csutil include |
| |
| 2005-05-19 Németh László <nemethl@gyorsposta.hu> |
| * csutil.cxx: fix loop condition in lineuniq() |
| bug reported by Viktor Nagy (nagyv nyelvtud hu). |
| |
| * morphbase.cxx: handle PSEUDOROOT with zero affixes |
| bug reported by Viktor Nagy (nagyv nyelvtud hu). |
| * tests/zeroaffix.*: add zeroaffix tests |
| |
| 2005-04-09 Németh László <nemethl@gyorsposta.hu> |
| * config.h.in: reset with autoheader |
| |
| * src/hunspell/hunspell.cxx: set version |
| |
| 2005-04-06 Németh László <nemethl@gyorsposta.hu> |
| * tests: tests |
| |
| * src/morphbase: |
| New optional parameters in affix file: |
| - PSEUDOROOT: for forbidding root with not forbidden suffixed forms. |
| - COMPOUNDWORDMAX: max. words in compounds (default is no limit) |
| - COMPOUNDROOT: signs compounds in dictionary for handling special compound rules |
| - remove COMPOUNDWORD, ONLYROOT |
| |
| 2005-03-21 Németh László <nemethl@gyorsposta.hu> |
| * src/morphbase/*: |
| - 2-byte flags, FLAG_NUM, FLAG_LONG |
| - CIRCUMFIX: signed suffixes and prefixes can only occur together |
| - ONLYINCOMPOUND for fogemorpheme (Swedish, Danish) or Flute-elements (German) |
| - COMPOUNDBEGIN: allow signed roots, and roots with signed suffix in begin of compounds |
| - COMPOUNDMIDDLE: like before, but middle of compounds |
| - COMPOUNDEND: like before, but end of compounds |
| - remove COMPOUNDFIRST, COMPOUNDLAST |