man/hunspell.3 - hunspell - Git at Google

 .TH hunspell 3 "2011-02-01"
 .LO 1
 .hy 0
 .SH NAME
 \fBhunspell\fR - spell checking, stemming, morphological generation and analysis
 .SH SYNOPSIS
 \fB#include <hunspell/hunspell.hxx> /* or */\fR
 .br
 \fB#include <hunspell/hunspell.h>\fR
 .br
 .sp
 .BI "Hunspell(const char *" affpath ", const char *" dpath );
 .sp
 .BI "Hunspell(const char *" affpath ", const char *" dpath ", const char * " key );
 .sp
 .BI "~Hunspell(" );
 .sp
 .BI "int add_dic(const char *" dpath );
 .sp
 .BI "int add_dic(const char *" dpath ", const char *" key );
 .sp
 .BI "int spell(const char *" word );
 .sp
 .BI "int spell(const char *" word ", int *" info ", char **" root );
 .sp
 .BI "int suggest(char***" slst ", const char *" word);
 .sp
 .BI "int analyze(char***" slst ", const char *" word);
 .sp
 .BI "int stem(char***" slst ", const char *" word);
 .sp
 .BI "int stem(char***" slst ", char **" morph ", int " n);
 .sp
 .BI "int generate(char***" slst ", const char *" word ", const char *" word2);
 .sp
 .BI "int generate(char***" slst ", const char *" word ", char **" desc ", int " n);
 .sp
 .BI "void free_list(char ***" slst ", int " n);
 .sp
 .BI "int add(const char *" word);
 .sp
 .BI "int add_with_affix(const char *" word ", const char *" example);
 .sp
 .BI "int remove(const char *" word);
 .sp
 .BI "char * get_dic_encoding(" );
 .sp
 .BI "const char * get_wordchars(" );
 .sp
 .BI "unsigned short * get_wordchars_utf16(int *" len);
 .sp
 .BI "struct cs_info * get_csconv(" );
 .sp
 .BI "const char * get_version(" );
 .SH DESCRIPTION
 The \fBHunspell\fR library routines give the user word-level
 linguistic functions: spell checking and correction, stemming,
 morphological generation and analysis in item-and-arrangement style.
 .PP
 The optional C header contains the C interface of the C++ library with
 Hunspell_create and Hunspell_destroy constructor and destructor, and
 an extra HunHandle parameter (the allocated object) in the
 wrapper functions (see in the C header file \fBhunspell.h\fR).
 .PP
 The basic spelling functions, \fBspell()\fR and \fBsuggest()\fR can
 be used for stemming, morphological generation and analysis by
 XML input texts (see XML API).
 .
 .SS Constructor and destructor
 Hunspell's constructor needs paths of the affix and dictionary files.
 See the \fBhunspell\fR(4) manual page for the dictionary format.
 Optional \fBkey\fR parameter is for dictionaries encrypted by
 the \fBhzip\fR tool of the Hunspell distribution.
 .
 .SS Extra dictionaries
 The add_dic() function load an extra dictionary file.
 The extra dictionaries use the affix file of the allocated Hunspell
 object. Maximal number of the extra dictionaries is limited in the source code (20).
 .
 .SS Spelling and correction
 The spell() function returns non-zero, if the input word is recognised
 by the spell checker, and a zero value if not. Optional reference
 variables return a bit array (info) and the root word of the input word.
 Info bits checked with the SPELL_COMPOUND, SPELL_FORBIDDEN or SPELL_WARN
 macros sign compound words, explicit forbidden and probably bad words.
 From version 1.3, the non-zero return value is 2 for the dictionary
 words with the flag "WARN" (probably bad words).
 .PP
 The suggest() function has two input parameters, a reference variable
 of the output suggestion list, and an input word. The function returns
 the number of the suggestions. The reference variable
 will contain the address of the newly allocated suggestion list or NULL,
 if the return value of suggest() is zero. Maximal number of the suggestions
 is limited in the source code.
 .PP
 The spell() and suggest() can recognize XML input, see the XML API section.
 .
 .SS Morphological functions
 The plain stem() and analyze() functions are similar to the suggest(), but
 instead of suggestions, return stems and results of the morphological
 analysis. The plain generate() waits a second word, too. This extra word
 and its affixation will be the model of the morphological generation of
 the requested forms of the first word.
 .PP
 The extended stem() and generate() use the results of a
 morphological analysis:
 .PP
 .RS
 .nf
 char ** result, result2;
 int n1 = analyze(&result, "words");
 int n2 = stem(&result2, result, n1);
 .fi
 .RE
 .PP
 The morphological annotation of the Hunspell library has fixed
 (two letter and a colon) field identifiers, see the
 \fBhunspell\fR(4) manual page.
 .PP
 .RS
 .nf
 char ** result;
 char * affix = "is:plural"; // description depends from dictionaries, too
 int n = generate(&result, "word", &affix, 1);
 for (int i = 0; i < n; i++) printf("%s\n", result[i]);
 .fi
 .RE
 .PP
 .SS Memory deallocation
 The free_list() function frees the memory allocated by suggest(),
 analyze, generate and stem() functions.
 .SS Other functions
 The add(), add_with_affix() and remove() are helper functions of a
 personal dictionary implementation to add and remove words from the
 base dictionary in run-time. The add_with_affix() uses a second word
 as a model of the enabled affixation of the new word.
 .PP
 The get_dic_encoding() function returns "ISO8859-1" or the character
 encoding defined in the affix file with the "SET" keyword.
 .PP
 The get_csconv() function returns the 8-bit character case table of the
 encoding of the dictionary.
 .PP
 The get_wordchars() and get_wordchars_utf16() return the
 extra word characters definied in affix file for tokenization by
 the "WORDCHARS" keyword.
 .PP
 The get_version() returns the version string of the library.
 .SS XML API
 The spell() function returns non-zero for the "<?xml?>" input
 indicating the XML API support.
 .PP
 The suggest() function stems, analyzes and generates the forms of the
 input word, if it was added by one of the following "SPELLML" syntaxes:
 .PP
 .RS
 .nf
 <?xml?>
 <query type="analyze">
 <word>dogs</word>
 </query>
 .fi
 .RE
 .PP

 .PP
 .RS
 .nf
 <?xml?>
 <query type="stem">
 <word>dogs</word>
 </query>
 .fi
 .RE
 .PP

 .PP
 .RS
 .nf
 <?xml?>
 <query type="generate">
 <word>dog</word>
 <word>cats</word>
 </query>
 .fi
 .RE
 .PP

 .PP
 .RS
 .nf
 <?xml?>
 <query type="generate">
 <word>dog</word>
 <code><a>is:pl</a><a>is:poss</a></code>
 </query>
 .fi
 .RE
 .PP

 The outputs of the type="stem" query and the stem() library function
 are the same. The output of the type="analyze" query is a string contained
 a <code><a>result1</a><a>result2</a>...</code> element. This
 element can be used in the second syntax of the type="generate" query.
 .SH EXAMPLE
 See analyze.cxx in the Hunspell distribution.
 .SH AUTHORS
 Hunspell based on Ispell's spell checking algorithms and OpenOffice.org's Myspell source code.
 .PP
 Author of International Ispell is Geoff Kuenning.
 .PP
 Author of MySpell is Kevin Hendricks.
 .PP
 Author of Hunspell is László Németh.
 .PP
 Author of the original C API is Caolan McNamara.
 .PP
 Author of the Aspell table-driven phonetic transcription algorithm and code is Björn Jacke.
 .PP
 See also THANKS and Changelog files of Hunspell distribution.
	.TH hunspell 3 "2011-02-01"
	.LO 1
	.hy 0
	.SH NAME
	\fBhunspell\fR - spell checking, stemming, morphological generation and analysis
	.SH SYNOPSIS
	\fB#include <hunspell/hunspell.hxx> /* or */\fR
	.br
	\fB#include <hunspell/hunspell.h>\fR
	.br
	.sp
	.BI "Hunspell(const char " affpath ", const char " dpath );
	.sp
	.BI "Hunspell(const char " affpath ", const char " dpath ", const char * " key );
	.sp
	.BI "~Hunspell(" );
	.sp
	.BI "int add_dic(const char *" dpath );
	.sp
	.BI "int add_dic(const char " dpath ", const char " key );
	.sp
	.BI "int spell(const char *" word );
	.sp
	.BI "int spell(const char " word ", int " info ", char **" root );
	.sp
	.BI "int suggest(char**" slst ", const char " word);
	.sp
	.BI "int analyze(char**" slst ", const char " word);
	.sp
	.BI "int stem(char**" slst ", const char " word);
	.sp
	.BI "int stem(char*" slst ", char " morph ", int " n);
	.sp
	.BI "int generate(char**" slst ", const char " word ", const char *" word2);
	.sp
	.BI "int generate(char**" slst ", const char " word ", char **" desc ", int " n);
	.sp
	.BI "void free_list(char ***" slst ", int " n);
	.sp
	.BI "int add(const char *" word);
	.sp
	.BI "int add_with_affix(const char " word ", const char " example);
	.sp
	.BI "int remove(const char *" word);
	.sp
	.BI "char * get_dic_encoding(" );
	.sp
	.BI "const char * get_wordchars(" );
	.sp
	.BI "unsigned short * get_wordchars_utf16(int *" len);
	.sp
	.BI "struct cs_info * get_csconv(" );
	.sp
	.BI "const char * get_version(" );
	.SH DESCRIPTION
	The \fBHunspell\fR library routines give the user word-level
	linguistic functions: spell checking and correction, stemming,
	morphological generation and analysis in item-and-arrangement style.
	.PP
	The optional C header contains the C interface of the C++ library with
	Hunspell_create and Hunspell_destroy constructor and destructor, and
	an extra HunHandle parameter (the allocated object) in the
	wrapper functions (see in the C header file \fBhunspell.h\fR).
	.PP
	The basic spelling functions, \fBspell()\fR and \fBsuggest()\fR can
	be used for stemming, morphological generation and analysis by
	XML input texts (see XML API).
	.
	.SS Constructor and destructor
	Hunspell's constructor needs paths of the affix and dictionary files.
	See the \fBhunspell\fR(4) manual page for the dictionary format.
	Optional \fBkey\fR parameter is for dictionaries encrypted by
	the \fBhzip\fR tool of the Hunspell distribution.
	.
	.SS Extra dictionaries
	The add_dic() function load an extra dictionary file.
	The extra dictionaries use the affix file of the allocated Hunspell
	object. Maximal number of the extra dictionaries is limited in the source code (20).
	.
	.SS Spelling and correction
	The spell() function returns non-zero, if the input word is recognised
	by the spell checker, and a zero value if not. Optional reference
	variables return a bit array (info) and the root word of the input word.
	Info bits checked with the SPELL_COMPOUND, SPELL_FORBIDDEN or SPELL_WARN
	macros sign compound words, explicit forbidden and probably bad words.
	From version 1.3, the non-zero return value is 2 for the dictionary
	words with the flag "WARN" (probably bad words).
	.PP
	The suggest() function has two input parameters, a reference variable
	of the output suggestion list, and an input word. The function returns
	the number of the suggestions. The reference variable
	will contain the address of the newly allocated suggestion list or NULL,
	if the return value of suggest() is zero. Maximal number of the suggestions
	is limited in the source code.
	.PP
	The spell() and suggest() can recognize XML input, see the XML API section.
	.
	.SS Morphological functions
	The plain stem() and analyze() functions are similar to the suggest(), but
	instead of suggestions, return stems and results of the morphological
	analysis. The plain generate() waits a second word, too. This extra word
	and its affixation will be the model of the morphological generation of
	the requested forms of the first word.
	.PP
	The extended stem() and generate() use the results of a
	morphological analysis:
	.PP
	.RS
	.nf
	char ** result, result2;
	int n1 = analyze(&result, "words");
	int n2 = stem(&result2, result, n1);
	.fi
	.RE
	.PP
	The morphological annotation of the Hunspell library has fixed
	(two letter and a colon) field identifiers, see the
	\fBhunspell\fR(4) manual page.
	.PP
	.RS
	.nf
	char ** result;
	char * affix = "is:plural"; // description depends from dictionaries, too
	int n = generate(&result, "word", &affix, 1);
	for (int i = 0; i < n; i++) printf("%s\n", result[i]);
	.fi
	.RE
	.PP
	.SS Memory deallocation
	The free_list() function frees the memory allocated by suggest(),
	analyze, generate and stem() functions.
	.SS Other functions
	The add(), add_with_affix() and remove() are helper functions of a
	personal dictionary implementation to add and remove words from the
	base dictionary in run-time. The add_with_affix() uses a second word
	as a model of the enabled affixation of the new word.
	.PP
	The get_dic_encoding() function returns "ISO8859-1" or the character
	encoding defined in the affix file with the "SET" keyword.
	.PP
	The get_csconv() function returns the 8-bit character case table of the
	encoding of the dictionary.
	.PP
	The get_wordchars() and get_wordchars_utf16() return the
	extra word characters definied in affix file for tokenization by
	the "WORDCHARS" keyword.
	.PP
	The get_version() returns the version string of the library.
	.SS XML API
	The spell() function returns non-zero for the "<?xml?>" input
	indicating the XML API support.
	.PP
	The suggest() function stems, analyzes and generates the forms of the
	input word, if it was added by one of the following "SPELLML" syntaxes:
	.PP
	.RS
	.nf
	<?xml?>
	<query type="analyze">
	<word>dogs</word>
	</query>
	.fi
	.RE
	.PP

	.PP
	.RS
	.nf
	<?xml?>
	<query type="stem">
	<word>dogs</word>
	</query>
	.fi
	.RE
	.PP

	.PP
	.RS
	.nf
	<?xml?>
	<query type="generate">
	<word>dog</word>
	<word>cats</word>
	</query>
	.fi
	.RE
	.PP

	.PP
	.RS
	.nf
	<?xml?>
	<query type="generate">
	<word>dog</word>
	<code><a>is:pl</a><a>is:poss</a></code>
	</query>
	.fi
	.RE
	.PP

	The outputs of the type="stem" query and the stem() library function
	are the same. The output of the type="analyze" query is a string contained
	a <code><a>result1</a><a>result2</a>...</code> element. This
	element can be used in the second syntax of the type="generate" query.
	.SH EXAMPLE
	See analyze.cxx in the Hunspell distribution.
	.SH AUTHORS
	Hunspell based on Ispell's spell checking algorithms and OpenOffice.org's Myspell source code.
	.PP
	Author of International Ispell is Geoff Kuenning.
	.PP
	Author of MySpell is Kevin Hendricks.
	.PP
	Author of Hunspell is László Németh.
	.PP
	Author of the original C API is Caolan McNamara.
	.PP
	Author of the Aspell table-driven phonetic transcription algorithm and code is Björn Jacke.
	.PP
	See also THANKS and Changelog files of Hunspell distribution.