|
FreeLing
3.0
|
Class suffixes implements suffixation rules and dictionary search for suffixed word forms. More...
#include <suffixes.h>

Public Member Functions | |
| affixes (const std::wstring &, const std::wstring &) | |
| Constructor. | |
| void | look_for_affixes (word &, dictionary &) |
| look up possible roots of a suffixed/prefixed form | |
Private Member Functions | |
| void | look_for_affixes_in_list (int, std::multimap< std::wstring, sufrule > &, word &, dictionary &) const |
| find all applicable affix rules for a word | |
| void | look_for_combined_affixes (std::multimap< std::wstring, sufrule > &, std::multimap< std::wstring, sufrule > &, word &, dictionary &) const |
| find all applicable prefix+sufix rules combination for a word | |
| std::set< std::wstring > | GenerateRoots (int, const sufrule &, const std::wstring &) const |
| generate roots according to rules. | |
| void | SearchRootsList (std::set< std::wstring > &, const std::wstring &, sufrule &, word &, dictionary &) const |
| find roots in dictionary and apply matching rules | |
| void | ApplyRule (const std::wstring &, const std::list< analysis > &, const std::wstring &, sufrule &, word &, dictionary &) const |
| actually apply a affix rule | |
| void | CheckRetokenizable (const sufrule &, const std::wstring &, const std::wstring &, const std::wstring &, dictionary &, std::list< word > &, int) const |
| auxiliary method to deal with retokenization | |
Private Attributes | |
| accents | accen |
| Language-specific accent handler. | |
| std::multimap< std::wstring, sufrule > | affix [2] |
| all suffixation/prefixation rules | |
| std::multimap< std::wstring, sufrule > | affix_always [2] |
| suffixation/prefixation rules applied unconditionally | |
| std::set< unsigned int > | ExistingLength [2] |
| index of existing suffix/prefixs lengths. | |
| unsigned int | Longest [2] |
| Length of longest suffix/prefix. | |
Class suffixes implements suffixation rules and dictionary search for suffixed word forms.
| affixes::affixes | ( | const std::wstring & | Lang, |
| const std::wstring & | sufFile | ||
| ) |
Constructor.
Create a suffixed words analyzer.
References sufrule::acc, affix, affix_always, sufrule::always, sufrule::enc, ERROR_CRASH, ExistingLength, util::int2wstring(), sufrule::lema, Longest, sufrule::nomore, util::open_utf8_file(), sufrule::output, PREF, sufrule::retok, SUF, sufrule::term, and TRACE.
| void affixes::ApplyRule | ( | const std::wstring & | , |
| const std::list< analysis > & | , | ||
| const std::wstring & | , | ||
| sufrule & | , | ||
| word & | , | ||
| dictionary & | |||
| ) | const [private] |
actually apply a affix rule
Actually apply a rule.
if (not suf.cond.Search(pos->get_tag()) ) {
References word::add_analysis(), util::capitalization(), CheckRetokenizable(), sufrule::cond, sufrule::expression, word::get_form(), word::get_lc_form(), sufrule::lema, sufrule::nomore, sufrule::output, word::set_found_in_dict(), analysis::set_retokenizable(), TRACE, and util::wstring2list().
Referenced by look_for_combined_affixes(), and SearchRootsList().
| void affixes::CheckRetokenizable | ( | const sufrule & | suf, |
| const std::wstring & | form, | ||
| const std::wstring & | lem, | ||
| const std::wstring & | tag, | ||
| dictionary & | dic, | ||
| std::list< word > & | rtk, | ||
| int | caps | ||
| ) | const [private] |
auxiliary method to deal with retokenization
Check whether the suffix carries retokenization information, and create alternative word list if necessary.
References word::add_analysis(), util::capitalize(), word::get_form(), word::get_lemma(), word::get_tag(), sufrule::retok, dictionary::search_form(), word::set_form(), TRACE, and util::wstring2list().
Referenced by ApplyRule().
| set< wstring > affixes::GenerateRoots | ( | int | kind, |
| const sufrule & | suf, | ||
| const std::wstring & | rt | ||
| ) | const [private] |
generate roots according to rules.
Generate all possible forms expanding root rt with all possible terminations according to the given suffix rule.
References PREF, SUF, sufrule::term, and TRACE.
Referenced by look_for_affixes_in_list(), and look_for_combined_affixes().
| void affixes::look_for_affixes | ( | word & | w, |
| dictionary & | dic | ||
| ) |
look up possible roots of a suffixed/prefixed form
Look up possible roots of a suffixed form.
Words already analyzed are only applied the "always"-marked suffix rules. So-far unrecognized words, are applied all the sufix rules.
References affix, affix_always, word::get_form(), word::get_n_analysis(), util::int2wstring(), look_for_affixes_in_list(), look_for_combined_affixes(), PREF, SUF, and TRACE.
| void affixes::look_for_affixes_in_list | ( | int | kind, |
| std::multimap< std::wstring, sufrule > & | suff, | ||
| word & | w, | ||
| dictionary & | dic | ||
| ) | const [private] |
find all applicable affix rules for a word
References accen, ExistingLength, accents::fix_accentuation(), GenerateRoots(), word::get_lc_form(), util::int2wstring(), Longest, PREF, SearchRootsList(), SUF, and TRACE.
Referenced by look_for_affixes().
| void affixes::look_for_combined_affixes | ( | std::multimap< std::wstring, sufrule > & | suff, |
| std::multimap< std::wstring, sufrule > & | pref, | ||
| word & | w, | ||
| dictionary & | dic | ||
| ) | const [private] |
find all applicable prefix+sufix rules combination for a word
References accen, ApplyRule(), ExistingLength, accents::fix_accentuation(), word::found_in_dict(), GenerateRoots(), word::get_lc_form(), util::int2wstring(), Longest, PREF, SearchRootsList(), word::set_found_in_dict(), SUF, and TRACE.
Referenced by look_for_affixes().
| void affixes::SearchRootsList | ( | std::set< std::wstring > & | , |
| const std::wstring & | , | ||
| sufrule & | , | ||
| word & | , | ||
| dictionary & | |||
| ) | const [private] |
find roots in dictionary and apply matching rules
Search candidate forms in dictionary, discarding invalid forms and annotating the valid ones.
References ApplyRule(), util::int2wstring(), dictionary::search_form(), and TRACE.
Referenced by look_for_affixes_in_list(), and look_for_combined_affixes().
accents affixes::accen [private] |
Language-specific accent handler.
Referenced by look_for_affixes_in_list(), and look_for_combined_affixes().
std::multimap<std::wstring,sufrule> affixes::affix[2] [private] |
all suffixation/prefixation rules
Referenced by affixes(), and look_for_affixes().
std::multimap<std::wstring,sufrule> affixes::affix_always[2] [private] |
suffixation/prefixation rules applied unconditionally
Referenced by affixes(), and look_for_affixes().
std::set<unsigned int> affixes::ExistingLength[2] [private] |
index of existing suffix/prefixs lengths.
Referenced by affixes(), look_for_affixes_in_list(), and look_for_combined_affixes().
unsigned int affixes::Longest[2] [private] |
Length of longest suffix/prefix.
Referenced by affixes(), look_for_affixes_in_list(), and look_for_combined_affixes().
1.7.6.1