|
FreeLing
3.0
|
The class hmm_tagger implements the syntactic analyzer and is the main class, which uses all the others. More...
#include <hmm_tagger.h>


Public Member Functions | |
| hmm_tagger (const std::wstring &, const std::wstring &, bool, unsigned int, unsigned int kb=1) | |
| Constructor. | |
| void | annotate (sentence &) |
| analyze given sentence | |
| double | SequenceProb_log (const sentence &, int k=0) |
| Given an *annotated* sentence, compute (log) probability of k-th best sequence according to HMM parameters. | |
Private Member Functions | |
| bool | is_forbidden (const std::wstring &, sentence::const_iterator) const |
| check if a trigram is in the forbidden list. | |
| double | ProbA_log (const std::wstring &, const std::wstring &, sentence::const_iterator) |
| Compute transition log_probability from state_i to state_j, returning appropriate smoothed values if no evidence is available. | |
| double | ProbB_log (const std::wstring &, const word &) |
| Compute emission log_probability for observation obs from state_i. | |
| double | ProbPi_log (const std::wstring &) const |
| Compute initial log_probability for state_i. | |
| std::list< emission_states > | FindStates (const sentence &) const |
| compute possible emission states for each word in sentence. | |
Private Attributes | |
| std::wstring | Language |
| std::map< std::wstring, double > | PTag |
| maps to store the probabilities | |
| std::map< std::wstring, double > | PBg |
| std::map< std::wstring, double > | PTrg |
| std::map< std::wstring, double > | PInitial |
| std::map< std::wstring, double > | PWord |
| std::multimap< std::wstring, std::wstring > | Forbidden |
| set of hand-specified forbidden bigram and trigram transitions | |
| std::map< std::wstring, double > | pA_cache |
| probabilitiy caches, to speed up computations | |
| std::map< std::wstring, double > | pB_cache |
| unsigned int | kbest |
| number of best paths to compute | |
| double | c [3] |
| coeficients to compute linear interpolation | |
The class hmm_tagger implements the syntactic analyzer and is the main class, which uses all the others.
| hmm_tagger::hmm_tagger | ( | const std::wstring & | lang, |
| const std::wstring & | hmmFile, | ||
| bool | rtk, | ||
| unsigned int | force, | ||
| unsigned int | kb = 1 |
||
| ) |
Constructor.
Constructor: Build a HMM tagger, loading probability tables.
References c, ERROR_CRASH, Forbidden, analysis::get_short_tag(), kbest, Language, util::open_utf8_file(), PBg, PInitial, PTag, PTrg, PWord, TRACE, util::vector2wstring(), WARNING, and util::wstring2vector().
| void hmm_tagger::annotate | ( | sentence & | se | ) | [virtual] |
analyze given sentence
Disambiguate given sentences with provided options.
Implements POS_tagger.
References util::double2wstring(), FindStates(), util::int2wstring(), kbest, Language, ProbA_log(), ProbB_log(), ProbPi_log(), TRACE, and trellis::ZERO_logprob.
| list< emission_states > hmm_tagger::FindStates | ( | const sentence & | sent | ) | const [private] |
compute possible emission states for each word in sentence.
Obtain a list with the states that *may* have emmited current observation (a sentence).
References Language, and TRACE.
Referenced by annotate().
| bool hmm_tagger::is_forbidden | ( | const std::wstring & | , |
| sentence::const_iterator | |||
| ) | const [private] |
check if a trigram is in the forbidden list.
References Forbidden, Language, TRACE, util::vector2wstring(), and util::wstring2vector().
Referenced by ProbA_log().
| double hmm_tagger::ProbA_log | ( | const std::wstring & | state_i, |
| const std::wstring & | state_j, | ||
| sentence::const_iterator | w | ||
| ) | [private] |
Compute transition log_probability from state_i to state_j, returning appropriate smoothed values if no evidence is available.
If the trigram is in the "forbidden" list, result is probability zero.
References c, util::double2wstring(), is_forbidden(), pA_cache, PBg, PTag, PTrg, and TRACE.
Referenced by annotate(), and SequenceProb_log().
| double hmm_tagger::ProbB_log | ( | const std::wstring & | state_i, |
| const word & | obs | ||
| ) | [private] |
Compute emission log_probability for observation obs from state_i.
Pb=P(word|state)=P(state|word)*P(word)/P(state) Since states are bigrams: s=t1.t2
References util::double2wstring(), word::get_lc_form(), Language, PTag, PWord, and TRACE.
Referenced by annotate(), and SequenceProb_log().
| double hmm_tagger::ProbPi_log | ( | const std::wstring & | state_i | ) | const [private] |
Compute initial log_probability for state_i.
References PInitial, and trellis::ZERO_logprob.
Referenced by annotate(), and SequenceProb_log().
| double hmm_tagger::SequenceProb_log | ( | const sentence & | se, |
| int | k = 0 |
||
| ) |
Given an *annotated* sentence, compute (log) probability of k-th best sequence according to HMM parameters.
Given an *annotated* sentence, compute sequence (log) probability according to HMM parameters.
References Language, ProbA_log(), ProbB_log(), and ProbPi_log().
double hmm_tagger::c[3] [private] |
coeficients to compute linear interpolation
Referenced by hmm_tagger(), and ProbA_log().
std::multimap<std::wstring, std::wstring> hmm_tagger::Forbidden [private] |
set of hand-specified forbidden bigram and trigram transitions
Referenced by hmm_tagger(), and is_forbidden().
unsigned int hmm_tagger::kbest [private] |
number of best paths to compute
Referenced by annotate(), and hmm_tagger().
std::wstring hmm_tagger::Language [private] |
Referenced by annotate(), FindStates(), hmm_tagger(), is_forbidden(), ProbB_log(), and SequenceProb_log().
std::map<std::wstring,double> hmm_tagger::pA_cache [private] |
probabilitiy caches, to speed up computations
Referenced by ProbA_log().
std::map<std::wstring,double> hmm_tagger::pB_cache [private] |
std::map<std::wstring, double> hmm_tagger::PBg [private] |
Referenced by hmm_tagger(), and ProbA_log().
std::map<std::wstring, double> hmm_tagger::PInitial [private] |
Referenced by hmm_tagger(), and ProbPi_log().
std::map<std::wstring, double> hmm_tagger::PTag [private] |
maps to store the probabilities
Referenced by hmm_tagger(), ProbA_log(), and ProbB_log().
std::map<std::wstring, double> hmm_tagger::PTrg [private] |
Referenced by hmm_tagger(), and ProbA_log().
std::map<std::wstring, double> hmm_tagger::PWord [private] |
Referenced by hmm_tagger(), and ProbB_log().
1.7.6.1