FreeLing  3.0
Public Member Functions | Private Member Functions | Private Attributes
hmm_tagger Class Reference

The class hmm_tagger implements the syntactic analyzer and is the main class, which uses all the others. More...

#include <hmm_tagger.h>

Inheritance diagram for hmm_tagger:
Inheritance graph
[legend]
Collaboration diagram for hmm_tagger:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 hmm_tagger (const std::wstring &, const std::wstring &, bool, unsigned int, unsigned int kb=1)
 Constructor.
void annotate (sentence &)
 analyze given sentence
double SequenceProb_log (const sentence &, int k=0)
 Given an *annotated* sentence, compute (log) probability of k-th best sequence according to HMM parameters.

Private Member Functions

bool is_forbidden (const std::wstring &, sentence::const_iterator) const
 check if a trigram is in the forbidden list.
double ProbA_log (const std::wstring &, const std::wstring &, sentence::const_iterator)
 Compute transition log_probability from state_i to state_j, returning appropriate smoothed values if no evidence is available.
double ProbB_log (const std::wstring &, const word &)
 Compute emission log_probability for observation obs from state_i.
double ProbPi_log (const std::wstring &) const
 Compute initial log_probability for state_i.
std::list< emission_statesFindStates (const sentence &) const
 compute possible emission states for each word in sentence.

Private Attributes

std::wstring Language
std::map< std::wstring, double > PTag
 maps to store the probabilities
std::map< std::wstring, double > PBg
std::map< std::wstring, double > PTrg
std::map< std::wstring, double > PInitial
std::map< std::wstring, double > PWord
std::multimap< std::wstring,
std::wstring > 
Forbidden
 set of hand-specified forbidden bigram and trigram transitions
std::map< std::wstring, double > pA_cache
 probabilitiy caches, to speed up computations
std::map< std::wstring, double > pB_cache
unsigned int kbest
 number of best paths to compute
double c [3]
 coeficients to compute linear interpolation

Detailed Description

The class hmm_tagger implements the syntactic analyzer and is the main class, which uses all the others.


Constructor & Destructor Documentation

hmm_tagger::hmm_tagger ( const std::wstring &  lang,
const std::wstring &  hmmFile,
bool  rtk,
unsigned int  force,
unsigned int  kb = 1 
)

Constructor.

Constructor: Build a HMM tagger, loading probability tables.

References c, ERROR_CRASH, Forbidden, analysis::get_short_tag(), kbest, Language, util::open_utf8_file(), PBg, PInitial, PTag, PTrg, PWord, TRACE, util::vector2wstring(), WARNING, and util::wstring2vector().


Member Function Documentation

void hmm_tagger::annotate ( sentence se) [virtual]

analyze given sentence

Disambiguate given sentences with provided options.

Implements POS_tagger.

References util::double2wstring(), FindStates(), util::int2wstring(), kbest, Language, ProbA_log(), ProbB_log(), ProbPi_log(), TRACE, and trellis::ZERO_logprob.

list< emission_states > hmm_tagger::FindStates ( const sentence sent) const [private]

compute possible emission states for each word in sentence.

Obtain a list with the states that *may* have emmited current observation (a sentence).

References Language, and TRACE.

Referenced by annotate().

bool hmm_tagger::is_forbidden ( const std::wstring &  ,
sentence::const_iterator   
) const [private]

check if a trigram is in the forbidden list.

References Forbidden, Language, TRACE, util::vector2wstring(), and util::wstring2vector().

Referenced by ProbA_log().

double hmm_tagger::ProbA_log ( const std::wstring &  state_i,
const std::wstring &  state_j,
sentence::const_iterator  w 
) [private]

Compute transition log_probability from state_i to state_j, returning appropriate smoothed values if no evidence is available.

If the trigram is in the "forbidden" list, result is probability zero.

References c, util::double2wstring(), is_forbidden(), pA_cache, PBg, PTag, PTrg, and TRACE.

Referenced by annotate(), and SequenceProb_log().

double hmm_tagger::ProbB_log ( const std::wstring &  state_i,
const word obs 
) [private]

Compute emission log_probability for observation obs from state_i.

Pb=P(word|state)=P(state|word)*P(word)/P(state) Since states are bigrams: s=t1.t2

  • we approximate P(s)~=P(t2)
  • we approximate P(s|w)~=P(t2|w) Thus: Pb ~= P(t2|w)*P(w)/P(t2)

References util::double2wstring(), word::get_lc_form(), Language, PTag, PWord, and TRACE.

Referenced by annotate(), and SequenceProb_log().

double hmm_tagger::ProbPi_log ( const std::wstring &  state_i) const [private]

Compute initial log_probability for state_i.

References PInitial, and trellis::ZERO_logprob.

Referenced by annotate(), and SequenceProb_log().

double hmm_tagger::SequenceProb_log ( const sentence se,
int  k = 0 
)

Given an *annotated* sentence, compute (log) probability of k-th best sequence according to HMM parameters.

Given an *annotated* sentence, compute sequence (log) probability according to HMM parameters.

References Language, ProbA_log(), ProbB_log(), and ProbPi_log().


Member Data Documentation

double hmm_tagger::c[3] [private]

coeficients to compute linear interpolation

Referenced by hmm_tagger(), and ProbA_log().

std::multimap<std::wstring, std::wstring> hmm_tagger::Forbidden [private]

set of hand-specified forbidden bigram and trigram transitions

Referenced by hmm_tagger(), and is_forbidden().

unsigned int hmm_tagger::kbest [private]

number of best paths to compute

Referenced by annotate(), and hmm_tagger().

std::wstring hmm_tagger::Language [private]
std::map<std::wstring,double> hmm_tagger::pA_cache [private]

probabilitiy caches, to speed up computations

Referenced by ProbA_log().

std::map<std::wstring,double> hmm_tagger::pB_cache [private]
std::map<std::wstring, double> hmm_tagger::PBg [private]

Referenced by hmm_tagger(), and ProbA_log().

std::map<std::wstring, double> hmm_tagger::PInitial [private]

Referenced by hmm_tagger(), and ProbPi_log().

std::map<std::wstring, double> hmm_tagger::PTag [private]

maps to store the probabilities

Referenced by hmm_tagger(), ProbA_log(), and ProbB_log().

std::map<std::wstring, double> hmm_tagger::PTrg [private]

Referenced by hmm_tagger(), and ProbA_log().

std::map<std::wstring, double> hmm_tagger::PWord [private]

Referenced by hmm_tagger(), and ProbB_log().


The documentation for this class was generated from the following files: