FreeLing  3.0
Public Member Functions | Private Member Functions | Private Attributes
idioma Class Reference

Class "idioma" implements a visible Markov's model that calculates the probability that a text is in a certain language. More...

#include <idioma.h>

List of all members.

Public Member Functions

 idioma ()
 null constructor
 idioma (const std::wstring &)
 Constructor, given the model file to load.
double sequence_probability (std::wistream &, size_t &) const
 Calculates the probability that the text is in the instance language.
double compute_probability (const std::wstring &, double s=1.0) const
 Compute normalized language probability for given string.
void train (const std::wstring &, const std::wstring &, const std::wstring &)
 Create a new model for the language from given input file, Store model in given filename, with given language code.
void train (std::wistream &f, const std::wstring &, const std::wstring &)
std::wstring get_language_code () const
 get iso code for current language

Private Member Functions

std::wstring from_writable (const std::wstring &) const
 convert a trigram from writable represntation in the model file
std::wstring to_writable (const std::wstring &) const
 convert a trigram to a writable represntation for the model file
double ProbA (const std::wstring &, const std::wstring &) const
 Consult method for transition probabilities.
double ProbPi (const std::wstring &) const
 Consult method for initial probabilities.
void increment (std::map< std::wstring, double > &, const std::wstring &)
 Increase occurrences of a n-gram.
void increment (std::map< std::pair< std::wstring, std::wstring >, double > &, const std::wstring &, const std::wstring &)
 Increase occurrences of a two chained trigrams.
void initial_trigram (std::wistream &, wchar_t &, wchar_t &, wchar_t &) const
 Initial trigram: two fictitious '
' plus the first actual letter.
std::wstring trigram (wchar_t, wchar_t, wchar_t) const
 build actual trigram from iterators
void create_model (std::wistream &f)
 Create new model from given stream, with given language code.
void save_model (const std::wstring &) const
 Save current model in given file.

Private Attributes

std::wstring LangCode
std::map< std::wstring, double > pa_nom
 State transitions probabilities.
std::map< std::wstring, double > ppi_nom
 Initial probabilities.
std::map< std::wstring, double > pi
 auxiliary for training
std::map< std::wstring, double > A
std::map< std::pair
< std::wstring, std::wstring >
, double > 
B
 auxiliary for training
size_t nf
 auxiliary for training

Detailed Description

Class "idioma" implements a visible Markov's model that calculates the probability that a text is in a certain language.


Constructor & Destructor Documentation

null constructor

idioma::idioma ( const std::wstring &  )

Constructor, given the model file to load.


Member Function Documentation

double idioma::compute_probability ( const std::wstring &  ,
double  s = 1.0 
) const

Compute normalized language probability for given string.

void idioma::create_model ( std::wistream &  f) [private]

Create new model from given stream, with given language code.

std::wstring idioma::from_writable ( const std::wstring &  ) const [private]

convert a trigram from writable represntation in the model file

std::wstring idioma::get_language_code ( ) const

get iso code for current language

void idioma::increment ( std::map< std::wstring, double > &  ,
const std::wstring &   
) [private]

Increase occurrences of a n-gram.

void idioma::increment ( std::map< std::pair< std::wstring, std::wstring >, double > &  ,
const std::wstring &  ,
const std::wstring &   
) [private]

Increase occurrences of a two chained trigrams.

void idioma::initial_trigram ( std::wistream &  ,
wchar_t &  ,
wchar_t &  ,
wchar_t &   
) const [private]

Initial trigram: two fictitious '
' plus the first actual letter.

double idioma::ProbA ( const std::wstring &  ,
const std::wstring &   
) const [private]

Consult method for transition probabilities.

double idioma::ProbPi ( const std::wstring &  ) const [private]

Consult method for initial probabilities.

void idioma::save_model ( const std::wstring &  ) const [private]

Save current model in given file.

double idioma::sequence_probability ( std::wistream &  ,
size_t &   
) const

Calculates the probability that the text is in the instance language.

std::wstring idioma::to_writable ( const std::wstring &  ) const [private]

convert a trigram to a writable represntation for the model file

void idioma::train ( const std::wstring &  ,
const std::wstring &  ,
const std::wstring &   
)

Create a new model for the language from given input file, Store model in given filename, with given language code.

void idioma::train ( std::wistream &  f,
const std::wstring &  ,
const std::wstring &   
)
std::wstring idioma::trigram ( wchar_t  ,
wchar_t  ,
wchar_t   
) const [private]

build actual trigram from iterators


Member Data Documentation

std::map<std::wstring,double> idioma::A [private]
std::map<std::pair<std::wstring,std::wstring>,double> idioma::B [private]

auxiliary for training

std::wstring idioma::LangCode [private]
size_t idioma::nf [private]

auxiliary for training

std::map<std::wstring,double> idioma::pa_nom [private]

State transitions probabilities.

std::map<std::wstring,double> idioma::pi [private]

auxiliary for training

std::map<std::wstring,double> idioma::ppi_nom [private]

Initial probabilities.


The documentation for this class was generated from the following file: