|
FreeLing
3.0
|
Class word stores all info related to a word: form, list of analysis, list of tokens (if multiword). More...
#include <language.h>


Classes | |
| class | const_iterator |
| const_iterator over word analysis (either all, only selected, only unselected) More... | |
| class | iterator |
| iterator over word analysis (either all, only selected, only unselected) More... | |
Public Member Functions | |
| word () | |
| constructor | |
| word (const std::wstring &) | |
| constructor | |
| word (const std::wstring &, const std::list< word > &) | |
| constructor | |
| word (const std::wstring &, const std::list< analysis > &, const std::list< word > &) | |
| constructor | |
| word (const word &) | |
| Copy constructor. | |
| word & | operator= (const word &) |
| assignment | |
| void | copy_analysis (const word &) |
| copy analysis from another word | |
| int | get_n_selected (int k=0) const |
| Get the number of selected analysis. | |
| int | get_n_unselected (int k=0) const |
| get the number of unselected analysis | |
| bool | is_multiword () const |
| true iff the word is a multiword compound | |
| bool | is_ambiguous_mw () const |
| true iff the word is a multiword marked as ambiguous | |
| void | set_ambiguous_mw (bool) |
| set mw ambiguity status | |
| int | get_n_words_mw () const |
| get number of words in compound | |
| std::list< word > | get_words_mw () const |
| get word objects that compound the multiword | |
| std::wstring | get_form () const |
| get word form | |
| std::wstring | get_lc_form () const |
| Get word form, lowercased. | |
| std::wstring | get_ph_form () const |
| Get word phonetic form. | |
| word::iterator | selected_begin (int k=0) |
| Get an iterator to the first selected analysis. | |
| word::const_iterator | selected_begin (int k=0) const |
| Get an iterator to the first selected analysis. | |
| word::iterator | selected_end (int k=0) |
| Get an iterator to the end of selected analysis list. | |
| word::const_iterator | selected_end (int k=0) const |
| Get an iterator to the end of selected analysis list. | |
| word::iterator | unselected_begin (int k=0) |
| Get an iterator to the first unselected analysis. | |
| word::const_iterator | unselected_begin (int k=0) const |
| Get an iterator to the first unselected analysis. | |
| word::iterator | unselected_end (int k=0) |
| Get an iterator to the end of unselected analysis list. | |
| word::const_iterator | unselected_end (int k=0) const |
| Get an iterator to the end of unselected analysis list. | |
| unsigned int | num_kbest () const |
| Get how many kbest tags the word has. | |
| std::wstring | get_lemma (int k=0) const |
| get lemma for the selected analysis in list | |
| std::wstring | get_tag (int k=0) const |
| get tag for the selected analysis | |
| std::wstring | get_short_tag (int k=0) const |
| get tag (short version) for the selected analysis, assuming eagles tagset | |
| std::wstring | get_short_tag (const std::wstring &, int k=0) const |
| get tag (short version) for the selected analysis | |
| std::list< std::pair < std::wstring, double > > | get_senses (int k=0) const |
| get sense list for the selected analysis | |
| std::wstring | get_senses_string (int k=0) const |
| get sense list (as string) for the selected analysis | |
| void | set_senses (const std::list< std::pair< std::wstring, double > > &, int k=0) |
| set sense list for the selected analysis | |
| unsigned long | get_span_start () const |
| get token span. | |
| unsigned long | get_span_finish () const |
| bool | found_in_dict () const |
| get in_dict | |
| void | set_found_in_dict (bool) |
| set in_dict | |
| bool | has_retokenizable () const |
| check if there is any retokenizable analysis | |
| void | lock_analysis () |
| mark word as having definitive analysis | |
| bool | is_locked () const |
| check if word is marked as having definitive analysis | |
| void | add_alternative (const word &, double) |
| add an alternative to the alternatives list | |
| void | set_alternatives (const std::list< std::pair< word, double > > &) |
| replace alternatives list with list given | |
| bool | has_alternatives () const |
| find out if the speller checked alternatives | |
| std::list< std::pair< word, double > > | get_alternatives () const |
| get alternatives list | |
| std::list< std::pair< word, double > >::iterator | alternatives_begin () |
| get alternatives begin iterator | |
| std::list< std::pair< word, double > >::iterator | alternatives_end () |
| get alternatives end iterator | |
| void | add_analysis (const analysis &) |
| add one analysis to current analysis list (no duplicate check!) | |
| void | set_analysis (const analysis &) |
| set analysis list to one single analysis, overwriting current values | |
| void | set_analysis (const std::list< analysis > &) |
| set analysis list, overwriting current values | |
| void | set_form (const std::wstring &) |
| set word form | |
| void | set_ph_form (const std::wstring &) |
| Set word phonetic form. | |
| void | set_span (unsigned long, unsigned long) |
| set token span | |
| bool | find_tag_match (boost::u32regex &) |
| look for an analysis with a tag matching given regexp | |
| int | get_n_analysis () const |
| get number of analysis in current list | |
| void | unselect_all_analysis (int k=0) |
| empty the list of selected analysis | |
| void | select_all_analysis (int k=0) |
| mark all analysisi as selected | |
| void | select_analysis (word::iterator, int k=0) |
| add the given analysis to selected list. | |
| void | unselect_analysis (word::iterator, int k=0) |
| remove the given analysis from selected list. | |
| std::list< analysis > | get_analysis () const |
| get list of analysis (useful for perl API) | |
| word::iterator | analysis_begin () |
| get begin iterator to analysis list (useful for perl/java API) | |
| word::const_iterator | analysis_begin () const |
| word::iterator | analysis_end () |
| get end iterator to analysis list (useful for perl/java API) | |
| word::const_iterator | analysis_end () const |
Public Attributes | |
| std::vector< std::wstring > | user |
| user-managed data, we just store it. | |
Private Member Functions | |
| void | clone (const word &) |
| clone word (used by assignment/copy constructors) | |
Private Attributes | |
| std::wstring | form |
| lexical form | |
| std::wstring | lc_form |
| lexical form, lowercased | |
| std::wstring | ph_form |
| phonetic form | |
| std::list< word > | multiword |
| empty list if not a multiword | |
| bool | ambiguous_mw |
| whether the multiword presents segmentantion ambiguity (i.e. could not be a mw) | |
| std::list< std::pair< word, double > > | alternatives |
| alternative words (with analysis) provided by spell checker | |
| unsigned long | start |
| token span | |
| unsigned long | finish |
| bool | in_dict |
| word form found in dictionary | |
| bool | locked |
| word morphological shouldn't be further modified | |
Static Private Attributes | |
| static const int | SELECTED = 0 |
| Values for word::iterator types. | |
| static const int | UNSELECTED = 1 |
| static const int | ALL = 2 |
Class word stores all info related to a word: form, list of analysis, list of tokens (if multiword).
| word::word | ( | ) |
constructor
Class word stores all info related to a word: form, list of analysis, list of tokens (if multiword).
Create an empty new word
| word::word | ( | const std::wstring & | ) |
constructor
| word::word | ( | const std::wstring & | , |
| const std::list< word > & | |||
| ) |
constructor
| word::word | ( | const std::wstring & | , |
| const std::list< analysis > & | , | ||
| const std::list< word > & | |||
| ) |
constructor
| word::word | ( | const word & | w | ) |
Copy constructor.
| void word::add_alternative | ( | const word & | w, |
| double | d | ||
| ) |
add an alternative to the alternatives list
| void word::add_analysis | ( | const analysis & | a | ) |
add one analysis to current analysis list (no duplicate check!)
Add one analysis to word analysis list.
Referenced by dictionary::annotate_word(), affixes::ApplyRule(), dictionary::check_contracted(), affixes::CheckRetokenizable(), and probabilities::guesser().
| list< pair< word, double > >::iterator word::alternatives_begin | ( | ) |
get alternatives begin iterator
| list< pair< word, double > >::iterator word::alternatives_end | ( | ) |
get alternatives end iterator
get begin iterator to analysis list (useful for perl/java API)
get begin iterator to analysis list.
| word::const_iterator word::analysis_begin | ( | ) | const |
get end iterator to analysis list (useful for perl/java API)
get end iterator to analysis list.
| word::const_iterator word::analysis_end | ( | ) | const |
| void word::clone | ( | const word & | w | ) | [private] |
| void word::copy_analysis | ( | const word & | w | ) |
copy analysis from another word
Copy analysis list of given word.
| bool word::find_tag_match | ( | boost::u32regex & | re | ) |
look for an analysis with a tag matching given regexp
look for a tag in the analysis list of a word
Referenced by probabilities::annotate_word().
| bool word::found_in_dict | ( | ) | const |
get in_dict
Referenced by probabilities::annotate_word(), and affixes::look_for_combined_affixes().
| list< pair< word, double > > word::get_alternatives | ( | ) | const |
get alternatives list
| list< analysis > word::get_analysis | ( | ) | const |
get list of analysis (useful for perl API)
get list of analysis (only useful for perl API)
| wstring word::get_form | ( | ) | const |
get word form
Get word form.
Referenced by dictionary::annotate_word(), probabilities::annotate_word(), affixes::ApplyRule(), affixes::CheckRetokenizable(), affixes::look_for_affixes(), completer::matching_condition(), PrintDepTree(), PrintTree(), probabilities::smoothing(), and traces::trace_word().
| wstring word::get_lc_form | ( | ) | const |
Get word form, lowercased.
Referenced by affixes::ApplyRule(), probabilities::guesser(), affixes::look_for_affixes_in_list(), affixes::look_for_combined_affixes(), hmm_tagger::ProbB_log(), probabilities::smoothing(), and locutions::ValidMultiWord().
| wstring word::get_lemma | ( | int | k = 0 | ) | const |
get lemma for the selected analysis in list
Get lemma for the selected analysis in list.
Referenced by affixes::CheckRetokenizable(), check_lemma::eval(), completer::matching_condition(), PrintDepTree(), and PrintTree().
| int word::get_n_analysis | ( | ) | const |
get number of analysis in current list
Get length of analysis list.
Referenced by dictionary::annotate_word(), probabilities::annotate_word(), dictionary::check_contracted(), probabilities::guesser(), affixes::look_for_affixes(), and probabilities::smoothing().
| int word::get_n_selected | ( | int | k = 0 | ) | const |
Get the number of selected analysis.
| int word::get_n_unselected | ( | int | k = 0 | ) | const |
get the number of unselected analysis
Get the number of unselected analysis.
| int word::get_n_words_mw | ( | ) | const |
get number of words in compound
Get number of words in compound.
| wstring word::get_ph_form | ( | ) | const |
Get word phonetic form.
| list< pair< wstring, double > > word::get_senses | ( | int | k = 0 | ) | const |
get sense list for the selected analysis
| wstring word::get_senses_string | ( | int | k = 0 | ) | const |
get sense list (as string) for the selected analysis
References util::pairlist2wstring().
| wstring word::get_short_tag | ( | int | k = 0 | ) | const |
get tag (short version) for the selected analysis, assuming eagles tagset
get short PoS tag for the selected analysis, assuming eagles tagset
| std::wstring word::get_short_tag | ( | const std::wstring & | , |
| int | k = 0 |
||
| ) | const |
get tag (short version) for the selected analysis
| unsigned long word::get_span_finish | ( | ) | const |
Referenced by traces::trace_word().
| unsigned long word::get_span_start | ( | ) | const |
| wstring word::get_tag | ( | int | k = 0 | ) | const |
get tag for the selected analysis
Get PoS tag for the selected analysis in list.
Referenced by affixes::CheckRetokenizable(), check_pos::eval(), completer::matching_condition(), PrintDepTree(), and PrintTree().
| list< word > word::get_words_mw | ( | ) | const |
get word objects that compound the multiword
Get list of words in compound.
Referenced by traces::trace_word(), and ner_module::ValidMultiWord().
| bool word::has_alternatives | ( | ) | const |
find out if the speller checked alternatives
| bool word::has_retokenizable | ( | ) | const |
check if there is any retokenizable analysis
Referenced by probabilities::annotate_word().
| bool word::is_ambiguous_mw | ( | ) | const |
true iff the word is a multiword marked as ambiguous
Check whether the word is an ambiguous multiword.
| bool word::is_locked | ( | ) | const |
check if word is marked as having definitive analysis
| bool word::is_multiword | ( | ) | const |
true iff the word is a multiword compound
Check whether the word is a compound.
Referenced by traces::trace_word().
| void word::lock_analysis | ( | ) |
mark word as having definitive analysis
| unsigned int word::num_kbest | ( | ) | const |
Get how many kbest tags the word has.
Get how many kbest tags the word stores.
| void word::select_all_analysis | ( | int | k = 0 | ) |
mark all analysisi as selected
mark all analysis as selected for k-th best sequence
Referenced by probabilities::annotate_word().
| void word::select_analysis | ( | word::iterator | tag, |
| int | k = 0 |
||
| ) |
add the given analysis to selected list.
Mark given analysis as selected.
| word::iterator word::selected_begin | ( | int | k = 0 | ) |
Get an iterator to the first selected analysis.
Get the first selected analysis iterator.
Referenced by traces::trace_word().
| word::const_iterator word::selected_begin | ( | int | k = 0 | ) | const |
Get an iterator to the first selected analysis.
Get the first selected analysis iterator.
| word::iterator word::selected_end | ( | int | k = 0 | ) |
Get an iterator to the end of selected analysis list.
Get the end of selected analysis list.
Referenced by traces::trace_word().
| word::const_iterator word::selected_end | ( | int | k = 0 | ) | const |
Get an iterator to the end of selected analysis list.
Get the end of selected analysis list.
| void word::set_alternatives | ( | const std::list< std::pair< word, double > > & | ) |
replace alternatives list with list given
| void word::set_ambiguous_mw | ( | bool | a | ) |
set mw ambiguity status
Set mw ambiguity status.
| void word::set_analysis | ( | const analysis & | a | ) |
set analysis list to one single analysis, overwriting current values
Set (override) word analysis list with one single analysis.
Referenced by probabilities::guesser().
| void word::set_analysis | ( | const std::list< analysis > & | ) |
set analysis list, overwriting current values
| void word::set_form | ( | const std::wstring & | ) |
set word form
Set word form.
References util::lowercase().
Referenced by affixes::CheckRetokenizable().
| void word::set_found_in_dict | ( | bool | b | ) |
set in_dict
Referenced by dictionary::annotate_word(), affixes::ApplyRule(), and affixes::look_for_combined_affixes().
| void word::set_ph_form | ( | const std::wstring & | ) |
Set word phonetic form.
| void word::set_senses | ( | const std::list< std::pair< std::wstring, double > > & | , |
| int | k = 0 |
||
| ) |
set sense list for the selected analysis
| void word::set_span | ( | unsigned long | s, |
| unsigned long | e | ||
| ) |
set token span
Set token span.
| void word::unselect_all_analysis | ( | int | k = 0 | ) |
empty the list of selected analysis
un mark all analysis as selected for k-th best sequence
| void word::unselect_analysis | ( | word::iterator | tag, |
| int | k = 0 |
||
| ) |
remove the given analysis from selected list.
Unmark given analysis as selected.
| word::iterator word::unselected_begin | ( | int | k = 0 | ) |
Get an iterator to the first unselected analysis.
Get the first unselected analysis iterator.
Referenced by traces::trace_word().
| word::const_iterator word::unselected_begin | ( | int | k = 0 | ) | const |
Get an iterator to the first unselected analysis.
Get the first unselected analysis iterator.
| word::iterator word::unselected_end | ( | int | k = 0 | ) |
Get an iterator to the end of unselected analysis list.
Get the end of unselected analysis list.
Referenced by traces::trace_word().
| word::const_iterator word::unselected_end | ( | int | k = 0 | ) | const |
Get an iterator to the end of unselected analysis list.
Get the end of unselected analysis list.
const int word::ALL = 2 [static, private] |
Referenced by word::iterator::operator++(), and word::const_iterator::operator++().
std::list<std::pair<word,double> > word::alternatives [private] |
alternative words (with analysis) provided by spell checker
Referenced by clone().
bool word::ambiguous_mw [private] |
whether the multiword presents segmentantion ambiguity (i.e. could not be a mw)
Referenced by clone().
unsigned long word::finish [private] |
Referenced by clone().
std::wstring word::form [private] |
lexical form
Referenced by clone().
bool word::in_dict [private] |
word form found in dictionary
Referenced by clone().
std::wstring word::lc_form [private] |
lexical form, lowercased
Referenced by clone().
bool word::locked [private] |
word morphological shouldn't be further modified
Referenced by clone().
std::list<word> word::multiword [private] |
empty list if not a multiword
Referenced by clone().
std::wstring word::ph_form [private] |
phonetic form
Referenced by clone().
const int word::SELECTED = 0 [static, private] |
Values for word::iterator types.
Referenced by word::iterator::operator++(), and word::const_iterator::operator++().
unsigned long word::start [private] |
token span
Referenced by clone().
const int word::UNSELECTED = 1 [static, private] |
| std::vector<std::wstring> word::user |
user-managed data, we just store it.
Referenced by clone().
1.7.6.1