User:Alvations/word sense induction and disambiguation

The word sense induction and disambiguation task consisted of three separate phases:
 * 1) In the training phase, evaluation task participants were asked to use a traning dataset to induce the sense inventories for a set of polysemous words. The training dataset consisting of a set of polysemous nouns/verbs and the sentnece instances that they occurred in. No other resources were allowed other than morphological and syntactic Natural Language Processing components, such as morpohological analyzers, Part-Of-Speech taggers and syntactic parsers.
 * 2) In the testing phase, participants were provided with a test set for the disambiguating subtask using the induced sense inventory from the training phase.
 * 3) In the evaluation phase, answers of to the testing phase were evaluated in a supervised an unsupervised framework.

The unsupervised evaluation for WSI considered two types of evaluation V Measure (Rosenberg and Hirschberg, 2007), and paired F-Score (Artiles et al., 2009). This evaluation follows the supervised evaluation of SemEval-2007 WSI task (Agirre and Soroa, 2007)

Word Sense Induction and Disambiguation Example
Often in the induction process, stop words are considered to be semantically irrelevant and hence not considered in the process of building the sense inventory. The induction process outputs clusters of candidate senses that are related to a certain latent semantic variable or  sense cluster. Note that these sets of candidate senses should not be regarded as lexicographic meaning distinction (like synsets in WordNet or BabelNet). Rather, it should be regarded as a more coarse-grained and topic-related entity.

Target word: chip Occurs in the contexts : "An N.V. Philips unit has created a computer system that processes video images 3,000 times faster than conventional systems." "Using reduced instruction - set computing, or RISC, chips made by Intergraph of Huntsville, Ala., the system splits the image it ‘sees’ into 20 digital  representations, each processed by one chip." Induced senses {Centroid:: Candidate senses}: {computer:: cache, CPU, memory, microprocessor, processor, RAM, register} Disambiguation of the target word in context (a.k.a. coarse-grained sense): {computer}