Talk:Semantic similarity

Semantic relatedness
I think Semantic relatedness should be merged into this article because:
 * the article on Semantic relatedness does not define the concept of semantic relatedness. It is little more than a list.
 * this article states that "semantic similarity [is] also called semantic relatedness", i.e., they are synonymic of one another, so there is no need for a separate page for Semantic relatedness.

Jvhertum 10:20, 16 December 2006 (UTC)

sounds good

No, semantic similarity should be clearly distinguished of Semantic Relatedness. Refer to Budanitsky and Hirst 2006, and Gabrilovich and Markovich.

I think we should remove the suggestion for merging, if we agree both concepts are different.


 * Thanks, I have removed the merge tags. The article now clearly states that the two concepts are different and not synonymous. - Jvhertum 07:51, 4 May 2007 (UTC)

I think the first reference to distinguishing Semantic Similarity from Semantic Relatedness was (Budanitsky and Hirst 2001). Also, it seems like the Semantic Relatedness article only focuses on works from cognitive science. Could the Semantic Relatedness article be expanded to included measures used in Computer Science / NLP 'similar' to the Similarity measures on here (such as the LESK algorithm)? --Tromtone 22:56, 13 November 2007 (UTC)

Merge
Currently both Semantic relatedness and Semantic similarity describe the same and therefore their contents should be merged, regardless of whether the terms themselves are sysnonymous or not. --Lysytalk 11:52, 19 November 2010 (UTC)

introduction to semantic similarity
I find the first paragraph convoluted and far too complex: surely a more lucid definition could be found for an educated but less-informed audience, especially as an introduction?--Wikitrishslp (talk) 01:57, 16 September 2014 (UTC)

Introduction
Forgive my ignorance: If "semantic content" is opposed to "semantic similarity", then I find the which-clause ambiguous. What do "these" refer to in the second sentence? If it refers to the metric, then could the distinction between what it is used for and how it is applied be made clearer, i.e. in two sentences? Perhaps the idea of metric should be explained a bit or the word linked to another page? Should it read, "between units of language AND concepts or instances..."? If the semantic similarity and semantic relatedness are not merged, how is semantic similarity network different? It all becomes a bit woolly for the unitiated.--Wikitrishslp (talk) 02:46, 16 September 2014 (UTC)

The article on Semantic Measures at the bottom of the introduction contains some of the same information and same wording eg. "Semantic measures are widely used today to estimate the strength of the semantic relationship between elements of various types: units of language (e.g., words, sentences, documents), concepts or even instances semantically characterized (e.g., diseases, genes, geographical locations)." Reading the paragraph in this link, which, although it deals more specifically with measures, nonetheless provides a broader overview and is easier to understand than the Lead section itself.--Wikitrishslp (talk) 05:17, 19 September 2014 (UTC)

an easier to understand overview..
In spite of the decision not to merge semantic similarity and relatedness, more of a distinction could be made in the introduction, such as the difference simply described In the subheading Taxonomy. Different types of measures of each are also of interest. Diagrams would be helpful to get an overview of the subject matter such as a simple tree drawing showing the semantic similarity between, say, (car and fork), which is in the article by Jiang and Conrath. Under the subsection Visualisation, the idea of mind maps as an example of finding semantic similarity could also be a helpful idea as starting point. It seems that there are a lot of relevant applications of semantic measures used today which could also introduce the --Wikitrishslp (talk) 06:34, 19 September 2014 (UTC)article.

I think the summary information about PMI is not correct.
It says: "PMI (Pointwise mutual information) (+) large vocab, because it uses any search engine (like Google); (−) cannot measure relatedness between whole sentences or documents" But as far as I managed to dive into the topic this measure doesn't use any search engines. — Preceding unsigned comment added by Gneusch (talk • contribs) 14:56, 29 November 2020 (UTC)