User:Priyap97/sandbox

This will be the shared sandbox for Lexical Substitution. Priyap97 (talk) 03:51, 19 February 2018 (UTC)

April 5 Assignment: Peer Review response - Gerardo's work
After reviewing the peer review I can better explain the skip gram model with more clarity for readers that do not have any background in computer science or math.

Summary: What is the Skip Gram Model?
The Skip Gram Model takes words with similar meanings into a vector space (collection of objects that can be added together and multiplied by numbers) that are found close to each other in N-dimensions(list of items). A variety of neural networks(computer system modeled after a human brain) are formed together as a result of the vectors and networks that are related together. This all occurs in the dimensions of the vocabulary that has been generated in a network.

Example 1
In a sentence like "The dog walked at a quick pace" each word has a specific vector in relation to the other. The vector for "The" would be [1,0,0,0,0,0,0] because the 1 is the word vocabulary and the 0's are the words surrounding that vocabulary, which create a vector.

Additions
When placing this in the main space the best thing would be to add on links to wikipedia on words that have more difficult definitions like "Vector Space" and "N-dimensions". I already put a simple and brief definition next to each of the words so that it would give the readers an idea of what the definitions of the words are. Gerardo1006617 (talk) 19:16, 8 April 2018 (UTC)

April 5 Assignment: Response to Peer Review - Priya's Work
Thank you for the peer reviews! This week I will focus on providing more clear background information about lexical substitution.

Summary: What is Lexical Substitution?
Lexical substitution is the process of replacing one word with another in regular speech patterns. Generally speaking, the words appear in the same semantic context and similar denotations. Current issues in lexical substitution include using computing algorithms to predict patterns in lexical substitution. Different teams enter to complete tasks dealing with lexical substitution at the SemEval conferences.

At the SemEval 2007 conference, the computational linguistics task for lexical substitution was to receive as input a target word and find acceptable lexical substitutes. The 8 teams which completed this task used word sense disambiguation to determine the contexts in which the target word appears and the contexts of synonyms of the target word, then matched the contexts to each other to output the synonyms which were most likely to function in the context of the target word. Then, the alternate words were scored based on how closely they matched the original by denotation and context.

Additions for the future
In the future I need to further research SemEval 2010; this will likely be next week's task. Priyap97 (talk) 03:59, 6 April 2018 (UTC)

March 22 Assignment: First on the Article - Gerardo's Work
For this week, I will focus on preparing the details on Skip Gram Model research and making it Wikipedia acceptable. === Skip Gram Model Findings ===

The Skip Gram Model takes words with similar meanings into a vector space that are found close to each other in N-dimensions. A variety of neural networks are formed together as a result of the vectors and networks that are related together. This all occurs in the dimensions of the vocabulary that has been generated in a network.

Example 1
In a sentence like "The dog walked at a quick pace" each word has a specific vector in relation to the other. The vector for "The" would be [1,0,0,0,0,0,0] because the 1 is the word vocabulary and the 0's are the words surrounding that vocabulary, which create a vector.

More research still needs to be done in order to place the correct wording for particular phrases. A lot of the words used in this section can be referenced to other words on Wikipedia. There are pictures with better understanding of the Skip gram model that can be added on Wikipedia to give readers a better understanding of how this model works. Gerardo1006617 (talk) 00:27, 23 March 2018 (UTC)

March 22 Assignment: Working on the Article - Priya's Work
This week, I will focus on adding more details to last week's research on SemEval 2007 and converting it to language that is permissible for Wikipedia. === SemEval 2007 Findings ===

At the SemEval 2007 conference, the computational linguistics task for lexical substitution was to receive as input a target word and find acceptable lexical substitutes. The 8 teams which completed this task used word sense disambiguation to determine the contexts in which the target word appears and the contexts of synonyms of the target word, then matched the contexts to each other to output the synonyms which were most likely to function in the context of the target word.

Example
This algorithm finds that the word happy is most closely matched with the words glad and merry, and that jovial is a less direct match but still a functioning synonym.

More research still needs to be done on the scoring system since I appear to have been mistaken about the scoring levels being 0 and 1. It appears to be a more complex system which is based upon mathematical formulas which will not need to be included in the final article but should be analyzed to gain a general consensus of the criteria used to produce a score for each alternate word. Priyap97 (talk) 00:56, 23 March 2018 (UTC)

March 4 assignment: Working on the Article - Gerardo's Work
The research I have found involves the use of the Skip Gram Model and Word Embeddings on lexical substitution. === Skip Gram Model Findings === I looked for articles explaining about what the skip gram model was, but all I found were websites that explain this model.
 * The skip gram model takes similar words in a vector space to be close to each other in a N-dimensions.
 * For example in a sentence like "The dog walked at a quick pace" each word has a specific vector in relation to the other. The vector for "The" would be [1,0,0,0,0,0,0] because the 1 is the word vocabulary and the 0's are the words surrounding that vocabulary create a vector.
 * There is a variety of neural networks that can be formed while having these vectors and networks be related to one another.

Word Embedding

 * The word embeddings in a given context have semantically similar target words.
 * An example can be found from the lexical substitution model in figure two. The target word "acquire" in this case is closer to the word "learn" and it is assumed that it would be a better word for embedding, but it is also close to the word "buy" which makes it the better choice.
 * This shows evidence for lexical substitution in this scenario.

implementation of research and looking forward
The research I found today will give more background information to what is related to lexical substitutions. Some further implementation that can be added include: Gerardo1006617 (talk) 23:54, 4 March 2018 (UTC)
 * linking more words that relate to a specific topic that relates to the article
 * If there isn't any words that can be linked then mention it as a section of the article

March 4 Due Date: Begin Working on Article - Priya's Work
The research I completed this week mostly deals with SemEval 2007 and 2010 research findings on lexical substitution. === SemEval 2007 Findings ===
 * 8 teams participated in the SemEval Lexical Substitution task of 2007. This task was to take a target word, find synonyms for it and use word sense disambiguation to find the context of the word, then produce acceptable alternative words.
 * For example, the word happy would generate words such as cheerful and glad.
 * Each alternate word was scored on a scale of 0 to 1, with higher numbers signifying closer matches.

Future outlook for SemEval 2007

 * The researchers concluded that more work could be done to improve the scoring heuristics of alternate words since multiple alternates might be acceptable in a specific context.

=== SemEval 2010 Findings ===
 * The 2010 task was similar to the 2007 task, but involved Spanish rather than English.
 * The focus was more on enabling human translators to more easily find correct substitutions.
 * One of the scoring systems used in this year is similar to the scoring system used at SemEval 2010. However they began to introduce different metrics to discuss how directly applicable an alternate word is to the original context.
 * The conclusion from this year's research is that using different metrics to score words is far more effective than just one scoring system.

Applications of today's research and further steps
This research will be used to further develop the Evaluations section of the article as previously discussed in our group proposal. Some further steps I can take include: Priyap97 (talk) 21:19, 4 March 2018 (UTC)
 * More research about the specific algorithms presented and how they changed between 2007 and 2010.
 * More research about any new developments in the lexical substitution research field and more recent SemEvals.

Priya's Individual Proposal
To fix this Wikipedia article on lexical substitution, here are several actions that I can take over the course of the semester: * a. Currently, this article only includes one example of lexical substitution—the synonyms match and game can be used interchangeably in some contexts. To add to this, I would like to provide other simple examples like this one, but also perform research about lexical substitution studies which may suggest common trends in lexical substitution; do particular syntactic or semantic categories see more substitution than others? * a. This section of the article is only two sentences long and simply states that evaluation of automatic lexical substitution systems has been performed at Semeval 2007 and Semeval 2010. I would like to find out what kinds of automatic systems were presented, the algorithms behind their functionality, and present these in a concise manner to Wikipedia readers. * b. The purpose of a Wikipedia article is to provide a summary of existing information about a subject, and these evaluations of existing lexical substitutions seem critical to the understanding of the topic. I would also like to see if there are any more recent developments to lexical substitution systems and if possible add those to the article as well. * a. The introduction to the article is confusing to me because it does not properly explain the process of lexical substitution; instead it attempts to explain the difference between lexical substitution and word sense disambiguation, without quite explaining what either of them do. I would like more explicitly define the process of lexical substitution: in what contexts does it most often occur and why does it happen? I hope to be able to accomplish these three main goals with my group this semester through collaboration and thorough research. Priyap97 (talk) 04:26, 19 February 2018 (UTC)
 * 1.	Provide more concrete examples of lexical substitution:
 * 2.	Further develop “Evaluation” section of this article:
 * 3.	Structure the article more clearly overall:

Gerardo's Ideas
The article is currently under the stub-class on the quality scale on the talk page. The best way to address this would be to add more content to the article because in its current state there is only a brief description of what lexical substitutions are. There needs to be more than a basic dictionary definition on this article.
 * Article's quality

Currently there are not that many citations on the article. There are 4 citations that have been cited correctly and lead to pdfs that give more context to the the actual article. As more edits are made to this article the same quality of citations should be added onto this page.
 * Adding citations