User:Psyc442/sandbox

Psychology of Language

Stages of Production
Human speech is produced through an intricate system of vocal organs, which receive energy from the lungs and diaphragm (Lemmetty, 1999). However, speech production is more than the vocalization of a sound. This process starts in the brain and is studied in the field of Psycholinguistics (Trujillo).Theories of speech production focus on the explanation of the selection and retrieval of a single word (Ferreria, 2010). According to the theories that try to account for the speech production phenomenon, theproduction of spoken language involves three major levels of processing: the conceptual or semantic representation of the word to be produced, the retrieval of phonological properties, and finally the articulation of the word (Levelt, 1999).

Conceptualization
The first step in the process of speech production is conceptualization, in which the intention to create speech links a desired concept to a particular spoken word to be expressed. This is where the preverbal intended messages are formulated. These preverbal messages specify the concepts to be verbally expressed. In the formation of the preverbal message, the goal of the utterance must be determined (Ferraria & Slevc, 2007). The goal of the utterance is called a speech act. In his book entitled, Speech Acts: An Essay in the philosophy of Language, Searle argues that there are five categories of speech acts. These acts include representatives (the speaker states what he or she believes to be fact), directives (the speaker is trying to get the listener to perform a task), expressive (the speaker reveals his or her psychological state), declaratives (the speakers brings about a novel state of events) and commissives (the speaker commits to an event in the future) (Harley, 2010).

The effect of speech acts are determined by considering the force of a word, meaning the manner in which the word should be perceived by the person who receives the message (Mitchell, 2009). There are three types of forces that affect speech acts: locutionary, illocutionary, and perlocutionary. The locutionary speech act refers to the literal meaning, the illocutionary force is the action that the speaker is trying to get the listener to perform and the perlocutionary force is the effect of the utterance on the listener (Harley, 2010). The speech register must also be considered in the formation of the preverbal message. The speech register refers to the manner in which a speaker varies his or her language usage in a given situation( Krauss & Pardo). The speech register affects each level of linguistic analysis (lexical, syntactical and phonological) given that the register reflects the meaning the speaker places on a situation and the role the speaker takes on in the situation (Krauss & Pardo). For instance, a formal register would prevent the usage of casual language when the word is finally articulated.

Formulating Linguistic Plan
The second step is the formulation of a linguistic plan, in which the linguistic form required for that word's expression is created. This process involves such processes as the generation of a syntactic frame, and phonological encoding which specifies the phonetic form of the intended utterance. At this stage a lemma is picked that is the abstract form of a word that lacks information about the sounds in it (and thus before the word can be pronounced). Lemmas serve two functions in this process; they allow access to the syntactic properties of the word (the part of speech and whether the word is masculine or feminine) and the proper pronunciation of the word (Levelt, 1999). [1][7][8] A lemma is chosen based on its syntactical and semantic appropriateness in a given situation.There are various theories concerning how the syntactic and phonological properties of a word are retrieved. These theories either take on a serial approach or a parallel approach to a network model. Serial models assume that the planning units involved in the phonological process are independent of one another and that they these planning units act in sequence (Vaid, 2005). Parallel models assume that the planning units all act simultaneously and that there are four nodes in memory (semantic, syntactic, morphological and phonological) (Vaid, 2005).

Articulation
The third stage is articulation. There are three systems of muscles involved in the articulatory process: the respiratory (regulates air flow from lungs to vocal tract), laryngeal[vocal cords] (causes the distinction between voiced and voiceless sounds) and the supralaryngeal (part of the vocal tract that includes tongue, lips, teeth, jaw, and velum) (Vaid, 2005 The articulatory process regulates the retrieval of the particular motor phonetics of a word and the motor coordination of appropriate phonation and articulation by the lungs, glottis, larynx,tongue, lips, jaw, and other parts of the vocal apparatus.[7] The smallest unit of sound is the phoneme and phonemes are distinguishable from one another by the resonance in the vocal tract (Scavone, 1999). One can tell the difference between speech sounds based on the the place and manner of what is being articulated (Trujillo). Consonants for instance may be classified in their manner of articulation as plosive (p,b,t,etc), fricative (f,s,sh, etc) nasal (m,n,ng), liquid (r,l), or semivowel (w,y) (Scavone, 1999).

Serial Models
The serial models of language production suggest that there are stages of production that are discrete. These stages do not overlap, so a stage cannot begin before the previous stage has been finished (Harley, 2010).

Levelt’s Model
Levelt proposed a two stage model: the lemma, and the lexeme. The lemma is the concept, or the idea and meaning of a word. The lexeme is the phonological form of a word (Levelt, 1993). These stages are discrete and go in only one direction: the lemma first is chosen, then the lexeme is chosen from that lemma.

Evidence for this model comes, first, from brain imaging of people naming pictures. Activation sites were examined and analyzed for their linguistic significance. While naming pictures, activation goes from the occipital lobe, where visual processing happens, to the temporal gyrus which is connected to semantic processing, to Wernicke’s Area, where meaning is associated with broad phonology, and last to Broca’s Area, where syllables, phonemes, and prosody are connected (Harley, 2010). This shows that first, meaning is generated, then sounds.

Another piece of evidence for this model comes from Levelt’s study on picture-word interference. In this study, people were told to name a picture on the screen. Along with the picture, a word was put on top of it. This word was either related in meaning (semantically), related in sound (phonologically), or unrelated. The time that this word was presented onto the picture varied. Results from this study showed that when a word semantically related was presented early, picture-naming response times were slower. When a word phonologically related to the picture was presented later, response times were slower. Semantically related words presented later onto the picture had little effect on response times (Harley, 2010). This data shows that meaning is processed earlier than sounds, supporting Levelt’s two-stage theory.

Fromkin’s Model
Fromkin’s model has six stages. The first stage of Fromkin’s model is the identification of meaning, where a message to be conveyed is made. Next is the selection of the syntactic structure, where a basic grammatical outline of the sentence is generated. The third stage in Fromkin’s model is the generation of intonation contour. In this stage, the stress values of word slots in the sentence are defined. Next, is the insertion of content words, where nouns, verbs, and adjectives are inserted into the word slots. After content words, comes the insertion of affixes and function words, where function words, prefixes, and suffixes are added to the sentence. Finally, comes specification of phonetic segments, where the sentence is articulated according to phonological rules (Caroll, 2008).

One piece of evidence for Fromkin’s model comes from certain kinds of speech errors. Indicating that the formation of affixes and function words comes after the insertion of content words is what happens when two words are switched in a sentence. An example is “I put a table on the apple” instead of “I put an apple on the table.” When an exchange of two words is made, the rules of grammar are still preserved, because the function word “a” is said instead of “an,” as would be said with apple. This means that function words are applied after content words.

Garret’s Model
Garret’s model is similar to Fromkin’s model, although it has only five stages of language production. First is the message level, which is similar to the identification of meaning stage of Fromkin’s model. In this stage, an intention to convey a message is made. Next is the functional level, where the concept of what will be said and the relationships between the words are made. Word errors happen in this stage. Third, is the positional level. This is where the syntactic frame of the sentence is generated. After that, is the sound level, when the phonological forms of content words and function words are inserted into the word slots. Sound errors happen in this stage. Last comes articulation of the utterance (Harley, 2010).

Evidence for this model involves the study of speech errors. There is a difference between errors involving whole words and errors involving sounds. Word errors are made by switching words of the same type, nouns are exchanged with other nouns, not verbs (Harley 2010). Sound errors, on the other hand, are constrained by distance. This means that sounds closer together and in the same clause are more likely to be damaged than sounds that are further apart. This provides evidence for sound and word errors happening at different times, in different stages. Word errors happen in the functional level, where we assign specific concepts (Harley, 2010). The form of the sentence, or syntactic detail, is not assigned yet, so these words are only known by their type, which is why words of the same type are switched and distance does not matter. Sound errors happen in the sound level, where distance is important because we work on one section of the sentence at a time.

Other evidence for this model is a phenomenon called tip of the tongue. Most people have experienced this before, when you can think of the idea of the word, but you cannot figure out what the actual word is. You might know some features of it, such as a meaning or what it is related to, but you do not know the phonological form of the word. This supports the belief that there is a stage before the sound level, where only the idea of a word is present, the lemma (Harley, 2010).

Parallel Models
Parallel models differ from serial models in that the stages are simultaneous. They can occur at the same time, and can activate previous levels of production (Harley, 2010). Support for the parallel models of language production over the serial models comes from data of the lexical bias effect. This effect is that people are more likely to make phoneme errors that end up in actual words than non-words (Harley, 2010). Because the sound level comes after the word level, if the levels did not interact in both directions, sound errors resulting in real words should be equally as likely as those resulting in non-words. However, this is not the case. Because real words have a meaning and non-words do not, non-words cannot be activated by backwards spreading to the previous level where meaning is generated, while real words can (Harley, 2010).

Dell’s Model
According to Dell, there are four layers of processing and understanding: semantic, syntactic, morphological, and phonological. These work in parallel and in series, with activation at each level. Interference can occur at any of these stages. Production begins with concepts, the idea of a word. This idea would activate all the words with similar features. These words would then activate the corresponding morphological and phonological forms of the words. This selected word would then select morphological and phonological data. The distinction of this model is that, during this process, words similar in sound and meaning can cause conceptual interference (Dell, 1999).

The phonemic similarity effect provides support for Dell’s model. This effect is that people are more likely to produce a word that is phonologically similar to the word they are trying to say. According to Dell’s model, this is because each word that is activated spreads to it’s own set of sounds. These sounds then return back and activate words with similar sounds. (Dell, 1999).

Bock’s Model
Many sentences can be produced in a variety of ways: “Jill gave Jack the bucket” can be said as “Jill gave the bucket to Jack” or “Jack was given the bucket by Jill” or “The bucket was given to Jack by Jill.” Bock suggests a model for how we decide which way to send a message and how we plan the syntax. More accessible items are put earlier in the sentence (Dell 1999). The words that are more accessible are those that are said more often (high-frequency), more concrete words as opposed to abstract words, and those that have been mentioned recently in conversation. The placement of these words earlier in an utterance confines the syntax of later parts of the utterance.

Bock also brought up structural priming, which is that we can influence the syntactic forms a person uses by exposing them to utterances with a particular syntactic structure (Caroll 2008).

Speech Acts
When producing language, everything we say is said for a reason. Everything that a person says can be classified as a speech act. It is an act of communication and always has some kind of goal (Harley, 2010). John Searle (Searle, 1969) originally discussed these speech acts and Harley explains the five broad categories of these goals. Representatives occur when a speaker is stating a fact. Directives occur when the speaker’s goal is to get the listener to take some type of action. Expressives occur when the speaker is expressing something emotional or psychological. Declaratives occur when the speaker is stating something that is new to the current environment or situation. Commissives occur when the speaker is making a promise to take action in the future (Harley, 2010).

Every speech act falls into one of these categories, but in addition to these categories there are three other “forces” that go into a speech act. There is an illocutionary force, which is what the speaker wants from the listener. The locutionary force is the literal meaning of what the speaker is saying. The perlocutionary force is the result or final effect of the speech act (Harley, 2010). For example, let’s say a stranger were to approach another person on the street and ask, “Do you know what time it is?” The illocutionary force is that the speaker wants to know what time it is. The locutionary force is just whether or not the person actually knows the time. If the listener responds with the time, this is the perlocutionary act and the listener properly understood the speaker. If the listener responds by saying, “yes,” it is safe to assume that the perlocutionary act was that the listener took the question too literally and did not answer the speakers question properly.

Prosody
One of the important aspects of language production is prosody. Prosody is defined as the rhythmic aspect of language. This aspect of speech occurs unconsciously and the listener must use non-cognitive and emotional processes to interpret it. Prosody is important to all of the main aspects of language as a whole such as semantics, syntax, morphology and segmentation (Pechmann, 2004). Prosody can be very important in determining the meaning of a sentence. The cue of prosody is usually not apparent in the context of written text, but in the context of speech, the speaker can use prosody as a way to make the meaning of the sentence clear. This occurs because in most cases, prosodic boundaries occur in the same places as important syntactic boundaries (Pauker et al., 2011). This can be especially useful in sentences such as Garden Paths where the sentence could be interpreted in multiple ways. Prosody adds an extra cue that helps to disambiguate a sentence that could mean multiple things.

Taking the audience into account: Common and Privileged Ground
One of the main purposes of language production is to hold a conversation. In conversations, there are many things that need to be accounted for when the speaker is attempting to produce language so that the listener can best understand it. The speaker needs to account for whom they are speaking to and what the listener already knows and does not know. Psycholinguists refer to this as the ground of a conversation. The common ground is the information that is shared between the speaker and the listener. The privileged ground is the information that the speaker knows that the listener or listeners do not know. To convey something to the listener, the speaker must incorporate information that is privileged ground with information that is common ground so that it will make sense to the speaker (Ferreira, 2010). For example, a speaker may say to a listener: “The show is at 7:30 tonight.” For this utterance to make sense, “the show” that the speaker is referring to must be common ground. The time of the show may be privileged ground if the listener was unaware of the time of the show. If the speaker assumes that what show he is referencing is common ground and it is not, the utterance will not make sense and the speaker will have to go back and clarify to fix his or her statement.

To do this effectively, the speaker has to separate privileged from common ground and figure out what the speaker knows and what the listener knows and does not know. Ferreira states that there are two possible ways of distinguishing between privileged and common ground. One way is that the speaker assumes everything is known by both the speaker and the listener and then the speaker pulls back to find the privileged ground. The second way is that the speaker learns from experience and becomes incredibly skilled at figuring out what is privileged or common ground based on the cues present in the environment (Ferreira, 2010).

Alignment
Often times, when speakers engage in conversation, they begin to speak similarly. This is known as alignment. Ferreira identifies many ways in which speakers align. Participants in a conversation usually copy each other’s syntactic structure of sentences. They pronounce things more similarly. They also tend to start choosing the same words and the same phrases. Finally, they seem to typically talk about common situations during the conversation that references things that fall into the frame of the conversation (Ferreira, 2010).

Research Methods
Research done in the area of language production is broken down into two categories: opportunistic and experimental. The first type of research studies errors made during the production of speech and uses these these areas to gain insight into how words are stored and eventually selected for speech to occur. Research focusing on speech errors, while providing a more natural picture of speech production, is limited by “the relative scarcity of speech errors, the ambiguity of categorization and the potential for bias in the collection of errors.” (Bock, 1996) However, in the experimental tradition, research has been carried out using experimental methods that would allow for variables to be manipulated in tasks that would enable researchers to focus on certain aspects of the language production process. (Bock, 1996)

Opportunistic Research
Starting in the second half of the 19th Century, researchers began to focus on a speech errors and dysfunction in order to gain better insight into how speech is produced. Paul Broca studied patients who were unable to produced speech although they had previously had the ability to do so. He worked with them while they were alive so determine the extent of their deficiency and performed an autopsy to examine the structure of their brains. Performing several of these examinations lead Broca to believe that an area towards of the back of the left frontal lobe. (Harley, 2010)

Meringer and Meyer (1985) also studied speech errors but their approach was more methodical and empirical. They built a corpus of speech errors and performed analysis on the data they had received. The work they they performed resulted in the creation of terminology that has been integral in research on speech production. According to their results, the most common types of speech errors were: exchanges, shifts, anticipations, perseverations and blends. They also made the distinction between meaning-based errors (exchanging Ihre ‘you’ for meine ‘me’) and form-based errors (stunden ‘hour’ for studien ‘studies’) Their methods helped establish a valid way for studying speech errors and speech production. (Levelt, 1999)

This method of research on speech production has often been utilized by many researchers, but came back into prominence in the 1960s. Most notably, Fromkin (1973) and the MIT-CU developed extensive corpii that allowed for further study of speech errors. Garrett (1975) conducted research involving a corpus of speech errors that allowed him to determine that there are two modular levels of speech production, one level where syntactic features are assigned and another where the order of forms (morphemes, phonemes) is organized. Another major advancement in the study of speech errors occurred when Dell (1986) developed a computational model that could account for the types of speech errors that are more commonly made. (Levelt, 1999)

Experimental Research
Experimental research into language production was first started by Cattell in 1885 when he realized that it took twice as long to read a list of symbols then it did to read the words that the symbols represented. This observation resulted in more research that focused on naming objects, words as well as how long it took to do each of these tasks. In 1935, however, Stroop developed a task that enabled him to further manipulate the language production process. In this task, called the “Stroop task,” participants were shown colored words and it was their job to read either the word or the color of the word. Stroop was mainly interested in what would happen when the word was the name of a color and printed in a different color. For instance, the word red printed in blue. He found that being in this condition resulted in slower production of the target word. (Levelt, 1999)

Rosinski, et al. (1975), took this idea and modified it to further study semantic interference, but instead of words and colors, they decided to use objects and words. Participants were shown pictures of objects with a word embedded within the picture. The participants were told to either ignore the image and name the word or ignore the word and name the image. The words that were present in the image were either meaning-related or semantically-related and the results were that semantically-related words produced more interference than meaning-related words. (Levelt, 1999)