Chunking (psychology)

In cognitive psychology, chunking is a process by which small individual pieces of a set of information are bound together to create a meaningful whole later on in memory. The chunks, by which the information is grouped, are meant to improve short-term retention of the material, thus bypassing the limited capacity of working memory and allowing the working memory to be more efficient. A chunk is a collection of basic units that are strongly associated with one another, and have been grouped together and stored in a person's memory. These chunks can be retrieved easily due to their coherent grouping. It is believed that individuals create higher-order cognitive representations of the items within the chunk. The items are more easily remembered as a group than as the individual items themselves. These chunks can be highly subjective because they rely on an individual's perceptions and past experiences, which are linked to the information set. The size of the chunks generally ranges from two to six items but often differs based on language and culture.

According to Johnson (1970), there are four main concepts associated with the memory process of chunking: chunk, memory code, decode and recode. The chunk, as mentioned prior, is a sequence of to-be-remembered information that can be composed of adjacent terms. These items or information sets are to be stored in the same memory code. The process of recoding is where one learns the code for a chunk, and decoding is when the code is translated into the information that it represents.

The phenomenon of chunking as a memory mechanism is easily observed in the way individuals group numbers, and information, in day-to-day life. For example, when recalling a number such as 12101946, if numbers are grouped as 12, 10, and 1946, a mnemonic is created for this number as a month, day, and year. It would be stored as December 10, 1946, instead of a string of numbers. Similarly, another illustration of the limited capacity of working memory as suggested by George Miller can be seen from the following example: While recalling a mobile phone number such as 9849523450, we might break this into 98 495 234 50. Thus, instead of remembering 10 separate digits that are beyond the putative "seven plus-or-minus two" memory span, we are remembering four groups of numbers. An entire chunk can also be remembered simply by storing the beginnings of a chunk in the working memory, resulting in the long-term memory recovering the remainder of the chunk.

Modality effect
A modality effect is present in chunking. That is, the mechanism used to convey the list of items to the individual affects how much "chunking" occurs.

Experimentally, it has been found that auditory presentation results in a larger amount of grouping in the responses of individuals than visual presentation does. Previous literature, such as George Miller's The Magical Number Seven, Plus or Minus Two: Some Limits on our Capacity for Processing Information (1956) has shown that the probability of recall of information is greater when the chunking strategy is used. As stated above, the grouping of the responses occurs as individuals place them into categories according to their inter-relatedness based on semantic and perceptual properties. Lindley (1966) showed that since the groups produced have meaning to the participant, this strategy makes it easier for an individual to recall and maintain information in memory during studies and testing. Therefore, when "chunking" is used as a strategy, one can expect a higher proportion of correct recalls.

Memory training systems, mnemonic
Various kinds of memory training systems and mnemonics include training and drills in specially-designed recoding or chunking schemes. Such systems existed before Miller's paper, but there was no convenient term to describe the general strategy and no substantive and reliable research. The term "chunking" is now often used in reference to these systems. As an illustration, patients with Alzheimer's disease typically experience working memory deficits; chunking is an effective method to improve patients' verbal working memory performance. Patients with schizophrenia also experience working memory deficits which influence executive function; memory training procedures positively influence cognitive and rehabilitative outcomes. Chunking has been proven to decrease the load on the working memory in many ways. As well as remembering chunked information easier, a person can also recall other non-chunked memories easier due to the benefits chunking has on the working memory. For instance, in one study, participants with more specialized knowledge could reconstruct sequences of chess moves because they had larger chunks of procedural knowledge, which means that the level of expertise and the sorting order of the information retrieved is essential in the influence of procedural knowledge chunks retained in short-term memory. Chunking has been shown to have an influence in linguistics, such as boundary perception.

Efficient Chunk sizes
According to the research conducted by Dirlam (1972), a mathematical analysis was conducted to see what the efficient chunk size is. We are familiar with the size range that chunking holds, but Dirlam (1972) wanted to discover the most efficient chunk size. The mathematical findings have discovered that four or three items in each chunk is the most optimal.

Channel capacity, "Magic number seven", Increase of short-term memory
The word chunking comes from a famous 1956 paper by George A. Miller, "The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information". At a time when information theory was beginning to be applied in psychology, Miller observed that some human cognitive tasks fit the model of a "channel capacity" characterized by a roughly constant capacity in bits, but short-term memory did not. A variety of studies could be summarized by saying that short-term memory had a capacity of about "seven plus-or-minus two" chunks. Miller (1956) wrote, "With binary items, the span is about nine and, although it drops to about five with monosyllabic English words, the difference is far less than the hypothesis of constant information would require (see also, memory span). The span of immediate memory seems to be almost independent of the number of bits per chunk, at least over the range that has been examined to date." Miller acknowledged that "we are not very definite about what constitutes a chunk of information."

Miller (1956) noted that according to this theory, it should be possible to increase short-term memory for low-information-content items effectively by mentally recoding them into a smaller number of high-information-content items. He imagined this process is useful in scenarios such as "a man just beginning to learn radio-telegraphic code hears each dit and dah as a separate chunk. Soon he is able to organize these sounds into letters and then he can deal with the letters as chunks. Then the letters organize themselves as words, which are still larger chunks, and he begins to hear whole phrases." Thus, a telegrapher can effectively "remember" several dozen dits and dahs as a single phrase. Naïve subjects can remember a maximum of only nine binary items, but Miller reports a 1954 experiment in which people were trained to listen to a string of binary digits and (in one case) mentally group them into groups of five, recode each group into a name (for example, "twenty-one" for 10101), and remember the names. With sufficient practice, people found it possible to remember as many as forty binary digits. Miller wrote:

"It is a little dramatic to watch a person get 40 binary digits in a row and then repeat them back without error. However, if you think of this merely as a mnemonic trick for extending the memory span, you will miss the more important point that is implicit in nearly all such mnemonic devices. The point is that recoding is an extremely powerful weapon for increasing the amount of information that we can deal with."

Expertise and skilled memory effects
Studies have shown that people have better memories when they are trying to remember items with which they are familiar. Similarly, people tend to create familiar chunks. This familiarity allows one to remember more individual pieces of content, and also more chunks as a whole. One well-known chunking study was conducted by Chase and Ericsson, who worked with an undergraduate student, SF, for over two years. They wanted to see if a person's digit span memory could be improved with practice. SF began the experiment with a normal span of 7 digits. SF was a long-distance runner, and chunking strings of digits into race times increased his digit span. By the end of the experiment, his digit span had grown to 80 numbers. A later description of the research in The Brain-Targeted Teaching Model for 21st Century Schools states that SF later expanded his strategy by incorporating ages and years, but his chunks were always familiar, which allowed him to recall them more easily. Someone who does not have knowledge in the expert domain (e.g. being familiar with mile/marathon times) would have difficulty chunking with race times and ultimately be unable to memorize as many numbers using this method. The idea that a person who does not have knowledge in the expert domain would have difficulty chunking could also be seen in an experiment of novice and expert hikers to see if they could remember different mountain scenes. From this study, it was found that the expert hikers had better recall and recognition of structured stimuli. Another example could be seen with expert musicians in being able to chunk and recall encoded material that best meets the demands they are presented with at any given moment during the performance.

 Chunking and memory in chess revisited 

Previous research has shown that chunking is an effective tool for enhancing memory capacity due to the nature of grouping individual pieces into larger, more meaningful groups that are easier to remember. Chunking is a popular tool for people who play chess, specifically a master. Chase and Simon (1973a) discovered that the skill levels of chess players are attributed to long-term memory storage and the ability to copy and recollect thousands of chunks. The process helps acquire knowledge at a faster pace. Since it is an excellent tool for enhancing memory, a chess player who utilizes chunking has a higher chance of success. According to Chase and Simon, while re-examining (1973b), an expert chess master is able to access information in long-term memory storage quickly due to the ability to recall chunks. Chunks stored in long-term memory are related to the decision of the movement of board pieces due to obvious patterns.

 Chunking models for education 

Many years of research has concluded that chunking is a reliable process for gaining knowledge and organization of information. Chunking provides explanation to the behavior of experts, such as a teacher. A teacher can utilize chunking in their classroom as a way to teach the curriculum. Gobet (2005) proposed that teachers can use chunking as a method to segment the curriculum into natural components. A student learns better when focusing on key features of material, so it is important to create the segments to highlight the important information. By understanding the process of how an expert is formed, it is possible to find general mechanisms for learning that can be implemented into classrooms.

Chunking in motor learning
Chunking is a method of learning that can be applied in a number of contexts and is not limited to learning verbal material. Karl Lashley, in his classic paper on serial order, argued that the sequential responses that appear to be organized in a linear and flat fashion concealed an underlying hierarchical structure. This was then demonstrated in motor control by Rosenbaum et al. in 1983. Thus sequences can consist of sub-sequences and these can, in turn, consist of sub-sub-sequences. Hierarchical representations of sequences have an advantage over linear representations: They combine efficient local action at low hierarchical levels while maintaining the guidance of an overall structure. While the representation of a linear sequence is simple from a storage point of view, there can be potential problems during retrieval. For instance, if there is a break in the sequence chain, subsequent elements will become inaccessible. On the other hand, a hierarchical representation would have multiple levels of representation. A break in the link between lower-level nodes does not render any part of the sequence inaccessible, since the control nodes (chunk nodes) at the higher level would still be able to facilitate access to the lower-level nodes.

Chunks in motor learning are identified by pauses between successive actions in Terrace (2001). It is also suggested that during the sequence performance stage (after learning), participants download list items as chunks during pauses. He also argued for an operational definition of chunks suggesting a distinction between the notions of input and output chunks from the ideas of short-term and long-term memory. Input chunks reflect the limitation of working memory during the encoding of new information (how new information is stored in long-term memory), and how it is retrieved during subsequent recall. Output chunks reflect the organization of over-learned motor programs that are generated on-line in working memory. Sakai et al. (2003) showed that participants spontaneously organize a sequence into a number of chunks across a few sets and that these chunks were distinct among participants tested on the same sequence. They also demonstrated that the performance of a shuffled sequence was poorer when the chunk patterns were disrupted than when the chunk patterns were preserved. Chunking patterns also seem to depend on the effectors used.

Perlman found in his series of experiments that tasks that are larger in size and broken down into smaller sections had faster respondents than the task as a large whole. The study suggests that chunking a larger task into a smaller more manageable task can produce a better outcome. The research also found that completing the task in a coherent order rather than swapping from one task to another can also produce a better outcome.

Chunking in infants
Chunking is used in adults in different ways which can include low-level perceptual features, category membership, semantic relatedness, and statistical co-occurrences between items. Although due to recent studies we are starting to realize that infants also use chunking. They also use different types of knowledges to help them with chunking like conceptual knowledge, spatiotemporal cue knowledge, and knowledge of their social domain.

There have been studies that use different chunking models like PARSER and the Bayesian model. PARSER is a chunking model designed to account for human behavior by implementing psychologically plausible processes of attention, memory, and associative learning. In a recent study, it was determined that these chunking models like PARSER are seen in infants more than chunking models like Bayesian. PARSER is seen more because it is typically endowed with the ability to process up to three chunks simultaneously.

When it comes to infants using their social knowledge they need to use abstract knowledge and subtle cues because they can not create a perception of their social group on their own. Infants can form chunks using shared features or spatial proximity between objects.

Chunking in seven-month-old infants
Previous research shows that the mechanism of chunking is available in seven-month-old infants. This means that chunking can occur even before the working memory capacity has completely developed. Knowing that the working memory has a very limited capacity, it can be beneficial to utilize chunking. In infants, whose working memory capacity is not completely developed, it can be even more helpful to chunk memories. These studies were done using the violation-of-expectation method and recording the amount of time the infants watched the objects in front of them. Although the experiment showed that infants can use chunking, researchers also concluded that an infant's ability to chunk memories will continue to develop over the next year of their lives.

Chunking in 14-month-old infants
Working memory appears to store no more than three objects at a time in newborns and early toddlers. A study conducted in 2014, Infants use temporal regularities to chunk objects in memory, allowed for new information and knowledge. This research showed that 14-month-old infants, like adults, can chunk using their knowledge of object categories: they remembered four total objects when an array contained two tokens of two different types (e.g., two cats and two cars), but not when the array contained four tokens of the same type (e.g., four different cats). It demonstrates that newborns may employ spatial closeness to tie representations of particular items into chunks, benefiting memory performance as a result. Despite the fact that newborns' working memory capacity is restricted, they may employ numerous forms of information to tie representations of individual things into chunks, enhancing memory efficiency.

Chunking as the learning of long-term memory structures
This usage derives from Miller's (1956) idea of chunking as grouping, but the emphasis is now on long-term memory rather than only on short-term memory. A chunk can then be defined as "a collection of elements having strong associations with one another, but weak associations with elements within other chunks". The emphasis of chunking on long-term memory is supported by the idea that chunking only exists in long-term memory, but it assists with reintegration, which is involved in the recall of information in short-term memory. It may be easier to recall information in short-term memory if the information has been represented through chunking in long-term memory. Norris and Kalm (2021) argued that "reintegration can be achieved by treating recall from memory as a process of Bayesian inference whereby representations of chunks in LTM (long-term memory) provide the priors that can be used to interpret a degraded representation in STM (short-term memory)". In Bayesian inference, priors refer to the initial beliefs regarding the relative frequency of an event occurring instead of other plausible events occurring. When one who holds the initial beliefs receives more information, one will determine the likelihood of each of the plausible events that could happen and thus predict the specific event that will occur. Chunks in long-term memory are involved in forming the priors, and they assist with determining the likelihood and prediction of the recall of information in short-term memory. For example, if an acronym and its full meaning already exist in long-term memory, the recall of information regarding that acronym will be easier in short-term memory.

Chase and Simon in 1973 and later Gobet, Retschitzki, and de Voogt in 2004 showed that chunking could explain several phenomena linked to expertise in chess. Following a brief exposure to pieces on a chessboard, skilled chess players were able to encode and recall much larger chunks than novice chess players. However, this effect is mediated by specific knowledge of the rules of chess; when pieces were distributed randomly (including scenarios that were not common or allowed in real games), the difference in chunk size between skilled and novice chess players was significantly reduced. Several successful computational models of learning and expertise have been developed using this idea, such as EPAM (Elementary Perceiver and Memorizer) and CHREST (Chunk Hierarchy and Retrieval Structures). Chunking may be demonstrated in the acquisition of a memory skill, which was demonstrated by S. F., an undergraduate student with average memory and intelligence, who increased his digit span from seven to almost 80 within 20 months or after at least 230 hours. S. F. was able to improve his digit span partly through mnemonic associations, which is a form of chunking. S. F. associated digits, which were unfamiliar information to him, with running times, ages, and dates, which were familiar information to him. Ericsson et al. (1980) initially hypothesized that S. F. increased digit span was due to an increase in his short-term memory capacity. However, they rejected this hypothesis when they found that his short-memory capacity was always the same, considering that he "chunked" only three to four digits at once. Furthermore, he never rehearsed more than six digits at once nor rehearsed more than four groups in a supergroup. Lastly, if his short-term memory capacity increased, then he would have shown a greater capacity for the alphabets; he did not. Based on these contradictions, Ericsson et al. (1980) later concluded that S. F. was able to increase his digit span due to "the use of mnemonic associations in long-term memory," which further supports that chunking may exist in short-term memory rather than long-term memory.

Chunking has also been used with models of language acquisition. The use of chunk-based learning in language has been shown to be helpful. Understanding a group of basic words and then giving different categories of associated words to build on comprehension has shown to be an effective way to teach reading and language to children. Research studies have found that adults and infants were able to parse the words of a made-up language when they were exposed to a continuous auditory sequence of words arranged in random order. One of the explanations was that they may parse the words using small chunks that correspond to the made-up language. Subsequent studies have supported that when learning involves statistical probabilities (e.g., transitional probabilities in language), it may be better explained via chunking models. Franco and Destrebecqz (2012) further studied chunking in language acquisition and found that the presentation of a temporal cue was associated with a reliable prediction of the chunking model regarding learning, but the absence of the cue was associated with increased sensitivity to the strength of transitional probabilities. Their findings suggest that the chunking model can only explain certain aspects of learning, specifically language acquisition.

Chunking learning style and short-term memory
Norris conducted a study in 2020 of chunking and short-term memory recollection, finding that when a chunk is given, it is stored as a single item despite being a relatively large amount of information. This finding suggests that chunks should be less susceptible to decay or interference when they are recalled. The study used visual stimuli where all the items were given simultaneously. Items of two and three were found to be recalled easier than singles, and more singles were recalled when in a group with threes.

Chunking can be a form of data suppression that allows for more information to be stored in short-term memory. Rather than verbal short-memory measured by the number of items stored, Miller (1956) suggested that verbal short-term memory are stored as chunks. Later studies were done to determine if chunking was a form data compression when there is limited space for memory. Chunking works as data compression when it comes to redundant information and it allows for more information to be stored in short-term memory. However, memory capacity may vary.

Chunking and working memory
An experiment was done to see how chunking could be beneficial to patients who had Alzheimer's disease. This study was based on how chunking was used to improve working memory in normal young people. Working memory is impaired in the early stages of Alzheimer's disease which affects the ability to do everyday tasks. It also affects executive control of working memory. It was found that participants who had mild Alzheimer's disease were able to use working memory strategies to enhance verbal and spatial working memory performance.

It has been long thought that chunking can improve working memory. A study was done to see how chunking can improve working memory when it came to symbolic sequences and gating mechanisms. This was done by having 25 participants learn 16 sequences through trial and error. The target was presented alongside a distractor and participants were to identify the target by using right or left buttons on a computer mouse. The final analysis was done on only 19 participants. The results showed that chunking does improve symbolic sequence performance through decreasing cognitive load and real-time strategy. Chunking has proved to be effective in reducing the load on adding items into working memory. Chunking allows more items to be encoded into working memory with more available to transfer into long-term memory.

Chunking and Two-Factor Theory
Chekaf, Cowan, and Mathy (2016) looked at how immediate memory relates to the formation of chunks. In the immediate memory, they came up with a two-factor theory of the formation of chunks. These factors are compressibility and the order of the information. Compressibility refers to making information more compact and condensed. The material is transformed from something complex to something more simplified. Thus, compressibility relates to chunking due to the predictability factor. As for the second factor, the sequence of the information can impact what is being discovered. So the order, along with the process of compressing the material, may increase the probability that chunking occurs. These two factors interact with one another and matter in the concept of chunking. Chekaf, Cowan, and Mathy (2016) gave an example where the material "1,2,3,4” can be compressed to "numbers one through four." However, if the material was presented as "1,3,2,4” you cannot compress it because the order in which it is presented is different. Therefore, compressibility and order play an important role in chunking.