Dialectology

Dialectology (from Greek διάλεκτος, dialektos, "talk, dialect"; and -λογία, -logia) is the scientific study of dialects: subsets of languages. In the 19th century a branch of historical linguistics, dialectology is often now considered a sub-field of sociolinguistics. It studies variations in language based primarily on geographic distribution and their associated features. Dialectology deals with such topics as divergence of two local dialects from a common ancestor and synchronic variation.

Dialectologists are ultimately concerned with grammatical, lexical and phonological features that correspond to regional areas. Thus they usually deal not only with populations that have lived in certain areas for generations, but also with migrant groups that bring their languages to new areas (see language contact).

Commonly studied concepts in dialectology include the problem of mutual intelligibility in defining languages and dialects; situations of diglossia, where two dialects are used for different functions; dialect continua including a number of dialects of varying intelligibility; and pluricentrism, where a single language has two or more standard varieties.

Hans Kurath and William Labov are among the most prominent researchers in this field.

Dialects of English
In London, there were comments on the different dialects recorded in 12th-century sources, and a large number of dialect glossaries (focussing on vocabulary) were published in the 19th century. Philologists would also study dialects, as they preserved earlier forms of words.

In Britain, the philologist Alexander John Ellis described the pronunciation of English dialects in an early phonetic system in volume 5 of his series On Early English Pronunciation. The English Dialect Society was later set up by Joseph Wright to record dialect words in the British Isles. This culminated in the production of the six-volume English Dialect Dictionary in 1905. The English Dialect Society was then disbanded, as its work was considered complete, although some regional branches (e.g. the Yorkshire Dialect Society) still operate today. Traditional studies in dialectology were generally aimed at producing dialect maps, in which lines were drawn on a map to indicate boundaries between different dialect areas. The move away from traditional methods of language study, however, caused linguists to become more concerned with social factors. Dialectologists, therefore, began to study social, as well as regional variation. The Linguistic Atlas of the United States (the 1930s) was amongst the first dialect studies to take social factors into account.

Under the leadership of Harold Orton, the University of Leeds became a centre for the study of English dialect and set up an Institute of Dialect and Folk Life Studies. In the 1950s, the university undertook the Survey of English Dialects, which covered all of England, some bordering areas of Wales, and the Isle of Man. In addition, the university produced more than 100 monographs on dialect before the death of Harold Orton in 1975. The Institute closed in September 1983 to accommodate budget cuts at the University, but its dialectological studies are now part of a special collection, the Leeds Archive of Vernacular Culture, in the university's Brotherton Library.

This shift in interest consequently saw the birth of sociolinguistics, which is a mixture of dialectology and social sciences. However, Graham Shorrocks has argued that there was always a sociological element to dialectology and that many of the conclusions of sociolinguists (e.g. the relationships with gender, class and age) can be found in earlier work by traditional dialectologists.

In the US, Hans Kurath began the Linguistic Atlas of the United States project in the 1930s, intended to consist of a series of in-depth dialectological studies of regions of the country. The first of these, the Linguistic Atlas of New England, was published in 1939. Later works in the same project were published or planned for the Middle Atlantic and South Atlantic states, for the North Central States, for the Upper Midwest, for the Rocky Mountain States, for the Pacific Coast and for the Gulf States, though in a lesser degree of detail owing to the huge amount of work that would be necessary to fully process the data.

Later large-scale and influential studies of American dialectology have included the Dictionary of American Regional English, based on data collected in the 1960s and published between 1985 and 2013, focusing on lexicon; and the Atlas of North American English, based on data collected in the 1990s and published in 2006, focusing on pronunciation.

Dialects of French
Jules Gillieron published a linguistic atlas of 25 French-speaking locations in Switzerland in 1880. In 1888, Gillieron responded to a call from Gaston Paris for a survey of the dialects of French, likely to be superseded by Standard French in the near future, by proposing the Atlas Linguistique de la France. The principal fieldworker for the atlas, Edmond Edmont, surveyed 639 rural locations in French-speaking areas of France, Belgium, Switzerland and Italy. The questionnaire initially included 1400 items, but this was later increased to over 1900. The atlas was published in 13 volumes between 1902 and 1910.

Dialects of German
The first comparative dialect study in Germany was The Dialects of Bavaria in 1821 by Johann Andreas Schmeller, which included a linguistic atlas.

In 1873, a parson named L. Liebich surveyed the German-speaking areas of Alsace by a postal questionnaire that covered phonology and grammar. He never published any of his findings.

In 1876, Eduard Sievers published Elements of Phonetics and a group of scholars formed the Neogrammarian school. This work in linguistics covered dialectology in German-speaking countries. In the same year, Jost Winteler published a monograph on the dialect of Kerenzen in the Canton of Glarus in Switzerland, which became a model for monographs on particular dialects.

Also in 1876, Georg Wenker, a young school librarian from Düsseldorf based in Marburg, sent postal questionnaires out over Northern Germany. These questionnaires contained a list of sentences written in Standard German. These sentences were then transcribed into the local dialect, reflecting dialectal differences. He later expanded his work to cover the entire German Empire, including dialects in the east that have become extinct since the territory was lost to Germany. Wenker's work later became the Deutscher Sprachatlas at the University of Marburg. After Wenker's death in 1911, work continued under Ferdinand Wrede and later questionnaires covered Austria as well as Germany.

Dialects of Italian and Corsican
The first treatment of Italian dialects is provided by Dante Alighieri in his treatise De vulgari eloquentia in the early fourteenth century.

The founder of scientific dialectology in Italy was Graziadio Isaia Ascoli, who, in 1873, founded the journal Archivio glottologico italiano, still active today together with L'Italia dialettale, which was founded by Clemente Merlo in 1924.

After completing his work in France, Edmond Edmont surveyed 44 locations in Corsica for the Atlas Linguistique de la Corse.

Two students from the French atlas named Karl Jaberg and Jakob Jud surveyed Italian dialects in Italy and southern Switzerland in the Sprach- und Sachatlas Italiens und der Südschweiz. This survey influenced the work of Hans Kurath in the USA.

Dialects of Scots and Gaelic
The Linguistic Survey of Scotland began in 1949 at the University of Edinburgh.

The first part of the survey researched dialects of Scots in the Scottish Lowlands, the Shetland Islands, the Orkney Islands, Northern Ireland, and the two northernmost counties of England: Cumberland (since merged into Cumbria) and Northumberland. Three volumes of results were published between 1975 and 1985.

The second part studied dialects of Gaelic, including mixed use of Gaelic and English, in the Scottish Highlands and Western Isles. Results were published under the name of Cathair Ó Dochartaigh in five volumes between 1994 and 1997.

Methods of data collection
A variety of methods are used to collect data on regional dialects and to choose informants from whom to collect it. Early dialect research, focused on documenting the most conservative forms of regional dialects, least contaminated by ongoing change or contact with other dialects, focused primarily on collecting data from older informants in rural areas. More recently, under the umbrella of sociolinguistics, dialectology has developed greater interest in the ongoing linguistic innovations that differentiate regions from each other, devoting more attention to the speech of younger speakers in urban centers.

Some of the earliest dialectology collected data by use of written questionnaires asking informants to report on features of their dialect. This methodology has seen a comeback in recent decades, especially with the availability of online questionnaires that can be used to collect data from a huge number of informants at little expense to the researcher.

Dialect research in the 20th century predominantly used face-to-face interview questionnaires to gather data. There are two main types of questionnaires: direct and indirect. Researchers using the direct method for their face-to-face interviews will present the informant with a set of questions that demand a specific answer and are designed to gather lexical and/or phonological information. For example, the linguist may ask the subject the name for various items, or ask him or her to repeat certain words.

Indirect questionnaires are typically more open-ended and take longer to complete than direct questionnaires. A researcher using this method will sit down with a subject and begin a conversation on a specific topic. For example, he may question the subject about farm work, food and cooking, or some other subject, and gather lexical and phonological information from the information provided by the subject. The researcher may also begin a sentence, but allow the subject to finish it for him, or ask a question that does not demand a specific answer, such as "What are the most common plants and trees around here?" The sociolinguistic interview may be used for dialectological purposes as well, in which informants are engaged in a long-form open-ended conversation intended to allow them to produce a large volume of speech in a vernacular style.

Whereas lexical, phonological and inflectional variations can be easily discerned, information related to larger forms of syntactic variation is much more difficult to gather. Another problem is that informants may feel inhibited and refrain from using dialectal features.

Researchers may collect relevant excerpts from books that are entirely or partially written in a dialect. The major drawback is the authenticity of the material, which may be difficult to verify. Since the advent of social media, it has become possible for researchers to collect large volumes of geotagged posts from platforms such as Twitter, in order to document regional differences in the way language is used in such posts.

Mutual intelligibility
Some have attempted to distinguish dialects from languages by saying that dialects of the same language are understandable to each other's speakers. This simple criterion is demonstrated to be untenable, for example by the case of Italian and Spanish cited below. While native speakers of the two may enjoy mutual understanding ranging from limited to considerable depending on the topic of discussion and speakers' experience with linguistic variety, few people would want to classify Italian and Spanish as dialects of the same language in any sense other than historical. Spanish and Italian are similar and to varying extents mutually comprehensible, but phonology, syntax, morphology, and lexicon are sufficiently distinct that the two cannot be considered dialects of the same language (but rather developed from their common ancestor Latin).

Diglossia
Another feature is diglossia: this is a situation in which, in a given society, there are two closely related languages, one of high prestige, which is generally used by the government and in formal texts, and one of low prestige, which is usually the spoken vernacular tongue. An example of this is Sanskrit, which was considered the proper way to speak in northern India but was accessible only by the upper class, and Prakrit which was the common (and informal or vernacular) speech at the time.

Varying degrees of diglossia are still common in many societies around the world.

Dialect continuum
A dialect continuum is a network of dialects in which geographically adjacent dialects are mutually comprehensible, but with comprehensibility steadily decreasing as distance between the dialects increases. An example is the Dutch-German dialect continuum, a large network of dialects with two recognized literary standards. Although mutual intelligibility between standard Dutch and standard German is fairly limited, a chain of dialects connects them. Due to several centuries of influence by standard languages (especially in Northern Germany, where even today the original dialects struggle to survive) there are now many breaks in complete intelligibility between geographically adjacent dialects along the continuum, but in the past these breaks were virtually nonexistent.

The Romance languages&mdash;Galician/Portuguese, Spanish, Sicilian, Catalan, Occitan/Provençal, French, Sardinian, Romanian, Romansh, Friulan, other Italian, French, and Ibero-Romance dialects, and others&mdash;form another well-known continuum, with varying degrees of mutual intelligibility.

In both areas—the Germanic and Romance linguistic continuums—the relational notion of the term dialect is often vastly misunderstood, and today gives rise to considerable difficulties in implementation of European Union directives regarding support of minority languages. This is perhaps nowhere more evident than in Italy, where still today some of the population use their local language (dialetto 'dialect') as the primary means of communication at home and, to varying lesser extent, the workplace. Difficulties arise due to terminological confusion. The languages conventionally referred to as Italian dialects can be regarded as Romance sister languages of Italian, not variants of Italian, which are commonly and properly called italiano regionale ('regional Italian'). The label "Italian dialect" as conventionally used is more geopolitical in aptness of meaning rather than linguistic: Bolognese and Neapolitan, for example, are termed Italian dialects, yet resemble each other less than do Italian and Spanish. Misunderstandings ensue if "Italian dialect" is taken to mean 'dialect of Italian' rather than 'minority language spoken on Italian soil', i.e. part of the network of the Romance linguistic continuum. The indigenous Romance language of Venice, for example, is cognate with Italian, but quite distinct from the national language in phonology, morphology, syntax, and lexicon, and in no way a derivative or a variety of the national language. Venetian can be said to be an Italian dialect both geographically and typologically, but it is not a dialect of Italian.

Pluricentrism
A pluricentric language is a language that has two or more standard forms. An example is Hindustani, which encompasses two standard varieties, Urdu and Hindi. Another example is Norwegian, with Bokmål having developed closely with Danish and Swedish, and Nynorsk as a partly reconstructed language based on old dialects. Both are recognized as official languages in Norway.

In a sense, the set of dialects can be understood as being part of a single diasystem, an abstraction that each dialect is part of. In generative phonology, the differences can be acquired through rules. An example can be taken with Occitan (a cover term for a set of related varieties spoken in Southern France) where 'cavaL' (from late Latin caballus, "horse") is the diasystemic form for the following realizations:
 * Languedocien dialect: caval (L >, sometimes velar, used concurrently with French borrowed forms chival or chivau);
 * Limousin dialect: chavau (ca > cha and -L > -u);
 * Provençal dialect: cavau (-L > -u, used concurrently with French borrowed forms chival or chivau);
 * Gascon dialect: cavath (final -L >, sometimes palatalized, and used concurrently with French borrowed form chibau)
 * Auvergnat and Vivaro-alpine dialects: chaval (same treatment of ca cluster as in Limousin dialect)

The pluricentric approach may be used in practical situations. For instance when such a diasystem is identified, it can be used construct a diaphonemic orthography that emphasizes the commonalities between the varieties. Such a goal may or may not fit with sociopolitical preferences. Conversely, dialectological field-internal traditions may or may not delay the diversification of a given language into multiple standards (see Luxembourgish for an example of the latter, and the One Standard German Axiom for the former).

The abstand and ausbau languages framework
One analytical paradigm developed by Heinz Kloss is known as the abstand and ausbau languages framework. It has proven popular among linguists in Continental Europe, but is not so well known in English-speaking countries, especially among people who are not trained linguists. Although only one of many possible paradigms, it has the advantage of being constructed by trained linguists for the particular purpose of analyzing and categorizing varieties of speech, and has the additional merit of replacing such loaded words as "language" and "dialect" with the German terms of ausbau language and abstand language, words that are not (yet) loaded with political, cultural, or emotional connotations.