Slovene alphabet

The Slovene alphabet (slovenska abeceda, or slovenska gajica ) is an extension of the Latin script used to write Slovene. The standard language uses a Latin alphabet which is a slight modification of the Croatian Gaj's Latin alphabet, consisting of 25 lower- and upper-case letters:

Characters
The following Latin letters are also found separately alphabetized in words of non-Slovene origin: Ć (mehki č), Đ (dže), Q (ku), W (dvojni ve), X (iks), and Y (ipsilon).

Diacritics
To compensate for the shortcomings of the standard orthography, Slovenian also uses standardized diacritics or accent marks to denote stress, vowel length and pitch accent, much like the closely related Serbo-Croatian. However, as in Serbo-Croatian, use of such accent marks is restricted to dictionaries, language textbooks and linguistic publications. In normal writing, the diacritics are almost never used, except in a few minimal pairs where real ambiguity could arise.

Two different and mutually incompatible systems of diacritics are used. The first is the simpler non-tonemic system, which can be applied to all Slovene dialects. It is more widely used and is the standard representation in dictionaries such as SSKJ. The tonemic system also includes tone as part of the representation. However, neither system reliably distinguishes schwa from the front mid-vowels, nor vocalised l  from regular l. Some sources write these as ə and ł, respectively, but this is not as common.

Non-tonemic diacritics
In the non-tonemic system, the distinction between the two mid-vowels is indicated, as well as the placement of stress and length of vowels:


 * Long stressed vowels are notated with an acute diacritic: á é í ó ú ŕ (IPA: ).
 * However, the rarer long stressed low-mid vowels and  are notated with a circumflex: ê ô.
 * Short stressed vowels are notated with a grave: à è ì ò ù (IPA: ). Some systems may also include ə̀ for.

Tonemic diacritics
The tonemic system uses the diacritics somewhat differently from the non-tonemic system. The high-mid vowels and  are written ẹ ọ with a subscript dot, while the low-mid vowels  and  are written as plain e o.

Pitch accent and length is indicated by four diacritical marks:


 * The acute ( ´ ) indicates long and low pitch: á é ẹ́ í ó ọ́ ú ŕ (IPA: ).
 * The inverted breve (  ̑ ) indicates long and high pitch: ȃ ȇ ẹ̑ ȋ ȏ ọ̑ ȗ ȓ (IPA: ).
 * The grave ( ` ) indicates short and low pitch. This occurs only on è (IPA: ), optionally written as ə̀.
 * The double grave (  ̏ ) indicates short and high pitch: ȁ ȅ ȉ ȍ ȕ (IPA: á ɛ́ í ɔ́ ú). ȅ is also used for, optionally written as ə̏.

The schwa vowel is written ambiguously as e, but its accentuation will sometimes distinguish it: a long vowel mark can never appear on a schwa, while a grave accent can appear only on a schwa. Thus, only ȅ and unstressed e are truly ambiguous.

Others
The writing in its usual form uses additional accentual marks, which are used to disambiguate similar words with different meanings. For example:


 * gòl (naked) | gól (goal),
 * jêsen (ash (tree)) | jesén (autumn),
 * kót (angle, corner) | kot (as, like),
 * kózjak (goat's dung) | kozják (goat-shed),
 * med (between) | méd (brass) | méd (honey),
 * pól (pole) | pól (half (of)) | pôl (expresses a half an hour before the given hour),
 * prècej (at once) | precéj (a great deal (of))),
 * remí (draw) | rémi (rummy (- a card game)),
 * je (he/she is) | jé (he/she eats).

Foreign words
There are 5 letters for vowels (a, e, i, o, u) and 20 for consonants. The letters q, w, x, y are excluded from the standard spelling, as are some Serbo-Croatian graphemes (ć, đ), however they are collated as independent letters in some encyclopedias and dictionary listings; foreign proper nouns or toponyms are often not adapted to Slovene orthography as they are in some other Slavic languages, such as partly in Russian or entirely in the Serbian standard of Serbo-Croatian.

In addition, the graphemes ö and ü are used in certain non-standard dialect spellings (usually representing loanwords from German, Hungarian or Turkish) – for example, dödöli (Prekmurje potato dumplings) and Danilo Türk (a politician).

Encyclopedic listings (such as in the 2001 Slovenski pravopis and the 2006 Leksikon SOVA) use this alphabet:
 * a, b, c, č, ć, d, đ, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, š, t, u, v, w, x, y, z, ž.

Therefore, Newton and New York remain the same and are not transliterated to Njuton or Njujork; transliterated forms would seem very odd to a Slovene. However, the unit of force is written as njuton as well as newton. Some place names are transliterated (e.g. Philadelphia – Filadelfija; Hawaii – Havaji). Other names from non-Latin languages are transliterated in a fashion similar to that used by other European languages, albeit with some adaptations. Japanese, Indonesian and Arabic names such as Kajibumi, Jakarta and Jabar are written as Kadžibumi, Džakarta and Džabar, where j is replaced with dž. Except for ć and đ, graphemes with diacritical marks from other foreign alphabets (e.g., ä, å, æ, ç, ë, ï, ń, ö, ß, ş, ü) are not used as independent letters.

History
The modern alphabet (abeceda) was standardised in the mid-1840s from an arrangement of the Croatian national reviver and leader Ljudevit Gaj which would become the Croatian alphabet, and was in turn patterned on the Czech alphabet. Before the current alphabet became standard, š was, for example, written as ʃ, ʃʃ or ſ; č as tʃch, cz, tʃcz or tcz; i sometimes as y as a relic of the letter now rendered as Ы (yery) in modern Russian; j as y; l as ll; v as w; ž as ʃ, ʃʃ or ʃz.

In the old alphabet used by most distinguished writers, the Bohorič alphabet (bohoričica), developed by Adam Bohorič, the characters č, š and ž would be spelt as zh, ſh and sh respectively, and c, s and z would be spelt as z, ſ and s respectively. To remedy this, so that there was a one-to-one correspondence between sounds and letters, Jernej Kopitar urged the development of a new alphabet.

In 1825, Franc Serafin Metelko proposed his version of the alphabet (the Metelko alphabet, metelčica). However, it was banned in 1833 in favour of the Bohorič alphabet after the so-called "Suit of the Letters" (Črkarska pravda) (1830–1833), which was won by France Prešeren and Matija Čop. Another alphabet, the Dajnko alphabet (dajnčica), was developed by Peter Dajnko in 1824, but did not catch on as widely as the Metelko alphabet; it was banned in 1838 because it mixed Latin and Cyrillic characters, which was seen as a poor way to handle missing characters.

Gaj's Latin alphabet (gajica) was adopted afterwards, although it still fails to distinguish all the phonemes of Slovene.

Computer encoding
The preferred character encodings (writing codes) for Slovene texts are UTF-8 (Unicode), UTF-16, and ISO/IEC 8859-2 (Latin-2), which generally supports Central and Eastern European languages that are written in the Latin script.

In the original ASCII frame of 1 to 126 characters one can find these examples of writing text in Slovene:


 * a, b, c, *c, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, *s, t, u, v, z, *z
 * a, b, c, "c, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, "s, t, u, v, z, "z
 * a, b, c, c(, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, s(, t, u, v, z, z(
 * a, b, c, c^, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, s^, t, u, v, z, z^
 * a, b, c, cx, d, e, f, g, h, i, j, k, l, m, n, o, p, r, s, sx, t, u, v, z, zx

In ISO/IEC 8859-1 (Latin-1) typical workarounds for missing characters Č (č), Š (š), and Ž (ž) can be C~ (c~), S~ (s~), Z~ (z~) or similar as for ASCII encoding.

For usage under DOS and Microsoft Windows also code pages 852 and Windows-1250 respectively fully supported Slovene alphabet.

In TeX notation, č, š and ž become \v c, \v s, \v z, \v{c}, \v{s}, \v{z} or in their macro versions, "c, "s and "z, or in other representations as \~, \{, \'  for lowercase and \^, \ [ , \@ for uppercase.

The IETF language tags have assigned variants to the different orthographies of Slovene:
 * sl-bohoric (Bohoric alphabet)
 * sl-dajnko (Dajnko alphabet)
 * sl-metelko (Metelko alphabet)
 * sl-rozaj-1994 (Standardized Resian orthography).