Voynich manuscript



The Voynich manuscript is an illustrated codex, hand-written in an unknown script referred to as Voynichese. The vellum on which it is written has been carbon-dated to the early 15th century (1404–1438). Stylistic analysis has indicated the manuscript may have been composed in Italy during the Italian Renaissance. While the origins, authorship, and purpose of the manuscript are still debated, hypotheses range from a script for a natural language or constructed language, an unread code, cypher, or other form of cryptography, or perhaps a hoax, reference work (i.e. folkloric index or compendium), or work of fiction (e.g. science fantasy or mythopoeia, metafiction, speculative fiction) currently lacking the translation(s) and context needed to both properly entertain or eliminate any of these possibilities.

The manuscript consists of around 240 pages, but there is evidence that pages are missing. The text is written from left to right, and some pages are foldable sheets of varying sizes. Most of the pages have fantastical illustrations and diagrams, some crudely coloured, with sections of the manuscript showing people, fictitious plants, astrological symbols, etc. The manuscript is named after Wilfrid Voynich, a Polish book dealer who purchased it in 1912. Since 1969, it has been held in Yale University's Beinecke Rare Book and Manuscript Library. In 2020, Yale University published the manuscript online in its entirety in their digital library.

The Voynich manuscript has been studied by both professional and amateur cryptographers, including American and British codebreakers from both World War I and World War II. Codebreakers Prescott Currier, William Friedman, Elizebeth Friedman, and John Tiltman were unsuccessful.

The manuscript has never been demonstrably deciphered, and none of the proposed hypotheses have been independently verified. The mystery of its meaning and origin has excited speculation and provoked study.

Codicology
The codicology, or physical characteristics of the manuscript, has been studied by researchers. The manuscript measures 23.5 by, with hundreds of vellum pages collected into 18 quires. The total number of pages is around 240, but the exact number depends on how the manuscript's unusual foldouts are counted. The quires have been numbered from 1 to 20 in various locations, using a style of numerals consistent with those used in the 15th century, and the top righthand corner of each recto (righthand) page has been numbered from 1 to 116, using a style of numerals that originated at a later date. From the various numbering gaps in the quires and pages, it seems likely that in the past, the manuscript had at least 272 pages in 20 quires, some of which were already missing when Wilfrid Voynich acquired the manuscript in 1912. There is strong evidence that many of the book's bifolios were reordered at various points in the book's history, and that its pages were originally in a different order than the order they are in today.

Parchment, covers, and binding
Samples from various parts of the manuscript were radiocarbon dated at the University of Arizona in 2009. The results were consistent for all samples tested and indicated a date for the parchment between 1404 and 1438. Protein testing in 2014 revealed that the parchment was made from calfskin, and multispectral analysis showed that it had not been written on before the manuscript was created (i.e., it is not a palimpsest). The quality of the parchment is average and has deficiencies, such as holes and tears, common in parchment codices, but was also prepared with so much care that the skin side is largely indistinguishable from the flesh side. The parchment is prepared from "at least fourteen or fifteen entire calfskins".

Some folios (such as 42 and 47) are thicker than the usual parchment.

The goat skin binding and covers are not original to the book, but date to its possession by the Collegio Romano. Insect holes are present on the first and last folios of the manuscript in the current order and suggest that a wooden cover was present before the later covers. Discolouring on the edges points to a tanned leather inside cover.

Ink
Many pages contain substantial drawings or charts which are coloured with paint. Based on modern analysis using polarized light microscopy (PLM), it has been determined that a quill pen and iron gall ink were used for the text and figure outlines. The ink of the drawings, text, and page and quire numbers have similar microscopic characteristics. In 2009, energy-dispersive X-ray spectroscopy (EDS) revealed that the inks contained major amounts of carbon, iron, sulfur, potassium and calcium with trace amounts of copper and occasionally zinc. EDS did not show the presence of lead, while X-ray diffraction (XRD) identified potassium lead oxide, potassium hydrogen sulphate, and syngenite in one of the samples tested. The similarity between the drawing inks and text inks suggested a contemporaneous origin.

Paint
Coloured paint was applied (somewhat crudely) to the ink-outlined figures, possibly at a later date. The blue, white, red-brown, and green paints of the manuscript have been analysed using PLM, XRD, EDS, and scanning electron microscopy (SEM).


 * The blue paint proved to be ground azurite with minor traces of the copper oxide cuprite.
 * The white paint is likely a mixture of egg-white and calcium carbonate.
 * The green paint is tentatively characterised by copper and copper-chlorine resinate; the crystalline material might be atacamite or some other copper-chlorine compound.
 * Analysis of the red-brown paint indicated a red ochre with the crystal phases hematite and iron sulfide. Minor amounts of lead sulfide and palmierite are possibly present in the red-brown paint.

The pigments used were deemed inexpensive.

Retouching
Computer scientist Jorge Stolfi of the University of Campinas highlighted that parts of the text and drawings have been modified, using darker ink over a fainter, earlier script. Evidence for this is visible in various folios, for example f1r, f3v, f26v, f57v, f67r2, f71r, f72v1, f72v3 and f73r.

Text
Every page in the manuscript contains text, mostly in an unidentified language, but some have extraneous writing in Latin script. The bulk of the text in the 240-page manuscript is written in an unknown script, running left to right. Most of the characters are composed of one or two simple pen strokes. There exists some dispute as to whether certain characters are distinct, but a script of 20–25 characters would account for virtually all of the text; the exceptions are a few dozen rarer characters that occur only once or twice each. There is no obvious punctuation.

Much of the text is written in a single column in the body of a page, with a slightly ragged right margin and paragraph divisions and sometimes with stars in the left margin. Other text occurs in charts or as labels associated with illustrations. The ductus flows smoothly, giving the impression that the symbols were not enciphered; there is no delay between characters, as would normally be expected in written encoded text.

Extraneous writing
Only a few of the words in the manuscript are thought to have not been written in the unknown script:


 * f1r: A sequence of Latin letters in the right margin parallel with characters from the unknown script, also the now-unreadable signature of "Jacobj à Tepenece" is found in the bottom margin.
 * f17r: A line of writing in the Latin script in the top margin.
 * f66r: A small number of words in the bottom left corner near a drawing of a nude man have been read as der Mussteil, a High German phrase for 'a widow's share'.
 * f70v–f73v: The astrological series of diagrams in the astronomical section has the names of ten of the months (from March to December) written in Latin script, with spelling suggestive of the medieval languages of France, northwest Italy, or the Iberian Peninsula.
 * f116v: Four lines written in rather distorted Latin script, referred to as "Michitonese", except for two words in the unknown script. The words in Latin script appear to be distorted with characteristics of the unknown language. The lettering resembles European alphabets of the late 14th and 15th centuries, but the words do not seem to make sense in any language. Whether these bits of Latin script were part of the original text or were added later is not known.

Transcription
Various transcription alphabets have been created to equate Voynich characters with Latin characters to help with cryptanalysis, such as the Extensible (originally: European) Voynich Alphabet (EVA). The first major one was created by the "First Study Group", led by cryptographer William F. Friedman in the 1940s, where each line of the manuscript was transcribed to an IBM punch card to make it machine readable.



Statistical patterns
The text consists of over 170,000 characters, with spaces dividing the text into about 35,000 groups of varying length, usually referred to as "words" or "word tokens" (37,919); 8,114 of those words are considered unique "word types". The structure of these words seems to follow phonological or orthographic laws of some sort; for example, certain characters must appear in each word (like English vowels), some characters never follow others, or some may be doubled or tripled, but others may not. The distribution of letters within words is also rather peculiar: Some characters occur only at the beginning of a word, some only at the end (like Greek ς), and some always in the middle section.

Many researchers have commented upon the highly regular structure of the words. Professor Gonzalo Rubio, an expert in ancient languages at Pennsylvania State University, stated:

"The things we know as – things that occur commonly at the beginning or end of words, such as 's' or 'd' in our language, and that are used to express grammar, never appear in the middle of 'words' in the Voynich manuscript. That's unheard of for any Indo-European, Hungarian, or Finnish language."

Stephan Vonfelt studied statistical properties of the distribution of letters and their correlations (properties which can be vaguely characterised as rhythmic resonance, alliteration, or assonance) and found that under that respect Voynichese is more similar to the Mandarin Chinese text of the Records of the Grand Historian than to the text of works from European languages, although the numerical differences between Voynichese and Mandarin Chinese pinyin look larger than those between Mandarin Chinese pinyin and European languages.

Practically no words have fewer than two letters or more than ten. Some words occur in only certain sections, or in only a few pages; others occur throughout the manuscript. Few repetitions occur among the thousand or so labels attached to the illustrations. There are instances where the same common word appears up to three times in a row (see Zipf's law). Words that differ by only one letter also repeat with unusual frequency, causing single-substitution alphabet decipherings to yield babble-like text. In 1962, cryptanalyst Elizebeth Friedman described such statistical analyses as "doomed to utter frustration".

In 2014, a team led by Diego Amancio of the University of São Paulo published a study using statistical methods to analyse the relationships of the words in the text. Instead of trying to find the meaning, Amancio's team looked for connections and clusters of words. By measuring the frequency and intermittence of words, Amancio claimed to identify the text's keywords and produced three-dimensional models of the text's structure and word frequencies. The team concluded that, in 90% of cases, the Voynich systems are similar to those of other known books, indicating that the text is in an actual language, not random gibberish.

"The use of the framework was exemplified with the analysis of the Voynich manuscript, with the final conclusion that it differs from a random sequence of words, being compatible with natural languages. Even though our approach is not aimed at deciphering Voynich, it was capable of providing keywords that could be helpful for decipherers in the future."

Linguists Claire Bowern and Luke Lindemann have applied statistical methods to the Voynich manuscript, comparing it to other languages and encodings of languages, and have found both similarities and differences in statistical properties. Character sequences in languages are measured using a metric called h2, or second-order conditional entropy. Natural languages tend to have an h2 between 3 and 4, but Voynichese has much more predictable character sequences, and an h2 around 2. However, at higher levels of organisation, the Voynich manuscript displays properties similar to those of natural languages. Based on this, Bowern dismisses theories that the manuscript is gibberish. It is likely to be an encoded natural language or a constructed language. Bowern also concludes that the statistical properties of the Voynich manuscript are not consistent with the use of a substitution cipher or polyalphabetic cipher.

As noted in Bowern's review, multiple scribes or "hands" may have written the manuscript, possibly using two methods of encoding at least one natural language. The "language" Voynich A appears in the herbal and pharmaceutical parts of the manuscript. The "language" known as Voynich B appears in the balneological section, some parts of the medicinal and herbal sections, and the astrological section. The most common vocabulary items of Voynich A and Voynich B are substantially different. Topic modeling of the manuscript suggests that pages identified as written by a particular scribe may relate to a different topic.

In terms of morphology, if visual spaces in the manuscript are assumed to indicate word breaks, there are consistent patterns that suggest a three-part word structure of prefix, root or midfix, and suffix. Certain characters and character combinations are more likely to appear in particular fields. There are minor variations between Voynich A and Voynich B. The predictability of certain letters in a relatively small number of combinations in certain parts of words appears to explain the low entropy (h2) of Voynichese. In the absence of obvious punctuation, some variants of the same word appear to be specific to typographical positions, such as the beginning of a paragraph, line, or sentence.

The Voynich word frequencies of both variants appear to conform to a Zipfian distribution, supporting the idea that the text has linguistic meaning. This has implications for the encoding methods most likely to have been used, since some forms of encoding interfere with the Zipfian distribution. Measures of the proportional frequency of the ten most common words is similar to those of the Semitic, Iranian, and Germanic languages. Another measure of morphological complexity, the Moving-Average Type–Token Ratio (MATTR) index, is similar to Iranian, Germanic, and Romance languages.

Illustrations
Because the text cannot be read, the manuscript is conventionally divided into sections based on its illustrations. Most of the manuscript forms six different sections, each typified by illustrations with different styles and supposed subject matter except for the last section, in which the only drawings are small stars in the margin. The conventional sections are:


 * Herbal, 112 folios: Each page displays one or two plants and a few paragraphs of text, a format typical of European herbals of the time. Some parts of these drawings are larger and cleaner copies of sketches seen in the "pharmaceutical" section. None of the plants depicted are unambiguously identifiable.
 * Astronomical, 21 folios: Contains circular diagrams suggestive of astronomy or astrology, some of them with suns, moons, and stars. One series of 12 diagrams depicts conventional symbols for the zodiacal constellations (two fish for Pisces, a bull for Taurus, a hunter with crossbow for Sagittarius, etc.). Each of these has 30 female figures arranged in two or more concentric bands. Most of the females are at least partly nude, and each holds what appears to be a labelled star or is shown with the star attached to either arm by what could be a tether or cord of some kind. The last two pages of this section were lost (Aquarius and Capricornus, roughly February and January), while Aries and Taurus are split into four paired diagrams with 15 women and 15 stars each. Some of these diagrams are on fold-out pages.
 * Balneological, 20 folios: A dense, continuous text interspersed with drawings, mostly showing small nude women, some wearing crowns, bathing in pools or tubs connected by an elaborate network of pipes. The bifolio consists of folios 78 (verso) and 81 (recto); it forms an integrated design, with water flowing from one folio to the other.
 * Cosmological, 13 folios: More circular diagrams, but they are of an obscure nature. This section also has foldouts; one of them spans six pages, commonly called the Rosettes folio, and contains a map or diagram with nine "islands" or "rosettes" connected by "causeways" and containing castles, as well as what might be a volcano.
 * Pharmaceutical, 34 folios: Many labelled drawings of isolated plant parts (roots, leaves, etc.), objects resembling apothecary jars, ranging in style from the mundane to the fantastical, and a few text paragraphs.
 * Recipes, 22 folios: Full pages of text broken into many short paragraphs, each marked with a star in the left margin.

Five folios contain only text, and at least 14 folios (28 pages) are missing from the manuscript.

Purpose
The overall impression given by the surviving leaves of the manuscript is that it was meant to serve as a pharmacopoeia or to address topics in medieval or early modern medicine. However, the puzzling details of the illustrations have fuelled many theories about the book's origin, the contents of its text, and the purpose for which it was intended.

The first section of the book is almost certainly herbal, but attempts have failed to identify the plants, either with actual specimens or with the stylised drawings of contemporaneous herbals. Only a few of the plant drawings can be identified with reasonable certainty, such as a wild pansy and the maidenhair fern. The herbal pictures that match pharmacological sketches appear to be clean copies of them, except that missing parts were completed with improbable details. In fact, many of the plant drawings in the herbal section seem to be composite: the roots of one species have been fastened to the leaves of another, with flowers from a third.

Astrological considerations frequently played a prominent role in herb gathering, bloodletting, and other medical procedures common during the likeliest dates of the manuscript. However, interpretation remains speculative, apart from the obvious Zodiac symbols and one diagram possibly showing the classical planets.

History


Much of the book's early provenance is unknown, though the text and illustrations are all characteristically European. In 2009, University of Arizona researchers radiocarbon dated the manuscript's vellum to between 1404 and 1438. In addition, McCrone Associates in Westmont, Illinois, found that the paints in the manuscript were of materials to be expected from that period of European history. There have been erroneous reports that McCrone Associates indicated that much of the ink was added not long after the creation of the parchment, but their official report contains no such statement.

The first confirmed owner was Georg Baresch, a 17th-century alchemist from Prague. Baresch was apparently puzzled about this "Sphynx" that had been "taking up space uselessly in his library" for many years. He learned that Jesuit scholar Athanasius Kircher from the Collegio Romano had published a Coptic (Egyptian) dictionary and claimed to have deciphered the Egyptian hieroglyphs; Baresch twice sent a sample copy of the script to Kircher in Rome, asking for clues. The 1639 letter from Baresch to Kircher is the earliest known mention of the manuscript to have been confirmed.

Whether Kircher answered the request or not is not known, but he was apparently interested enough to try to acquire the book, which Baresch refused to yield. Upon Baresch's death, the manuscript passed to his friend Jan Marek Marci (also known as Johannes Marcus Marci), then rector of Charles University in Prague. A few years later, Marci sent the book to Kircher, his longtime friend and correspondent.

Marci also sent Kircher a cover letter (in Latin, dated 19 August 1665 or 1666) that was still attached to the book when Voynich acquired it:

"Reverend and Distinguished Sir, Father in Christ:

This book, bequeathed to me by an intimate friend, I destined for you, my very dear Athanasius, as soon as it came into my possession, for I was convinced that it could be read by no one except yourself.

The former owner of this book asked your opinion by letter, copying and sending you a portion of the book from which he believed you would be able to read the remainder, but he at that time refused to send the book itself. To its deciphering he devoted unflagging toil, as is apparent from attempts of his which I send you herewith, and he relinquished hope only with his life. But his toil was in vain, for such Sphinxes as these obey no one but their master, Kircher. Accept now this token, such as it is and long overdue though it be, of my affection for you, and burst through its bars, if there are any, with your wonted success.

Dr. Raphael, a tutor in the Bohemian language to Ferdinand III, then King of Bohemia, told me the said book belonged to the Emperor Rudolf and that he presented to the bearer who brought him the book 600 ducats. He believed the author was Roger Bacon, the Englishman. On this point I suspend judgement; it is your place to define for us what view we should take thereon, to whose favor and kindness I unreservedly commit myself and remain

At the command of your Reverence,

Joannes Marcus Marci of Cronland

Prague, 19th August, 1665 [or 1666]"

The "Dr. Raphael" is believed to be Raphael Sobiehrd-Mnishovsky, and the sum of 600 ducats is NaN ozt of actual gold weight. The only matching transaction in Rudolf's records is the 1599 purchase of "a couple of remarkable/rare books" from Karl Widemann for the sum of 600 florins. Widemann was a prolific collector of esoteric and alchemical manuscripts, so his ownership of the manuscript is plausible, but unproven.

While Wilfrid Voynich took Raphael's claims at face value, the Bacon authorship theory has been largely discredited. However, a piece of evidence supporting Rudolf's ownership is the now almost invisible name or signature, on the first page of the book, of Jacobus Horcicky de Tepenecz, the head of Rudolf's botanical gardens in Prague. Rudolf died still owing money to de Tepenecz, and it is possible that de Tepenecz may have been given the book (or simply taken it) in partial payment of that debt.

No records of the book for the next 200 years have been found, but in all likelihood, it was stored with the rest of Kircher's correspondence in the library of the Collegio Romano (now the Pontifical Gregorian University). It probably remained there until the troops of Victor Emmanuel II of Italy captured the city in 1870 and annexed the Papal States. The new Italian government decided to confiscate many properties of the Church, including the library of the Collegio. Many books of the university's library were hastily transferred to the personal libraries of its faculty just before this happened, according to investigations by Xavier Ceccaldi and others, and those books were exempt from confiscation. Kircher's correspondence was among those books, and so, apparently, was the Voynich manuscript, as it still bears the ex libris of Petrus Beckx, head of the Jesuit order and the university's rector at the time.

Beckx's private library was moved to the Villa Mondragone, Frascati, a large country palace near Rome that had been bought by the Society of Jesus in 1866 and housed the headquarters of the Jesuits' Ghislieri College.

In 1903, the Society of Jesus (Collegio Romano) was short of money and decided to sell some of its holdings discreetly to the Vatican Library. The sale took place in 1912, but not all of the manuscripts listed for sale ended up going to the Vatican. Wilfrid Voynich acquired 30 of these manuscripts, among them the one which now bears his name. He spent the next seven years attempting to interest scholars in deciphering the script, while he worked to determine the origins of the manuscript.

In 1930, the manuscript was inherited after Wilfrid's death by his widow Ethel Voynich, author of the novel The Gadfly and daughter of mathematician George Boole. She died in 1960 and left the manuscript to her close friend Anne Nill. In 1961, Nill sold the book to antique book dealer Hans P. Kraus. Kraus was unable to find a buyer and donated the manuscript to Yale University in 1969, where it was catalogued as "MS 408", sometimes also referred to as "Beinecke MS 408".

Timeline of ownership
The timeline of ownership of the Voynich manuscript is given below. The time when it was possibly created is shown in green (early 1400s), based on carbon dating of the vellum. Periods of unknown ownership are indicated in white. The commonly accepted owners of the 17th century are shown in orange; the long period of storage in the Collegio Romano is yellow. The location where Wilfrid Voynich allegedly acquired the manuscript (Frascati) is shown in green (late 1800s); Voynich's ownership is shown in red, and modern owners are highlighted blue.

Authorship hypotheses
Many people have been proposed as possible authors of the Voynich manuscript, among them Roger Bacon, John Dee or Edward Kelley, Giovanni Fontana, and Voynich.

Early history
Marci's 1665/1666 cover letter to Kircher says that, according to his friend the late Raphael Mnishovsky, the book had once been bought by Rudolf II, Holy Roman Emperor and King of Bohemia for 600 ducats, NaN ozt of actual gold weight. (Mnishovsky had died in 1644, more than 20 years earlier, and the deal must have occurred before Rudolf's abdication in 1611, at least 55 years before Marci's letter. However, Karl Widemann sold books to Rudolf II in March 1599.)

According to the letter, Mnishovsky (but not necessarily Rudolf) speculated that the author was 13th-century Franciscan friar and polymath Roger Bacon. Marci said that he was suspending judgment about this claim, but it was taken quite seriously by Wilfrid Voynich, who did his best to confirm it. Voynich contemplated the possibility that the author was Albertus Magnus if not Roger Bacon.

The assumption that Bacon was the author led Voynich to conclude that John Dee sold the manuscript to Rudolf. Dee was a mathematician and astrologer at the court of Queen Elizabeth I of England who was known to have owned a large collection of Bacon's manuscripts.

Dee and his scrier (spirit medium) Edward Kelley lived in Bohemia for several years, where they had hoped to sell their services to the emperor. However, this sale seems quite unlikely, according to John Schuster, because Dee's meticulously kept diaries do not mention it. If Bacon did not create the Voynich manuscript, a supposed connection to Dee is much weakened. It was thought possible, prior to the carbon dating of the manuscript, that Dee or Kelley might have written it and spread the rumour that it was originally a work of Bacon's in the hopes of later selling it.

Fabrication by Voynich
Some suspect Voynich of having fabricated the manuscript himself. As an antique book dealer, he probably had the necessary knowledge and means, and a lost book by Roger Bacon would have been worth a fortune. Furthermore, Baresch's letter and Marci's letter only establish the existence of a manuscript, not that the Voynich manuscript is the same one mentioned. These letters could possibly have been the motivation for Voynich to fabricate the manuscript, assuming that he was aware of them. However, many consider the expert internal dating of the manuscript and the June 1999 discovery of Baresch's letter to Kircher as having eliminated this possibility.

Eamon Duffy says that the radiocarbon dating of the parchment (or, more accurately, vellum) "effectively rules out any possibility that the manuscript is a post-medieval forgery", as the consistency of the pages indicates origin from a single source, and "it is inconceivable" that a quantity of unused parchment comprising "at least fourteen or fifteen entire calfskins" could have survived from the early 15th century.

Giovanni Fontana
It has been suggested that some illustrations in the books of an Italian engineer, Giovanni Fontana, slightly resemble Voynich illustrations. Fontana was familiar with cryptography and used it in his books, although he did not use the Voynich script but a simple substitution cipher. In the book Secretum de thesauro experimentorum ymaginationis hominum (Secret of the treasure-room of experiments in man's imagination), written c. 1430, Fontana described mnemonic machines, written in his cypher. That book and his Bellicorum instrumentorum liber both used a cryptographic system, described as a simple, rational cipher, based on signs without letters or numbers.

Other theories
Sometime before 1921, Voynich was able to read a name faintly written at the foot of the manuscript's first page: "Jacobj à Tepenece". This is taken to be a reference to Jakub Hořčický of Tepenec, also known by his Latin name Jacobus Sinapius. Rudolf II had ennobled him in 1607, had appointed him his Imperial Distiller, and had made him curator of his botanical gardens as well as one of his personal physicians. Voynich (and many other people after him) concluded that Jacobus owned the Voynich manuscript prior to Baresch, and he drew a link from that to Rudolf's court, in confirmation of Mnishovsky's story.

Jacobus's name has faded further since Voynich saw it, but is still legible under ultraviolet light. It does not match the copy of his signature in a document located by Jan Hurych in 2003. As a result, it has been suggested that the signature was added later, possibly even fraudulently by Voynich himself.

Baresch's letter bears some resemblance to a hoax that orientalist Andreas Müller (orientalist) once played on Athanasius Kircher. Müller sent some unintelligible text to Kircher with a note explaining that it had come from Egypt, and asking him for a translation. Kircher reportedly solved it. It has been speculated that these were both cryptographic tricks played on Kircher to make him look foolish.

Raphael Mnishovsky, the friend of Marci who was the reputed source of the Bacon story, was himself a cryptographer and apparently invented a cipher which he claimed was uncrackable (c. 1618). This has led to the speculation that Mnishovsky might have produced the Voynich manuscript as a practical demonstration of his cipher and made Baresch his unwitting test subject. Indeed, the disclaimer in the Voynich manuscript cover letter could mean that Marci suspected some kind of deception.

In his 2006 book, Nick Pelling proposed that the Voynich manuscript was written by 15th-century North Italian architect Antonio Averlino (also known as "Filarete"), a theory broadly consistent with the radiocarbon dating.

Jules Janick and Arthur O. Tucker, based on plant and animal identification, and the kabbalah map of central Mexico (folio 86v), argued that it was composed in Mexico between 1562 and 1572.

Language hypotheses
Many hypotheses have been developed about the Voynich manuscript's "language", called Voynichese:

Ciphers
According to the "letter-based cipher" theory, the Voynich manuscript contains a meaningful text in some European language that was intentionally rendered obscure by mapping it to the Voynich manuscript "alphabet" through a cipher of some sort—an algorithm that operated on individual letters. This was the working hypothesis for most 20th-century deciphering attempts, including an informal team of NSA cryptographers led by William F. Friedman in the early 1950s.

The counterargument is that almost all cipher systems consistent with that era fail to match what is seen in the Voynich manuscript. For example, simple substitution ciphers would be excluded because the distribution of letter frequencies does not resemble that of any known language, while the small number of different letter shapes used implies that nomenclator and homophonic ciphers should be ruled out, because these typically employ larger cipher alphabets. Polyalphabetic ciphers were invented by Alberti in the 1460s and included the later Vigenère cipher, but they usually yield ciphertexts where all cipher shapes occur with roughly equal probability, quite unlike the language-like letter distribution which the Voynich manuscript appears to have.

However, the presence of many tightly grouped shapes in the Voynich manuscript (such as "or", "ar", "ol", "al", "an", "ain", "aiin", "air", "aiir", "am", "ee", "eee", among others) does suggest that its cipher system may make use of a "verbose cipher", where single letters in a plaintext get enciphered into groups of fake letters. For example, the first two lines of page f15v (seen above) contain "oror or" and "or or oro r", which strongly resemble how Roman numerals such as "CCC" or "XXXX" would look if verbosely enciphered.

Shorthand
In 1943, Joseph Martin Feely claimed that the manuscript was a scientific diary written in shorthand. According to D'Imperio, this was "Latin, but in a system of abbreviated forms not considered acceptable by other scholars, who unanimously rejected his readings of the text".

Steganography
This theory holds that the text of the Voynich manuscript is mostly meaningless, but contains meaningful information hidden in inconspicuous details—e.g., the second letter of every word, or the number of letters in each line. This technique, called steganography, is very old and was described by Johannes Trithemius in 1499. Though the plain text was speculated to have been extracted by a Cardan grille (an overlay with cut-outs for the meaningful text) of some sort, this seems somewhat unlikely because the words and letters are not arranged on anything like a regular grid. Still, steganographic claims are hard to prove or disprove, because stegotexts can be arbitrarily hard to find.

It has been suggested that the meaningful text could be encoded in the length or shape of certain pen strokes.

Natural language
Statistical analysis of the text reveals patterns similar to those of natural languages. For instance, the word entropy (about 10 bits per word) is similar to that of English or Latin texts. Amancio et al. (2013) argued that the Voynich manuscript "is mostly compatible with natural languages and incompatible with random texts". Building on this theory, deep learning analysis conducted in 2023 determined that the alphabet was strikingly similar to Khojiki script.

The linguist Jacques Guy once suggested that the Voynich manuscript text could be some little-known natural language, written plaintext with an invented alphabet. He suggested Chinese in jest, but later comparison of word length statistics with Vietnamese and Chinese made him view that hypothesis seriously. In many language families of East and Central Asia, mainly Sino-Tibetan (Chinese, Tibetan, and Burmese), Austroasiatic (Vietnamese, Khmer, etc.) and possibly Tai (Thai, Lao, etc.), morphemes generally have only one syllable.

Child (1976), a linguist of Indo-European languages for the U.S. National Security Agency, proposed that the manuscript was written in a "hitherto unknown North Germanic dialect". He identified in the manuscript a "skeletal syntax several elements of which are reminiscent of certain Germanic languages", while the content is expressed using "a great deal of obscurity".

In February 2014, Professor Stephen Bax of the University of Bedfordshire made public his research into using "bottom up" methodology to understand the manuscript. His method involved looking for and translating proper nouns, in association with relevant illustrations, in the context of other languages of the same time period. A paper he posted online offers tentative translation of 14 characters and 10 words. He suggested the text is a treatise on nature written in a natural language, rather than a code, but no further work has been done since Bax's death in 2017.

Tucker & Talbert (2014) published a paper claiming a positive identification of 37 plants, 6 animals, and one mineral referenced in the manuscript to plant drawings in the Libellus de Medicinalibus Indorum Herbis or Badianus manuscript, an Aztec herbal written in 1552. Together with the presence of atacamite in the paint, they argue that the plants were from colonial New Spain and the text represented Nahuatl, the language of the Aztecs. They date the manuscript to between 1521 (the date of the Spanish conquest of the Aztec Empire) and circa 1576. These dates contradict the earlier radiocarbon date of the vellum and other elements of the manuscript. However, they argued that the vellum could have been stored and used at a later date. The analysis has been criticised by other Voynich manuscript researchers, who argued that a skilled forger could construct plants that coincidentally have a passing resemblance to theretofore undiscovered existing plants. Nahuatl specialist M.P. Hansen has rejected their proposed readings as pure nonsense.

Constructed language
The peculiar internal structure of Voynich manuscript words led William F. Friedman to conjecture that the text could be a constructed language. In 1950, Friedman asked the British army officer John Tiltman to analyse a few pages of the text, but Tiltman did not share this conclusion. In a paper in 1967, Brigadier Tiltman said: "After reading my report, Mr. Friedman disclosed to me his belief that the basis of the script was a very primitive form of synthetic universal language such as was developed in the form of a philosophical classification of ideas by Bishop Wilkins in 1667 and Dalgarno a little later. It was clear that the productions of these two men were much too systematic, and anything of the kind would have been almost instantly recognisable. My analysis seemed to me to reveal a cumbersome mixture of different kinds of substitution."

The concept of a constructed language is quite old, as attested by John Wilkins's Philosophical Language (1668), but still postdates the generally accepted origin of the Voynich manuscript by two centuries. In most known examples, categories are subdivided by adding suffixes (fusional languages); as a consequence, a text in a particular subject would have many words with similar prefixes—for example, all plant names would begin with similar letters, and likewise for all diseases, etc. This feature could then explain the repetitive nature of the Voynich text. However, no one has been able yet to assign a plausible meaning to any prefix or suffix in the Voynich manuscript.

Hoax


The fact that the manuscript has defied decipherment thus far has led various scholars to propose that the text does not contain meaningful content in the first place, implying that it may be a medieval hoax.

In 2003, computer scientist Gordon Rugg showed that text with characteristics similar to the Voynich manuscript could have been produced using a table of word prefixes, stems, and suffixes, which would have been selected and combined by means of a perforated paper overlay. The latter device, known as a Cardan grille, was invented around 1550 as an encryption tool, more than 100 years after the estimated creation date of the Voynich manuscript. Some maintain that the similarity between the pseudo-texts generated in Gordon Rugg's experiments and the Voynich manuscript is superficial, and the grille method could be used to emulate any language to a certain degree.

In April 2007, a study by Austrian researcher Andreas Schinner published in Cryptologia supported the hoax hypothesis. Schinner posited that the statistical properties of the manuscript's text were more consistent with meaningless gibberish produced using a quasi-stochastic method, such as the one described by Rugg, than with Latin and medieval German texts.

Some scholars have claimed that the manuscript's text appears too sophisticated to be a hoax. In 2013, Marcelo Montemurro, a theoretical physicist from the University of Manchester, published findings claiming that semantic networks exist in the text of the manuscript, such as content-bearing words occurring in a clustered pattern, or new words being used when there was a shift in topic. With this evidence, he believes it unlikely that these features were intentionally "incorporated" into the text to make a hoax more realistic, as most of the required academic knowledge of these structures did not exist at the time the Voynich manuscript would have been written. In 2021, researchers at Yale University, using the tf–idf analysis, further investigated the relation between clusters of subjects in the text and topics as they could be identified by illustrations and paleography analysis. Their conclusion is that clusters derived by computation match with the topics of the illustrations to some degree, thus providing evidence that the Voynich manuscript contains meaningful text.

However, other scholars have argued that such sophisticated patterns could also appear in hoaxed documents. In 2016, Gordon Rugg and Gavin Taylor published another article in Cryptologia demonstrating that the grille method could reproduce many larger-scale features of the text. In 2019, Torsten Timm and Andreas Schinner published a paper arguing that the text was produced by a process of "self-citation" in which scribes copied and modified meaningless words from earlier in the text. Using a computer simulation of this process, they demonstrated that it could reproduce many of the statistical characteristics of the Voynich manuscript. In 2022, Yale University researchers Daniel Gaskell and Claire Bowern published the results of an experiment in which human participants intentionally tried to write meaningless text. They found that the resulting text was often highly non-random and exhibited many of the same unusual statistical properties as the Voynich manuscript, supporting the idea that some features of the text could have been produced in a hoax.

Glossolalia
In their 2004 book, Gerry Kennedy and Rob Churchill suggest the possibility that the Voynich manuscript may be a case of glossolalia (speaking-in-tongues), channelling, or outsider art. If so, the author felt compelled to write large amounts of text in a manner which resembles stream of consciousness, either because of voices heard or because of an urge. This often takes place in an invented language in glossolalia, usually made up of fragments of the author's own language, although invented scripts for this purpose are rare.

Kennedy and Churchill use Hildegard von Bingen's works to point out similarities between the Voynich manuscript and the illustrations that she drew when she was suffering from severe bouts of migraine, which can induce a trance-like state prone to glossolalia. Prominent features found in both are abundant "streams of stars", and the repetitive nature of the "nymphs" in the balneological section.

The theory is controversial, and it is virtually impossible to prove or disprove it, short of deciphering the text. Kennedy and Churchill are themselves not convinced of the hypothesis, but consider it plausible. In the culminating chapter of their work, Kennedy states his belief that it is a hoax or forgery. Churchill acknowledges the possibility that the manuscript is either a synthetic forgotten language (as advanced by Friedman), or else a forgery, as the preeminent theory. However, he concludes that, if the manuscript is a genuine creation, mental illness or delusion seems to have affected the author.

Decipherment claims
Since the manuscript's modern rediscovery in 1912, there have been a number of claimed decipherings.

William Romaine Newbold
One of the earliest efforts to decode the book's code was made in 1921 by William Romaine Newbold of the University of Pennsylvania. His singular hypothesis held that the visible text is meaningless, but that each apparent "letter" is in fact constructed of a series of tiny markings discernible only under magnification. These markings were supposed to be based on ancient Greek shorthand, forming a second level of script that held the real content of the writing. Newbold claimed to have used this knowledge to work out entire paragraphs proving the authorship of Bacon and recording his use of a compound microscope four hundred years before van Leeuwenhoek. A circular drawing in the astronomical section depicts an irregularly shaped object with four curved arms, which Newbold interpreted as a picture of a galaxy, which could be obtained only with a telescope. However, Newbold's analysis has since been dismissed as overly speculative after John Matthews Manly of the University of Chicago pointed out serious flaws in his theory. For example, each shorthand character was assumed to have multiple interpretations, and as a result there was no reliable way to determine which was intended for any given case. Newbold's method also required rearranging letters at will until intelligible Latin was produced. These factors alone ensure the system enough flexibility that nearly anything at all could be discerned from the microscopic markings. Although evidence of micrography using the Hebrew language can be traced as far back as the ninth century, it is nowhere near as compact or complex as the shapes Newbold made out. Close study of the manuscript revealed the markings to be artefacts caused by the way ink cracks as it dries on rough vellum. Perceiving significance in these artefacts can be attributed to pareidolia. Thanks to Manly's thorough refutation, the micrography theory is now generally disregarded.

Joseph Martin Feely
In 1943, Joseph Martin Feely published Roger Bacon's Cipher: The Right Key Found, in which he claimed that the book was a scientific diary written by Roger Bacon. Feely's method posited that the text was a highly abbreviated medieval Latin written in a simple substitution cipher.

Leonell C. Strong
Leonell C. Strong, a cancer research scientist and amateur cryptographer, believed that the solution to the Voynich manuscript was a "peculiar double system of arithmetical progressions of a multiple alphabet". Strong published a translation of two pages in 1947, and claimed that the plaintext revealed the Voynich manuscript to be written by the 16th-century English author Anthony Ascham, whose works include A Little Herbal, published in 1550. Notes released after his death reveal that the last stages of his analysis, in which he selected words to combine into phrases, were questionably subjective.

Robert S. Brumbaugh
In 1978, Robert Brumbaugh, a professor of classical and medieval philosophy at Yale University, claimed that the manuscript was a forgery intended to fool Emperor Rudolf II into purchasing it, and that the text is Latin enciphered with a complex, two-step method.

John Stojko
In 1978, John Stojko published Letters to God's Eye, in which he claimed that the Voynich Manuscript was a series of letters written in vowelless Ukrainian. The theory caused some sensation among the Ukrainian diaspora at the time, and then in independent Ukraine after 1991. However, the date Stojko gives for the letters, the lack of relation between the text and the images, and the general looseness in the method of decryption have all been criticised.

Stephen Bax
In 2014, applied linguistics Professor Stephen Bax self-published a paper proposing a "provisional, partial decoding" of the Voynich Manuscript, proposing a translation for ten proper nouns and fourteen letters from the manuscript using techniques similar to those used to successfully translate Egyptian hieroglyphs. He claimed the manuscript to be a treatise on nature, in a Near Eastern or Asian language, but no full translation was made before Bax's death in 2017.

Nicholas Gibbs
In September 2017, television writer Nicholas Gibbs claimed to have decoded the manuscript as idiosyncratically abbreviated Latin. He declared the manuscript to be a mostly plagiarised guide to women's health.

Despite initial excitement in the community surrounding Gibbs' theory, scholars judged Gibbs' hypothesis to be unoriginal. His work was criticised as patching together already-existing scholarship with a highly speculative and incorrect translation; Lisa Fagin Davis, director of the Medieval Academy of America, stated that Gibbs' decipherment "doesn't result in Latin that makes sense." Davis added that she was "surprised the TLS published it." Other researchers concurred.

Greg Kondrak
Greg Kondrak, a professor of natural language processing at the University of Alberta, and his graduate student Bradley Hauer used computational linguistics in an attempt to decode the manuscript. Their findings were presented at the Annual Meeting of the Association for Computational Linguistics in 2017, in the form of an article suggesting that the language of the manuscript is most likely Hebrew, but encoded using alphagrams, i.e. alphabetically ordered anagrams. However, the team admitted that experts in medieval manuscripts who reviewed the work were not convinced.

Ahmet Ardıç
In 2018, Ahmet Ardıç, an electrical engineer with an interest in Turkic languages, claimed in a YouTube video that the Voynich script is a kind of Old Turkic written in a "poetic" style. The text would then be written using "phonemic orthography", meaning the author spelled out words as they heard them. Ardıç claimed to have deciphered and translated over 30% of the manuscript. His submission to the journal Digital Philology was rejected in 2019.

Gerard Cheshire
In 2019, Gerard Cheshire, a biology research assistant at the University of Bristol, made headlines for his theory that the manuscript was written in a "calligraphic proto-Romance" language. He claimed to have deciphered the manuscript in two weeks using a combination of "lateral thinking and ingenuity." Cheshire has suggested that the manuscript is "a compendium of information on herbal remedies, therapeutic bathing, and astrological readings"; that it contains numerous descriptions of medicinal plants   and passages that focus on female physical and mental health, reproduction, and parenting; and that the manuscript is the only known text written in proto-Romance. He further claimed: "The manuscript was compiled by Dominican nuns as a source of reference for Maria of Castile, Queen of Aragon."

In June 2023, Cheshire published his translation of the foldout illustration on page 158. He claims that it depicts a volcano, and theorises that it places the manuscript's creators near the island of Vulcano which was an active volcano during the 15th century.

However, experts in medieval documents disputed this interpretation vigorously, with the executive director of the Medieval Academy of America, Lisa Fagin Davis, denouncing the paper as "just more aspirational, circular, self-fulfilling nonsense". Approached for comment by Ars Technica, Davis gave this explanation:

"As with most would-be Voynich interpreters, the logic of this proposal is circular and aspirational: he starts with a theory about what a particular series of glyphs might mean, usually because of the word's proximity to an image that he believes he can interpret. He then investigates any number of medieval Romance-language dictionaries until he finds a word that seems to suit his theory. Then he argues that because he has found a Romance-language word that fits his hypothesis, his hypothesis must be right. His 'translations' from what is essentially gibberish, an amalgam of multiple languages, are themselves aspirational rather than being actual translations."

The University of Bristol subsequently removed a reference to Cheshire's claims from its website, referring, in a statement, to concerns about the validity of the research and stating: "This research was entirely the author's own work and is not affiliated with the University of Bristol, the School of Arts nor the Centre for Medieval Studies".

Facsimiles
Many books and articles have been written about the manuscript. Copies of the manuscript pages were made by alchemist Georgius Barschius (the Latinized form of the name of Georg Baresch; cf. the second paragraph under "History" above) in 1637 and sent to Athanasius Kircher, and later by Wilfrid Voynich.

In 2004, the Beinecke Rare Book and Manuscript Library made high-resolution digital scans publicly available online, and several printed facsimiles appeared. In 2016, the Beinecke Library and Yale University Press co-published a facsimile, The Voynich Manuscript, with scholarly essays.

The Beinecke Library also authorised the production of a print run of 898 replicas by the Spanish publisher Siloé in 2017.

Cultural influence
The manuscript has inspired various works of fiction, including:


 * Between 1976 and 1978, Italian artist Luigi Serafini created the Codex Seraphinianus containing false writing and pictures of imaginary plants in a style reminiscent of the Voynich manuscript.
 * Contemporary classical composer Hanspeter Kyburz's 1995 chamber work The Voynich Cipher Manuscript, for chorus & ensemble is inspired by the manuscript.
 * In 2015, the New Haven Symphony Orchestra commissioned Hannah Lash to compose a symphony inspired by the manuscript.
 * For the 500th strip of the webcomic Sandra and Woo, published 29 July 2013, writer Oliver Knörzer and artist Puri Andini created The Book of Woo, four illustrated pages inspired by the Voynich manuscript. All four pages show strange illustrations next to a cipher text. The strip was mentioned in MTV Geek and discussed in the Cipher Mysteries blog of cryptology expert Nick Pelling as well as Klausis Krypto Kolumne of cryptology expert Klaus Schmeh.  The Book of Woo was also discussed in the 2017 book Unsolved! by Craig P. Bauer, about the history of famous ciphers. As part of the lead-up to the 1,000th strip, Knörzer posted the translated English text on 28 June 2018, revealing the crucial obfuscation involved translating the plain text into the constructed language Toki Pona.

News and documentaries

 * news – summary of Gordon Rugg's paper directed towards a more general audience