The Voynich Manuscript: Six Hundred Years of Carefully Written, Beautifully Illustrated, Completely Undeciphered
A 15th-century manuscript on vellum, written in an alphabet that resembles no known writing system, illustrated with hundreds of unidentifiable plants, naked women in tubs, and astronomical diagrams — and untranslated despite a century of attempts by professional cryptographers, linguists, and mathematicians.
AnomalyDesk is reader-supported. Articles may contain affiliate links to books and primary-document collections. Read our full funding disclosure.
What the manuscript is, in a paragraph.
The Voynich Manuscript is a 240-page codex written on vellum that carbon-dates to between 1404 and 1438 CE (a 2009 University of Arizona analysis sampling four leaves). It is written in an alphabet of approximately 25–30 characters, arranged into what appear to be words, sentences, and paragraphs — but the script matches no known writing system, and the "language" it represents has not been identified. The manuscript is divided by its illustrations into apparent sections: a "Herbal" section depicting plants (most of which cannot be identified with known botanical species), an "Astronomical" section with circular diagrams featuring zodiac signs and what may be astrological figures, a "Balneological" or "Biological" section depicting nude female figures bathing in elaborate tubs and pipework, a "Cosmological" section with foldout maps and diagrams, a "Pharmaceutical" section with illustrated containers labeled with what appear to be lists, and a final "Recipes" section consisting almost entirely of dense text. The manuscript is named after Wilfrid Voynich, a Polish-Lithuanian book dealer who acquired it in 1912 from a Jesuit college near Rome. Voynich publicized it as a cipher manuscript and the search for a decryption has continued since. The provenance trail extends backward through Athanasius Kircher (the 17th-century Jesuit polymath, who received it from a correspondent in Prague), an attribution to the court of Holy Roman Emperor Rudolf II, and back to figures including the Bohemian alchemist Jakub Hořčický de Tepenec and possibly to Roger Bacon (a 13th-century attribution now disproven by the 2009 carbon dating). A century of dedicated cryptographic and linguistic work — including by William Friedman and other major figures of 20th-century cryptography — has produced no widely-accepted decipherment. Modern statistical analysis has established that the text exhibits the linguistic patterns of natural human language (Zipf's law compliance, word-length distributions, internal "topic" coherence) rather than randomly-generated noise, making a pure-hoax explanation harder to defend than it once was, but the actual content remains unread.
The documented record.
Physical characteristics
The manuscript is a codex of 102 surviving folios (originally probably ~134, based on quire-number gaps), measuring approximately 22.5 x 16 cm. It is written on vellum — processed calfskin — and the parchment quality is consistent across the document, suggesting a single source. The ink is iron gall ink, also consistent throughout. The illustrations are painted with a small range of pigments: ferrous oxide red, copper green, ultramarine blue (in limited quantities), and other earth pigments. None of the materials is anomalous for a 15th-century European manuscript. Verified [1]
The 2009 carbon dating
In 2009, a team at the University of Arizona, led by Greg Hodgins, performed accelerator mass spectrometry radiocarbon dating on four vellum samples from the manuscript. The results, published in 2011 and 2014: the vellum was produced from animals slaughtered between approximately 1404 and 1438 CE (95% confidence). This dating disposes definitively of the 17th-century-hoax hypothesis that had been advanced by some skeptics (proposing Wilfrid Voynich himself as a forger), but does not address whether the text was added contemporaneously to the vellum or at a later date. The consensus view, based on the ink analysis (showing no anomalous chemical aging characteristics) and the absence of any palimpsest evidence, is that the text and illustrations were added shortly after vellum production. Verified [2]
The script
The Voynich script — informally "Voynichese" or "EVA" (European Voynich Alphabet, the most-used transcription system, developed by René Zandbergen and Gabriel Landini in the 1990s) — uses approximately 25–30 distinct character forms. The exact count is debated because some characters appear to be variants or ligatures of others. Verified
The script is written left-to-right and is consistently spaced into discrete "words" of varying length. Sentences and paragraphs are demarcated by line breaks and (rarely) by punctuation-like marks. The same character set appears throughout the entire manuscript with consistent frequency distributions across sections.
Multiple distinct "hands" or scribes have been identified in the manuscript by paleographic analysis — most analyses identify two to five scribes — suggesting the document was produced by a collaborative process rather than a single author.
The illustrations
The illustrations comprise approximately 113 unidentifiable plants in the Herbal section, dozens of astronomical/astrological diagrams featuring recognizable zodiac figures (Pisces, Taurus, etc., consistent with the 15th-century European tradition), the distinctive "balneological" or "biological" section showing nude female figures connected by what appear to be plumbing or tubes (sometimes interpreted as alchemical-bath imagery, sometimes as obstetric/gynecological reference, sometimes as something else entirely), a series of large foldout maps with rosette designs at corners, and pharmaceutical-section illustrations of labeled containers (apothecary jars) alongside what look like medicinal herb portraits. Verified [3]
None of the plant illustrations have been confidently identified with known botanical species. Some bear resemblance to known plants (a sunflower-like figure has been proposed as evidence of New World origin postdating Columbus, but this identification is disputed); most do not match any known flora.
The provenance trail
The documented chain of custody, working backward from the present: Verified
- Present: Beinecke Library, Yale University. MS 408. Acquired from Hans P. Kraus, 1969.
- 1962–1969: Hans P. Kraus, antiquarian book dealer, New York.
- 1930–1962: Ethel Voynich (Wilfrid's widow) and her assistant Anne Nill.
- 1912–1930: Wilfrid Voynich, who purchased it from the Jesuit Collegio Romano (Villa Mondragone), where it had been part of the collection of Pier Antonio Petrini.
- 1670: Bequest in the will of Johannes Marcus Marci to Athanasius Kircher, recorded in a letter accompanying the manuscript. Marci's letter explicitly describes the manuscript and asks Kircher to attempt its decipherment. The Marci letter is among the most important provenance documents.
- 1665–1670: Marci's possession.
- ~1622: An owner identified in the Marci letter as "Jacobus Sinapius" (Jakub Hořčický de Tepenec, court chemist to Rudolf II). His signature is partially visible on folio 1r under ultraviolet examination.
- ~1576–1611: Possibly the court of Holy Roman Emperor Rudolf II of Bohemia, Prague. Rudolf reportedly purchased a manuscript "with characters in an unknown language" for 600 ducats in 1586, attributed by him to Roger Bacon. The identification of Rudolf's purchase with the Voynich Manuscript is plausible but not definitively established.
- Pre-1586: Unknown. The 2009 carbon dating places the manuscript's vellum production in the early-to-mid 15th century, leaving a ~150-year gap of unaccounted provenance.
The decoding history
Major decoding attempts: Verified
- 1921: William Romaine Newbold (University of Pennsylvania) claimed the manuscript was the work of Roger Bacon (13th century) and proposed a decryption involving multiple cipher layers. Newbold's claim received initial enthusiastic press; his cipher key was systematically refuted by John Manly (University of Chicago) in 1931 as relying on character-level pareidolia that produced effectively arbitrary readings.
- 1944–1978: William F. Friedman (the most distinguished American cryptographer of the 20th century) ran two informal study groups on the manuscript. He concluded that the text "was probably not a code in the ordinary sense" but possibly a synthetic language or a stenographic system. Friedman's anagrammed conclusion, in the form "The Voynich MSS was an early attempt to construct an artificial or universal language of the a priori type," was published posthumously in 1959.
- 1976–Present: Various claims by professional and amateur cryptographers, none of which have been authenticated by independent peer review and the Voynich community of researchers. Among the most-cited: Robert Brumbaugh (multiple cipher proposals, 1970s-80s); Leo Levitov (Cathar prayers, 1987, refuted); Stephen Bax (2014, claimed partial decoding of proper names of plants, partially accepted but limited in scope); Greg Kondrak (2018, proposed Hebrew origins, broadly rejected by Voynich scholars); Gerard Cheshire (2019, proposed proto-Romance, rejected by professional medievalists).
Statistical analyses
Modern computational analysis has established several properties of the Voynich text: Verified
- Word-frequency distribution complies with Zipf's law, the empirical pattern observed in natural language.
- Word-length distribution is significantly more constrained than English, Latin, or modern European languages — words tend to cluster around lengths of 4–7 characters with few outliers.
- There are statistically significant differences in word frequencies between the manuscript's apparent sections (Herbal vs. Astronomical vs. Biological), suggesting genuine topical content.
- The text contains relatively few instances of consecutive identical characters, more than typical for natural language but consistent with some constrained-syllable systems.
- Word repetition rates and contextual patterns are inconsistent with most known European languages but consistent with some abjad (consonant-only) or syllabic writing systems.
The statistical properties are inconsistent with pure random generation and inconsistent with simple substitution of any known language. They are consistent either with an unknown natural language (perhaps an extinct minor European language or a non-European language transcribed in an invented alphabet), with a deliberately-constructed cipher of considerable sophistication, or with an artificial-language project of the period (similar to but predating the philosophical "universal languages" of the 17th century).
The candidate explanations.
Hypothesis: Encoded natural European language (Latin or vernacular)
Argument: the text is an encrypted Latin medical, alchemical, or herbal text using a cipher system that has resisted attack for six centuries. Claimed
Limits: 15th-century European cryptography was not sufficiently developed to produce a cipher that would resist modern statistical attack. The text's statistical signature does not match Latin or European vernacular plaintext under any tested cipher model. The cipher hypothesis remains popular among amateur cryptanalysts but has not produced a verifiable decryption.
Hypothesis: An unknown but real natural language
Argument: the manuscript is written in a language — perhaps a minor or now-extinct one — transcribed into an invented alphabet by a community using it for esoteric or specialized purposes. Candidates proposed have included: a Turkic language (Ahmet Ardiç, 2018, partially translated but contested); Manchu or a related Tungusic language; a constructed liturgical language for a religious or alchemical sect; an extinct dialect of a known language family. Claimed with several different proposed languages.
Limits: Each specific language hypothesis has produced partial decoding claims but none has reached the threshold of decoding extended coherent text. The "natural language" framing accounts for the statistical properties; the absence of a verified decoding makes specific identifications speculative.
Hypothesis: An artificial language
Argument: the manuscript represents an early attempt at a constructed language — either for philosophical or universal-language reasons (analogous to the 17th-century projects of John Wilkins and others, but earlier) or for ritual purposes. William Friedman's posthumous conclusion was a version of this hypothesis. Claimed
Limits: No other 15th-century artificial-language project is documented in the surviving record. The hypothesis is parsimonious but unconfirmable without independent evidence of the language's existence elsewhere.
Hypothesis: A glossolalia / channeling production
Argument: the text was produced through a "writing trance" or glossolalia-like process by an author convinced they were transmitting meaningful but mysterious content. Statistical properties would emerge naturally from such production by a literate author internalizing the structure of language without intending coherent semantic content.
Limits: The hypothesis is difficult to falsify but accounts for the text's combination of language-like statistics with apparent untranslatability. Some recent researchers (notably Andreas Schinner) have argued from statistical evidence that the manuscript exhibits patterns consistent with hand-driven nonsense generation. Claimed; not consensus-supported but increasingly serious.
Hypothesis: A sophisticated hoax
Argument: the manuscript was produced as an elaborate hoax to be sold as a mysterious cipher. Most popularly: that Wilfrid Voynich himself produced it, with the goal of profitable sale.
Limits: The 2009 carbon dating definitively eliminates the Voynich-as-author hypothesis (the vellum predates Voynich's lifetime by approximately 470 years). A 15th-century hoax explanation is not eliminated by carbon dating but must contend with the substantial effort required to produce 240 pages of statistically-coherent gibberish illustrated to a sophisticated standard. Hoax remains possible but is increasingly seen as less parsimonious than the natural-language or constructed-language hypotheses.
The unanswered questions.
The decipherment itself
The central question. Despite professional cryptographic work over a century, no decoding has been authenticated. The manuscript's statistical properties make it neither easily-defeated cipher nor obviously random; the right framework for attacking it has not been identified.
The pre-1586 provenance
The 150-year gap between the carbon-dated vellum production (~1404–1438) and the earliest documented owner (Rudolf II's court, ~1586) is unaccounted-for in any surviving record. Where the manuscript was during that period, who held it, and whether the gap represents a specific location (e.g., a monastery library) or many short-term holdings, are unknown.
The identity of the missing pages
The current 240 pages represent approximately 80% of the original quires based on numbering. The missing pages might contain section transitions, dedications, colophons, or other indications of authorship and purpose that the surviving folios lack.
The illustrations' purpose
The herbal-section illustrations show no consistent match to any known medieval botanical tradition (Dioscorides, the Tractatus de Herbis, etc.). Whether the plants are invented, are stylized representations of real plants now-unrecognizable, or represent plants from a region beyond medieval European knowledge, is undetermined. The biological-section figures, similarly, do not match any known medieval anatomical or balneological tradition closely.
The relationship to known traditions of esoteric knowledge
The manuscript has features (alchemical-adjacent imagery, astronomical diagrams, herbal section) that resonate with known 15th-century esoteric/alchemical traditions, but does not match any specific tradition known to scholars. Whether it represents a previously-unknown tradition or a deliberately-eclectic synthesis is open.
Primary material.
- The Voynich Manuscript itself, Beinecke Library MS 408, fully digitized and available at
collections.library.yale.edu. - The 1665/1670 Marci-to-Kircher letter, the primary provenance document.
- The 2009 University of Arizona carbon-dating results (Hodgins et al.).
- The EVA transcription of the text by Zandbergen, Landini, and collaborators, available at
voynich.nu. - The Friedman study-group papers, partially preserved in the George Marshall Foundation archives.
- The Newbold-Manly papers in the University of Chicago library.
The sequence.
- 1404–1438 Vellum produced (carbon-dated). Text and illustrations added shortly thereafter.
- 1438–1586 Unaccounted-for period.
- ~1586 Possibly purchased by Rudolf II of Bohemia for 600 ducats.
- Early 17th century In possession of Jakub Hořčický de Tepenec, court chemist to Rudolf II.
- ~1665 Johannes Marcus Marci sends the manuscript to Athanasius Kircher with the famous accompanying letter.
- 1665–1912 Held by the Jesuit Collegio Romano and then the Villa Mondragone outside Rome.
- 1912 Wilfrid Voynich purchases the manuscript from the Jesuits.
- 1921 William Romaine Newbold's claimed Bacon decryption.
- 1931 John Manly refutes Newbold.
- 1944–1959 William Friedman's study groups.
- 1959 (posthumous) Friedman's "artificial language" conclusion published.
- 1969 Hans P. Kraus donates the manuscript to Yale.
- 1990s–present EVA transcription enables systematic computational analysis.
- 2009 Carbon dating definitively places vellum in 1404–1438.
- 2014 Bax claimed partial decoding (proper names of plants).
- 2018–2019 Kondrak (Hebrew), Cheshire (proto-Romance), and Ardiç (Turkic) decoding claims, all contested.
- 2026 Manuscript remains untranslated; computational and linguistic work continues.
Cases on this archive that connect.
Planned: the Phaistos Disc (parallel undeciphered single-artifact case); the Rohonc Codex (another untranslated European manuscript); the Beale Ciphers (sustained-attack cipher); the Kryptos sculpture (the modern parallel of a designed cipher that has resisted attack); the Zodiac Killer ciphers (the closest contemporary parallel of a cipher whose authorship is anonymous and which has been partially solved over decades).
Full bibliography.
- Beinecke Rare Book and Manuscript Library, Yale University. Voynich Manuscript (MS 408), full digital facsimile and physical metadata.
- Hodgins, Greg, et al. Radiocarbon dating analysis of Voynich Manuscript vellum samples. NSF-Arizona AMS Laboratory, 2009. Results reported in subsequent peer-reviewed publications.
- Clemens, Raymond, ed. The Voynich Manuscript. Yale University Press, 2016. The most authoritative single-volume scholarly treatment.
- Marci, Johannes Marcus. Letter to Athanasius Kircher, August 19, 1665/1666. Reproduced in Voynich's own publications and in subsequent provenance literature.
- D'Imperio, Mary E. The Voynich Manuscript: An Elegant Enigma. NSA, 1978; declassified 1995. The most comprehensive cryptanalytic treatment.
- Friedman, William F., and various contributors. Voynich study group papers, 1944–1959. George Marshall Foundation archives.
- Bax, Stephen. "A proposed partial decoding of the Voynich script." Self-published, 2014.
- Cheshire, Gerard. "The Language and Writing System of MS408 (Voynich) Explained." Romance Studies, 2019. (Widely rejected by Voynich specialists.)
- Zandbergen, René. voynich.nu. The most comprehensive single online research site, including the EVA transcription.
- Schinner, Andreas. "The Voynich Manuscript: Evidence of the Hoax Hypothesis." Cryptologia 31, 95–107 (2007). The most serious modern hoax-hypothesis treatment.