A most cryptic text: the Voynich manuscript

Is it a genuine medieval manuscript or a more modern fake - and what do all the weird symbols mean, if they mean anything at all?

A page from the Voynich Manuscript, showing some of the strange and so-far undeciphered symbols within.
The libraries of Yale University are home to many treasures, from first folios of the works of Shakespeare to Dark Age documents. Yet their most valuable acquisition may be a manuscript that no one has ever read - despite countless attempts to do so. Known as the Voynich Manuscript, its author and date of composition are unknown, although even a cursory glance through its 240-odd vellum pages suggest that it is many centuries old. Most perplexing of all are its contents: drawings of bizarre, alien-like creatures and plants, and what appear to be diagrams of astrological and medical significance. The handwritten text accompanying them offers no clues to their purpose or meaning, however, as it is in a language no one can read.

Not surprisingly, there are many theories about the nature of the Voynich Manuscript. The eponymous book dealer who found the manuscript at a Jesuit college in Italy in 1912 believed it to be the work of the 13th-century English monk Roger Bacon, famed for his knowledge of alchemy, philosophy and science. Others have claimed that it is a medieval hoax, or a forgery concocted by Voynich himself. Now some hard facts are starting to emerge about what has been described as the world's most mysterious manuscripts. This month, the Austrian broadcasting company ORF revealed that tiny samples of the manuscript had been submitted to carbon dating at the University of Arizona, the results suggesting that the vellum had been made somewhere between 1404 and 1438. Such a date rules out a direct connection with Bacon, though not the possibility of a hoax: a determined forger could have acquired a supply of blank medieval vellum. Yet according to ORF, tests by experts at the McCrone Research Institute in Chicago show that the ink was added to the manuscript at around the same time as it was made.

Taken together, these new findings suggest that at least the material of the Voynich Manuscript is genuine. Now the challenge is to make sense of its contents. Are they written in some lost language, or a secret code - or are they just plain gibberish? Over the years, many scholars have tried to unravel the enigmatic script, with little to show for their efforts. During the 1920s, a professor of philosophy at the University of Pennsylvania named William Newbold claimed that the Voynich Manuscript was written in a cipher devised by Roger Bacon. According to Newbold, the decoded text revealed that Bacon had access to telescopes and microscopes centuries before they were supposedly first invented. Not surprisingly, the idea created a sensation; sadly, it later emerged that Newbold's methods allowed a host of different decipherings, and he'd just chosen ones that supported his theories.

The Voynich Manuscript has since refused to divulge its secrets to far more competent investigators, including William Friedman, the celebrated American cryptologist who deciphered the Japanese wartime codes. At the end of the Second World War, he and his fellow cryptologists applied standard code-breaking methods to the manuscript, looking for signs of a real language concealed by the handwritten symbols. They soon found themselves in deep water. It proved impossible to pin down the exact size of the "alphabet" of the Voynich Manuscript, the 170,000-character text containing over 70 distinct symbols. Even so, some words and phrases clearly appeared more often than expected in standard languages, suggesting the text was not in code, as encryption typically reduces word frequencies.

At the same time, the brilliant British codebreaker John Tiltman - who played a key role in breaking Hitler's most secure ciphers - claimed to see hints of rules governing the construction of the "Voynichese", with some symbols apparently playing the role of prefixes and suffixes. The codebreakers came to suspect that Voynichese is a so-called a priori language, in which words are based on conceptual rather than linguistic principles. A simple modern example is the Dewey system of library book classification, in which anything related to, say, a social science has a catalogue entry beginning with a 3, those specific to education having 7 as the second number and so on, leading to the Dewey "word" for "higher education" being 378.

Friedman and his colleagues believed the best hope of understanding Voynichese lay through using the power of a machine that was already transforming code-breaking: the computer. It is a prediction now starting to come true, as scholars apply the number-crunching power of computers along with concepts from Information Theory. Pioneered by the American communications theorist Claude Shannon in the 1940s, this branch of mathematics shows the connections between the raw statistics of characters in a text and their meaning. One key feature of real language is its so-called "entropy", which measures the chances of one symbol being followed by another. Random gobbledygook has a very high entropy, as there is a more or less equal chance of any symbol being followed by any other. In contrast, the rules of grammar underpinning real languages gives them a relatively low entropy - unless they have been enciphered in some way to disguise their information content.

Entropy tests of the Voynich Manuscript reveal that it has an unusually low entropy - apparently ruling out the idea that it is just gobbledygook, or has been encrypted using complex "polyalphabetic" ciphers, whose many different symbols would produce a high entropy. Ongoing research by Dr Marcello Montemurro of the University of Manchester in England suggests that the linguistic structure of Voynichese is consistent with that of texts written in known languages. What is needed now is some way of at least finding out what the text might be about, even if its precise meaning remains unclear. Dr Montemurro and his colleague Dr Damian Zanette of the Balseiro Institute in Argentina have developed techniques able to do this. Applied to known works, they can discriminate between discursive works of literature or history and information-rich textbooks. Applied to the Voynich Manuscript, they might confirm the long-held belief that it is some kind of encyclopaedia of esoteric knowledge.

It could still turn out that the manuscript really is a 15th-century forgery. Yet if this does prove to be the case, it is a forgery of such sophistication that its true nature could only be revealed using 21st-century computing power. Either way, Yale University can rest assured that in the Voynich Manuscript it has a truly extraordinary document in its possession. Robert Matthews is Visiting Reader in Science at Aston University, Birmingham, England