
What if we make radio contact with an extraterrestrial civilization, and the only thing we can transmit is text, and we transmit the entire text of this dictionary, what can they learn from it?
A dictionary is a strange thing – it defines each word in terms of other words, all of which can also be found in the same dictionary. It is a perfect example of a totally closed system. Without the illustrations, it is as air tight as a closed system can be. With such a system, is there any intrinsic information content? In other words, what can our extraterrestrial friends learn from this huge book? Anything? Something?
What they can definitely learn by analyzing all the sentential structures in the syntax of the English language. There are known techniques to derive the syntax of a language from a large collection of sample sentences. The dictionary is full of sample sentences. Moreover, it has definitions of each and every word used in the dictionary. Therefore it should not be too difficult for an intelligent race to figure it out. With this knowledge the aliens can write an endless variety of perfectly correct English sentences. The question is, will they know anything about what they mean? Most probably not, since there is no clue in the dictionary to figure that out. The illustrations could have been a clue, even just a few of them, but that was not part of our transmission. The closed system has no leaks through which the real universe can enter the closed world of tangled words. If we include the page numbers then it is almost certain that they can figure out our number system.
Taking it a step further, let’s say we transmit all the English language books in all the libraries of the world and just to make sure we got it all, let’s also add the entire web – once again, just the text and nothing else. Will that give them any more to work with? Of course now they have everything we have ever written in the English language – all of our literature, science, religion, philosophy, history, plus the mountain-load of web content we are creating everyday, including this blog post. But still, with no external clues, our alien friends may be able to write flawless English now, and this time the text they produce will not only be grammatically correct, but through clever statistical analysis of the vast collection, they may even be able to write more “meaningful” and better quality English. But still they will probably have no idea what they are talking about.
Let’s imagine we extend it even further by including all text written in all human languages, including all the side-by-side bilingual books and bilingual dictionaries. Now they may be able to form the grammar of all known languages, and even be able to translate a piece of text from one language to another. But still they probably won’t understand a thing. It will not be too different from the automated translators that we use on the web – it does translate, purely on the basis of logic and statistics, without any understanding of the content.
However, if our text included mathematical texts, then it should be possible for them to get some very significant clues. A school arithmetic text that includes a few equations like “2 + 3 = 5” would let them figure out our number system and the meanings of the mathematical operators. This is so not just because the mathematical language is very precise, but because mathematics deals with universal and self-consistent truths. With that starting point, it is not only possible to figure out the rest of our mathematical literature, but it may provide clues into some of our English language words that are often used in mathematical texts, such as “if”, “then” etc. Like rock climbing, once you have a toe hold, it is possible to conquer a lot more.
