-The branch of knowledge that deals with interpretation, especially of the Bible or literary texts-
The limits of my language mean the limits of my world.
Gottfried Wilhelm von Leibniz, (probably) the greatest mathematicians and natural philosopher who ever walked fifteenth century continental Europe, dreamed of an all-embracing formal language, able to express the highest concepts that the human mind can explore: science, mathematics, metaphysics. He called this vocabulary characteristica universalis. He was also a declared rationalist, although the definition of "rationalist" was a little different from the modern one. Leibniz never went so far into the technical details of building such a language, to the point that many historians suspect that it never went beyond a simple unattainable aspiration, one of those cases in which the idea turns out to be bigger than the man who gave birth to it. On the other hand, Kurt Gödel was among the heavyweights who were convinced that Leibniz's dream was not just a feverish delirium. Yes, the same Kurt Gödel of the incompleteness theorem. I have no intention of expressing whether characteristica universalis can be possible or not (although a conversation i had with a linguist persuaded me that it is practically infeasible), I prefer to focus on the importance that formal languages have had in the transmutation of human society.
Scholars define prehistory as the set of events that occurred before the birth of written language but why is written language so important to the point of being the fundamental marker of our maturity? The gravity of this innovation lies in the fact that it allowed us to note dates, witnesses, thoughts and this facilitated the work of posterity in reconstructing what came before, what generated the world in which they live, in which we live. Writing has allowed historians the luxury of looking into the past simply by observing words, turning them into scholars of texts. Texts are powerful tools, through them it is possible to share experiences, secret informations, knowledge and cooking recipes. They are also fundamental for the perpetuation of religious cults. Imagine yourself thousands of years in the past, in the hot and muggy Indian subcontinent, while writing the Bhagavadgita or the Ramayana, most likely under the influence of powerful psychotropic substances called "The voice of God".
Or, if you prefer, imagine yourself in ancient Egypt-Israel as you write the old testament: different settings, same story.
Have you ever imagined that your works would have been the most widespread and the most studied in history? That would have given rise to global cults that would last thousands of years? Well maybe you would have imagined it given the omnipotence delirium that pervaded you while you were writing them.
If you have a preference for modern history you can put yourself in Marx's shoes in nineteenth-century Prussia as you write "Das Kapital" and "Manifest der Kommunistischen Partei". Would have you foreseen the Russian revolution, Stalinism, Cuba and North Korea? I believe I have rattled off enough examples to persuade you of the power of texts.
All the seminal works cited above were written in some natural language which means a language that has evolved naturally, perhaps from a common ancestor, and that men have modified and rearranged through the act of speaking it. Think of it as Darwinism applied to linguistics. When you talk to your friends, insult your neighbors or your boss, when you declare yourself to your crush, you are doing it through a natural language. However, the natural ones, are not the only languages that exist: remember Leibniz's characteristica universalis? In fact you've certainly heard of formal languages. In my field, compuer science, and by extension in mathematics (and of course in linguistics too), formal languages are of fundamental importance. But why is so and what are these languages? To put it in simple terms you can think of formal languages as a subset of natural languages which consists of all the strings (words) built from symbols taken from one or more alphabets, to which pre-established rules are applied. These rules, unlike those of natural languages (much more elastic), are not negligible, here we truly speak of grammars carved in granite [These grammars are called (guess what) formal grammars and represent the sets of rules for building formal words. Although we will not focus on formal grammars here, it seems fair to at least list a few: context-free grammar, regular grammars, Context-sensitive ]. What? Are you asking for an example? Okay, let's do this. Consider the following alphabet:
As a rule we only say that a string belongs to the language if and only if it is composed solely of symbols from the alphabet; we don't care about the order, we don't care if there are repetitions, we don't care about anything else. The following strings:
all belong to the language while these:
do not belong to the language.
Above we have defined a very simple formal language, an atomic one I would say.
So what's the point of all this? The importance and applications of formal languages are uncountable, for example they are used in linguistics to understand certain properties of natural languages (divide et impera). In mathematics they are studied for their combinatorial properties (combinatorics on words) but it is in computer science that they meet their greatest purpose. Once this essay is finished, you will be convinced that computer science is nothing but the study of the associations between words and formal languages. In this discipline we have a fundamental logical model called Turing Machine (synonymous with algorithm) which must perform computations on an input. An input to a TM is always a string of symbols, whether in BINARY (Among other things, the invention of the binary code is attributed to Leibniz), ASCII, UNICODE or otherwise encoded. We are often interested in decision problems: Given an input , does have certain properties? This kind of problems always have a binary answer: either reflects the properties we are looking for or it does not. We have already established that is always a string so a decision problem is isomorphic to determining whether or not a word belongs to a formal language. Any other kind of questions we can run into in computer science, be it an optimization problem, a counting problem (etc.) can be rephrased as a decision problem. Let's see another language. Given the alphabet , the language is the set of all the finite words over the binary alphabet. Example of words from : , , , , , , ,
Theorem: There exist a Turing Machine that recognize membership in the language. Proof: Such TM work as follow:
1) Start reading the input left to right, one symbol at a time. 2) If the symbol is or move right on the input. It is easy to note that this algorithm halts if and only if the input string belongs to the language .
Note that each Turing Machine recognize one and only one language but there may be more Turing Machines that recognize the same language. A further fact to be known is the following: for Cantor's diagonal argument, there are infinitely more formal languages than Turing Machines that recognize them, i.e. there are infinite problems that cannot be solved by algorithms. This fact in particular is mind-blowing.
At this point it should be clear to everyone how fundamental the study of formal languages is but, to be sure, let's see a more concrete example with direct and non-trivial real world applications.
We will focus on a relaxed version of RSA, one of the most exploited asymmetric encryption protocols in the world. Basically, what the algorithm does to get the public key, is to multiply two very large (hundreds of digits) prime numbers, and (which together represent the private key). At this point the public key is sent to the client, the information that the client must send to the server is "tied" to the public key via a simple power elevation and subsequent multiplication with the public key itself. RSA is suspected of being a one-way function: it is easy to factor the public key to obtain the information sent by the client once and are known but, if the factors are unknown, the problem is considered intractable.
Here we have two different problems, an attacker who intercepts the public key, to reconstruct the encrypted information, must recognize membership in this language:
But without the knowledge of and . To date no Turing Machine is known that can efficiently (that is in upper bounded polynomial time) recognize membership in this language, this is because, if it does not know , a priori, it must carry out a brute force attack on the input, trying all possible prime factors. Note that the number of transactions (search space) grows exponentially with the increase of . For the server, on the other hand, it is enough to executes a simple division to decrypt the information.
I don't know how you feel abut this but I deeply admire the researchers who have come to discover such a wonderful application for this alleged formal one-way language (discovery leaked directly from number theory) and that have given us a safe method to communicate online. Note that, without encryption, the internet as we know it will not exist: online shopping, home banking, secure e-mail, anonymity and much more. Cryptography is that thin mathematical line that separates 2.0 Economy from chaos. The above example, though very important, is only a drop in the ocean.
Optimization of airline flights, industrial optimizations, cryptography, automatic theorem proving, digital circuits design, folding of proteins, linear programming, combinatorics, data Compression, cellular automatons, natural languages recognition, primality testing, social networks, graph theory... all these fields (and many many many, many others) are all related to the study of syntactic and semantic properties of words belonging to different formal languages.
The very foundations of mathematics, whether you consider it to be Set theory or Homotopy type theory (or even Category theory) are, in fact, expressed in formal languages: set of symbols and rules that link these symbols. Perhaps Leibniz's dream will never come true but, over time and also thanks to his work, we have moved from studying religious texts to studying formal languages and this has brought us many more benefits, accelerating both our technological/economical/industrial development and our quantitative epistemological knowledge .