Sometimes in mathematics, you can right 20 slightly different definitions and find you have defined 20 slightly different things. Other times you can write many different formalizations and find they are all equivalent. Turing completeness is the latter. It turns up in Turing machines, register machines, tiling the plane, Conways game of life and many other places. There are weaker and stronger possibilities, like finite state machines, stack machines and oracle machines. (Ie a Turing machine with a magic black box that solves the halting problem is stronger than a normal Turing machine)
Human brains are finite state machines. A Turing machine has unlimited memory and time.
Physical laws are generally continuous, but there exists a Turing machine that takes in a number N and computes the laws to accuracy 1/N. This isn't philosophically forced, but it seems to be the way things are. All serious theories are computable.
We could conceivably be in a universe that wasn't simulateable by a Turing machine. Assuming our brains are simulatable, we could never know this absolutely, as simulators with a huge but finite amount of compute trying to trick us could never be ruled out. 0 and 1 aren't probabilities and you are never certain. Still, we could conceivably be in a situation were an uncomputable explanation is far simpler than any computable theory.
Thanks for the answer!
Human brains are finite state machines. A Turing machine has unlimited memory and time.
Oops! You're right, and It's something that I used to know. So IIRC as long your tape (and your time) is not infinite you still have a finite state machine, so Turing machines are kind of finite state machines taken to the limit for () is that right?
I'll focus on the gears-level elaboration of why all those computational mechanisms are equivalent. In short: If you want to actually get anything done, Turing machines suck! They're painful to use, and that makes it hard to get insight by experimenting with them. Lambda calculus / combinator calculi are way better for actually seeing why this general equivalence is probably true, and then afterwards you can link that back to TMs. (But, if at all possible, don't start with them!)
Sure, in some sense Turing machines are "natural" or "obvious": If you start from finite state machines and add a stack, you get a push-down automaton. Do it again, and you essentially have a Turing machine. (Each stack is one end of the infinite tape.) There's more bookkeeping involved to properly handle "underflows" (when you reach the bottom of a stack and need to extend it) yadda yadda, so every "logical state" would require a whole group of states to actually implement, but that's a boring mechanical transformation. So just slap another layer of abstraction on it and don't actually work with 2-stack PDAs, and you're golden, right? Right…?
Well… actually the same problem repeats again and again. TMs are just extremely low-level and not amenable to abstraction. By default, you get two tape symbols. If you need more, you can emulate those, which will require multiple "implementation states" to implement a single "logical state", so you add another layer of abstraction… Same with multiple tapes or whatever else you need. But you're still stuck in ugly implementation details, because you need to move the head left and right to skip across stuff, and you can't really abstract that away. (Likewise, chaining TMs or using one as a "subroutine" from another isn't easily possible and involves lots of manual "rewiring".) So except for trivial examples, actually working with Turing machines is extremely painful. It's not surprising that it's really hard to see why they should be able to run all effectively computable functions.
Enter lambda calculus. It's basically just functions, on a purely syntactic level. You have binders or (anonymous) function definitions – the eponymous lambdas. You have variables, and you have nested application of variables and terms (just like you're used to from normal functions.) There's exactly one rule: If a function is applied to an argument, you substitute that argument into the body wherever the placeholder variable was used. So e.g. (λx.((fx)y)(gx))z → ((fz)y)(gz) by substituting z for x. (Actual use of lambda calculus needs a few more rules: You have to make sure that you're not aliasing variables and changing the meaning, but that's a relatively minor (if tricky) bookkeeping problem that does abstract away fairly cleanly.)
Now, on top of that, you can quickly build a comfortable working environment. Want to actually name things? let x := y in t
is just a ((λx.t)y). (The assignment being split around the whole code is a bit ugly, but you don't even need "actual" extensions, it's just syntax sugar.) Data can also be emulated quite comfortably, if you understand one key idea: Data is only useful because you eventually do something with it. If all you have is abstraction, just abstract out the "actually doing" part with a placeholder in the most general scheme that you could possibly need. (Generally an induction scheme or "fold".) [Note: I'll mostly use actual names instead of one-letter variables going forward, and may drop extra parentheses: f x y z = (((f x) y) z)
]
true := λtt.λff.tt
, false := λtt.λff.ff
(Now your if c then x else y
just becomes a c x y
… or you add some syntax sugar.)zero := λze.λsu.ze
, succ := λn.λze.λsu.su(n ze su)
. (If you look closely, that's just the induction principle. And all that the number "does" is: (succ (succ (succ zero)) x f) = f (f (f x))
)nil := λni.λco.ni
, cons := λx.λxs.λni.λco.co x (xs ni co)
The usual list functions are easy too: map := λf.λxs.(xs nil (λx.λxs.cons (f x) xs))
, fold := λxs.xs
(yep, the list – as represented – is the fold)And so on… so I think you'll believe me that you basically have a full functional programming language, and that lambda calculus is equivalent to all those more usual computational substrates. (…and maybe Turing machines too?)
Still… even if lambda calculus is extremely simple in some ways, it's also disturbingly complex in others. Substitution everywhere inside a term is "spooky action at a distance", a priori it's quite plausible that this does a lot of "heavy lifting". So: Let's turn to combinator calculi. Let's keep the term structure and application of things, but get rid of all (object-level) variables and binders. Instead, we define three substitution rules: (using variables only at the meta-level)
I x = x
K v x = v
(or, fully parenthesized: (K v) x = v
)S a b x = (a x) (b x)
(or, again more precisely: ((S a) b) x = (a x) (b x)
)(Strictly speaking, we don't even need I
, as SKKx = (Kx)(Kx) = x
, so if we say I := SKK
that's equivalent.) The interesting thing about these rules is that all lambda-terms can be translated into SKI
-terms. (To get rid of a binder / λ, you recursively transform the body: Pretend the intended argument x
is already applied to the body, how do you get it where it needs to go? If you have an application, you prefix it with S
such that the argument is passed down into both branches. If a branch doesn't use the argument (that includes other variables), prefix it with K
so it remains unchanged. Otherwise, it is the bound variable that should be substituted, so put an I
and you get the value. Done.) Worst case, this will cause an exponential(!) increase of the term size, but it'll get the job done. After that, there's no more "looking inside terms" or dealing with variables, but extremely simple rewrite rules. To revisit the examples above:
true := λtt.λff.tt → λtt.(K tt) → S(KK)I
, false := λtt.λff.ff → λtt.(I) → KI
zero := λze.λsu.ze → S(KK)I
, succ := λn.λze.λsu.su(n ze su) → λn.λze.SI(S(K(n ze))I) → λn.S(K(SI))(S(S(KS)(S(KK)(S(KK)(S(K n)I)))(KI)) → (S(K(S(K(SI))))(S(S(KS)(S(K(S(KS)))(S(K(S(KK)))(S(S(KS)(S(KK)I))(KI)))))(K(KI))))
[…except I think I messed up somewhere.]So actually you don't need lambda calculus (or variables, for that matter), just these two rules. You can compile your Haskell to this SKI-stuff.
Take a moment to process that.
Now I hope it's also pretty obvious that you can implement a Turing machine in Haskell? Same for imperative programming languages – look at e.g. the Whitespace interpreter (Haskell version in the archive, Idris version on Github). So at this point I think you have a pretty good intuition as for why all existing computational mechanisms can be reduced down to SKI-calculus.
As for the other direction…with a basic TM model, dealing with all those dynamically resizing parts would be extremely painful, but if you use an extended Turing machine that has lots of different symbols and also more than one band, it's actually not that hard? Detecting expandable locations isn't that hard if you use one band as scratch space for counting nested parentheses. Doing the "reductions" (which can actually greatly expand the space needed because S
duplicates the argument) can also happen on a separate band, and then afterwards you copy over the stuff back onto the "main band", after adjusting the placement of the unchanged stuff on the side. (I hope that's sufficiently clear? This is basically using multiple bands as a fixed number of temporary helper variables.)
And that's it. Now you can run your Turing machine in your Haskell on your Turing machine, or whatever… Hope you're having fun!
Thanks for your reply! I had never heard of SKI-Calculus!
In short: If you want to actually get anything done, Turing machines suck! They're painful to use, and that makes it hard to get insight by experimenting with them.
I feel like one advantage Turing machines have, is that they are in some sense low level (actually quite close to how we build our computers)? They do seem to have a nice abstraction for memory. I would have no idea how to characterize NSPACE with lambda calculus, for example (though of course googling reveals someone seems to have figured it out).
I think it is just so that once you hit a level of "systematicity" you are not missing out anything. It is not that the entities referenced when defining a turing machine are especially powerful. If there were "magic symbols" one could imagine that if a mathematician just scribbled the right sigils then a leaps better outcome could result. If you disregards what the mathematics is about it is black marks on a white background. What makes it "systematic" or "mathematical" is that the black marks give sufficient guidance on how to make more marks. One could try to relax this a bit. It is physically possible to write "2+2=5" (and it possible to make systems where "2+2=5" is a valid move, but in disntinguishing valid from invalid moves we are assuming away things). But if what you write next doesn't have to do what you have written before it becomes doubtful to call that a "result".
One important idea is mututal emulation. If the mathematician can write a story how a turing machine would throw around symbols and a turing machine can throw around symbols how a mathematcian might move a pen, you don't need to ask which capability is more "fundamental". If you have a result in one language, you can either get it by native operation or by the other method emulating the other.
There are shadows of there migth be interesting things that might go a little beyond computation or efficient operation. To put is provocatively if you have advice that has a "asspull" in it then that is not a valid algorithm. One example could be "1. Try a thing. 2. If it fails try another thing". One can turn this into a good algorithm with the flavor of "1. Enumerate all the possible answers 2. check each". For some mathematical tasks it might be that you just do something and something ends up working, there might not be a method to come up with mathematical discoveries. But with everything that has a method, it is a technology and can be followed by a "dumb" instruction follower, ie the homework is "fair". But some intelligent system can benefit from advice that is not a (complete) method, ie when following the instruction they need to do genuine creative/executive decicions such as "how to try a bunch of different things".
One might be tempted to think that good science results is a human encoutnering the worlds sense data and executing correct inductive inference on it. But then you have stuff like Schrödingers equation being just guessed (rather than being constructed from some hints in a understanble construction). Another story would be that people hallucinate all kinds of hypothesis and natural selection just leaves those that lucked out to be in correspondence with reality to be alive. But if there is a method to the madness the story is trackable and we can say about its properties. So any kind of system that can deal with system descriptions will be able to do all describable work (or repharasing emulate/employ any kind of madness that would be neccesary for particular outcomes).
I agree with most of this except for:
...There are shadows of there migth be interesting things that might go a little beyond computation or efficient operation. To put is provocatively if you have advice that has a "asspull" in it then that is not a valid algorithm. One example could be "1. Try a thing. 2. If it fails try another thing". One can turn this into a good algorithm with the flavor of "1. Enumerate all the possible answers 2. check each". For some mathematical tasks it might be that you just do something and something ends up working, there might
I dont think church-turing thesis is true, it is not true if general relativity is correct, paper for details; https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=a2ae95b5f4eb0c61c45522b2498004b16c8a19f8
"What if our brains just happen to be Turing machines?" Another way to ask this might be, "Is consciousness computable?" Might I suggest another question? "Does it matter that Turing machines are self-contained; in other words, that they do not sense or interact with their environment?" Our brains do not only sense and interact with our environment, they also sense and control our own bodies. And our bodies, at numerous levels, down to our individual cells, sense and control their own status. Does it matter that the canonical Turing machine has no sensory apparatus, other than that which detects ones and zeroes on the program tape? What if it instead also sensed the texture of the tape, the current through the motors and relays, the battery or mains voltage powering it, the feel of the tape advancing through the reader, whether or not it was in good repair, clean or dusty, oiled or unoiled. What if it detected tape jams and other malfunctions, and could take measures to correct the problem? What if it could sense the temperature of the room it was in, and control it, to the degree that temperature affected its function? What if it developed a taste for tapes of a particular brand? What if it found some programs more interesting than others? What if it detected the presence of other Turing machines? What if it had to compete with other Turing machines for access to electrical current, or to tapes of better quality? What if it found another Turing machine that shared some of its tastes, in tapes, programs, and electrical sources, but differed in others, and they discussed their preferences, each expanding the other's appreciation for the resources they needed? Or found that they disagreed so profoundly, that they had to destroy the other before still other machines became infected with such bizarre programs?
Could all this sensing, all these preferences, and all the control mechanisms, operate off that one tape, threading back and forth through the reader?
Just phrasing these questions convinces me that the Turing machine model of consciousness fails, that consciousness is not an algorithm, and is not remotely computable. It also convinces me that consciousness is not programmable. It must always self-develop in, not just a brain, but a body that it can control, and use to affect the world it lives in.
Our brains do not only sense and interact with our environment, they also sense and control our own bodies. And our bodies, at numerous levels, down to our individual cells, sense and control their own status.
I find this a bit confusing, I don't consciously control most of what's going on in my body: I don't have a sense of the status of my mitochondria or any individual cell in my body that aren't specifically developed for sensing. So how is this related to consciousness?
...It also convinces me that consciousness is not programmable. It must always sel
Why do you believe that everything "effectively computable" can be computed by a Turing machine? In the same vein: How would you recognize this to be false?
I don't particularly doubt that this is true. But I want to get a better gears level understanding for why this might be true. When this came up during my "fundamentals of computer science" class, my professor was communicating high confidence that this was essentially correct (joking he could speed up the graduation of the student who would show this is false), but gave very little info on why he believed this in the first place.
I got a bit more out of the respective Wikipedia article. I get that people have come up with different formalisms for "effectively computable" and they all seem to be weaker or equivalent to Turing machines, and that does seem like compelling evidence. But, I want to understand whether there is more to it. There seem to be points in favor, like this quote from Gödel that I pulled from the Wikipedia article:
Does this mean: The act of showing that something would be computable in a System Si would effectively show how to do it with a System S1 (e. g. a Turing machine?). To me this seems kind of circular and to rely on the fact that there is no "stronger" formal system that would itself be stronger as a Turing machine that we would use to show the existence of such a system Si (this is probably a confused thought, so help in understanding this would be welcome)?
Another thought: What if our brains just happen to be Turing machines (but other things in the world are "stronger") and there is just no way to formally describe these higher level Algorithms with our limited brains (analogy: Could our brains come up with Turing machines if they were finite automatons?). Could there be ways for us to notice if this was the case?
So to sum up, I still find this quite confusing and thought this might be the right place to get good answers or to clarify these points.