LESSWRONG
is fundraising!
LW

[Incomplete] What is Computation Anyway? — LessWrong

Disclaimer

This post was written three months ago, and I haven't really touched it since then. I never fully completed it^[1], and I'm not satisfied with the current draft. But as is, I'm not on track to complete it anytime in the foreseeable future.

At the time I started it, I believed it to be pretty important as far as my LW essays go. Given that, it seems marginally more dignified to publish the draft as is^[2].

As at the time of publishing, I don't necessarily still endorse it.

Epistemic Status

Unconfident, exploratory. This is a dialogue and an exercise in deconfusion, not necessarily a novel contribution.

Preamble

I've been quite confused about computation, and I recently read a paper that made me feel significantly less confused. This is a review/exploration of the paper, and a documentation of my deconfusion. Because it's an exercise in my deconfusion, I will explore some of my own inferences and thinking. It's not intended to be a faithful representation of the paper^[3].

I'll leave any lingering confusion in my understanding of computation to Cunningham's Law.

Acknowledgements

Many thanks to @davidad, for discussing this topic with me and answering some of my questions. He also recommended the insightful paper.

Introduction

Automata theory answers how we compute. Models of computation give us idealised abstractions of methods to compute. But the question of what a "computation" actually is, remains unresolved. What does it mean to "compute". When we "compute" something, what are we actually doing? Why do we engage in that activity?

The paper poses two questions:

Why do we compute?
What do we compute?

I found the treatment of the first question very enlightening, and think a satisfactory account of it was given. The treatment of the second question, however, felt very unsatisfactory and left me confused. But, it was a productive confusion and led me to some interesting thinking.

Because I don't really understand/appreciate the paper's treatment of the second question, I'm going to set it aside, and instead explore the avenues of thinking it led me down in my confusion.

Why Do We Compute?

Naively, we compute to acquire/obtain information.

There are two objections to the naive view.

One problem is that information cannot be created de novo. Computation cannot actually create information out of nothing (information is conserved in a total system)^[4].

A second problem is that even if we could actually create information de novo, computation wouldn't create any information, as the output of a (deterministic)^[5] computation is a logical consequence of its inputs. The computation doesn't generate any new information. It doesn't give you any knowledge that wasn't already "knowable" in some sense^[6].

Responding to these two objections helps shed some light on the nature of computation.

Information Conservation

To resolve the first problem with the naive view, we can note that while information in a total system is conserved, information can flow between and to subsystems (in the same way that an object can gain/lose heat from/to its environment).

Thus, when we talk about computing, we are specifically talking about open systems that are situated within an environment. This may lead naturally to a workable definition of "computer":

Computer

A computer is an open information system situated in an environment.

("Open" in the sense that information can flow from it to its containing environment and vice versa.)

Agent

Subsystems that receive information from their environment (observations/percepts) and send outgoing information to said environment (actions) are agents^[7].

Agent interactions are intrinsic to information flow during computation.

Non-Generation of Information

We can resolve the second problem with the naive view by noting that we are not logically omniscient. I do not explicitly know whether is a prime number, even though that knowledge is implicit in my knowledge of the natural numbers and the definition of prime numbers.

In general, we don't explicitly know all the implications of our current knowledge.

Interlude: Rescuing the Naive View

Having resolved the above objections to the naive view of the purpose of computation as "information generation", I think it's possible to describe (in a loose sense), what computation actually is.

Interlude: Some Helpful Concepts

I'd like to cover a few concepts that are helpful to explain what computation actually is.

Intensions and Extensions

Wikipedia page

Sequences Post

Consider the set of numbers $X = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}$ . I'll use this set and its subsets to explain intensions and extensions.

Intension

An intensional definition of a concept is a description of the necessary and sufficient conditions^[8] for matching the concept (for a predicate it's a description of the necessary and sufficient conditions for satisfying the predicate).

Suppose we wish to refer to the even natural numbers less than $10$ .

Some intensions of the above are:

$S = {x \in X : 2 | x}$
$S = {x \in X : x mod 2 = 0}$
$S = {x \in N : 2 | x \land x < 10}$
$S = {x \in N : x mod 2 = 0 \land x < 10}$
The members of $X$ that are divisible by $2$
The natural numbers less than $10$ that are divisible by $2$
The even natural numbers less than $10$

Indeed, the very description I gave of the set desired was an intensional definition of it. In general, a concise definition of a concept in terms of other concepts is an intensional definition of that concept.

Extension

An extensional definition of a concept is its extension: the collection of all objects that satisfy/match that concept.

An extension of the even natural numbers less than $10$ . is: $S = {0, 2, 4, 6, 8}$ .

(The definition of $X$ I gave earlier was an extensional definition.)

Implicit and Explicit Knowledge

(I'd offer an explanation of these concepts in the context of computation.)

Implicit Knowledge

Knowledge of an intensional description.

Explicit Knowledge

Explicit knowledge is possessing an intensional description that is "essentially isomorphic" to an extension.

E.g. explicit knowledge of a number is possessing a numeral that corresponds to said number in a numbering system^[9].

I.e. you in some sense trivially have the entire extension (a cost may be required to apply the isomorphism, but it's a trivial cost).

A Less Naive Explanation of Computation

Computation is "work" to extract information latent in our extant knowledge. Just as you must do mechanical work to create changes in states of uniform motion, you must also do informational work to create changes in states of belief/knowledge. That informational "work" is computation.

If I were to give some semi-naive definitions of computation:

Informational work an agent performs to change its epistemic state (state of knowledge/belief).

The production of explicit knowledge of an output from implicit knowledge of an input - output mapping and a given input.

A transformation of intensional descriptions into extensional descriptions

Information Reduction

Another aspect of computation is data reduction: getting rid of information in the input.

Often, the information content of the input is quite voluminous, and computation reduces it considerable to extract more succinct information in the more useful extensional form.

(E.g. consider the information content of the natural numbers. All true and false propositions about the natural numbers is knowledge implicit in their definition. When you compute some property of a number, you're reducing that vast body of knowledge to get the particular knowledge you desire.)

This suggests a valuable inequality:

The information content of the outputs of a computation is $\leq$ the information content of its inputs.

This inequality is perhaps a first step towards a formulation of computation purely in terms of information dynamics^[10].

Taxonomy of Computation

I'll briefly explore a couple classes of computation and then expand on them in a bit more detail over the remainder of this post^[11].

Brief Description

The two classes are:

Closed computations
Open computations

They are classified according to their permeability to information flow "during" the computation.

Closed Computation

A closed computation is one that does not admit any information flow between the system and its environment while the computation is "ongoing".

It receives an input from its environment and later returns an output to it. Other than this input and output, there is not other information flow between the system and its environment.

Open Computation

An open computation is one that admits information flow between the system and its environments while the computation is ongoing.

It may receive many inputs from its environment, and return many outputs to its environment while the computation is ongoing.

One may think of an open computation as receiving a stream of inputs from its environment and returning a stream of outputs to its environment.

Or perhaps more poignantly:

An open computation may be considered a sequence of closed computations^[12].

A more detailed sketch of the above statement will be given in a later section.

A More Detailed Explanation

Let a program be properly understood as the implementation of a computation.

Closed Computation

An intension of a closed computation is an algorithm.

An extension of a closed computation is a function $f$ of the form $f : I \to O$

Where:

$I$ : the set of inputs

$O$ : the set of outputs

( $I$ and $O$ may also be considered the types of the inputs and outputs respectively).

It should be noted that not all functions admit intensions that are computations. The halting predicate does not admit a computable intension.

Computability theory discriminates for us, those functions that admit intensions that are computations.

Open Computations

There may be many ways to formulate an open computation.

Recall that an open computation admits a stream of inputs from its environment and returns a stream of outputs to its environment.

And that it can be considered a sequence of closed computations.

Extension

We could consider how to formulate this extensionally.

The way apparent to me is as a function $f^{'}$ of the form $f^{'} : T \times I \to O$ $($ or $f^{'} : T \to (I \to O)$ $)$ ^[13].

Where:

$T$ : is a set^[14] of time points (to admit both discrete and continuous computation)

$I$ : is a set of inputs

$O$ : is a set of outputs

Intension

The intension of an open computation is a process.

Given the above extension of an open computation we could try to reverse engineer its intension.

For discrete computations^[15], we can imagine an algorithm that:

At each time step:

Receives an input from the environment
Executes another algorithm
1. This algorithm may be a single operation or multiple operations, but regardless, it should complete within the "time step")
  1. In practice, this may manifest as different time steps admitting different lengths of time (as a result of different resources being required to perform the computation associated with a given time step)
2. The input to this algorithm is the history of inputs the process has received from initiation until the current time step
  1. Let $t_{0}$ represent the start of program execution/the first time step
  2. Let $I (t_{j})$ represent the input at time step $t_{j}$
  3. At time step $t_{k}$ , the input to the program is the set ${I (t) : t \in T (t \leq t_{k})}$
    1. The ordering of this set forms a "sequence" of inputs to the algorithm
    2. This input sequence, provides a notion of "state" for the process.
3. This is the intension of the " $(I \to O)$ " function produced by the process at each time step
Returns an output to its environment
1. Said output being the output of the algorithm executed in step 2

The above algorithm would seem to match the desired signature of $f^{'} : T \times I \to O | | f^{'} : T \to (I \to O)$ .

And also matches our intensional definitions of an open computation as:

Admitting a stream of inputs and returning a stream of outputs
A sequence of closed computations

The above sketch is not necessarily the only way to formulate a process, but it seems like a coherent/sensible intension of a discrete process^[16].

A formulation of a continuous process would be quite interesting^[17]. Sadly, it's beyond my current mathematics level.

Summary

To summarise:

We compute in order to obtain information
A computer is an open information system
- A system that admits information flow to and from its environment
Computation is informational work to:
- Update an agent's epistemic state
- Produce explicit knowledge of outputs from the inputs and implicit knowledge
- Transform intensional descriptions into extensional descriptions
Computation reduces the information in its inputs to produce its outputs
- This inequality is a first step towards a formulation of computation in terms of information dynamics
Closed computations do not permit information flow to/from the environment while the computation is ongoing.
- An intension of a closed computation is an algorithm
- An extension of a closed computation is a function from inputs to outputs
  - $f$ of the form $f : I \to O$
Open computations permit information flow to/from the environment while the computation is ongoing
- An open computation is a sequence of closed computations
- An extension of an open computation is a process
- An intension of an open computation is a function from the product of time points and inputs to outputs
  - $f^{'}$ of the form $f^{'} : T \times I \to O$ $($ or $f^{'} : T \to (I \to O)$ $)$

Conclusions

This post is intended to start a conversation. Do please:

Address the known inaccuracies/misunderstandings/confusions/gaps in this post
Point out unknown inaccuracies/misunderstandings/confusions/gaps in this post
Suggest improvements to this post
Extend this post
Discuss the questions raised in the paper
- Including providing your own answers to said questions
Provide your own account of "what computation actually is"
Suggest further reading to understand "what computation actually is"
- I would really appreciate this

Appendices

Appendix A: Logical Omniscience and Epistemic Logic

[To Do]

[Explore the principle of logical omniscience and its shortcomings in detail. Give an account of how computation bridges that gap.]

[Address the implication closure of epistemic logic.]

Appendix B: Subjectivity

Consider the equation $3 \times 5 = 15$ . That calculation may be represent different computations:

Integer multiplication: $3 * 5 \mapsto 15$
Prime factorisation: $15 \mapsto 3 * 5$

Just from the performance of that calculation, we can't tell which computation is actually being performed.

In order to discriminate the two computations, we'd need to know the starting knowledge of the agent performing that computation (the inputs to the computation).

Perhaps, this suggests that computation is inherently subjective and dependent on the agent performing the computation.

Are the aforementioned two computations identical from an information dynamics perspective?

I don't know. I don't know enough — or any — information theory? Does the integer tuple $(3, 5)$ have the same information content as the integer $15$ ?

If it does, then the information flow in the computations would be identical. But even then, the flow of information during a computation is not all there is to describing a computation. A proper account of computation would need to "dereference" the information flowing to and from a computation: its inputs and outputs.

The two closed computations aren't identical from a functional perspective. If the computation is properly considered extensionally as a function $f : I \to O$ , then we need to note that for arbitrary sets $X$ and $Y$ , then for a given function $g : X \to Y,$

$\forall h : Y \to X, h \neq g$

The above inequality is true, even if the internals (the "steps" taken by an algorithm evaluating those functions^[18]) of $h$ and $g$ are identical.

This is because for two functions to be equal, they must have the same domain and codomain (and be extensionally equivalent considered as sets of ordered pairs).

Integer multiplication and prime factorisation are thus two different computations.

$Multiply : N^{n} \to N$
$Factorise : N \to N^{n}$

Given that the signatures are different, the functions are different, and thus the computations they correspond to are different.

Whether it is computed as " $3 * 5 \mapsto 15$ " (integer multiplication) or " $15 \mapsto 3 * 5$ " (prime factorisation) depends on the observer/user/consumer of the computation.

Perhaps computation is in some sense inherently subjective. That is, to answer what computation is being performed, it may be necessary to refer to the agent that is consuming said computation.

Is computation subjective?
- Consider the equation: $3 \times 5 = 15$
  - That calculation may be represent different computations:
    - Integer multiplication: $3 * 5 \mapsto 15$
    - Prime factorisation: $15 \mapsto 3 * 5$
  - Are the different computations the calculation represents the same for an info theoretic perspective?
    - They have different inputs and outputs
      - Integer multiplication
        Input: $(3, 5)$
        Output: $15$
      - Prime factorisation
        Input: $15$
        Output: $(3, 5)$

Appendix C: Duality

[To Do]

[A formulation of duality for computations via invertible functions.]

Appendix C: Other Classes of Computation

Additional classes can be formulated which restrict information flow from the environment with respect to a computation. That said, these classes — if they indeed exist — can be considered as special cases of the aforementioned classes.

The additional classes are:

Open or closed computations that:

Receive no information from the environment at the start of the computation, and returns information to the environment at the end of the computation
- Perhaps programs that don't accept any input are an example of this
- Such programs must not consume^[19] any other programs
Receive information from the environment at the start of the computation, and do not return any information to the environment at the end of the computation
- Perhaps programs whose only consequences are side effects are an example of this
- Such programs must not be consumed by^[19] any other programs
Receive no information from the environment and return no information to the environment
- Perhaps standalone programs that take no input and return no output whose only consequences are side effects are an example of this
- Such programs must not interface with^[20] any other programs

However, it may be possible in each of the above cases to insist that the program does receive information from the environment:

E.g. the invocation or termination of the program
Changes in the states of the program invoked by any larger systems that contain it

And that the program does return information to the environment:

Via changes in the states of larger systems it is embedded in
- That is, side effects may themselves represent an information transfer
Via its continued operation
- I.e. the very fact that a program is still running provides a steady stream of information: namely, that the program has not yet halted or crashed.

And even if there was truly no information transfer, you could formulate these cases as computations where the information transfer in (a) particular direction(s) is zero bits.

And it is not clear to me what purpose computations that strictly met the criteria of those classes, that received no information from their environments or transferred no information to them, would actually have.

Transferring information to the environment without receiving any information from it sounds like creating information from nothing
A computation that received information from the environment but didn't transfer any information may as well be a black hole
A computation that neither received nor transferred information from its environment sounds like something that is not causally entangled with its environment

Because of the above objections, I am not convinced these "extra classes" merit special distinction — they haven't yet earned a name — hell, I'm not even confident that these extra classes merit being called "computations" at all.

They don't fulfill the purpose of a computation (obtaining information), and do not meet the definitions of computation presented earlier.

Appendix D: Known Outstanding Confusions, Misunderstandings and Gaps in Knowledge/Understanding

The things I know that I do not know about computation:

Information dynamics
- A comprehensive formulation of computation purely in terms of information dynamics
  - That is an adequate reduction of computation to the flow of information between a system and its environment
  - (A computer as an open information system and computation as data reduction of the inputs to the outputs seem to be first steps in that direction)
- A complete reduction of computation that dissolves the consumer?
  - Is such an account possible?
Thermodynamics
- How do the informational dynamics of computation cash out into thermodynamics
  - Data erasure in irreversible computation is physical work and requires energy
  - Do irreversible computations also relate to thermodynamics in the same way?
Is computation subjective?
- Are objective accounts of computation without making reference to any consumer of the computation possible?
- Can computation adequately be reduced to an account solely of the dynamics of information flow between a system and its environment?
Taxonomy of computation
- Are side effects information transfers from a computation to its environment?
  - Why?
  - A similar question can be asked if the environment causes a change in the internal state of a computation
- Are program invocations/terminations information transfers from an environment to a computation?
  - Why?
- Do the other potential classes of computations exist in any useful or meaningful sense?
  - It seems to me that they may not (I suspect the answer to the above questions about information transfer is "yes")
- Is my formulation of digital processes adequate?
- What is an adequate intensional description of continuous open computations (analogue processes)?
"What do we compute"
- I do not understand why this question is meaningful, interesting or worth considering; the first section of the post felt enlightening and this section just felt deeply confusing
- I don't understand what's unsatisfactory about a naive answer of "what we compute" as:
  - A function (in the set theoretic sense)
  - An intension(al description)
  - Information (via data reduction/the data processing inequality)
- That is after answering the first question of "why do we compute?" — and defending the naive answer to that question — I do not understand why there's any question as to "what do we compute" left
- After you define computation as producing explicit knowledge of outputs from inputs and extant implicit knowledge, what question about "what do you compute?" is left?
How do mental phenomena translate to computational phenomena?
- The main reason I want to understand computation is because I want to understand how mental phenomena translate to physical phenomena
  - I suspect that computational phenomena play an intermediate role here

I will return to the question of "what is computation anyway?" as I learn more about the theory of computation, information theory, thermodynamics, and the relevant mathematics.

^{^}
Best as I can recall, the outstanding content were confined to the appendices.
^{^}
If I do complete it at some later date, I can rewrite/[expand on] it then.
^{^}
Don't let the length of this post intimidate you: the majority of the post is contained in appendices and footnotes.
^{^}
I suspect that this follows directly from the laws of thermodynamics (presumably the second law), but I am not sure about the exact details. I don't yet understand thermodynamics well.
There is an interesting post in the Sequences that touches on conservation of information.
^{^}
"Computation" in the rest of this post can be understood to refer to deterministic computations. I believe that similar statements can be made about non deterministic computations (a non deterministic Turing machine can be simulated by a deterministic Turing machine).
^{^}
Specifically, the "logical omniscience principle of epistemic logic":
$\forall ϕ, ψ [K_{a} ϕ \land (ϕ ⟹ ψ) ⟹ K_{a} ψ]$
If an agent knows a proposition, then they know all its implications. They are logically omniscient.
This principle is of course not true in practice, but it helps to bear in mind for understanding what computation offers us.
Computation is a mechanism to bridge this gap between what we know explicitly, and what we know implicitly.

^{^}
Note that this neatly matches the definition of "agent" as formulated by Russell and Norvig in "Artificial Intelligence A Modern Approach".
I find this specification of "agent" in terms of information flow, to and from particular subsystems interesting. If it holds water, it seems like a nontrivial formulation. Perhaps it will allow objective, non-arbitrary specifications of agents.
^{^}
Let $P$ and $Q$ be arbitrary propositions:
" $P$ is a necessary condition for $Q$ " $: Q ⟹ P$
" $P$ is a sufficient condition for $Q$ " $: P ⟹ Q$
" $P$ is a necessary and sufficient condition for $Q$ " $: P ⟺ Q$
^{^}
Note that every string can be encoded by some number, so all data (that can be represented as binary strings) can be associated with a number. Explicit knowledge of data is possessing the associated number (in some constant predefined numeral system).
^{^}
I have been told that this is the "data processing inequality" of information theory.
^{^}
This section is not a representation of anything covered in the paper. Rather it's the product of me investigating my own confusion about the last section of the paper.
It is somewhat original to me, but is not necessarily a novel contribution.
^{^}
Or a closed computation expressed over time.
^{^}
Note that the second is just the curried form of the former, and thus the two are properly equivalent.
^{^}
Properly speaking, $T$ is a "well-ordered" set. Intuitively, think of it as an ordered set where each subset has a minimum element.
^{^}
I know basically zero continuous mathematics, so I cannot properly reason about continuous open computations.
However, I do think that I can make a stab at discrete computations. I think it's somewhat justified as digital computation is a discrete abstraction.
Furthermore, continuous computations can be approximated by discrete computations whose discrete elements represent ever smaller intervals of the underlying continuous quantity (in this case, time steps representing smaller intervals of time).
^{^}
It seems to me that such discrete processes can be implemented via a REPL or similar structures.
^{^}
I suspect that a "stream of consciousness" is one such continuous process.
Perhaps a better understanding of continuous processes will be helpful in understanding how mental phenomena translates to physical phenomena (via computational phenomena as an intermediary).
^{^}
Recall that an algorithm is the intension of a closed computation.
^{^}
"Consume" here refers to receiving information (input) from another computation.
A program $C$ consumes another program $B$ if the output of $B$ is an input to $C$ .
Likewise, we say that $B$ is "consumed by" $C$ .
^{^}
"Interface with" here refers to information transfer with another computation.
If a program $C$ consumes (or is consumed by) another program $B$ , we say that $B$ and $C$ "interface with" each other.
^{^}
"Process" here being a special kind of program that represents an open computation.