Hey, G Gordon Worley III!
I just finished reading this post because Steve2152 was one of the two people (you being the other) to comment on my (accidentally published) post on formalizing and justifying the concept of emotions.
It's interesting to hear that you're looking for a foundational grounding of human values because I'm planning a post on that subject as well. I think you're close with the concept of error minimization. My theory reaches back to the origins of life and what sets living systems apart from non-living systems. Living systems are locally anti-entropic which means: 1) According to the second law of thermodynamics, a living system can never be a truly closed system. 2) Life is characterized by a medium that can gather information such as genetic material.
The second law of thermodynamics means that all things decay, so it's not enough to simply gather information, the system must also preserve the information it gathers. This creates an interesting dynamic because gathering information inherently means encountering entropy (the unknown) which is inherently dangerous (what does this red button do?). It's somewhat at odds with the goal of preserving information. You can even see this fundamental dichotomy manifest in the collective intelligence of the human race playing tug-of-war between conservatism (which is fundamentally about stability and preservation of norms) and liberalism (which is fundamentally about seeking progress or new ways to better society).
Another interesting consequence of the 'telos' of life being to gather and preserve information is: it inherently provides a means of assigning value to information. That is: information is more valuable the more it pertains to the goal of gathering and preserving information. If an asteroid were about to hit earth and you were chosen to live on a space colony until Earth's atmosphere allowed humans to return and start society anew, you would probably favor taking a 16 GB thumb drive with the entire English Wikipedia article text than a server-rack full several petabytes of high-definition recordings of all the reality television ever filmed, because that won't be super helpful toward the goal of preserving knowledge *relevant* to man kind's survival.
The theory also opens interesting discussions like, if all living things have a common goal; why do things like paracites, conflict, and war exist? Also, how has evolution led to a set of instincts that imperfectly approximate this goal? How do we implement this goal in an intelligent system? How do we guarantee such an implementation will not result in conflict? Etc.
Anyway, I hope you'll read it when I publish it and let me know what you think!
Thanks for the insight!
This is actually an incomplete draft that I didn't mean to publish, so I do intend to cover some of your points. It's probably not going to go into the depth you're hoping for since it's pretty much just a synthesis of the bit of information from a segment from a Radiolab episode and three theorems about neural networks.
My goal was to simply use those facts to provide an informal proof that a trade-off exists between latency and optimality* in neural networks and that said trade-off explains why some agents (including biological creatures) might use multiple models at different points in that trade-off instead of devoting all their computational resources to one very deep model or one low-latency model. I don't think it's a particularly earth-shattering revelation, but sometimes; even pretty straight forward ideas can have an impact**.
I also don't think that subconscious processing is exactly the same as emotions.
The position I present here is a little more subtle than that. It doesn't directly equate subconscious processing to emotions. I state that emotions are: a conscious recognition of physiological processes triggered by faster stimulus-response paths in your nervous system.
The examples given in the podcast focus mostly on fight-or-flight until they get later into the discussion about research on paraplegic subjects. I think that might hint at a hierarchy of emotional complexity. It's easy to explain the most basic 'emotion' that even the most primitive brains should express. As you point out; emotions like guilt are more difficult to explain. I don't know if I can give a satisfactory response to that point because it's beyond my lay understanding, but my best guess is: this feed-back loop from stimulus to response back to stimulus and so on can be initiated from something other than direct sensory input and the information fed back might include more than physiological state.
Each path has some input which propagates through it and results in some output. The output might include more than signals that directly physiological control signals such as various muscles. It include more abstract information such as a compact representation of the internal state of the path. The input might include more than sensory input. The feedback might be more direct.
For instance, I believe I've read that some parts of the brain receive a copy of recent motor commands which may or may not correspond to physiological change. Along with the in-direct feedback from sensors that measure your sweaty palms, the output of a path may directly feed back the command signals to release hormones or to blink eyes or whatever as input to other paths. A path might output signals that don't correspond to any physiological control, they may be specifically meant to be feedback signals that communicate more abstract information.
Another example is: you don't cry at the end of Schindler's List because of any direct sensory input. The emotion arises from a more complex, higher-order cognition of the situation. Perhaps there are abstract outputs from the slower paths that feed back into the faster paths which makes the whole feed-back system more complex and allows for a higher-order cognition paths to indirectly result in physiological responses that they don't directly control.
Another piece of the puzzle may be that the slowest path which I, perhaps erroneously; refer to consciousness, is supposedly where the physiological state triggered by faster paths gets labeled. That slower path almost definitely uses other context to arrive at such a label. A physiological state can have multiple causes. If you've just run a marathon on a cold day, it's unlikely you'll feel you're frightened if you register as an elevated heart rate, sweaty palms, goosebumps, etc.
I lump all those 'faster stimulus-response paths' including reflexes under the umbrella term 'subconscious' which might not be correct. I'm not sure if any of the related fields (neurology, psychology, etc.) have a more precise definition for subconscious. The word used in the podcast is the 'autonomic nervous system' which, according to Google means: the part of the nervous system responsible for control of the bodily functions not consciously directed, such as breathing, the heartbeat, and digestive processes.
There's a bit of a blurred line there, since reflexes are often included as part of the autonomic nervous system even though they govern responses that can also be consciously directed, such as blinking. Also, I believe the debate of what, exactly, 'consciously directed' means, is still out since, AFAIK; there's no generally agreed upon formal definition of the word 'consciousness'.
In fact, the term "subconscious" lumps together "some of the things happening in the neocortex" with "everything happening elsewhere in the brain" (amygdala, tectum, etc.) which I think are profoundly different and well worth distinguishing. ... I think a neocortex by itself cannot do anything biologically useful.
I think there are a lot of words related to the phenomenon of intelligence and consciousness that have nebulous, informal meanings which vaguely reference concrete implementations (like the human mind and brain), but could and should be formalized mathematically. In that pursuit, I'd like to extract the essence of those words from the implementation details like the neocortex.
There are many other creatures, such as octopuses and crows; which are on a similar evolutionary path of increasing intelligence but have completely different anatomy to humans and each other. I agree that focusing research on the neocortex itself is a terrible way to understand intelligence. It's like trying to understand how a computer works by looking only at media files on the hard drive. Ignoring the BIOS, operating system, file system, CPU, and other underlying systems that render that data useful.
I believe, for instance; Artificial Intelligence is a misnomer. We should be studying the phenomenon of intelligence as an abstract property that a system can exhibit regardless of whether it's man-made. There is no scientific field of artificial aerodynamics or artificial chemistry. There's no fundamental difference between the way air behaves when it interacts with a wing that depends upon whether the wing is natural or man-made.
Without a formal definition of 'intelligence' we have no way of making basic claims like, "system X is more intelligent than system Y". It's similar to how fields like physics were stuck until previously vague words like force and energy were given formal mathematical definitions. The engineering of heat engines benefited greatly when thermodynamics was developed and formalized ideas like 'heat' and 'entropy'. Computer science wasn't really possible until Church and Turing formalized the vague ideas of computation and computability. Later Shannon formalized the concept of information and allowed even greater progress.
We can look to specific implementations of a phenomenon to draw inspiration and help us understand the more universal truths about the phenomenon in question (as I do in this post), but if an alien robot came from outer-space and behaved in every way like a human, I see no reason to treat its intelligence as a fundamentally distinct phenomenon. When it exhibits emotion, I see no reason to call it anything else.
Anyway, I haven't read your post yet, but I look forward to it! Thanks, again!
*here, optimality refers to producing the absolute best outputs for a given input. It's independent of the amount of resources required to arrive at those outputs.
**I mean: Special Relativity (SR) came from the fact that the velocity of light (measured in space/time) appeared constant across all reference frames according to Maxwell's equations (and backed up by observation). Einstein made the genius but obvious (in hind-sight) conclusion that the only way it's possible for a value of space/time to remain constant between reference frames is if the measure space and time themselves are variable. The Lorentz transform is the only transform consistent with such dimensional variability between reference frames. There are only three terms in c = time/space, If c is constant and different reference frames demand variability, time and space must not be constant.
Not that I think I'm presenting anything as amazing as Special Relativity or that I think I'm anywhere near Einstein. It's just a convenient example.
In short, your second paragraph is what I'm after.
Philosophically, I don't think the distinction you make between a design choice and an evolved feature carries much relevance. It's true that some things evolve that have no purpose and it's easy to imagine that emotions are one of things especially since people often conceptualize emotion as the "opposite" of rationality, however; some things evolve that clearly do serve a purpose (in other words there is a justification for their existence), like the eye. Of course nobody sat down with the intent to design an eye. It evolved, was useful, and stuck around because of that utility. The utility of the eye (its justification for sticking around) exists independent of whether the eye exists. A designer recognizes the utility before hand and purposefully implements it. Evolution "recognizes" the utility after stumbling into it.
How? The person I'm responding to gets the math of probability wrong and uses it to make a confusing claim that "there's nothing wrong" as though we have no more agency over the development of AI than we do over the chaotic motion of a dice.
It's foolish to liken the development of AI to a roll of the dice. Given the stakes, we must try to study, prepare for, and guide the development of AI as best we can.
This isn't hypothetical. We've already built a machine that's more intelligent than any man alive and which brutally optimizes toward a goal that's incompatible with the good of man kind. We call it, "Global Capitalism". There isn't a man alive who knows how to stock the shelves of stores all over the world with #2 pencils that cost only 2 cents each, yet it happens every day because *the system* knows how. The problem is: that system operates with a sociopathic disregard for life (human or otherwise) and has exceeded all limits of sustainability without so much as slowing down. It's a short-sighted, cruel leviathan and there's no human at the reigns.
At this point, it's not about waiting for the dice to settle, it's about figuring out how to wrangle such a beast and prevent the creation of more.
This is a pretty lame attitude towards mathematics. If William Rowan Hamilton showed you his discovery of quaternions, you'd probably scoff and say "yeah, but what can that do for ME?".
Occam's razor has been a guiding principal for science for centuries without having any proof for why it's a good policy, Now Solomonoff comes along and provides a proof and you're unimpressed. Great.
After all, a formalization of Occam's razor is supposed to be useful in order to be considered rational.
Declaring a mathematical abstraction useless just because it is not practically applicable to whatever your purpose may be is pretty short-sighted. The concept of infinity isn't useful to engineers, but it's very useful to mathematicians. Does that make it irrational?
Thinking this through some more, I think the real problem is that S.I. is defined in the perspective of an agent modeling an environment, so the assumption that Many Worlds has to put any un-observable on the output tape is incorrect. It's like stating that Copenhagen has to output all the probability amplitudes onto the output tape and maybe whatever dice god rolled to produce the final answer as well. Neither of those are true.
That's a link to somebody complaining about how someone else presented an argument. I have no idea what point you think it makes that's relevant to this discussion.
output of a TM that just runs the SWE doesn't predict your and only your observations. You have to manually perform an extra operation to extract them, and that's extra complexity that isn't part of the "complexity of the programme".
First, can you define "SWE"? I'm not familiar with the acronym.
Second, why is that a problem? You should want a theory that requires as few assumptions as possible to explain as much as possible. The fact that it explains more than just your point of view (POV) is a good thing. It lets you make predictions. The only requirement is that it explains at least your POV.
The point is to explain the patterns you observe.
>The size of the universe is not a postulate of the QFT or General Relativity.
That's not relevant to my argument.
It most certainly is. If you try to run the Copenhagen interpretation in a Turing machine to get output that matches your POV, then it has to output the whole universe and you have to find your POV on the tape somewhere.
The problem is: That's not how theories are tested. It's not like people are looking for a theory that explains electromagnetism and why they're afraid of clowns and why their uncle "Bob" visited so much when they were a teenager and why their's a white streak in their prom photo as though a cosmic ray hit the camera when the picture was taken, etc. etc.
The observations we're talking about are experiments where a particular phenomenon is invoked with minimal disturbance from the outside world (if you're lucky enough to work in a field like Physics which permits such experiments). In a simple universe that just has an electron traveling toward a double-slit wall and a detector, what happens? We can observe that and we can run our model to see what it predicts. We don't have to run the Turing machine with input of 10^80 particles for 13.8 billion years then try to sift through the output tape to find what matches our observations.
Same thing for the Many Worlds interpretation. It explains the results of our experiments just as well as Copenhagen, it just doesn't posit any special phenomenon like observation, observation is just what entanglement looks like from the perspective of one of the entangled particles (or system of particles if you're talking about the scientist).
Operationally, something like copenhagen, ie. neglect of unobserved predictions, and renormalisation , hasto occur, because otherwise you can't make predictions.
First of all: Of course you can use many worlds to make predictions, You do it every time you use the math of QFT. You can make predictions about entangled particles, can't you? The only thing is: while the math of probability is about weighted sums of hypothetical paths, in MW you take it quite literally as paths the actually being traversed. That's what you're trading for the magic dice machine in non-deterministic theories.
Secondly: Just because Many Worlds says those worlds exist, doesn't mean you have to invent some extra phenomenon to justify renormalization. At the end of the day the unobservable universe is still unobservable. When you're talking about predicting what you might observe when you run experiment X, it's fine to ultimately discard the rest of the multiverse. You just don't need to make up some story about how your perspective is special and you have some magic power to collapse waveforms that other particles don't have.
Hence my comment about SU&C. Different adds some extra baggage about what that means -- occurred in a different branch versus didn't occur -- but the operation still needs to occur.
Please stop introducing obscure acronyms without stating what they mean. It makes your argument less clear. More often than not it results in *more* typing because of the confusion it causes. I have no idea what this sentence means. SU&C = Single Universe and Collapse? Like objective collapse? "Different" what?
Well, the original comment was about explaining lightning
You're right. I think I see your point more clearly now. I may have to think about this a little deeper. It's very hard to apply Occam's razor to theories about emergent phenomena. Especially those several steps removed from basic particle interactions. There are, of course, other ways to weigh on theory against another. One of which is falsifiability.
If the Thor theory must be constantly modified so to explain why nobody can directly observe Thor, then it gets pushed towards un-falsifiability. It gets ejected from science because there's no way to even test the theory which in-turn means it has no predictive power.
As I explained in one of my replies to Jimdrix_Hendri, thought there is a formalization for Occam's razor, Solomonoff induction isn't really used. It's usually more like: individual phenomena are studied and characterized mathematically, then; links between them are found that explain more with fewer and less complex assumptions.
In the case of Many Worlds vs. Copenhagen, it's pretty clear cut. Copenhagen has the same explanatory power as Many Worlds and shares all the postulates of Many Worlds, but adds some extra assumptions, so it's a clear violation of Occam's razor. I don't know of a *practical* way to handle situations that are less clear cut.