Oregon State University PhD student working on AI alignment.

TurnTrout's Comments

April Fools: Announcing LessWrong 3.0 – Now in VR!

Amazing call. In these scary times, it's comforting to be reminded just how smart the LessWrong mod team is.

How special are human brains among animal brains?

Humans seem to have much higher degrees of consciousness and agency than other animals, and this may have emerged from our capacities for language. Helen Keller (who was deaf and blind since infancy, and only started learning language when she was 6) gave an autobiographical account of how she was driven by blind impetuses until she learned the meanings of the words “I” and “me”

This is fascinating, but there's a bit of a potential confounder in that she was six years old. I'm anecdotally aware of several people who feel they weren't really conscious before a certain age.

TurnTrout's shortform feed

Don't have much of an opinion - I haven't rigorously studied infinitesimals yet. I usually just think of infinite / infinitely small quantities as being produced by limiting processes. For example, the intersection of all the -balls around a real number is just that number (under the standard topology), which set has 0 measure and is, in a sense, "infinitely small".

TurnTrout's shortform feed

To prolong my medicine stores by 200%, I've mixed in similar-looking iron supplement placebos with my real medication. (To be clear, nothing serious happens to me if I miss days)

How important are MDPs for AGI (Safety)?

The point of this point is mostly to claim that it's not a hugely useful framework for thinking about RL.

Even though I agree it's unrealistic, MDPs are still easier to prove things in and I still think that they can give us important insights. for example, if I had started with more complex environments when I was investigating instrumental convergence, I would've spent a ton of extra time grappling with the theorems for little perceived benefit. that is, the MDP framework let me more easily cut to the core insights. sometimes it's worth thinking about more general computable environments, but probably not always.

TurnTrout's shortform feed

It seems to me that Zeno's paradoxes leverage incorrect, naïve notions of time and computation. We exist in the world, and we might suppose that that the world is being computed in some way. If time is continuous, then the computer might need to do some pretty weird things to determine our location at an infinite number of intermediate times. However, even if that were the case, we would never notice it – we exist within time and we would not observe the external behavior of the system which is computing us, nor its runtime.

ODE to Joy: Insights from 'A First Course in Ordinary Differential Equations'

Counterexample: is analytic but its derivatives don't satisfy your proposed condition for being analytic.

The human side of interaction

why do we even believe that human values are good?

Because they constitute, by definition, our goodness criterion? It's not like we have two separate modules - one for "human values", and one for "is this good?". (ETA or are you pointing out how our values might shift over time as we reflect on our meta-ethics?)

Perhaps the typical human behaviour amplified by possibilities of a super-intelligence would actually destroy the universe.

If I understand correctly, this is "are human behaviors catastrophic?" - not "are human values catastrophic?".

TurnTrout's shortform feed

Broca’s area handles syntax, while Wernicke’s area handles the semantic side of language processing. Subjects with damage to the latter can speak in syntactically fluent jargon-filled sentences (fluent aphasia) – and they can’t even tell their utterances don’t make sense, because they can’t even make sense of the words leaving their own mouth!

It seems like GPT2 : Broca’s area :: ??? : Wernicke’s area. Are there any cog psych/AI theories on this?

Load More