G Gordon Worley III

Director of Research at PAISRI

G Gordon Worley III's Comments

Truth value as magnitude of predictions

This is kind of long and skimming it I couldn't quite tell what the main points were and didn't get a clear sense at the start of why I should read it. Can you provide a summary/abstract/tldr on your post? Or alternatively a pitch for why someone should read it?

What is the subjective experience of free will for agents?

I agree. I think Jessica does a good job of incidentally capturing why it doesn't, but to reiterate:

  • Eliezer is only answering the question of what the algorithm is like from the inside;
  • it doesn't offer an complete alternative model, only shows why a particular model doesn't make sense;
  • and so we are left with the problem of how to understand what it is like to make a decision from an outside perspective, i.e. how do I talk about how someone makes a decision and what a decision is from outside the subjective uncertainty of being the agent in the time prior to when the decision is made.

Finally, I don't think it totally rules out the possibility of talking about possible alternatives, only talking about them via a particular method, and thus maybe there is some other way to have an outside view on degrees of freedom in decision making after a decision has already been made.

Resources for AI Alignment Cartography

Thanks, that really helpful to understand your work better!

Which books provide a good overview of modern human prehistory?

Books on evolutionary psychology might be relevant, simply because evolutionary psychology relies on what evidence we have about how human behaved in pre-history as part of its evidence set. For example, as I recall The Evolution of Human Sexuality had to rely on a lot of anthropological and archaeological research to develop its theory and draw conclusions.

Also, anthropology and archaeology research touch on what it was like to be human before writing. Although ethnographies are a bit out of favor and have some clear issues with observer bias, ethnographies of foragers are probably our best look at what it was like to be a human prior to civilization. Similarly archaeology gives us some insight into what humans were like before writing via the artifacts they left behind, and I think of it as akin to paleontology in that it uses what evidence left to us by the past to infer what it was like: it's not perfect, but it's all we got.

Announcing Web-TAISU, May 13-17

Awesome, excited to see this happening again and online (I would have missed this because of starting a new job even without pandemic-related travel restrictions). I might not be there for the whole thing due to said new job, but I really enjoyed it last time and am looking forward to what I can participate in this time!

Resources for AI Alignment Cartography

I know you link/mention Rohin's map. I think Paul or Chris Olah had put together another map at one time. How do you see your work differing from or building on what they've done?

The horse-sized duck: a theory of innovation, individuals and society

First, I'm somewhat sympathetic to the idea of "be a nerd even if it doesn't pay off in your lifetime because it will make society better". That said, I think it's all the more reason to be skeptical about the argument here.

Just to zoom in on one thing, I doubt the story you tell about the discovery of astronomy. As you say, it's illustrative and not historical, but there is some historical and anthropological evidence about this, and I'm not sure your story is very believable against that evidence. Skimming the book at that link and knowing as a general prior that humans tend to overfit data (i.e. find patterns in everything), it would be surprising to me if it took some special person with special skills to produce the sort of basic astronomy you're suggesting. If anything, I'd expect humans to rapidly reinvent astronomy based on noticing patterns in the stars and seeing them as representative of animals, plants, etc. if you wiped the memories of every human living today and asked them to start over from scratch.

You then go on to say some things about bravery and working on neglected topics that might turn out to be hits, but this seems to have nothing to do with neurodivergence other than a kind of backwards causation where neurodivergent folks are more likely to work on neglected things, but not because they are neglected but because the space of things one might care about is wide and someone who isn't interested in popular things will most likely not accidentally land on caring about a thing that is incidentally popular.

So overall I'm not convinced that your arguments hold up for saying anything more than that there is value in being brave and working on neglected things if you have good reason to think they are important or because you are pursuing a high-variance strategy; the rest is just incidental.

What is the subjective experience of free will for agents?

Thanks, I'll revisit these. They seem like they might be pointing towards a useful resolution I can use to better model values.

G Gordon Worley III's Shortform

As I work towards becoming less confused about what we mean when we talk about values, I find that it feels a lot like I'm working on a jigsaw puzzle where I don't know what the picture is. Also all the pieces have been scattered around the room and I have to find the pieces first, digging between couch cushions and looking under the rug and behind the bookcase, let alone figure out how they fit together or what they fit together to describe.

Yes, we have some pieces already and others think they know (infer, guess) what the picture is from those (it's a bear! it's a cat! it's a woman in a fur coat!), and as I work I find it helpful to keep updating my own guess because even when it's wrong it sometimes helps me think of new ways to try combining the pieces or to know what pieces might be missing that I should go look for, but it also often feels like I'm failing all the time because I'm updating rapidly based on new information and that keeps changing my best guess.

I suspect this is a common experience for folks working on problems in AI safety and many other complex problems, so I figured I'd share this metaphor I recently hit on for making sense of what it is like to do this kind of work.

Load More