Shoshannah Tekofsky


Research Journals

Wiki Contributions


Should we have a rewrite the Rationalist Basics Discourse contest?

Not that I think anything is gonna beat this. But still :D

Ps: can be both content and/or style

Thank you! I appreciate the in-depth comment.

Do you think any of these groups hold that all of the alignment problem can be solved without advancing capabilities?


And I appreciate the correction -- I admit I was confused about this, and may not have done enough of a deep-dive to untangle this properly. Originally I wanted to say "empiricists versus theorists" but I'm not sure where I got the term "theorist" from either.


And to both examples, how are you conceptualizing a "new idea"? Cause I suspect we don't have the same model on what an idea is.

Two things that worked for me:

  1. Produce stuff, a lot of stuff, and make it findable online. This makes it possible for people to see your potential and reach out to you.

  2. Send an email to anyone you admire asking if they are interested in going for a coffee (if you have the funds to fly out to them) or do a video call. Explain why you admire them and why this would be high value to you. I did this for 4 people without limit of 'how likely are they to answer' and one of them said 'yeah sure' and I think the email made them happy cause a reasonable subset of people like learning how they have touched other's lives in a positive way.

Even in experiments, I think most of the value is usually from observing lots of stuff, more than from carefully controlling things.

I think I mostly agree with you but have the "observing lots of stuff" categorized as "exploratory studies" which are badly controlled affairs where you just try to collect more observations to inform your actual eventual experiment. If you want to pin down a fact about reality, you'd still need to devise a well-controlled experiment that actually shows the effect you hypothesize to exist from your observations so far.

If you actually go look at how science is practiced, i.e. the things successful researchers actually pick up during PhD's, there's multiple load-bearing pieces besides just that.


Note that a much simpler first-pass on all these is just "spend a lot more time reading others' work, and writing up and distilling our own".

I agree, but if people were both good at finding necessary info as an individual and we had better tools for coordinating (e.g.,finding each other and relevant material faster) then that would speed up research even further. And I'd argue that any gains in speed of research is as valuable as the same proportional delay in developing AGI.

There is an EU telegram group where they are, among other things, collecting data on where people are in Europe. I'll DM an invite.

That makes a lot of sense! And was indeed also thinking of Elicit

Note: The meetup this month is Wednesday, Jan 4th, at 15:00. I'm in Berkeley currently, and I couldn't see how times were displayed for you guys cause I have no option to change time zones on LW. I apologize if this has been confusing! I'll get a local person to verify dates and times next time (or even set them).

Did you accidentally forget to add this post to your research journal sequence?

I thought I added it but apparently hadn't pressed submit. Thank you for pointing that out!


  1. optimization algorithms (finitely terminating)
  2. iterative methods (convergent)

That sounds as if as if they are always finitely terminating or convergent, which they're not. (I don't think you wanted to say they are)

I was going by the Wikipedia definition:

To solve problems, researchers may use algorithms that terminate in a finite number of steps, or iterative methods that converge to a solution (on some specified class of problems), or heuristics that may provide approximate solutions to some problems (although their iterates need not converge).

I don't quite understand this. What does the sentence "computational optimization can compute all computable functions" mean? Additionally, in my conception of "computational optimization" (which is admittedly rather vague), learning need not take place. 

I might have overloaded the phrase "computational" here. My intention was to point out what can be encoded by such a system. Maybe "coding" is a better word? E.g., neural coding. These systems can implement Turing machines so can potentially have the same properties of turing machines.

these two options are conceptually quite different and might influence the meaning of the analogy. If intelligence computes only a "target direction", then this corresponds to a heuristic approach in which locally, the correct direction in action space is chosen. However, if you view intelligence as an actual optimization algorithm, then what's chosen is not only a direction but a whole path.

I'm wondering if our disagreement is conceptual or semantic. Optimizing a direction instead of an entire path is just a difference in time horizon in my model. But maybe this is a different use of the word "optimize"?


You write "Learning consists of setting the right weights between all the neurons in all the layers. This is analogous to my understanding of human intelligence as path-finding through reality"

  • Learning is a thing you do once, and then you use the resulting neural network repeatedly. In contrast, if you search for a path, you usually use that path only once. 

If I learn the optimal path to work, then I can use that multiple times. I'm not sure I agree with the distinction you are drawing here ... Some problems in life only need to be solved exactly once, but that's the same as any thing you learn only being applicable once. I didn't mean to claim the processes are identical, but that they share an underlying structure. Though indeed, this might an empty intuitive leap with no useful implementation. Or maybe not a good matching at all.

I do not know what you mean by "mapping a utility function to world states". Is the following a correct paraphrasing of what you mean?

"An aligned AGI is one that tries to steer toward world states such that the neurally encoded utility function, if queried, would say 'these states are rather optimal' "

Yes, thank you.


I don't quite understand the analogy to hyperparameters here. To me, it seems like childbirth's meaning is in itself a reward that, by credit assignment, leads to a positive evaluation of the actions that led to it, even though in the experience the reward was mostly negative. It is indeed interesting figuring out what exactly is going on here (and the shard theory of human values might be an interesting frame for that, see also this interesting post looking at how the same external events can trigger different value updates), but I don't yet see how it connects to hyperparameters.

A hyperparameter is a parameter across parameters. So say with childbirth, you have a parameter pain on physical pain which is a direct physical signal, and you have a hyperparameter 'Satisfaction from hard work' that takes 'pain' as input as well as some evaluative cognitive process and outputs reward accordingly. Does that make sense? 

What if instead of trying to build an AI that tries to decode our brain's utility function, we build the process that created our values in the first place and expose the AI to this process

Digging in to shard theory is still on my todo list. [bookmarked]

Many models that do not overfit also memorize much of the data set. 

Is this on the sweet spot just before overfitting or should I be thinking of something else?


Thank you for you extensive comment! <3

Load More