The problem under consideration is very important for some possible futures of humanity.
However, author's eudamonic wishlist is self-admittedly geared for fiction production, and don't seem to be very enforceable.
It's a fine overview of modern language models. Idea of scaling all the skills at the same time is highlighted, different from human developmental psychology. Since publishing 500B-PaLM models seemed to have jumps at around 25% of the tasks of BIG-bench.
Inadequacy of measuring average performance on LLM is discussed, where a proportion is good, and rest is outright failure from human PoV. Scale seems to help with rate of success.
In 7th footnote, should be 5e9, not 5e6 (doesn't seem to impact reasoning qualitatively).
Argument against CEV seems cool, thanks for formulating it. I guess we are leaving some utility on the table with any particular approach.
Part on referring to a model to adjudicate itself seems really off. I have a hard time imagining a thing that has better performance at meta-level than on object-level. Do you have some concrete example?
Thanks for giving it a think.
Turning off is not a solved problem, e.g. https://www.lesswrong.com/posts/wxbMsGgdHEgZ65Zyi/stop-button-towards-a-causal-solution
Finite utility doesn't help, as long as you need to use probability. So you get, 95% chance of 1 unit of utility is worse than 99%, is worse than 99.9%, etc. And then you apply the same trick to probabilities you get a quantilizer. And that doesn't work either https://www.lesswrong.com/posts/ZjDh3BmbDrWJRckEb/quantilizer-optimizer-with-a-bounded-amount-of-output-1
Maybe people failure is caused by whatever they tweaked to avoid 'generating realistic faces and known persons'?
No particular philosophy: just add some kludge to make your life easier, then repeat until they blot out the Sun.
Non-computer tool is paper for notes & pen, filing everything useful to inbox during daily review. Everything else is based off org-mode, with Orgzly on mobile. Syncing over SFTP, not a cloud person.
Wrote an RSS reader in Python for filling inbox, along with org-capture. Wouldn't recommend the same approach, since elfeed should do the same reasonably easy. Having a script helps since running it automatically nightly + before daily review fills up inbox enough novel stuff to motivate going through it, and avoid binging on other sites.
Other than inbox have a project list & calendar within emacs. Not maintaining a good discipline for weekly/monthly reviews, but much smoother than keeping it in your head.
I have a log file that org-mode keeps in order by date. And references file that don't get very organized or used often. Soon will try to link contents of my massive folder of PDFs with it.
Oh, sorry. Javascript shenanigans seem to have sent me into antoher course, works fine on a clean browser.
The post expands on the intuition of ML field that reinforcement learning doesn't always work and getting it to work is fiddly process.
In the final chapter, a DeepMind paper that argues that 'one weird trick' will work, is demolished.