If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
Excellent, thanks!
Much research on deception (Anthropic's recent work, trojans, jailbreaks, etc) is not targeting "real" instrumentally convergent deception reasoning, but learned heuristics.
If you have the slack, I'd be interested in hearing/chatting more about this, as I'm working (or trying to work) on the "real" "scary" forms of deception. (E.g. do you think that this paper has the same failure mode?)
Anybody know how Fathom Radiant (https://fathomradiant.co/) is doing?
They’ve been working on photonics compute for a long time so I’m curious if people have any knowledge on the timelines they expect it to have practical effects on compute.
Also, Sam Altman and Scott Gray at OpenAI are both investors in Fathom. Not sure when they invested.
I’m guessing it’s still a long-term bet at this point.
OpenAI also hired someone who worked at PsiQuantum recently. My guess is that they are hedging their bets on the compute end and generally looking for opportunities on ...
For the pretraining-finetuning paradigm, this link is now made much more explicitly in Cross-Task Linearity Emerges in the Pretraining-Finetuning Paradigm; as well as linking to model ensembling through logit averaging.
I am trying to gather a list of answers/quotes from public figures to the following questions:
I am writing them down here if you want to look/help: https://docs.google.com/spreadsheets/d/1HH1cpD48BqNUA1TYB2KYamJwxluwiAEG24wGM2yoLJw/edit?usp=sharing
YouTube can generate those automatically, or you can rip the .mp4 with an online service (just Google around, there are tons), then pass it to something like Otter.ai
[Crossposted from Eukaryote Writes Blog.]
So you’ve heard about how fish aren’t a monophyletic group? You’ve heard about carcinization, the process by which ocean arthropods convergently evolve into crabs? You say you get it now? Sit down. Sit down. Shut up. Listen. You don’t know nothing yet.
“Trees” are not a coherent phylogenetic category. On the evolutionary tree of plants, trees are regularly interspersed with things that are absolutely, 100% not trees. This means that, for instance, either:
I thought I had a pretty good guess at this, but the situation...
“ taxonomy is not automatically a great category for regular usage.”
This is great, and I love the specific example of trees as a failure to classify a large set into subsets.
Something that’s not exactly the same problem, but rhymes, is that of genre classification for content discovery. Consider Spotify playlists. There are millions of songs, and hundreds of classified genres. Genres are classified much like species/genus taxonomies— two songs share a genre if they share a common ancestor of music. Led Zeppelin and the Beatles are different, but they bo...
Suppose Alice and Bob are two Bayesian agents in the same environment. They both basically understand how their environment works, so they generally agree on predictions about any specific directly-observable thing in the world - e.g. whenever they try to operationalize a bet, they find that their odds are roughly the same. However, their two world models might have totally different internal structure, different “latent” structures which Alice and Bob model as generating the observable world around them. As a simple toy example: maybe Alice models a bunch of numbers as having been generated by independent rolls of the same biased die, and Bob models the same numbers using some big complicated neural net.
Now suppose Alice goes poking around inside of her world model, and somewhere in there...
One thing I'd note is that AIs can learn from variables that humans can't learn much from, so I think part of what will make this useful for alignment per se is a model of what happens if one mind has learned from a superset of the variables that another mind has learned from.
The European parliamentary election is currently taking place (voting period: 2024-04-07 to 2024-07-10). While I assume my vote[1] has ~no impact on x-risk either way, I'd nonetheless like to take into account if parties have made public statements on AI. But I'm not sure they do. Does anyone know, or how to find out?
I guess there's also the question of past legislative records on AI, which might be even more predictive of future behavior. There's the EU's AI Act, but I'm not sure what its current implementation status is, whether it's considered useful by AI x-risk folks, or what if any political parties have made any statements on it. I'm also not sure how to interpret the parties' parliamentary voting record on it.[2]