Customize
Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
simeon_c5333
1
Idea: Daniel Kokotajlo probably lost quite a bit of money by not signing an OpenAI NDA before leaving, which I consider a public service at this point. Could some of the funders of the AI safety landscape give some money or social reward for this? I guess reimbursing everything Daniel lost might be a bit too much for funders but providing some money, both to reward the act and incentivize future safety people to not sign NDAs would have a very high value. 
Anybody know how Fathom Radiant (https://fathomradiant.co/) is doing? They’ve been working on photonics compute for a long time so I’m curious if people have any knowledge on the timelines they expect it to have practical effects on compute. Also, Sam Altman and Scott Gray at OpenAI are both investors in Fathom. Not sure when they invested. I’m guessing it’s still a long-term bet at this point. OpenAI also hired someone who worked at PsiQuantum recently. My guess is that they are hedging their bets on the compute end and generally looking for opportunities on that side of things. Here’s his bio: Ben Bartlett I'm currently a quantum computer architect at PsiQuantum working to design a scalable and fault-tolerant photonic quantum computer. I have a PhD in applied physics from Stanford University, where I worked on programmable photonics for quantum information processing and ultra high-speed machine learning. Most of my research sits at the intersection of nanophotonics, quantum physics, and machine learning, and basically consists of me designing little race tracks for photons that trick them into doing useful computations.
A list of some contrarian takes I have: * People are currently predictably too worried about misuse risks * What people really mean by "open source" vs "closed source" labs is actually "responsible" vs "irresponsible" labs, which is not affected by regulations targeting open source model deployment. * Neuroscience as an outer alignment[1] strategy is embarrassingly underrated. * Better information security at labs is not clearly a good thing, and if we're worried about great power conflict, probably a bad thing. * Much research on deception (Anthropic's recent work, trojans, jailbreaks, etc) is not targeting "real" instrumentally convergent deception reasoning, but learned heuristics. Not bad in itself, but IMO this places heavy asterisks on the results they can get. * ML robustness research (like FAR Labs' Go stuff) does not help with alignment, and helps moderately for capabilities. * The field of ML is a bad field to take epistemic lessons from. Note I don't talk about the results from ML. * ARC's MAD seems doomed to fail. * People in alignment put too much faith in the general factor g. It exists, and is powerful, but is not all-consuming or all-predicting. People are often very smart, but lack social skills, or agency, or strategic awareness, etc. And vice-versa. They can also be very smart in a particular area, but dumb in other areas. This is relevant for hiring & deference, but less for object-level alignment. * People are too swayed by rhetoric in general, and alignment, rationality, & EA too, but in different ways, and admittedly to a lesser extent than the general population. People should fight against this more than they seem to (which is not really at all, except for the most overt of cases). For example, I see nobody saying they don't change their minds on account of Scott Alexander because he's too powerful a rhetorician. Ditto for Eliezer, since he is also a great rhetorician. In contrast, Robin Hanson is a famously terrible rhetorician, so people should listen to him more. * There is a technocratic tendency in strategic thinking around alignment (I think partially inherited from OpenPhil, but also smart people are likely just more likely to think this way) which biases people towards more simple & brittle top-down models without recognizing how brittle those models are. ---------------------------------------- 1. A non-exact term ↩︎
RobertM5739
8
EDIT: I believe I've found the "plan" that Politico (and other news sources) managed to fail to link to, maybe because it doesn't seem to contain any affirmative commitments by the named companies to submit future models to pre-deployment testing by UK AISI. I've seen a lot of takes (on Twitter) recently suggesting that OpenAI and Anthropic (and maybe some other companies) violated commitments they made to the UK's AISI about granting them access for e.g. predeployment testing of frontier models.  Is there any concrete evidence about what commitment was made, if any?  The only thing I've seen so far is a pretty ambiguous statement by Rishi Sunak, who might have had some incentive to claim more success than was warranted at the time.  If people are going to breathe down the necks of AGI labs about keeping to their commitments, they should be careful to only do it for commitments they've actually made, lest they weaken the relevant incentives.  (This is not meant to endorse AGI labs behaving in ways which cause strategic ambiguity about what commitments they've made; that is also bad.)
Quote from Cal Newport's Slow Productivity book: "Progress in theoretical computer science research is often a game of mental chicken, where the person who is able to hold out longer through the mental discomfort of working through a proof element in their mind will end up with the sharper result."

Popular Comments

Recent Discussion

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

1Nevin Wetherill
Thanks anyway :) Also, yeah, makes sense. Hopefully this isn't a horribly misplaced thread taking up people's daily scrolling bandwidth with no commensurate payoff. Maybe I'll just say something here to cash out my impression of the "first post" intro-message in question: its language has seemed valuable to my mentality in writing a post so far. Although, I think I got a mildly misleading first-impression about how serious the filter was. The first draft for a post I half-finished was a fictional explanatory dialogue involving a lot of extended metaphors... After reading that I had the mental image of getting banned immediately with a message like "oh, c'mon, did you even read the prompt?" Still, that partially-mistaken mental frame made me go read more documentation on the editor and take a more serious approach to planning a post. A bit like a very mild temperature-drop shock to read "this is like a university application." I grok the intent, and I'm glad the community has these sorta norms. It seems likely to help my personal growth agenda on some dimensions.
3habryka
Uploaded them both!
Lorxus10

Excellent, thanks!

1Lorxus
 IMO you did! Like I said in my comment, for reasons that are secret temporarily I care about those two sequences a lot, but I might not have thought to just ask whether they could be added to the library, nor did I know that the blocker was suitable imagery.
25Garrett Baker
A list of some contrarian takes I have: * People are currently predictably too worried about misuse risks * What people really mean by "open source" vs "closed source" labs is actually "responsible" vs "irresponsible" labs, which is not affected by regulations targeting open source model deployment. * Neuroscience as an outer alignment[1] strategy is embarrassingly underrated. * Better information security at labs is not clearly a good thing, and if we're worried about great power conflict, probably a bad thing. * Much research on deception (Anthropic's recent work, trojans, jailbreaks, etc) is not targeting "real" instrumentally convergent deception reasoning, but learned heuristics. Not bad in itself, but IMO this places heavy asterisks on the results they can get. * ML robustness research (like FAR Labs' Go stuff) does not help with alignment, and helps moderately for capabilities. * The field of ML is a bad field to take epistemic lessons from. Note I don't talk about the results from ML. * ARC's MAD seems doomed to fail. * People in alignment put too much faith in the general factor g. It exists, and is powerful, but is not all-consuming or all-predicting. People are often very smart, but lack social skills, or agency, or strategic awareness, etc. And vice-versa. They can also be very smart in a particular area, but dumb in other areas. This is relevant for hiring & deference, but less for object-level alignment. * People are too swayed by rhetoric in general, and alignment, rationality, & EA too, but in different ways, and admittedly to a lesser extent than the general population. People should fight against this more than they seem to (which is not really at all, except for the most overt of cases). For example, I see nobody saying they don't change their minds on account of Scott Alexander because he's too powerful a rhetorician. Ditto for Eliezer, since he is also a great rhetorician. In contrast, Robin Hanson is a famously terrible rhet

Much research on deception (Anthropic's recent work, trojans, jailbreaks, etc) is not targeting "real" instrumentally convergent deception reasoning, but learned heuristics.

 

If you have the slack, I'd be interested in hearing/chatting more about this, as I'm working (or trying to work) on the "real" "scary" forms of deception. (E.g. do you think that this paper has the same failure mode?)

1Christopher “Chris” Upshaw
All of these seem pretty cold tea, as in true but not contrarian.
4Garrett Baker
Everyone I talk with disagrees with most of these. So maybe we just hang around different groups.

Anybody know how Fathom Radiant (https://fathomradiant.co/) is doing?

They’ve been working on photonics compute for a long time so I’m curious if people have any knowledge on the timelines they expect it to have practical effects on compute.

Also, Sam Altman and Scott Gray at OpenAI are both investors in Fathom. Not sure when they invested.

I’m guessing it’s still a long-term bet at this point.

OpenAI also hired someone who worked at PsiQuantum recently. My guess is that they are hedging their bets on the compute end and generally looking for opportunities on ... (read more)

9jacquesthibs
Quote from Cal Newport's Slow Productivity book: "Progress in theoretical computer science research is often a game of mental chicken, where the person who is able to hold out longer through the mental discomfort of working through a proof element in their mind will end up with the sharper result."
1keltan
Big fan of Cal’s work. He’s certainly someone who is pushing the front lines in the fight against acrasia. I’m currently reading “how to win at college”. It’s a super information dense package. Feels a bit like rationality from a-z, if it were specifically for college students trying to succeed. Why did you decide to share this quote? I feel like I’m missing some key context that could aid my understanding.

For the pretraining-finetuning paradigm, this link is now made much more explicitly in Cross-Task Linearity Emerges in the Pretraining-Finetuning Paradigm; as well as linking to model ensembling through logit averaging. 

3habryka
At least Eliezer has been extremely clear that he is in favor of a stop not a pause (indeed, that was like the headline of his article "Pausing AI Developments Isn't Enough. We Need to Shut it All Down"), so I am confused why you list him with anything related to "pause". My guess is me and Eliezer are both in favor of a pause, but mostly because a pause seems like it would slow down AGI progress, not because the next 6 months in-particular will be the most risky period. 

I am trying to gather a list of answers/quotes from public figures to the following questions:

  • What are the chances that AI will cause human extinction?
  • Will AI automate most human labour?
  • Should advanced AI models be open source?
  • Do humans have a moral duty to build artificial superintelligence?
  • Should there be international regulation of advanced AI?
  • Will AI be used to make weapons of mass destruction (WMDs)?

I am writing them down here if you want to look/help: https://docs.google.com/spreadsheets/d/1HH1cpD48BqNUA1TYB2KYamJwxluwiAEG24wGM2yoLJw/edit?usp=sharing 

9fowlertm
We recently released an interview with independent scholar John Wentworth: It mostly centers around two themes: "abstraction" (forming concepts) and "agency" (dealing with goal-directed systems).  Check it out!
2Thomas Kwa
Is there an AI transcript/summary?

YouTube can generate those automatically, or you can rip the .mp4 with an online service (just Google around, there are tons), then pass it to something like Otter.ai

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

[Crossposted from Eukaryote Writes Blog.]

So you’ve heard about how fish aren’t a monophyletic group? You’ve heard about carcinization, the process by which ocean arthropods convergently evolve into crabs? You say you get it now? Sit down. Sit down. Shut up. Listen. You don’t know nothing yet.

“Trees” are not a coherent phylogenetic category. On the evolutionary tree of plants, trees are regularly interspersed with things that are absolutely, 100% not trees. This means that, for instance, either:

  • The common ancestor of a maple and a mulberry tree was not a tree.
  • The common ancestor of a stinging nettle and a strawberry plant was a tree.
  • And this is true for most trees or non-trees that you can think of.

I thought I had a pretty good guess at this, but the situation...

“ taxonomy is not automatically a great category for regular usage.”

This is great, and I love the specific example of trees as a failure to classify a large set into subsets.

Something that’s not exactly the same problem, but rhymes, is that of genre classification for content discovery. Consider Spotify playlists. There are millions of songs, and hundreds of classified genres. Genres are classified much like species/genus taxonomies— two songs share a genre if they share a common ancestor of music. Led Zeppelin and the Beatles are different, but they bo... (read more)

Suppose Alice and Bob are two Bayesian agents in the same environment. They both basically understand how their environment works, so they generally agree on predictions about any specific directly-observable thing in the world - e.g. whenever they try to operationalize a bet, they find that their odds are roughly the same. However, their two world models might have totally different internal structure, different “latent” structures which Alice and Bob model as generating the observable world around them. As a simple toy example: maybe Alice models a bunch of numbers as having been generated by independent rolls of the same biased die, and Bob models the same numbers using some big complicated neural net. 

Now suppose Alice goes poking around inside of her world model, and somewhere in there...

One thing I'd note is that AIs can learn from variables that humans can't learn much from, so I think part of what will make this useful for alignment per se is a model of what happens if one mind has learned from a superset of the variables that another mind has learned from.

The European parliamentary election is currently taking place (voting period: 2024-04-07 to 2024-07-10). While I assume my vote[1] has ~no impact on x-risk either way, I'd nonetheless like to take into account if parties have made public statements on AI. But I'm not sure they do. Does anyone know, or how to find out? 

I guess there's also the question of past legislative records on AI, which might be even more predictive of future behavior. There's the EU's AI Act, but I'm not sure what its current implementation status is, whether it's considered useful by AI x-risk folks, or what if any political parties have made any statements on it. I'm also not sure how to interpret the parties' parliamentary voting record on it.[2]

  1. ^

    In case it matters, I'm from Germany, Bavaria.

  2. ^

    The roll-call vote results can be found here after clicking on "Artificial Intelligence Act". The 523/46/49 votes for the +/-/0 results are the votes in favor/against/abstained.

LessOnline & Manifest Summer Camp

June 3rd to June 7th

Between LessOnline and Manifest, stay for a week of experimental events, chill coworking, and cozy late night conversations.

Prices raise $100 on May 13th