Customize
Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
simeon_c6337
2
Idea: Daniel Kokotajlo probably lost quite a bit of money by not signing an OpenAI NDA before leaving, which I consider a public service at this point. Could some of the funders of the AI safety landscape give some money or social reward for this? I guess reimbursing everything Daniel lost might be a bit too much for funders but providing some money, both to reward the act and incentivize future safety people to not sign NDAs would have a very high value. 
A list of some contrarian takes I have: * People are currently predictably too worried about misuse risks * What people really mean by "open source" vs "closed source" labs is actually "responsible" vs "irresponsible" labs, which is not affected by regulations targeting open source model deployment. * Neuroscience as an outer alignment[1] strategy is embarrassingly underrated. * Better information security at labs is not clearly a good thing, and if we're worried about great power conflict, probably a bad thing. * Much research on deception (Anthropic's recent work, trojans, jailbreaks, etc) is not targeting "real" instrumentally convergent deception reasoning, but learned heuristics. Not bad in itself, but IMO this places heavy asterisks on the results they can get. * ML robustness research (like FAR Labs' Go stuff) does not help with alignment, and helps moderately for capabilities. * The field of ML is a bad field to take epistemic lessons from. Note I don't talk about the results from ML. * ARC's MAD seems doomed to fail. * People in alignment put too much faith in the general factor g. It exists, and is powerful, but is not all-consuming or all-predicting. People are often very smart, but lack social skills, or agency, or strategic awareness, etc. And vice-versa. They can also be very smart in a particular area, but dumb in other areas. This is relevant for hiring & deference, but less for object-level alignment. * People are too swayed by rhetoric in general, and alignment, rationality, & EA too, but in different ways, and admittedly to a lesser extent than the general population. People should fight against this more than they seem to (which is not really at all, except for the most overt of cases). For example, I see nobody saying they don't change their minds on account of Scott Alexander because he's too powerful a rhetorician. Ditto for Eliezer, since he is also a great rhetorician. In contrast, Robin Hanson is a famously terrible rhetorician, so people should listen to him more. * There is a technocratic tendency in strategic thinking around alignment (I think partially inherited from OpenPhil, but also smart people are likely just more likely to think this way) which biases people towards more simple & brittle top-down models without recognizing how brittle those models are. ---------------------------------------- 1. A non-exact term ↩︎
RobertM5739
8
EDIT: I believe I've found the "plan" that Politico (and other news sources) managed to fail to link to, maybe because it doesn't seem to contain any affirmative commitments by the named companies to submit future models to pre-deployment testing by UK AISI. I've seen a lot of takes (on Twitter) recently suggesting that OpenAI and Anthropic (and maybe some other companies) violated commitments they made to the UK's AISI about granting them access for e.g. predeployment testing of frontier models.  Is there any concrete evidence about what commitment was made, if any?  The only thing I've seen so far is a pretty ambiguous statement by Rishi Sunak, who might have had some incentive to claim more success than was warranted at the time.  If people are going to breathe down the necks of AGI labs about keeping to their commitments, they should be careful to only do it for commitments they've actually made, lest they weaken the relevant incentives.  (This is not meant to endorse AGI labs behaving in ways which cause strategic ambiguity about what commitments they've made; that is also bad.)
Anybody know how Fathom Radiant (https://fathomradiant.co/) is doing? They’ve been working on photonics compute for a long time so I’m curious if people have any knowledge on the timelines they expect it to have practical effects on compute. Also, Sam Altman and Scott Gray at OpenAI are both investors in Fathom. Not sure when they invested. I’m guessing it’s still a long-term bet at this point. OpenAI also hired someone who worked at PsiQuantum recently. My guess is that they are hedging their bets on the compute end and generally looking for opportunities on that side of things. Here’s his bio: Ben Bartlett I'm currently a quantum computer architect at PsiQuantum working to design a scalable and fault-tolerant photonic quantum computer. I have a PhD in applied physics from Stanford University, where I worked on programmable photonics for quantum information processing and ultra high-speed machine learning. Most of my research sits at the intersection of nanophotonics, quantum physics, and machine learning, and basically consists of me designing little race tracks for photons that trick them into doing useful computations.
Quote from Cal Newport's Slow Productivity book: "Progress in theoretical computer science research is often a game of mental chicken, where the person who is able to hold out longer through the mental discomfort of working through a proof element in their mind will end up with the sharper result."

Popular Comments

Recent Discussion

3Dagon
So, https://en.wikipedia.org/wiki/PageRank ?
2Abhimanyu Pallavi Sudhir
Oh right, lol, good point.

Also check out "personalized pagerank", where the rating shown to each user is "rooted" in what kind of content this user has upvoted in the past. It's a neat solution to many problems.

Predicting the future is hard, so it’s no surprise that we occasionally miss important developments.

However, several times recently, in the contexts of Covid forecasting and AI progress, I noticed that I missed some crucial feature of a development I was interested in getting right, and it felt to me like I could’ve seen it coming if only I had tried a little harder. (Some others probably did better, but I could imagine that I wasn't the only one who got things wrong.)

Maybe this is hindsight bias, but if there’s something to it, I want to distill the nature of the mistake.

First, here are the examples that prompted me to take notice:

Predicting the course of the Covid pandemic:

  • I didn’t foresee the contribution from sociological factors (e.g., “people not wanting
...

Relatedly, over time as capital demands increase, we might see huge projects which are collaborations between multiple countries.

I also think that investors could plausibly end up with more and more control over time if capital demands grow beyond what the largest tech companies can manage. (At least if these investors are savvy.)

(The things I write in this comment are commonly discussed amongst people I talk to, so not exactly surprises.)

3ryan_greenblatt
My guess is this is probably right given some non-trivial, but not insane countermeasures, but those countermeasures may not actually be employed in practice. (E.g. countermeasures comparable in cost and difficulty to Google's mechanisms for ensuring security and reliability. These required substantial work and some iteration but no fundamental advances.) I'm currently thinking about one of my specialties as making sure these countermeasures and tests of these countermeasures are in place. (This is broadly what we're trying to get at in the ai control post.)
3zeshen
My impression that safety evals were deemed irrelevant because a powerful enough AGI, being deceptively aligned, would pass all of them anyway. We didn't expect the first general-ish AIs to be so dumb, like how GPT-3 was being so blatant and explicit about lying to the TaskRabbit worker. 
4Ben
I like this framework. Often when thinking about a fictional setting (reading a book, or worldbuilding) there will be aspects that stand out as not feeling like they make sense [1]. I think you have a good point that extrapolating out a lot of trends might give you something that at first glance seems like a good prediction, but if you tried to write that world as a setting, without any reference to how it got there, just writing it how you think it ends up, then the weirdness jumps out. [1] eg. In Dune lasers and shields have an interaction that produces an unpredictably large nuclear explosion. To which the setting posits the equilibrium "no one uses lasers, it could set off an explosion". With only the facts we are given, and that fact that the setting is swarming with honourless killers and martyrdom-loving religious warriors, it seems like an implausible equilibrium. Obviously it could be explained with further details.
63simeon_c
Idea: Daniel Kokotajlo probably lost quite a bit of money by not signing an OpenAI NDA before leaving, which I consider a public service at this point. Could some of the funders of the AI safety landscape give some money or social reward for this? I guess reimbursing everything Daniel lost might be a bit too much for funders but providing some money, both to reward the act and incentivize future safety people to not sign NDAs would have a very high value. 
66habryka
@Daniel Kokotajlo If you indeed avoided signing an NDA, would you be able to share how much you passed up as a result of that? I might indeed want to create a precedent here and maybe try to fundraise for some substantial fraction of it.

To clarify: I did sign something when I joined the company, so I'm still not completely free to speak (still under confidentiality obligations). But I didn't take on any additional obligations when I left.

Unclear how to value the equity I gave up, but it probably would have been about 85% of my family's net worth at least. But we are doing fine, please don't worry about us. 

Suppose Alice and Bob are two Bayesian agents in the same environment. They both basically understand how their environment works, so they generally agree on predictions about any specific directly-observable thing in the world - e.g. whenever they try to operationalize a bet, they find that their odds are roughly the same. However, their two world models might have totally different internal structure, different “latent” structures which Alice and Bob model as generating the observable world around them. As a simple toy example: maybe Alice models a bunch of numbers as having been generated by independent rolls of the same biased die, and Bob models the same numbers using some big complicated neural net. 

Now suppose Alice goes poking around inside of her world model, and somewhere in there...

4tailcalled
One thing I'd note is that AIs can learn from variables that humans can't learn much from, so I think part of what will make this useful for alignment per se is a model of what happens if one mind has learned from a superset of the variables that another mind has learned from.

This model does allow for that. :) We can use this model whenever our two agents agree predictively about some parts of the world X; it's totally fine if our two agents learned their models from different sources and/or make different predictions about other parts of the world.

I am trying to gather a list of answers/quotes from public figures to the following questions:

  • What are the chances that AI will cause human extinction?
  • Will AI automate most human labour?
  • Should advanced AI models be open source?
  • Do humans have a moral duty to build artificial superintelligence?
  • Should there be international regulation of advanced AI?
  • Will AI be used to make weapons of mass destruction (WMDs)?

I am writing them down here if you want to look/help: https://docs.google.com/spreadsheets/d/1HH1cpD48BqNUA1TYB2KYamJwxluwiAEG24wGM2yoLJw/edit?usp=sharing 

Answer by teradimich10

I have already tried to collect the most complete collection of quotes here. But it is already very outdated.

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

Historically produce shopping was mostly in open-air markets, but in the US produce is now typically sold in buildings. Most open-air produce sales are probably at farmers markets, but these focus on the high end. I like that Boston's Haymarket more similar to the historical model: competing vendors selling conventional produce relatively cheaply.

It closes for the weekend at 7pm on Saturdays, and since food they don't sell by the end of the market is mostly going to waste they start discounting a lot. You can get very good deals, though you need to be cautious: what's left at the end is often past the end of it's human-edible life.

Today Lily was off at a scouting trip, and I asked Anna what she wanted to do. She remembered that a previous time Lily was...

I went a few times but eventually got grossed out by all the mold. (At least they don't sell live pangolins there.)

3Marthinwurer
This is a fun slice of life. I'm glad y'all had a good time!

LessOnline & Manifest Summer Camp

June 3rd to June 7th

Between LessOnline and Manifest, stay for a week of experimental events, chill coworking, and cozy late night conversations.

Prices raise $100 on May 13th