LESSWRONG
LW

981
Zach Stein-Perlman
10675Ω3928668412
Message
Dialogue
Subscribe

AI strategy & governance. ailabwatch.org. ailabwatch.substack.com. 

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
4Zach Stein-Perlman's Shortform
Ω
4y
Ω
302
Drake Thomas's Shortform
Zach Stein-Perlman1d*170

You may be interested in ailabwatch.org/resources/corporate-documents, which links to a folder where I have uploaded ~all past versions of the CoI. (I don't recommend reading it, although afaik the only lawyers who've read the Anthropic CoI are Anthropic lawyers and advisors, so it might be cool if one independent lawyer read it from a skeptical/robustness perspective. And I haven’t even done a good job diffing the current version from a past version; I wasn’t aware of the thing Drake highlighted.)

Reply2
Zach Stein-Perlman's Shortform
Zach Stein-Perlman4d20

I guess so! Is there reason to favor logit?

Reply
Eric Neyman's Shortform
Zach Stein-Perlman7d42

Yep, e.g. donations sooner are better for getting endorsements. Especially for Bores and somewhat for Wiener, I think.

Reply
Zach Stein-Perlman's Shortform
Zach Stein-Perlman8d20

Maybe the logistic success curve should actually be the cumulative normal success curve.

Reply
The Tale of the Top-Tier Intellect
Zach Stein-Perlman9d20

There's often a logistic curve for success probabilities, you know? The distances are measured in multiplicative odds, not additive percentage points. You can't take a project like this and assume that by putting in some more hard work, you can increase the absolute chance of success by 10%. More like, the odds of this project's failure versus success start out as 1,000,000:1, and if we're very polite and navigate around Mr. Topaz's sense that he is higher-status than us and manage to explain a few tips to him without ever sounding like we think we know something he doesn't, we can quintuple his chances of success and send the odds to 200,000:1. Which is to say that in the world of percentage points, the odds go from 0.0% to 0.0%. That's one way to look at the “law of continued failure”.

If you had the kind of project where the fundamentals implied, say, a 15% chance of success, you’d then be on the right part of the logistic curve, and in that case it could make a lot of sense to hunt for ways to bump that up to a 30% or 80% chance.

Reply2
kave's Shortform
Zach Stein-Perlman13d20

I observe that https://www.lesswrong.com/posts/BqwXYFtpetFxqkxip/mikhail-samin-s-shortform?commentId=dtmeRXPYkqfDGpaBj isn't frontpage-y but remains on the homepage even after many mods have seen it. This suggests that the mods were just patching the hack. (But I don't know what other shortforms they've hidden, besides the political ones, if any.)

Reply
Mikhail Samin's Shortform
Zach Stein-Perlman13d90

fwiw I agree with most but not all details, and I agree that Anthropic's commitments and policy advocacy have a bad track record, but I think that Anthropic capabilities is nevertheless net positive, because Anthropic has way more capacity and propensity to do safety stuff than other frontier AI companies.

I wonder what you believe about Anthropic's likelihood of noticing risks from misalignment relative to other companies, or of someday spending >25% of internal compute on (automated) safety work.

Reply1
Zach Stein-Perlman's Shortform
Zach Stein-Perlman17d455

I think "Overton window" is a pretty load-bearing concept for many LW users and AI people — it's their main model of policy change. Unfortunately there's lots of other models of policy change. I don't think "Overton window" is particularly helpful or likely-to-cause-you-to-notice-relevant-stuff-and-make-accurate-predictions. (And separately people around here sometimes incorrectly use "expand the Overton window" to just mean with "advance AI safety ideas in government.") I don't have time to write this up; maybe someone else should (or maybe there already exists a good intro to the study of why some policies happen and persist while others don't[1]).

Some terms: policy windows (and "multiple streams"), punctuated equilibrium, policy entrepreneurs, path dependence and feedback (yes this is a real concept in political science, e.g. policies that cause interest groups to depend on them are less likely to be reversed), gradual institutional change, framing/narrative/agenda-setting.

Related point: https://forum.effectivealtruism.org/posts/SrNDFF28xKakMukvz/tlevin-s-quick-takes?commentId=aGSpWHBKWAaFzubba.

  1. ^

    I liked the book Policy Paradox in college. (Example claim: perceived policy problems are strategically constructed through political processes; how issues are framed—e.g. individual vs collective responsibility—determines which solutions seem appropriate.) I asked Claude for suggestions on a shorter intro and I didn't find the suggestions helpful.

    I guess I think if you work on government stuff and you [don't have poli sci background / aren't familiar with concepts like "multiple streams"] you should read Policy Paradox (although the book isn't about that particular concept).

Reply
kave's Shortform
Zach Stein-Perlman20d20

I guess I'll write non-frontpage-y quick takes as posts instead then :(

Reply1
kave's Shortform
Zach Stein-Perlman20d229

I'd like to be able to see such quick takes on the homepage, like how I can see personal blogposts on the homepage (even though logged-out users can't).

Are you hiding them from everyone? Can I opt into seeing them?

Reply
Load More
44AI companies' policy advocacy (Sep 2025)
1mo
0
104xAI's new safety framework is dreadful
2mo
6
52AI companies have started saying safeguards are load-bearing
Ω
3mo
Ω
2
15ChatGPT Agent: evals and safeguards
4mo
0
34Epoch: What is Epoch?
5mo
1
15AI companies aren't planning to secure critical model weights
5mo
0
207AI companies' eval reports mostly don't support their claims
Ω
5mo
Ω
13
58New website analyzing AI companies' model evals
6mo
0
72New scorecard evaluating AI companies on safety
6mo
8
71Claude 4
6mo
24
Load More
Ontology
2 years ago
(+45)
Ontology
2 years ago
(-5)
Ontology
2 years ago
Ontology
2 years ago
(+64/-64)
Ontology
2 years ago
(+45/-12)
Ontology
2 years ago
(+64)
Ontology
2 years ago
(+66/-8)
Ontology
2 years ago
(+117/-23)
Ontology
2 years ago
(+58/-21)
Ontology
2 years ago
(+41)
Load More