in case i forgot last month, here's a link to july
One proof of concept for the GSAI stack would be a well-understood mechanical engineering domain automated to the next level and certified to boot. How about locks? Needs a model of basic physics, terms in some logic for all the parts and how they compose, and some test harnesses that simulate an adversary. Can you design and manufacture a provably unpickable lock?
Zac Hatfield-Dodds (of hypothesis/pytest and Anthropic, was offered and declined authorship on the GSAI position paper) challenged Ben Goldhaber to a bet after Ben coauthored a post with Steve Omohundro. It seems to resolve in 2026 or 2027, the comment thread should get cleared up once Ben gets back from Burning Man. The arbiter is Raemon from LessWrong.
Zac says you can’t get a provably unpickable lock on this timeline. Zac gave (up to) 10:1 odds, so recall that the bet can be a positive expected value for Ben even if he thinks the event is most likely not going to happen.
For funsies, let’s map out one path of what has to happen for Zac to pay Ben $10k. This is not the canonical path, but it is a path:
Arguments take place in 5 parts.
- Claim: What do you want me to believe?
- Reasons: Why should I agree?
- Evidence: How do you know? Can you back it up?
- Acknowledgment and Response: But what about ... ?
- Warrant: How does that follow?
This can be modeled as a conversation with readers, where the reader prompts the writer to taking the next step on the list.
Claim ought to be supported with reasons. Reasons ought to be based on evidence. Arguments are recursive: a part of an argument is an acknowledgment of an anticipated response, and another argument addresses that response. Finally, when the distance between a claim and a reason grows large, we draw connections with something called warrants.
The logic of warrants proceeds in generalities and instances. A general circumstance predictably leads to a general consequence, and if you have an instance of the circumstance you can infer an instance of the consequence.
Arguing in real life papers is complexified from the 5 steps, because
Thinking about a top-level post on FOMO and research taste
Idk maybe this shortform is most of the value of the top level post
A trans woman told me
I get to have all these talkative blowhard traits and no one will punish me for it cuz I'm a girl. This is one major reason detrans would make my life worse. Society is so cruel to men, it sucks so much for them
And another trans woman had told me almost the exact same thing a couple months ago.
My take is that roles have upsides and downsides, and that you'll do a bad job if you try to say one role is better or worse than another on net or say that a role is more downside than upside. Also, there are versions of "women talk too much" as a stereotype in many subcultures, but I don't have a good inside view about it.
Getting the easy things right shows respect for your readers and is the best training for dealing with the hard things.
If they don't believe the evidence, they'll reject the reasons and, with them, your claim.
We saw previously that claims ought to be supported with reasons, and reasons ought to be based on evidence. Now we will look closer at reasons and evidence.
Reasons must be in a clear, logical order. Atomically, readers need to buy each of your reasons, but compositionally they need to buy your logic. Storyboarding is a useful technique for arranging reasons into a logical order: physical arrangements of index cards, or some DAG-like syntax. Here, you can list evidence you have for each reason or, if you're speculating, list the kind of evidence you would need.
When storyboarding, you want to read out the top level reasons as a composite entity without looking at the details (evidence), because you want to make sure the high-level logic makes sense.
...Readers will not accept a reason until they see it anchored in what they consider to be a bedrock of established fact. ... To count as evidence, a statement must report something tha
Primary sources provide you with the "raw data" or evidence you will use to develop, test, and ultimately justify your hypothesis or claim. Secondary sources are books, articles, or reports that are based on primary sources and are intended for scholarly or professional audiences. Tertiary sources are books and articles that synthesize and report on secondary sources for general readers, such as textbooks, articles in encyclopedias, and articles in mass-circulation publications.
The distinction between primary and secondary sources comes from 19th century historians, and the idea of tertiary sources came later. The boundaries can be fuzzy, and are certainly dependent on the task at hand.
I want to reason about what these distinctions look like in the alignment community, and whether or not they're important.
The rest of chapter five is about how to use libraries and information technologies, and evaluating sources for relevance and reliability.
Chapter 6 starts off with the kind of thing you should be looking for while you read
Yesterday I quit my job for direct work on epistemic public goods! Day one of direct work trial offer is April 4th, and it'll take 6 weeks after that to know if I'm a fulltime hire.
I'm turning down
for
did anyone draw up an estimate of how much the proportion of code written by LLMs will increase? or even what the proportion is today
I think this is a crucial part of a lot of psychological maladaption and social dysfunction, very salient to EAs. If you're way more trait xyz than anyone you know for most of your life, your behavior and mindset will be massively effected, and depending on when in life / how much inertia you've accumulated by the time you end up in a different room where suddenly you're average on xyz, you might lose out on a ton of opportunities for growth.
In other words, the concept of "big fish small pond" is deeply insightful a...
I think a property of my theory of change is that academic and commercial speed is a bottleneck. I recently realized that my mass assignment for timelines synchronized with my mass assignment for the prosaic/nonprosaic axis. The basic idea is that let's say a radical new paper that blows up and supplants the entire optimization literature gets pushed to the arxiv tomorrow, signaling the start of some paradigm that we would call nonprosaic. The lag time for academics and industry to figure out what's going on, fi...
I feel like 7 years from AlexNet to the world of PyTorch, TPUs, tons of ML MOOCs, billion-parameter models, etc. is strong evidence against what you're saying, right? Or were deep neural nets already a big and hot and active ecosystem even before AlexNet, more than I realize? (I wasn't paying attention at the time.)
Moreover, even if not all the infrastructure of deep neural nets transfers to a new family of ML algorithms, much of it will. For example, the building up of people and money in ML, the building up of GPU / ASIC servers and the tools to use them, the normalization of the idea that it’s reasonable to invest millions of dollars to train one model and to fab ASICs tailored to a particular ML algorithm, the proliferation of expertise related to parallelization and hardware-acceleration, etc. So if it took 7 years from AlexNet to smooth turnkey industrial-scale deep neural nets and billion-parameter models and zillions of people trained to use them, then I think we can guess <7 years to get from a different family of learning algorithms to the analogous situation. Right? Or where do you disagree?
Cope isn't a very useful concept.
For every person who has a bad reason that they catch because you say "sounds like cope", there are 10x as many people who find their reason actually compelling. Saying "if that was my reason it would be a sign I was in denial of how hard I was coping" or "I don't think that reason is compelling" isn't really relevant to the person you're ostensibly talking to, who's trying to make the best decisions for the best reasons. Just say you don't understand why the reason is compelling.
I asked a friend whether I should TA for a codeschool called ${{codeschool}}.
You shouldn't hang around ${{codeschool}}. People at ${{codeschool}} are not pursuing excellence.
A hidden claim there that I would soak up the pursuit of non-excellence by proximity or osmosis isn't what's interesting (though I could see that turning out either way). What's interesting is the value of non-excellence, which I'll call adequacy.
${{codeschool}} in this case is effective and impactful at putting butts in seats at companies, and is thereby re...
I used to think "community builder" was a personality trait I couldn't switch off, but once I moved to the bay I realized that I was just desperate for serendipity and knew how to take it from 0% to 1%. Since the bay is constantly humming at 70-90% serendipity, I simply lost the urge to contribute.
Benefactors are so over / beneficiaries are so back / etc.
Let FairBot
be the player that sends an opponent to Cooperate
(C
) if it is provable that they cooperate with FairBot
, and sends them to Defect
(D
) otherwise.
Let FairBot_k
be the player that searches for proofs of length <= k
that it's input cooperates with FairBot_k
, and cooperates if it finds one, returning defect if all the proofs of length <= k
are exhausted without one being valid.
Critch writes that "100%" of the time, mathematicians and computer scientists report believing that FairBot_k(FairBot_k) = D
, owing to the basic vision of a stack overf...
I find myself, just as a random guy, deeply impressed at the operational competence of airports and hospitals. Any good books about that sort of thing?
Consider politics. You should take your political preferences/aesthetics, go to the tribes that are based on them, and help them be more sane. In the politics example, everyone's favorite tribe has failure modes, and it is sort of the responsibility of the clearest-headed members of that tribe to make sure that those failure modes don't become the dominant force of that tribe.
Speaking for myself, having been deeply in an activist tribe before I was a rat/EA, I regret I wasn't there to hel...
Broadly, the two kinds of claims are conceptual and practical.
Conceptual claims ask readers not to ask, but to understand. The flavors of conceptual claim are as follows:
There's essentially one flavor of practical claim
If you read between the lines, you might notice that a kind of claim of fact or cause/consequence is that a policy work...
Training for alignment research is one part competence (at math, cs, philosophy) and another part having an inside view / gears-level model of the actual problem. Competence can be outsourced to universities and independent study, but inside view / gears-level model of the actual problem requires community support.
A background assumption I'm working with is that training as a longtermist is not always synchronized with legible-to-academia training. It might be the case that jr researchers oug...
I'm excited for language model interpretability to teach us about the difference between compilers and simulations of compilers. In the sense that chatgpt and I can both predict what a compiler of a suitably popular programming language will do on some input, what's going on there---- surely we're not reimplementing the compiler on our substrate, even in the limit of perfect prediction? Will be an opportunity for a programming language theorist in another year or two of interp progress
In the Safeguarded AI programme thesis[1], proof certificates or certifying algorithms are relied upon in the theory of change. Let's discuss!
From the thesis:
Proof certificates are a quite broad concept, introduced by[33] : a certifying algorithm is defined as one that produces enough metadata about its answer that the answer's correctness can be checked by an algorithm which is so simple that it is easy to understand and to formally verify by hand.
The abstract of citation 33 (McConnell et al)[2]
...A certifying algorithm is an algorithm
any interest in a REMIX study group? in meatspace in berkeley, a few hours a week. https://github.com/redwoodresearch/remix_public/
Any tips for getting out of a "rise to the occasion mindset" and into a "sink to your training" mindset?
I'm usually optimizing for getting the most out of my A-game bursts. I want to start optimizing for my baseline habits, instead. I should cover the B-, C-, and Z-game; the A-game will cover itself.
Mathaphorically, "rising to the occasion" is taking a max of a max, whereas "sinking to the level of your habits" looks like a greatest lower bound.
Yall, this is a rant. It will be sloppy.
I'm really tired of high functioning super smart "autism" like ok we all have madeup diagnoses--- anyone with a IQ slightly above 90 knows that they can learn the slogans to manipulate gatekeepers to get performance enhancement, and they decide not to if they think theyre performing well enough already. That doesn't mean "ADHD" describes something in the world. Similarly, there's this drift of "autism" getting more and more popular. It's obnoxious because labels and identities are obnoxious, but i only find it repuls...
For the record, to mods: I waited till after petrov day to answer the poll because my first guess upon receiving a message on petrov day asking me to click something is that I'm being socially engineered. Clicking the next day felt pretty safe.
messy, jotting down notes:
Methods, famously, includes the line "I am a descendant of the line of Bacon", tracing empiricism to either Roger (13th century) or Francis (16th century) (unclear which).
Though a cursory wikiing shows an 11th century figure providing precedents for empiricism! Alhazen or Ibn al-Haytham worked mostly optics apparently but had some meta-level writings about the scientific method itself. I found this shockingly excellent quote
...The duty of the man who investigates the writings of scientists, if learning the truth is his goal, is to make himself an enemy of a
DM me for invite if you're at all interested in multipolar scenarios, cooperative AI, ARCHES, social applications & governance, computational social choice, heterogeneous takeoff, etc.
(side note I'm also working on figuring out what unipolar worlds and/or homogeneous takeoff worlds imply for MMD research).
Last time we discussed the difference between information and a question or a problem, and I suggested that the novelty-satisfied mode of information presentation isn't as good as addressing actual questions or problems. In chapter 3 which I have not typed up thoughts about, A three step procedure is introduced
Going over etiquette and the social contract, perhaps if it's software specific it talks about minimal reproducers, whatever else the author thinks is involved.
Event for the paper club: https://calendar.app.google/2a11YNXUFwzHbT3TA
blurb about the paper in last month's newsletter:
...... If you’re wondering why you just read all that, here’s the juice: often in GSAI position papers there’ll be some reference to expectations that capture “harm” or “safety”. Preexpectations and postexpectations with respect to particular pairs of programs could be a great way to cash this out, cuz we could look at programs as interventions and simulate RCTs (labeling one program
Does anyone use vim / mouse-minimal browser? I like Tridactyl better than the other one I tried, but it's not great when there's a vim mode in a browser window everything starts to step on eachother (like in jupyter, colab, leetcode, codesignal)