The ELK paper is long but I’ve found it worthwhile, and after spending a bit of time noodling on it — one of my takeaways is I think this is essentially a failure mode for the approaches to factored cognition I've been interested in. (Maybe it's a failure mode in factored cognition generally.
I expect that I’ll want to spend more time thinking about ELK-like problems before spending a bunch more time thinking about factored cognition.
In particular it's now probably a good time to start separating a bunch of things I had jumbled together, namely:
This morning, I read about how close we came to total destruction during the Cuban missile crisis, where we randomly survived because some Russian planes were inaccurate and also separately several Russian nuclear sub commanders didn't launch their missiles even though they were being harassed by US destroyers. The men were in 130 DEGREE HEAT for hours and passing out due to carbon dioxide poisoning, and still somehow they had enough restraint to not hit back.
I just started crying. I am so grateful to those people. And to Khrushchev, for ridiculing ... (read more)
Danbooru2021 is out. We've gone from n=3m to n=5m (w/162m tags) since Danbooru2017. Seems like all the anime you could possibly need to do cool multimodal text/image DL stuff, hint hint.
Here are all the letters of the (English) alphabet: hrtmbvyxfoiazkpnqldjucsweg
Epistemic status: Leaning heavily into inside view, throwing humility to the winds.
Imagine TAI is magically not coming (CDT-style counterfactual). Then, the most notable-in-hindsight feature of modern times might be the budding of mathematical metaphysics (Solomonoff induction, AIXI, Yudkowsky's "computationalist metaphilosophy", UDT, infra-Bayesianism...) Perhaps, this will lead to an "epistemic revolution" comparable only with the scientific revolution in magnitude. It will revolutionize our understanding of the scientific method (probably solving ... (read more)
You seem to be implying that they will be terrifying for the exact opposite reasons why the previous epistemic revolution's philosophical implications were.
What do you mean by "exact opposite reasons"? To me, it seems like continuation of the same trend of humiliating the human ego:
Compare two views of "the universal prior"
Yes, that's right. It's the same basic issue that leads to the Anvil Problem
I engage too much w/ generalizations about AI alignment researchers.
Noticing this behavior seems useful for analyzing it and strategizing around it.
A sketch of a pattern to be on the lookout for in particular is "AI Alignment researchers make mistake X" or "AI Alignment researchers are wrong about Y". I think in the extreme I'm pretty activated/triggered by this, and this causes me to engage with it to a greater extent than I would have otherwise.
This engagement is probably encouraging more of this to happen, so I think more of a pause and reflection... (read more)
How much does the 'actual you' cost?The actual you is the you right now, in this very moment, constructed from the set of all past experiences and all future precommitments. Now think about all that it took for the actual you to exist: the Big Bang, Earth, thousands of years of wars and history, billions of dollars spent on the movies you experienced, etc.
At first glance it might seem like your production cost is quite high! But you're not the only one being produced, the production cost is split among everyone, so you're actually a lot cheaper than you th... (read more)
The concept of cost requires alternatives. What do you cost, compared to the same universe with someone else in your place? very little. What do you cost, compared to no universe at all? you cost the universe.
I think the root of many political disagreements between rationalists and other groups, is that other groups look at parts of the world and see a villain-shaped hole. Eg: There's a lot of people homeless and unable to pay rent, rent is nominally controlled by landlords, the problem must be that the landlords are behaving badly. Or: the racial demographics in some job/field/school underrepresent black and hispanic people, therefore there must be racist people creating the imbalance, therefore covert (but severe) racism is prevalent.
Having read Meditations o... (read more)
I think this would make a good top-level post.
I'm pretty confident that adversarial training (or any LM alignment process which does something like hard-mining negatives) won't work for aligning language models or any model that has a chance of being a general intelligence.
This has lead to me calling these sorts of techniques 'thought policing' and the negative examples as 'thoughtcrime' -- I think these are unnecessarily extra, but they work.
The basic form of the argument is that any concept you want to ban as thoughtcrime, can be composed out of allowable concepts.
Take for example Redwood Rese... (read more)
... is just an arbitrary thing not to do.
I think this is the crux-y part for me. My basic intuition here is something like "it's very hard to get contemporary prosaic LMs to not do a thing they already do (or have high likelihood of doing)" and this intuition points me in the direction of instead "conditionally training them to only do that thing in certain contexts" is easier in a way that matters.
My intuitions are based on a bunch of assumptions that I have access to and probably some that I don't.
Like, I'm basically only thinking about large langu... (read more)
We need a name for the following heuristic, I think, I think of it as one of those "tribal knowledge" things that gets passed on like an oral tradition without being citeable in the sense of being a part of a literature. If you come up with a name I'll certainly credit you in a top level post!
I heard it from Abram Demski at AISU'21.
Suppose you're either going to end up in world A or world B, and you're uncertain about which one it's going to be. Suppose you can pull lever LA which will be 100 valuable if you end up in world A, or you can pull lever LB whi... (read more)
Why are you specifying 100 or 0 value, and using fuzzy language like "acceptably small" for disvalue?
100 and 0 in this context make sense. Or at least in my initial reading: arbitrarily-chosen values that are in a decent range to work quickly with (akin to why people often work in percentages instead of 0..1)
Is this based on "value" and "disvalue" being different dimensions, and thus incomparable?
It is - I'm going to say "often", although I am aware this is suboptimal phrasing - often the case that you are confident in the sign of an outcome but not the ma... (read more)
If an anthropic update implicitly 'validates' (morally speaking) all historic events that led up to that anthropic update, then how do you retain moral values throughout the updating process?
For example: suppose you're tempted to drink a coke but you don't want to, but you drink the coke anyway. Then while you're drinking the coke, you perform an anthropic update and 'validate' the act of drinking the coke. How do you avoid that? Clearly, drinking the coke wasn't the good thing to do, but you somehow forced it to be good? Weird.
It only implies that you can have no moral imperative to change the past. It has no consequences whatsoever for morally evaluating the past.
In 1898, William Crookes announced that there was an impending crisis which required urgent scientific attention. The problem was that crops deplete Nitrogen from the soil. This can be remedied by using fertilizers, however, he had calculated that existing sources of fertilizers (mainly imported from South America) could not keep up with expected population growth, leading to mass starvation, estimated to occur around 1930-1940. His proposal was that we could entirely circumvent the issue by finding a way to convert some of our mostly Ni... (read more)
The problem with that is that the Nitrogen does not go back into the atmosphere. It goes into the oceans and the resulting problems have been called a stronger violation of planetary boundaries then CO2 pollution.
100 Year Bunkers
I often hear that building bio-proof bunkers would be good for bio-x-risk, but it seems like not a lot of progress is being made on these.
It's worth mentioning a bunch of things I think probably make it hard for me to think about:
That seems worth considering!
[crossposted from EA Forum]
Reflecting a little on my shortform from a few years ago, I think I wasn't ambitious enough in trying to actually move this forward.
I want there to be an org that does "human challenge"-style RCTs across lots of important questions that are extremely hard to get at otherwise, including (top 2 are repeated from previous shortform):
Edited to add: I no longer think "human challen... (read more)
Thanks, I agree with this and it's probably not good branding anyway.
I was thinking the "challenge" was just doing the intervention (e.g. being vegan), but agree that the framing is confusing since it refers to something different in the clinical context. I will edit my shortforms to reflect this updated view.
I was reading a comment (linked below) by gwern and it hit me:
Jaynes’s Probability: The Logic of Science is so special because it presents a unified theory of probability. After reading it, I no longer think of “probability” and “statistics” as being different things. As many understand evolution - feeling there is a set of core principles, like selection and evolutionary pressure and mutation, even if the person isn’t familiar with many of the technical findings or machinery they’d need to actually do an analysis good enough to make good predictions from ... (read more)
For the last two years, typing for 5+ minutes hurt my wrists. I tried a lot of things: shots, physical therapy, trigger-point therapy, acupuncture, massage tools, wrist and elbow braces at night, exercises, stretches. Sometimes it got better. Sometimes it got worse.
No Beat Saber, no lifting weights, and every time I read a damn book I would start translating the punctuation into Dragon NaturallySpeaking syntax.
Text: "Consider a bijection f:X→Y"My mental narrator: "Cap consider a bijection space dollar foxtrot colon cap x backslash tango oscar cap y do
Text: "Consider a bijection f:X→Y"
My mental narrator: "Cap consider a bijection space dollar foxtrot colon cap x backslash tango oscar cap y do
It was probably just regression to the mean because lots of things are, but I started feeling RSI-like symptoms a few months ago, read this, did this, and now they're gone, and in the possibilities where this did help, thank you! (And either way, this did make me feel less anxious about it 😀)
Thick and Thin Concepts
Take for example concepts like courage, diligence and laziness. These concepts are considered thick concepts because they have both a descriptive component and a moral component. To be courageous is most often meant* not only to claim that the person undertook a great risk, but that it was morally praiseworthy. So the thick concept is often naturally modeled as a conjunction of a descriptive claim and a descriptive claim.
However, this isn't the only way to understand these concepts. An alternate would be along the following lines: Im... (read more)
Tonight my family and I played a trivia game (Wits & Wagers) with GPT-3 as one of the players! It lost, but not by much. It got 3 questions right out of 13. One of the questions it got right it didn't get exactly right, but was the closest and so got the points. (This is interesting because it means it was guessing correctly rather than regurgitating memorized answers. Presumably the other two it got right were memorized facts.)
Anyhow, having GPT-3 playing made the whole experience more fun for me. I recommend it. :) We plan to do this every year with whatever the most advanced publicly available AI (that doesn't have access to the internet) is.
I typed in the questions to GPT-3 and pressed "generate" to see its answers. I used a pretty simple prompt.
PSA for Edge browser users: if you care about privacy, make sure Microsoft does not silently enable syncing of browsing history etc. (Settings->Privacy, search and services).
They seemingly did so to me a few days ago (probably along with the Windows "Feature update" 20H2); it may be something that they currently do to some users and not others.