Shortform Content [Beta]

Alex Ray's Shortform

The ELK paper is long but I’ve found it worthwhile, and after spending a bit of time noodling on it — one of my takeaways is I think this is essentially a failure mode for the approaches to factored cognition I've been interested in.  (Maybe it's a failure mode in factored cognition generally.

I expect that I’ll want to spend more time thinking about ELK-like problems before spending a bunch more time thinking about factored cognition.

In particular it's now probably a good time to start separating a bunch of things I had jumbled together, namely:

  • Develo
... (read more)
TurnTrout's shortform feed

This morning, I read about how close we came to total destruction during the Cuban missile crisis, where we randomly survived because some Russian planes were inaccurate and also separately several Russian nuclear sub commanders didn't launch their missiles even though they were being harassed by US destroyers. The men were in 130 DEGREE HEAT for hours and passing out due to carbon dioxide poisoning, and still somehow they had enough restraint to not hit back.

And and

I just started crying. I am so grateful to those people. And to Khrushchev, for ridiculing ... (read more)

gwern's Shortform

Danbooru2021 is out. We've gone from n=3m to n=5m (w/162m tags) since Danbooru2017. Seems like all the anime you could possibly need to do cool multimodal text/image DL stuff, hint hint.

rhollerith_dot_com's Shortform

Here are all the letters of the (English) alphabet: hrtmbvyxfoiazkpnqldjucsweg

2irarseil2dIs the order (pseudo-)random? Does it have a hidden meaning I might not be aware of? What's your purpose sharing this?
Vanessa Kosoy's Shortform

Epistemic status: Leaning heavily into inside view, throwing humility to the winds.

Imagine TAI is magically not coming (CDT-style counterfactual[1]). Then, the most notable-in-hindsight feature of modern times might be the budding of mathematical metaphysics (Solomonoff induction, AIXI, Yudkowsky's "computationalist metaphilosophy"[2], UDT, infra-Bayesianism...) Perhaps, this will lead to an "epistemic revolution" comparable only with the scientific revolution in magnitude. It will revolutionize our understanding of the scientific method (probably solving ... (read more)

-1MackGopherSena2dYou seem to be implying that they will be terrifying for the exact opposite reasons why the previous epistemic revolution's philosophical implications were. Only this time, even more so; imagine an epistemically-induced emotional state where your only option is to force feed yourself some Schopenhauer to help mellow you out until you can acclimate to what you just learned. And it almost wasn't enough. If an infohazard warning doesn't at least give you second thoughts, then you haven't actually internalized the concept. Lesser infohazards have the additional danger of jading you to future infohazards that will be infinitely more powerful. The opportunity to practice acute epistemic abstinence is quickly running out. Technological Singularity might be the least interesting one of the coming conjunction of singularities.

You seem to be implying that they will be terrifying for the exact opposite reasons why the previous epistemic revolution's philosophical implications were.

What do you mean by "exact opposite reasons"? To me, it seems like continuation of the same trend of humiliating the human ego:

  • you are not going to live forever
  • yes, you are mere atoms
  • your planet is not the center of the universe
  • even your sun is not special
  • your species is related to the other species that you consider inferior
  • instead of being logical, your mind is a set of short-sighted agents fighting e
... (read more)
Eigil Rischel's Shortform

Compare two views of "the universal prior"

  • AIXI: The external world is a Turing machine that receives our actions as input and produces our sensory impressions as output. Our prior belief about this Turing machine should be that it's simple, i.e. the Solomonoff prior
  • "The embedded prior": The "entire" world is a sort of Turing machine, which we happen to be one component of in some sense. Our prior for this Turing machine should be that it's simple (again, the Solomonoff prior), but we have to condition on the observation that it's complicated enough to c
... (read more)
2Pattern3dWhy wouldn't they be the same? Are you saying AIXI doesn't ask 'where did I come from?'

Yes, that's right. It's the same basic issue that leads to the Anvil Problem

Alex Ray's Shortform

I engage too much w/ generalizations about AI alignment researchers.

Noticing this behavior seems useful for analyzing it and strategizing around it.

A sketch of a pattern to be on the lookout for in particular is "AI Alignment researchers make mistake X" or "AI Alignment researchers are wrong about Y".  I think in the extreme I'm pretty activated/triggered by this, and this causes me to engage with it to a greater extent than I would have otherwise.

This engagement is probably encouraging more of this to happen, so I think more of a pause and reflection... (read more)

MackGopherSena's Shortform

How much does the 'actual you' cost?

The actual you is the you right now, in this very moment, constructed from the set of all past experiences and all future precommitments. Now think about all that it took for the actual you to exist: the Big Bang, Earth, thousands of years of wars and history, billions of dollars spent on the movies you experienced, etc.

At first glance it might seem like your production cost is quite high! But you're not the only one being produced, the production cost is split among everyone, so you're actually a lot cheaper than you th... (read more)

The concept of cost requires alternatives.   What do you cost, compared to the same universe with someone else in your place?  very little.  What do you cost, compared to no universe at all? you cost the universe.

Jimrandomh's Shortform

I think the root of many political disagreements between rationalists and other groups, is that other groups look at parts of the world and see a villain-shaped hole. Eg: There's a lot of people homeless and unable to pay rent, rent is nominally controlled by landlords, the problem must be that the landlords are behaving badly. Or: the racial demographics in some job/field/school underrepresent black and hispanic people, therefore there must be racist people creating the imbalance, therefore covert (but severe) racism is prevalent.

Having read Meditations o... (read more)

Showing 3 of 8 replies (Click to show all)

I think this would make a good top-level post.

6Vaniver6moI think I basically disagree with this, or think that it insufficiently steelmans the other groups. For example, the homeless vs. the landlords; when I put on my systems thinking hat, it sure looks to me like there's a cartel, wherein a group that produces a scarce commodity is colluding to keep that commodity scarce to keep the price high. The facts on the ground are more complicated--property owners are a different group from landlords, and homelessness is caused by more factors than just housing prices--but the basic analysis that there are different classes, those classes have different interests, and those classes are fighting over government regulation as a tool in their conflict seems basically right to me. Like, it's really not a secret that many voters are motivated by keeping property values high, politicians know this is a factor that they will be judged on. Maybe you're trying to condemn a narrow mistake here, where someone being an 'enemy' implies that they are a 'villain', which I agree is a mistake. But it sounds like you're making a more generic point, which is that when people have political disagreements with the rationalists, it's normally because they're thinking in terms of enemy action instead of not thinking in systems. But a lot of what the thinking in systems reveals is the way in which enemies act using systemic forces!
3jimrandomh6moI think this is correct as a final analysis, but ineffective as a cognitive procedure. People who start by trying to identify villains tend to land on landlords-in-general, with charging-high-rent as the significant act, rather than a small subset of mostly non-landlord homeowners, with protesting against construction as the significant act.
Alex Ray's Shortform

I'm pretty confident that adversarial training (or any LM alignment process which does something like hard-mining negatives) won't work for aligning language models or any model that has a chance of being a general intelligence.

This has lead to me calling these sorts of techniques 'thought policing' and the negative examples as 'thoughtcrime' -- I think these are unnecessarily extra, but they work. 

The basic form of the argument is that any concept you want to ban as thoughtcrime, can be composed out of allowable concepts.

Take for example Redwood Rese... (read more)

Showing 3 of 5 replies (Click to show all)
2A Ray4d"The goal is" -- is this describing Redwood's research or your research or a goal you have more broadly? I'm curious how this is connected to "doesn't write fiction where a human is harmed".
2paulfchristiano4dMy general goal, Redwood's current goal, and my understanding of the goal of adversarial training (applied to AI-murdering-everyone) generally. "Don't produce outputs where someone is injured" is just an arbitrary thing not to do. It's chosen to be fairly easy not to do (and to have the right valence so that you can easily remember which direction is good and which direction is bad, though in retrospect I think it's plausible that a predicate with neutral valence would have been better to avoid confusion).

... is just an arbitrary thing not to do.

I think this is the crux-y part for me.  My basic intuition here is something like "it's very hard to get contemporary prosaic LMs to not do a thing they already do (or have high likelihood of doing)" and this intuition points me in the direction of instead "conditionally training them to only do that thing in certain contexts" is easier in a way that matters.

My intuitions are based on a bunch of assumptions that I have access to and probably some that I don't.

Like, I'm basically only thinking about large langu... (read more)

Quinn's Shortform

We need a name for the following heuristic, I think, I think of it as one of those "tribal knowledge" things that gets passed on like an oral tradition without being citeable in the sense of being a part of a literature. If you come up with a name I'll certainly credit you in a top level post!

I heard it from Abram Demski at AISU'21.

Suppose you're either going to end up in world A or world B, and you're uncertain about which one it's going to be. Suppose you can pull lever which will be 100 valuable if you end up in world A, or you can pull lever whi... (read more)

Showing 3 of 6 replies (Click to show all)

Why are you specifying 100 or 0 value, and using fuzzy language like "acceptably small" for disvalue?

100 and 0 in this context make sense. Or at least in my initial reading: arbitrarily-chosen values that are in a decent range to work quickly with (akin to why people often work in percentages instead of 0..1)

Is this based on "value" and "disvalue" being different dimensions, and thus incomparable?

It is - I'm going to say "often", although I am aware this is suboptimal phrasing - often the case that you are confident in the sign of an outcome but not the ma... (read more)

1TLW4dIt is often the case that you are confident in the sign of an outcome but not the magnitude of the outcome. This heuristic is what happens if you are simultaneously very confident in the sign of positive results, and have very little confidence in the magnitude of negative results.
1MackGopherSena6dThat simply sounds like negative utilitarianism. Positive utilitarianism (where you accept the risk ofV(L|B)<0in exchange for marginal value gains for ending up in world A), is at least as rational.
MackGopherSena's Shortform

If an anthropic update implicitly 'validates' (morally speaking) all historic events that led up to that anthropic update, then how do you retain moral values throughout the updating process?

For example: suppose you're tempted to drink a coke but you don't want to, but you drink the coke anyway. Then while you're drinking the coke, you perform an anthropic update and 'validate' the act of drinking the coke. How do you avoid that? Clearly, drinking the coke wasn't the good thing to do, but you somehow forced it to be good? Weird.

Showing 3 of 9 replies (Click to show all)

It only implies that you can have no moral imperative to change the past. It has no consequences whatsoever for morally evaluating the past.

3Richard_Kennaway4d"Ought implies can" in that linked article is about the present and future, not the past. There is nothing in that principle to disallow having a preference that the past had not been as it was, and to have regret for former actions. The past cannot be changed, but one can learn from one's past errors, and strive to become someone who would not have made that error, and so will not in the future.
1Measure4dIs this just postulating that whatever did happen (historically) should have happened (morally)?
Adele Lopez's Shortform

The Averted Famine

In 1898, William Crookes announced that there was an impending crisis which required urgent scientific attention. The problem was that crops deplete Nitrogen from the soil. This can be remedied by using fertilizers, however, he had calculated that existing sources of fertilizers (mainly imported from South America) could not keep up with expected population growth, leading to mass starvation, estimated to occur around 1930-1940. His proposal was that we could entirely circumvent the issue by finding a way to convert some of our mostly Ni... (read more)

The problem with that is that the Nitrogen does not go back into the atmosphere. It goes into the oceans and the resulting problems have been called a stronger violation of planetary boundaries then CO2 pollution. 

6Gunnar_Zarncke12dfull story here: []
Alex Ray's Shortform

100 Year Bunkers

I often hear that building bio-proof bunkers would be good for bio-x-risk, but it seems like not a lot of progress is being made on these.

It's worth mentioning a bunch of things I think probably make it hard for me to think about:

  • It seems that even if I design and build them, I might not be the right pick for an occupant, and thus wouldn't directly benefit in the event of a bio-catastrophe
  • In the event of a bio-catastrophe, it's probably the case that you don't want anyone from the outside coming in, so probably you need people already livin
... (read more)
Showing 3 of 6 replies (Click to show all)

That seems worth considering!

2ChristianKl7dWhat's your threat scenario where you would believe a bio-bunker to be helpful?
1A Ray6dI'm roughly thinking of this sort of thing: []
elifland's Shortform

[crossposted from EA Forum]

Reflecting a little on my shortform from a few years ago, I think I wasn't ambitious enough in trying to actually move this forward.

I want there to be an org that does "human challenge"-style RCTs across lots of important questions that are extremely hard to get at otherwise, including (top 2 are repeated from previous shortform):

  1. Health effects of veganism
  2. Health effects of restricting sleep
  3. Productivity of remote vs. in-person work
  4. Productivity effects of blocking out focused/deep work

Edited to add: I no longer think "human challen... (read more)

4rossry7dI'm confused why these would be described as "challenge" RCTs, and worry that the term will create broader confusion in the movement to support challenge trials for disease. In the usual clinical context, the word "challenge" in "human challenge trial" refers to the step of introducing the "challenge" of a bad thing (e.g., an infectious agent) to the subject, to see if the treatment protects them from it. I don't know what a "challenge" trial testing the effects of veganism looks like? (I'm generally positive on the idea of trialing more things; my confusion+comment is just restricted to the naming being proposed here.)

Thanks, I agree with this and it's probably not good branding anyway. 

I was thinking the "challenge" was just doing the intervention (e.g. being vegan), but agree that the framing is confusing since it refers to something different in the clinical context. I will edit my shortforms to reflect this updated view.

Maxwell Peterson's Shortform

I was reading a comment (linked below) by gwern and it hit me:

Jaynes’s Probability: The Logic of Science is so special because it presents a unified theory of probability. After reading it, I no longer think of “probability” and “statistics” as being different things. As many understand evolution - feeling there is a set of core principles, like selection and evolutionary pressure and mutation, even if the person isn’t familiar with many of the technical findings or machinery they’d need to actually do an analysis good enough to make good predictions from ... (read more)

TurnTrout's shortform feed

For the last two years, typing for 5+ minutes hurt my wrists. I tried a lot of things: shots, physical therapy, trigger-point therapy, acupuncture, massage tools, wrist and elbow braces at night, exercises, stretches. Sometimes it got better. Sometimes it got worse.

No Beat Saber, no lifting weights, and every time I read a damn book I would start translating the punctuation into Dragon NaturallySpeaking syntax.

Text: "Consider a bijection "

My mental narrator: "Cap consider a bijection space dollar foxtrot colon cap x backslash tango oscar cap y do

... (read more)
Showing 3 of 17 replies (Click to show all)

It was probably just regression to the mean because lots of things are, but I started feeling RSI-like symptoms a few months ago, read this, did this, and now they're gone, and in the possibilities where this did help, thank you! (And either way, this did make me feel less anxious about it 😀)

1Raj Thimmiah10moYou able to link to it now?
4Steven Byrnes1yHappy to have (maybe) helped! :-)
Chris_Leong's Shortform

Thick and Thin Concepts

Take for example concepts like courage, diligence and laziness. These concepts are considered thick concepts because they have both a descriptive component and a moral component. To be courageous is most often meant* not only to claim that the person undertook a great risk, but that it was morally praiseworthy. So the thick concept is often naturally modeled as a conjunction of a descriptive claim and a descriptive claim.

However, this isn't the only way to understand these concepts. An alternate would be along the following lines: Im... (read more)

Daniel Kokotajlo's Shortform

Tonight my family and I played a trivia game (Wits & Wagers) with GPT-3 as one of the players! It lost, but not by much. It got 3 questions right out of 13. One of the questions it got right it didn't get exactly right, but was the closest and so got the points. (This is interesting because it means it was guessing correctly rather than regurgitating memorized answers. Presumably the other two it got right were memorized facts.)

Anyhow, having GPT-3 playing made the whole experience more fun for me. I recommend it. :) We plan to do this every year with whatever the most advanced publicly available AI (that doesn't have access to the internet) is.

4Mitchell_Porter7dHow did GPT-3 participate?

I typed in the questions to GPT-3 and pressed "generate" to see its answers. I used a pretty simple prompt.

ofer's Shortform

PSA for Edge browser users: if you care about privacy, make sure Microsoft does not silently enable syncing of browsing history etc. (Settings->Privacy, search and services).

They seemingly did so to me a few days ago (probably along with the Windows "Feature update" 20H2); it may be something that they currently do to some users and not others.

Load More