387

LESSWRONG
LW

386

lemonhope's Shortform

by lemonhope
27th Jan 2020
AI Alignment Forum
1 min read
127

6

Ω 2

This is a special post for quick takes by lemonhope. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
lemonhope's Shortform
101lemonhope
19rotatingpaguro
18faul_sname
17ryan_greenblatt
13Elizabeth
7lc
6Jozdien
4Ben
5Elizabeth
4niplav
6aysja
2niplav
3MinusGix
1lemonhope
3MinusGix
1[comment deleted]
37lemonhope
33Jeremy Gillen
19TsviBT
7Jeremy Gillen
18TsviBT
5Jeremy Gillen
5TsviBT
6Jeremy Gillen
6TsviBT
24Matt Putz
28habryka
9Matt Putz
13habryka
5[anonymous]
28habryka
3gyfwehbdkch
14ChristianKl
10interstice
2interstice
2lemonhope
4interstice
6Nathan Helm-Burger
12Richard_Ngo
3kave
10Richard_Ngo
16lemonhope
10Garrett Baker
14lemonhope
17sjadler
35sjadler
2DirectedEvolution
1Drake Morrison
7sjadler
10lemonhope
7Raemon
2lemonhope
10lemonhope
10lemonhope
8lemonhope
7Ann
3Sasha Lišková
2lemonhope
3Sasha Lišková
3Michael Roe
3Michael Roe
1Michael Roe
3quetzal_rainbow
7lemonhope
6Thomas Kwa
2lemonhope
2lemonhope
6lemonhope
9Random Developer
6Cole Wyeth
4Viliam
6lemonhope
5RHollerith
5lemonhope
4lemonhope
4lemonhope
3Viliam
4lemonhope
2Seth Herd
4lemonhope
3lemonhope
2Joseph Miller
4Cole Wyeth
2lemonhope
3lemonhope
6kave
2Gunnar_Zarncke
3lemonhope
2lemonhope
9J Bostock
3Vladimir_Nesov
4Vladimir_Nesov
4the gears to ascension
2lemonhope
2lemonhope
4ryan_greenblatt
2lemonhope
4habryka
2lemonhope
2lemonhope
2lemonhope
2lemonhope
4the gears to ascension
1lemonhope
4the gears to ascension
1lemonhope
1lemonhope
2Gurkenglas
1lemonhope
6Alexander Gietelink Oldenziel
4lemonhope
4Nathan Helm-Burger
2lemonhope
3cdt
2lemonhope
1lemonhope
1lemonhope
1lemonhope
1lemonhope
1lemonhope
-1lemonhope
-1lemonhope
-2lemonhope
4Gunnar_Zarncke
3lemonhope
3Viliam
2quetzal_rainbow
1metachirality
127 comments, sorted by
top scoring
Click to highlight new comments since: Today at 9:31 PM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings
[-]lemonhope1y10154

A tricky thing about feedback on LW (or maybe just human nature or webforum nature):

  • Post: Maybe there's a target out there let's all go look (50 points)
    • Comments: so inspiring! We should all go look!
  • Post: What "target" really means (100 points)
    • Comments: I feel much less confused, thank you
  • Post: I shot an arrow at the target (5 points)
    • Comments: bro you missed
  • Post: Target probably in the NW cavern in the SE canyon (1 point)
    • Comments: doubt it
  • Post: Targets and arrows - a fictional allegory (500 points)
    • Comments: I am totally Edd in this story
  • Post: I hit the target. Target is dead. I have the head. (40 points)
    • Comments: thanks. cool.

Basically, if you try to actually do a thing or be particularly specific/concrete then you are held to a much higher standard.

There are some counterexamples. And LW is better than lots of sites.

Nonetheless, I feel here like I have a warm welcome to talk bullshit around the water cooler but angry stares when I try to mortar a few bricks.

I feel like this is almost a good site for getting your hands dirty and getting feedback and such. Just a more positive culture towards actual shots on target would be sufficient I think. Not sure how that could be achieved.

Maybe this is like publication culture vs workshop culture or something.

Reply116511
[-]rotatingpaguro1y1910

Unpolished first thoughts:

  1. Selection effect: people who go to a blog to read bc they like reading, not doing
  2. Concrete things are hard reads, math-heavy posts, doesn't feel ok to vote when you don't actually understand
  3. In general easier things have wider audience
  4. Making someone change their mind is more valuable to them than saying you did something?
  5. There are many small targets and few big ideas/frames, votes are distributed proportionally
Reply91
[-]faul_sname1y*1811

It's not perfect, but one approach I saw on here and liked a lot was @turntrout's MATS team's approach for some of the initial shard theory work, where they made an initial post outlining the problem and soliciting predictions on a set of concrete questions (which gave a nice affordance for engagement, namely "make predictions and maybe comment on your predictions), and then they made a follow-up post with their actual results. Seemed to get quite good engagement.

A confounding factor, though, was that was also an unusually impressive bit of research.

Reply
[-]ryan_greenblatt1y1711

At least as far as safety research goes, concrete empirical safety research is often well received.

Reply
[-]Elizabeth1y138

I think you're directionally correct and would like to see lesswrong reward concrete work more. But I think your analysis is suffering from survivorship bias. Lots of "look at the target" posts die on the vine so you never see their low karma, and decent arrow-shot posts tend to get more like 50 even when the comments section is empty. 

Reply
7lc1y
It's a lot easier to signal the kind of intelligence that LessWrong values-in-practice by writing a philosophical treatise than by actually accomplishing something.
6Jozdien1y
I think a large cause might be that posts talking about the target are more accessible to a larger number of people. Posts like List of Lethalities are understandable to people who aren't alignment researchers, while something the original Latent Adversarial Training post (which used to be my candidate for the least votes:promising ratio post) is mostly relevant to and understandable by people who think about inner alignment or adversarial robustness. This is to say nothing of posts with more technical content. This seems like an issue with the territory that there are far more people who want to read things about alignment than people who work on alignment. The LW admins already try to counter similar effects by maintaining high walls for the garden, and with the karma-weighted voting system. On the other hand, it's not clear that pushing along those dimensions would make this problem better; plausibly you need slightly different mechanisms to account for this. The Alignment Forum sort-of seems like something that tries to address this: more vote balancing between posts about targets and posts about attempts to reach it because of the selection effect. This doesn't fully address the problem, and I think you were trying to point out the effects of not accounting for some topics having epistemic standards that are easier to meet than others, even when the latter is arguably more valuable. I think it's plausible that's more important, but there are other ways to improve it as well[1]. 1. ^ When I finished writing, I realized that what you were pointing out is also somewhat applicable to this comment. You point out a problem, and focus on one cause that's particularly large and hard to solve. I write a comment about another cause that's plausibly smaller but easier to solve, because that meets an easier epistemic standard than failing at solving the harder problem.
4Ben1y
I certainly see where you are coming from. One thing that might be cofounding it slightly is that (depending on the target) the reward for actually taking home targets might not be LW karma but something real.  So the "I have the head" only gives 40 karma. But the "head" might well also be worth something in the real world, like if its some AI code toy example that does something cool it might lead to a new job. Or if its something more esoteric like "meditation technique improves performance at work" then you get the performance boost.
5Elizabeth1y
Data point: my journeyman posts on inconclusive lit reviews get 40-70 karma (unless I make a big claim and then retract it. Those both got great numbers). But I am frequently approached to do lit reviews, and I have to assume the boring posts no one comments on contribute to the reputation that attracts those.
4niplav1y
Strong agree. I think this is because in the rest of the world, framing is a higher status activity than filling, so independent thinkers gravitate towards the higher-status activity of framing.
6aysja1y
Or independent thinkers try to find new frames because the ones on offer are insufficient? I think this is roughly what people mean when they say that AI is "pre-paradigmatic," i.e., we don't have the frames for filling to be very productive yet. Given that, I'm more sympathetic to framing posts on the margin than I am to filling ones, although I hope (and expect) that filling-type work will become more useful as we gain a better understanding of AI. 
2niplav1y
This response is specific to AI/AI alignment, right? I wasn't "sub-tweeting" the state of AI alignment, and was more thinking of other endeavours (quantified self, paradise engineering, forecasting research). In general, the bias towards framing can be swamped by other considerations.
3MinusGix1y
I see this as occurring with various pieces of Infrabayesianism, like Diffractor's UDT posts. They're dense enough mathematically (hitting the target) which makes them challenging to read... and then also challenging to discuss. There are fewer comments even from the people who read the entire post because they don't feel competent enough to make useful commentary (with some truth behind that feeling); the silence also further making commentation harder. At least that's what I've noticed in myself, even though I enjoy & upvote those posts. Less attention seems natural because of specialization into cognitive niches, not everyone has read all the details of SAEs, or knows all the mathematics referenced in certain agent foundations posts. But it does still make it a problem in socially incentivizing good research. I don't know if there are any great solutions. More up-weighting for research-level posts? I view the distillation idea from a ~year ago as helping with drawing attention towards strong (but dense) posts, but it appeared to die down. Try to revive that more?
1lemonhope1y
What was the distillation idea from a year ago?
3MinusGix1y
https://www.lesswrong.com/posts/zo9zKcz47JxDErFzQ/call-for-distillers
1[comment deleted]1y
[-]lemonhope10mo37-6

I don't have a witty, insightful, neutral-sounding way to say this. The grantmakers should let the money flow. There are thousands of talented young safety researchers with decent ideas and exceptional minds, but they probably can't prove it to you. They only need one thing and it is money.

They will be 10x less productive in a big nonprofit and they certainly won't find the next big breakthrough there.

(Meanwhile, there are becoming much better ways to make money that don't involve any good deeds at all.)

My friends were a good deal sharper and more motivated at 18 than now at 25. None of them had any chance at getting grants back then, but they have an ok shot now. At 35, their resumes will be much better and their minds much duller. And it will be too late to shape AGI at all.

I can't find a good LW voice for this point but I feel this is incredibly important. Managers will find all the big nonprofits and eat their gooey centers and leave behind empty husks. They will do this quickly, within a couple years of each nonprofit being founded. The founders themselves will not be spared. Look how the writing of Altman or Demis changed over the years.

The funding situation needs to change very much and very quickly. If a man has an idea just give him money and don't ask questions. (No, I don't mean me.)

Reply11
[-]Jeremy Gillen10mo3319

I think I disagree. This is a bandit problem, and grantmakers have tried pulling that lever a bunch of times. There hasn't been any field-changing research (yet). They knew it had a low chance of success so it's not a big update. But it is a small update.

Probably the optimal move isn't cutting early-career support entirely, but having a higher bar seems correct. There are other levers that are worth trying, and we don't have the resources to try every lever.

Also there are more grifters now that the word is out, so the EV is also declining that way.

(I feel bad saying this as someone who benefited a lot from early-career financial support).

Reply21
[-]TsviBT10mo198

grantmakers have tried pulling that lever a bunch of times

What do you mean by this? I can think of lots of things that seem in some broad class of pulling some lever that kinda looks like this, but most of the ones I'm aware of fall greatly short of being an appropriate attempt to leverage smart young creative motivated would-be AGI alignment insight-havers. So the update should be much smaller (or there's a bunch of stuff I'm not aware of).

Reply1
7Jeremy Gillen10mo
The main thing I'm referring to are upskilling or career transition grants, especially from LTFF, in the last couple of years. I don't have stats, I'm assuming there were a lot given out because I met a lot of people who had received them. Probably there were a bunch given out by the ftx future fund also. Also when I did MATS, many of us got grants post-MATS to continue our research. Relatively little seems to have come of these. How are they falling short? (I sound negative about these grants but I'm not, and I do want more stuff like that to happen. If I were grantmaking I'd probably give many more of some kinds of safety research grant. But "If a man has an idea just give him money and don't ask questions" isn't the right kind of change imo).
[-]TsviBT10mo1812

upskilling or career transition grants, especially from LTFF, in the last couple of years

Interesting; I'm less aware of these.

How are they falling short?

I'll answer as though I know what's going on in various private processes, but I don't, and therefore could easily be wrong. I assume some of these are sort of done somewhere, but not enough and not together enough.

  • Favor insightful critiques and orientations as much as constructive ideas. If you have a large search space and little traction, a half-plane of rejects is as or more valuable than a guessed point that you knew how to even generate.
  • Explicitly allow acceptance by trajectory of thinking, assessed by at least a year of low-bandwidth mentorship; deemphasize agenda-ish-ness.
  • For initial exploration periods, give longer commitments with less required outputs; something like at least 2 years. Explicitly allow continuation of support by trajectory.
  • Give a path forward for financial support for out of paradigm things. (The Vitalik fellowship, for example, probably does not qualify, as the professors, when I glanced at the list, seem unlikely to support this sort of work; but I could be wrong.)
  • Generally emphasize judgem
... (read more)
Reply1
5Jeremy Gillen10mo
I agree this would be a great program to run, but I want to call it a different lever to the one I was referring to. The only thing I would change is that I think new researchers need to understand the purpose and value of past agent foundations research. I spent too long searching for novel ideas while I still misunderstood the main constraints of alignment. I expect you'd get a lot of wasted effort if you asked for out-of-paradigm ideas. Instead it might be better to ask for people to understand and build on past agent foundations research, then gradually move away if they see other pathways after having understood the constraints. Now I see my work as mostly about trying to run into constraints for the purpose of better understand them. Maybe that wouldn't help though, it's really hard to make people see the constraints.
5TsviBT10mo
We agree this is a crucial lever, and we agree that the bar for funding has to be in some way "high". I'm arguing for a bar that's differently shaped. The set of "people established enough in AGI alignment that they get 5 [fund a person for 2 years and maybe more depending how things go in low-bandwidth mentorship, no questions asked] tokens" would hopefully include many people who understand that understanding constraints is key and that past research understood some constraints. I don't really agree with this. Why do you say this? I agree with this in isolation. I think some programs do state something about OOP ideas, and I agree that the statement itself does not come close to solving the problem. (Also I'm confused about the discourse in this thread (which is fine), because I thought we were discussing "how / how much should grantmakers let the money flow".)
6Jeremy Gillen10mo
Good point, I'm convinced by this.  That's my guess at the level of engagement required to understand something. Maybe just because when I've tried to use or modify some research that I thought I understood, I always realise I didn't understand it deeply enough. I'm probably anchoring too hard on my own experience here, other people often learn faster than me. I was thinking "should grantmakers let the money flow to unknown young people who want a chance to prove themselves."
6TsviBT10mo
Hm. A couple things: * Existing AF research is rooted in core questions about alignment. * Existing AF research, pound for pound / word for word, and even idea for idea, is much more unnecessary stuff than necessary stuff. (Which is to be expected.) * Existing AF research is among the best sources of compute-traces of trying to figure some of this stuff out (next to perhaps some philosophy and some other math). * Empirically, most people who set out to stuff existing AF fail to get many of the deep lessons. * There's a key dimension of: how much are you always asking for the context? E.g.: Why did this feel like a mainline question to investigate? If we understood this, what could we then do / understand? If we don't understand this, are we doomed / how are we doomed? Are there ways around that? What's the argument, more clearly? * It's more important whether people are doing that, than whether / how exactly they engage with existing AF research. * If people are doing that, they'll usually migrate away from playing with / extending existing AF, towards the more core (more difficult) problems. Ah ok you're right that that was the original claim. I mentally autosteelmanned.
[-]Matt Putz10mo244

Just wanted to flag quickly that Open Philanthropy's GCR Capacity Building team (where I work) has a career development and transition funding program.

The program aims to provide support—in the form of funding for graduate study, unpaid internships, self-study, career transition and exploration periods, and other activities relevant to building career capital—for individuals at any career stage who want to pursue careers that could help reduce global catastrophic risks (esp. AI risks). It’s open globally and operates on a rolling basis.

I realize that this is quite different from what lemonhope is advocating for here, but nevertheless thought it would be useful context for this discussion (and potential applicants).

Reply
[-]habryka10mo*285

I would mostly advise people against making large career transitions on the basis of Open Phil funding, or if you do, I would be very conservative with it. Like, don't quit your job because of a promise of 1 year of funding, because it is quite possible your second year will only be given conditional on you aligning with the political priorities of OP funders or OP reputational management, and career transitions usually take longer than a year. To be clear, I think it often makes sense to accept funding from almost anyone, but in the case of OP it is funding with unusually hard-to-notice strings attached that might bite you when you are particularly weak-willed or vulnerable.

Also, if OP staff tells you they will give you future grants, or guarantee you some kind of "exit grant" I would largely discount that, at least at the moment. This is true for many, if not most, funders, but my sense is people tend to be particularly miscalibrated for OP (who aren't particularly more or less trustworthy in their forecasts than random foundations and philanthropists, but I do think often get perceived as much more).

Of course, different people's risk appetite might differ, and mileage might vary, but if you can, I would try to negotiate for a 2-3 year grant, or find another funder to backstop you for another year or two, even if OP has said they would keep funding you, before pursuing some kind of substantial career pivot.

Reply1
9Matt Putz10mo
Regarding our career development and transition funding (CDTF) program:  * The default expectation for CDTF grants is that they’re one-off grants. My impression is that this is currently clear to most CDTF grantees (e.g., I think most of them don't reapply after the end of their grant period, and the program title explicitly says that it’s “transition funding”). * (When funding independent research through this program, we sometimes explicitly clarify that we're unlikely to renew by default). * Most of the CDTF grants we make have grant periods that are shorter than a year (with the main exception that comes to mind being PhD programs). I think that’s reasonable (esp. given that the grantees know this when they accept the funding). I’d guess most of the people we fund through this program are able to find paid positions after <1 year. (I probably won't have time to engage further.)
[-]habryka10mo136

Yeah, I was thinking of PhD programs as one of the most common longer-term grants. 

Agree that it's reasonable for a lot of this funding to be shorter, but also think that given the shifting funding landscape where most good research by my lights can no longer get funding, I would be quite hesitant for people to substantially sacrifice career capital in the hopes of getting funding later (or more concretely, I think it's the right choice for people to choose a path where they end up with a lot of slack to think about what directions to pursue, instead of being particularly vulnerable to economic incentives while trying to orient towards the very high-stakes feeling and difficult to navigate existential risk reduction landscape, which tends to result in the best people predictably working for big capability companies). 

This includes the constraints of "finding paid positions after <1 year", where the set of organizations that have funding to sponsor good work is also very small these days (though I do think that has a decent chance of changing again within a year or two, so it's not a crazy bet to make).

Given these recent shifts and the harsher economic incentives of transitioning into the space, I think it would make sense for people to negotiate with OP about getting longer grants than OP has historically granted (which I think aligns with what I think OP staff makes sense as well, based on conversations I've had).

Reply
5[anonymous]10mo
Do you mean something more expansive than "literally don't pursue projects that are either conversative/Republican-coded or explicitly involved in expanding/enriching the Rationality community"? Which, to be clear, would be less-than-ideal if true, but should be talked about in more specific terms when giving advice to potential grant-receivers. I get an overall vibe from many of the comments you've made recently about OP, both here and on the EA forum, that you believe in a rather broad sense they are acting to maximize their own reputation or whatever Dustin's whims are that day (and, consequently, lying/obfuscating this in their public communications to spin these decisions the opposite way), but I don't think[1] you have mentioned any specific details that go beyond their own dealings with Lightcone and with right-coded figures. 1. ^ Could be a failure of my memory, ofc
[-]habryka10mo*282

Yes, I do not believe OP funding constraints are well-described by either limitations on grants specifically to "rationality community" or "conservative/republican-coded activities".

Just as an illustration, if you start thinking or directing your career towards making sure we don't torture AI systems despite them maybe having moral value, that is also a domain where OP has withdrawn funding from. Same if you want to work on any wild animal or invertebrate suffering. I also know of multiple other grantees which do not straightforwardly fall into any domains that OP has announced they are withdrawing funding from that cannot receive funding.[1]

I think the best description for predicting what OP is avoiding funding right now, and will continue to avoid funding into the future is broadly "things that might make Dustin or OP look weird, and are not in a very small set of domains where OP is OK with taking reputational hits or defending people who want to be open about their beliefs, or might otherwise cost them political capital with potential allies (which includes but is not exclusive to the democratic party, AI capability companies, various US government departments, and a vague conc... (read more)

Reply1
3gyfwehbdkch10mo
Can't Dustin donate $100k anonymously (bitcoin or cash) to researchers in a way that decouples his reputation from the people he's funding?
[-]ChristianKl10mo142

My friends were a good deal sharper and more motivated at 18 than now at 25. 

How do you tell that there were sharper back then?

Reply
[-]interstice10mo104

It sounds pretty implausible to me, intellectual productivity is usually at its peak from mid-20s to mid-30s(for high fluid-intelligence fields like math and physics)

Reply21
2interstice10mo
People asked for a citation so here's one: https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.kellogg.northwestern.edu/faculty/jones-ben/htm/age%2520and%2520scientific%2520genius.pdf&ved=2ahUKEwiJjr7b8O-JAxUVOFkFHfrHBMEQFnoECD0QAQ&sqi=2&usg=AOvVaw0HF9-Ta_IR74M8df7Av6Qe Although my belief was more based on anecdotal knowledge of the history of science. Looking up people at random: Einstein's annus mirabilis was at 26; Cantor invented set theory at 29; Hamilton discovered Hamiltonian mechanics at 28; Newton invented calculus at 24. Hmmm I guess this makes it seem more like early 20s - 30. Either way 25 is definitely in peak range, and 18 typically too young(although people have made great discoveries by 18, like Galois. But he likely would have been more productive later had he lived past 20)
2lemonhope10mo
Einstein started doing research a few years before he actually had his miracle year. If he started at 26, he might have never found anything. He went to physics school at 17 or 18. You can't go to "AI safety school" at that age, but if you have funding then you can start learning on your own. It's harder to learn than (eg) learning to code, but not impossibly hard. I am not opposed to funding 25 or 30 or 35 or 40 year olds, but I expect that the most successful people got started in their field (or a very similar one) as a teenager. I wouldn't expect funding an 18-year-old to pay off in less than 4 years. Sorry for being unclear on this in original post.
4interstice10mo
Yeah I definitely agree you should start learning as young as possible. I think I would usually advise a young person starting out to learn general math/CS stuff and do AI safety on the side, since there's way more high-quality knowledge in those fields. Although "just dive in to AI" seems to have worked out well for some people like Chris Olah, and timelines are plausibly pretty short so ¯\_(ツ)_/¯
6Nathan Helm-Burger10mo
This definitely differs for different folks. I was nowhere near my sharpest in late teens or early twenties. I think my peak was early 30s. Now in early 40s, I'm feeling somewhat less sharp, but still ahead of where I was at 18 (even setting aside crystalized knowledge). I do generally agree though that this is a critical point in history, and we should have more people trying more research directions.
[-]Richard_Ngo10mo120

In general people should feel free to DM me with pitches for this sort of thing.

Reply
3kave10mo
Perhaps say some words on why they might want to?
[-]Richard_Ngo10mo100

Because I might fund them or forward it to someone else who will.

Reply
[-]lemonhope8mo160

Nobody has a deal where they'll pay you to not take an offer from an AI lab right? I realize that would be weird incentives, just curious.

Reply
[-]Garrett Baker8mo101

in some sense that’s just hiring you for any other job, and of course if an AGI lab wants you, you end up with greater negotiating leverage at your old place, and could get a raise (depending on how tight capital constraints are, which, to be clear, in AI alignment are tight).

Reply
[-]lemonhope2mo14-2

Long have I searched for an intuitive name for motte & bailey that I wouldn't have to explain too much in conversation. I might have finally found it. The "I was merely saying fallacy". Verb: merelysay. Noun: merelysayism. Example: "You said you could cure cancer and now you're merelysaying you help the body fight colon cancer only."

Reply
[-]sjadler2mo17-1

I've long been confused why people don't just use something like "bait and switch" or "rope-a-dope"?

It's possible they're not the exact same concept, but they seem pretty close, and the former (maybe the latter too) already has an intuitive meaning to people

Reply2
[-]sjadler2mo3510

"Overclaim and retreat" also seems better than motte & bailey imo

Reply
2DirectedEvolution2mo
There are a lot of similar terms, but motte and bailey is a uniquely apt metaphor for describing a specific rhetorical strategy. I think the reason it often feels unhelpful in practice is because it’s unusually unnecessary to be so precise when our goal is just to call out bullshit. I personally like “motte and bailey” quite a bit, but as a tool for my own private thinking rather than as a piece of rhetoric to persuade others with.
1Drake Morrison2mo
I would guess something like historical momentum is the reason people keep using it. Nicholas Shackel coined the term in 2005, then it got popularized in 2014 from SSC. 20 years is a long time for people to be using the term.
7sjadler2mo
20 years is a long time sure, but I don’t think would be good argument for keeping it! (I understand you’re likely just describing, not justifying) Motte & bailey has a major disadvantage of “nobody who hears it for the first time has any understanding of what it means” Even as someone who knows the concept, I’m still not even 100% positive that motte and bailey do in fact mean “overclaim and retreat” respectively People are welcome to use the terms they want, of course. But I’d think there should be a big difference between M&B and some simpler name in order to justify M&B
[-]lemonhope1y100

Where has the "rights of the living vs rights of the unborn" debate already been had? In the context of longevity. (Presuming that at some point an exponentially increasing population consumes its cubically increasing resources.)

Reply
7Raemon1y
I couldn't easily remember this, and then tried throwing it into our beta-testing LessWrong-contexted-LLM. (I'm interested in whether the following turned out to be helpful) (it said more stuff but much of it seemed less relevant) It pulled in these posts as potentially relevant (some of this doesn't seem like what you meant but filtering it manually didn't feel worth it).
2lemonhope1y
Thank you! Seems like this bot works quite well for this task
[-]lemonhope1y109

I wish LW questions had an "accepted answer" thing like stackexchange

Reply2
[-]lemonhope3y100

Practice speedruns for rebuilding civilization?

Reply
[-]lemonhope1y*83

I wonder how many recent trans people tried/considered doubling down on their assigned sex (eg males taking more testosterone) instead first. Maybe (for some people) either end of gender spectrum is comfortable and being in the middle feels bad¿ Anybody know? Don't want to ask my friends because this Q will certainly anger them

Reply
7Ann1y
If it worked, sounds potentially compatible with whatever the inverse(s) of agender is/are? Can at least say that many cisgender people get hormone therapy when they aren't getting what they would like out of their hormones (i.e., menopause, low testosterone, etc). Hormones do useful things, and having them miscalibrated relative to your preferences can be unpleasant. It's also not uncommon to try to 'double down' on a quality you're repressing, i.e., if someone's actively trying to be their assigned sex, they may in fact try particularly hard to conform to it, consciously or otherwise. Even if not repressed, I know I've deliberately answered a few challenges in life where I discovered 'this is particularly hard for me' with 'then I will apply additional effort to achieving it', and I'm sure I've also done it subconsciously.
3Sasha Lišková1y
Hey, Luke. I don't know if I'm still your friend, but I'm not angered, and I'll bite --- plenty of people I know have tried this. Joining the military is common, although I have no idea if this is to effect hypermasculinity or not (most of my trans friends are dmab.) Janae Marie Kroc is probably the most extreme example I can name, but I expect if you find a forum for exmilitary trans folk somewhere you'll be able to find a lot more data on this. I think I could argue that in the years I knew you personally (like 2015 to 2017) I was trying to do this in some kind of way. LK was one of the first people I publicly floated my name to --- we were out running around campus, I don't know if you were still dating at the time. I have absolutely no idea if either of you care. N=1. They are, consciously or not, trying to hide in the closet. This is not the worst idea anyone's ever had, especially in a hostile environment. I appreciate that you're still working in an environment I gave up on ever making progress in. I just...wasn't equal to it. I hope you're well.
2lemonhope1y
Hey!!! Thanks for replying. But did you or anyone you know consider chemical cisgenderization? Or any mention of such in the forums? I would it expect it to be a much stronger effect than eg joining the military. Although I hear it is common for men in the military to take steroids, so maybe there would be some samples there.... I imagine taking cis hormones is not an attractive idea, because if you dislike the result then you're worse off than you started. (Oh and we were still together then. LK has child now, not sure how that affects the equation.)
3Sasha Lišková1y
"Chemical cisgenderization" is usually just called "detransition." To do it, you stop taking hormones. Unless you've had the appropriate surgeries (which most of us haven't because it's very expensive) your body will do it by itself. Transfeminine HRT consists of synthetic estrogen and an anti-androgen of some sort (usually spironolactone or finasteride.) Estrogen monotherapy, in higher doses, is coming more into vogue now that more has been published that suggests it's more effective. Anyway, I know some people who have tried. I'm told the dysphoria comes right back, worse than ever. I know at least one (AMAB nonbinary) person who actually needed to take low-dose T after their orchiectomy, although the dose was an order of magnitude less than what their body naturally produced, but that's rather an exceptional case. Actual desistance rates are on the order of a few percent*, and >90% of those are for reasons other than "I'm not actually trans." [0] [0] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8213007/
3Michael Roe1y
Well  there's this frequently observed phenomenon where someone feels insecure about their gender, and then does something hypermasculine like joining Special Forces or becoming a cage fighter or something like that. They are hoping that it will make them feel confident of their birth-certificate-sex. Then they discover that nope, this does not work and they are still trans. People should be aware that there are copious examples of people who are like -- nope, still trans --- after hoping that going hard on their birth-certificate-gender will work,
3Michael Roe1y
Ascertainment bias, of course, because we only see the cases where this did not work, and do not know exactly how many members of e.g. Delta Force were originally in doubt as to their gender. We can know it doesnt work sometimes.
1Michael Roe1y
While I was typing this, quetzal_rainbow made the same point
3quetzal_rainbow1y
I mean, the problem is if it works we won't hear about such people - they just live happily ever after and don't talk about uncomfortable period of their life.
[-]lemonhope1y70

Is there a good like uh "intro to China" book or YouTube channel? Like something that teaches me (possibly indirectly) what things are valued, how people think and act, extremely basic history, how politics works, how factories get put up, etc etc. Could be about government, industry, the common person, or whatever.. I wish I could be asking for something more specific, but I honestly do not even know the basics.

All I've read is Shenzhen: A Travelogue from China which was quite good although very obsolete. Also it is a comic book.

I'm not much of a reader ... (read more)

Reply
6Thomas Kwa1y
I'm a fan of this blog which is mainly translations and commentary on Chinese social media posts but also has some history posts.
2lemonhope1y
Thank you!
2lemonhope1y
This is so much better than what claude was giving me
[-]lemonhope1mo6-1

Having some vague thoughts about "evil people". In the movies the heroes love everything and fight to save it, and the villains hate everything and fight to destroy it. I feel like in real life, the heroes love power and control, they want the world to be a certain way, and that happens to be a good world for some others too, and they happen to have good ideas about how to do it (eg USA founding fathers, Xi Jinping) instead of stupid bad ideas (Mao, Stalin). There seems to be an innate human drive for revenge and for genocide of other ethnicities, but besi... (read more)

Reply
9Random Developer1mo
Most of the "evil" people I have encountered in life didn't especially care what happened to other people. They didn't seem to have much of a moral system or a conscience. If these people have a strong ability to predict the consequences of their actions, they will often respond to incentives. If they're bad at predicting consequences, they can be a menace. I've also seen (from a distance) a different behavior that I might describe as "vice signaling." The goal here may be establishing credibility with other practitioners of various vices, sort of a mutually assured destruction.
6Cole Wyeth1mo
I recommend “Beyond Good and Evil” by Nietzsche 
4Viliam1mo
I think the problem with trying to learn from psychopaths is the fact that they lie all the time. Their words have no relation to reality other than "does it seem that saying these specific words will help me get something I want?" (including amusement). Even in things that are easy to verify, if they believe that the expected damage from you catching them lying is negligible, or smaller that the expected benefit. Their words are literally "speech acts", sequences of otherwise meaningless syllables emitted only to make you do something; not reflections of their actual model of the world. So if you believe that Mao was a psychopath in the clinical sense of the word (I don't know if he was, maybe yes, maybe no), then no matter how many books of his collected writings or speeches you would read, they contain almost zero evidence about what he actually believed. They will probably contradict themselves because why not. The only source of information would be to examine their actions (while carefully ignoring their words, because those are designed to mislead you). But do you have a reliable source for that? How many things that other people say Mao did, was actually something Mao told them that he did? How many things no one dared to put in writing? Either because they were afraid of punishment, or because they realized that story would be unbelievable and they had no hard evidence to support it. What I am trying to say is that learning from actual psychopaths seems so difficult to me that it's practically impossible.
[-]lemonhope9mo60

What is the current popular (or ideally wise) wisdom wrt publishing demos of scary/spooky AI capabilities? I've heard the argument that moderately scary demos drive capability development into secrecy. Maybe it's just all in the details of who you show what when and what you say. But has someone written a good post about this question?

Reply
5RHollerith9mo
The way it is now, when one lab has an insight, the insight will probably spread quickly to all the other labs. If we could somehow "drive capability development into secrecy," that would drastically slow down capability development.
[-]lemonhope8mo51

Kinda sucks how it is easy to have infinity rules or zero rules but really hard to have a reasonable amount of rules. It reminds me of how I check my email — either every 5 seconds or every 5 weeks.

Reply
[-]lemonhope1y4-1

It's hard to grasp just how good backprop is. Normally in science you estimate the effect of 1-3 variables on 1-3 outcomes. With backprop you can estimate the effect of a trillion variables on an outcome. You don't even need more samples! Around 100 is typical for both (n vs batch_size)

Reply
[-]lemonhope1y40

I wonder how a workshop that teaches participants how to love easy victory and despise hard-fought battles could work

Reply
3Viliam1y
Give people a long list of tasks, a short time interval, and then reward them based on the number of tasks solved. Repeat until they internalize the lesson that solving a problem quickly is good, spending lots of time on a problem is bad, so if something seems complicated they should ignore it and move on to the next task.
[-]lemonhope1y40

I wonder if a chat loop like this would be effective at shortcutting years of confused effort maybe in research andor engineering. (The AI just asks the questions and the person answers.)

  • "what are you seeking?"
  • "ok how will you do it?"
  • "think of five different ways to do that"
  • "describe a consistent picture of the consequences of that"
  • "how could you do that in a day instead of a year"
  • "give me five very different alternate theories of how the underlying system works"

Questions like that can be surprisingly easy to answer. Just hard to remember to ask.

Reply
2Seth Herd1y
Note that even current LLMs can pretty decently answer all but the first of those as well as ask them.
[-]lemonhope1y43

I notice I strong upvote on LW mobile a lot more than desktop because double-tap is more natural than long-click. Maybe mobile should have a min delay between the two taps?

Reply
[-]lemonhope4mo30

Is there a list of rationality one-liners that can possibly actually make you more rational? Just barely enough words to be saying something rather than nothing, and to have it maybe land. Eg

  • Wide mugs are always bigger than tall mugs.
  • newyep:newnope = oldyep * chanceifyep : oldnope * chanceifnope (assuming independent events!)
  • Tasks take sample(lognormal(median)) hours to complete.
  • If it took you a long time to think of something, then figure out how you coulda figured it out faster. Then next time figure fast.
  • People care very much what words you choose to convey the same meaning. They really really care.
Reply
2Joseph Miller4mo
Can your explain the mugs one?
4Cole Wyeth4mo
Though it’s of course not literally true, I believe it refers to the fact that a mug’s volume scales as width^2 but only height^1 
2lemonhope4mo
Yeah it is a popular psychology experiment to trick people with container shapes.
[-]lemonhope1y30

Is it rude to make a new tag without also tagging a handful of posts for it? A few tags I kinda want:

  • explanation: thing explained.
  • idea: an idea for a thing someone could do (weaker version of "Research Agenda" tag)
  • stating the obvious: pointing out something obviously true but maybe frequently overlooked
  • experimental result
  • theoretical result
  • novel maybe: attempts to do something new (in the sense of novelty requirements for conference publications)
Reply
6kave1y
Good question! From the Wiki-Tag FAQ: I believe all tags have to be approved. If I were going through the morning moderation queue, I wouldn't approve an empty tag.
2Gunnar_Zarncke1y
At times, I have added tags that I felt were useful or missing, but usually, I add it to at least a few important posts to illustrate. At one time, one of them was removed but a good explanation for it was given.
[-]lemonhope6y31

Zettelkasten in five seconds with no tooling

Have one big textfile with every thought you ever have. Number the thoughts and don't make each thought too long. Reference thoughts with a pound (e.g. #456) for easy search.

Reply
[-]lemonhope22d20

With controlling a theoretical rl agent, what's the problem with asking the ai to be 99% sure that it mopped 99% of the floor and stop?

I remember that if you just ask for 99% floormop then agent will spend forever getting 99.99999% sure that at least 99% is mopped, but I can't remember the problem with this little patch.

Reply
9J Bostock22d
This can kind of work if you assume you have a friendly prior over actions to draw from, and no inner misalignment issues. Suppose an AI gets a score of 1 for mopping and 0 otherwise. If you draw from the set of all action sequences (according to some prior) which get expected reward >0.99, then you're probably fine as long as the prior isn't too malign. For example drawing from the distribution of GPT-4-base trajectories probably doesn't kill you. This is a kind of satisficer, which is an old MIRI idea. The real issues occur when you need the AI to do something difficult, like "Save the world from OpenBrain's upcoming Agent-5 release". In that case, there's no way to really construct a friendly distribution to satisfice over. There's also the problem of accurately specifying what "mopped" means in the first place, but thankfully GPT-4 already knows what mopping is. Having a friendly prior does an enormous amount of work here. And there's the whole inner misalignment failure thingy as well.
3Vladimir_Nesov22d
An LLM is still going to have essentially everything in its base distribution, the trajectories that solve very difficult problems aren't going to be absurdly improbable, they just won't ever be encountered by chance, without actually doing the RL. If the finger is put on the base model distribution in a sufficiently non-damaging way, it doesn't seem impossible that a lot of concepts and attitudes from the base distribution survive even if solutions to the very difficult problems move much closer to the surface. Alien mesa-optimizers might take over, but also they might not, and the base distribution is still there, even if in a somewhat distorted form.
4Vladimir_Nesov22d
What if it's only 99.99% sure that it's 99% sure? Also, in some sense levels of credence are ill-defined, and worse any abstractions of ontology in the real world will be leaky, even computation. It's not even possible to define what "stop" means without assuming sufficient intent alignment, it's not fundamentally more difficult to take over the reachable universe than to shut down without leaving the factory. And it also may well turn out to be possible to take over the reachable universe while also in some borderline inadmissible sense technically shutting down without leaving the factory.
4the gears to ascension22d
Stop can be done with thermodynamics and boundaries, I think? You need to be able to address all the locations the AI is implemented and require that their energy release goes to background. Still some hairy ingredients for asymptotic alignment, but not as bad as "fetch a coffee as fast as possible without that being bad".
[-]lemonhope4mo20

Who predicted that AI will have a multi-year "everything works" period where the prerequisite pieces come together and suddenly every technique works on every problem? Like before electricity you had to use the right drill bit or saw blade for a given material, but now you can cut anything with anything if you are only slightly patient.

Reply
[-]lemonhope10mo20

I can only find capabilities jobs right now. I would be interested in starting a tiny applied research org or something. How hard is it to get funding for that? I don't have a strong relevant public record, but I did quite a lot of work at METR and elsewhere.

Reply
4ryan_greenblatt10mo
It might be easier to try to establish some track record by doing a small research project first. I don't know if you have enough runway for this though.
2lemonhope10mo
Yeah I just wanted to check that nobody is giving away money before I go do the exact opposite thing I've been doing. I might try to tidy something up and post it first
4habryka10mo
What do you mean by "applied research org"? Like, applied alignment research?
2lemonhope10mo
Yes.
2lemonhope10mo
I do think I could put a good team together and make decent contributions quickly
[-]lemonhope1y21

The acceptable tone of voice here feels like 3mm wide to me. I'm always having bad manners

Reply
[-]lemonhope1y2-9

LW mods, please pay somebody to turn every post with 20+ karma into a diagram. Diagrams are just so vastly superior to words.

Reply1
4the gears to ascension1y
can you demonstrate this for a few posts? (I suspect it will be much harder than you think.)
1lemonhope1y
The job would of course be done by a diagramming god, not a wordpleb like me If i got double dog dared...
4the gears to ascension1y
Link some posts you'd like diagrams of at least, then. If this were tractable, it might be cool. But I suspect most of the value is in even figuring out how to diagram the posts.
1lemonhope1y
From the frontpage: https://www.lesswrong.com/posts/zAqqeXcau9y2yiJdi/can-we-build-a-better-public-doublecrux https://www.lesswrong.com/posts/bkr9BozFuh7ytiwbK/my-hour-of-memoryless-lucidity https://www.lesswrong.com/posts/Lgq2DcuahKmLktDvC/applying-refusal-vector-ablation-to-a-llama-3-70b-agent https://www.lesswrong.com/posts/ANGmJnZL2fskHX6tj/dyslucksia https://www.lesswrong.com/posts/BRZf42vpFcHtSTraD/linkpost-towards-a-theoretical-understanding-of-the-reversal Like all of them basically. Think of it like a TLDR. There are many ways to TLDR but any method that's not terrible is fantastic
[-]lemonhope6mo*10
[This comment is no longer endorsed by its author]Reply
2Gurkenglas6mo
Link an example, along with how cherry-picked it is?
[-]lemonhope1y*10

maybe you die young so you don't get your descendants sick

I've always wondered why evolution didn't select for longer lifespans more strongly. Like, surely a mouse that lives twice as long would have more kids and better knowledge of safe food sources. (And lead their descendants to the same food sources.) I have googled for an explanation a few times but not found one yet.

I thought of a potential explanation the other day. The older you get, the more pathogens you take on. (Especially if you're a mouse.) If you share a den with your grandkids then you mig... (read more)

Reply11
6Alexander Gietelink Oldenziel1y
I like this.  Another explanation I have heard:  a popular theory of aging is the mitochrondial theory of aging.  There are several variants of this theory some of which are definitely false, while some are plausibly in sorta-the-direction. It's a big controversy and I'm not an expert yada yada yada. Let me assume something like the following is true: aging is a metabolic phenomena where mitochrondia degrade overtime and at some point start to leak damaging byproducts which is substantially responsible for aging. Mitochrondial DNA have less repair mechanism than nuclear DNA. Over time they accrue mutations that are bad (much quicker than nuclear dna).  Species that reproduce fast & many may select less on (mitochrondial) mutational load since its matter less. On the other hand, species that have more selection on mitochrondial mutational load for whatever reason are less fecund. E.g. fetuses may be spontaneously aborted if the mitochrondia have too many mutations.  Some pieces of evidence: eggs contain the mitochrondia and are 'kept on ice', i.e. they do not metabolize. Birds have a much stronger selection pressure for high-functioning metabolism (because of flight)[1] and plausibly 'better mitochrondia'.  [there are also variant-hypotheses possible that have a similar mutation meltdown story but don't go through mitochrondia per se. There is some evidence and counterevidence for epigenetic and non-mitochrondial mutational meltdown theories of againg too. So not implausible] 1. ^ compare bats? what about their lifespans?
4lemonhope4mo
Your argument is quite good
4Nathan Helm-Burger1y
My cached mental explanation from undergrad when I was learning about the details of evolution and thinking about this was something along the lines of a heuristic like:  "Many plants and animals seem to have been selected for dying after a successful reproduction event. Part of this may be about giving maximal resources to that reproduction event (maybe your only one, or just your last one). But for animals that routinely survive their last reproductive event, and survive raising the children until the children become independent, then there's probably some other explanation. I think about this with mice as my prototypical example a lot, since they seem to have this pattern. Commonly both male and female mice will survive reproduction, potentially even multiple cycles. However, mice do seem to be selected for relatively fast senescence. What might underlie this? My guess is that senescence can cause you to get out of the way of your existing offspring. Avoiding being a drag on them. There are many compatible (potentially co-occurring) ways this could happen. Some that I can think of off the top of my head are: * Not being a vector for disease, while in a relatively weakened state of old age * Not feeding predators, which could then increase in population and put further stress on the population of your descendants / relatives. * Not consuming resources which might otherwise be more available to your descendants / relatives including: * food * good shelter locations * potential mating opportunities * etc "
2lemonhope1y
Thanks for the cached explanation, this is similar to what I thought before a few days ago. But now I'm thinking that an older-but-still-youthful mouse would be better at avoiding predators and could be just as fertile, if mice were long lived. So the food & shelter might be "better spent" on them, in terms of total expected descendants. This would only leave the disease explanation, yes?
3cdt1y
My understanding was the typical explanation was antagonistic pleiotropy, but I don't know whether that's the consensus view. This seems to have the name 'pathogen control hypothesis' in the literature - see review. I think it has all the hallmarks of a good predictive hypothesis, but I'd really want to see some simulations of which parameter scenarios induce selection this way. 
2lemonhope1y
They keywords are much appreciated. That second link is only from 2022! I wonder if anybody suggested this in like 1900. Edit: some of the citations are from very long ago
[-]lemonhope1y10

I wonder how well a water cooled stovetop thermoelectric backup generator could work.

This is only 30W but air cooled https://www.tegmart.com/thermoelectric-generators/wood-stove-air-cooled-30w-teg

You could use a fish tank water pump to bring water to/from the sink. Just fill up a bowl of water with the faucet and stick the tube in it. Leave the faucet running. Put a filter on the bowl. Float switch to detect low water, run wire with the water tube

Normal natural gas generator like $5k-10k and you have to be homeowner

I think really wide kettle with coily ... (read more)

Reply
[-]lemonhope1y10

(Quoting my recent comment)

Apparently in the US we are too ashamed to say we have "worms" or "parasites", so instead we say we have "helminths". Using this keyword makes google work. This article estimates at least 5 million people (possibly far more) in the US have one of the 6 considered parasites. Other parasites may also be around. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7847297/ (table 1)

This is way more infections than I thought!!

Note the weird symptoms. Blurry vision, headache, respiratory illness, blindness, impaired cognition, fever... Not j... (read more)

Reply
[-]lemonhope1y10

I was working on this cute math notation the other day. Curious if anybody knows a better way or if I am overcomplicating this.

Say you have z:=c∗x2∗y. And you want m:=dz/dx=2∗c∗x∗y to be some particular value.

Sometimes you can control x, sometimes you can control y, and you can always easily measure z. So you might use these forms of the equation:

m=2∗c∗x∗y=2∗z/x=2∗√c∗y∗z

It's kind of confusing that m seems proportional to both z and √z. So here's where the notation comes in. Can write above like

m(x=x,y=y,z=?)=2∗c∗x∗y m(x=x,y=?,z=z)=2∗z/x m(x=?,y=y,z=z)=2∗√c... (read more)

Reply
[-]lemonhope1y10

Seems it is easier / more streamlined / more googlable now for a teenage male to get testosterone blockers than testosterone. Latter is very frowned upon — I guess because it is cheating in sports. Try googling eg "get testosterone prescription high school reddit -trans -ftm". The results are exclusively people shaming the cheaters. Whereas of course googling "get testosterone blockers high school reddit" gives tons of love & support & practical advice.

Females however retain easy access to hormones via birth control.

Reply
[-]lemonhope1y*10

I wonder what experiments physicists have dreamed up to find floating point errors in physics. Anybody know? Or can you run physics with large ints? Would you need like int256?

Reply
[-]lemonhope1y-10

Andor is a word now. You're welcome everybody. Celebrate with champagne andor ice cream.

Reply1
-1lemonhope1y
What monster downvoted this
[-]lemonhope1y*-2-5

I wonder how much testosterone during puberty lowers IQ. Most of my high school math/CS friends seemed low-T and 3/4 of them transitioned since high school. They still seem smart as shit. The higher-T among us seem significantly brain damaged since high school (myself included). I wonder what the mechanism would be here...

Like 40% of my math/cs Twitter is trans women and another 30% is scrawny nerds and only like 9% big bald men.

Reply1
4Gunnar_Zarncke1y
Testosterone influences brain function but not so much general IQ. It may influence to which areas your attention and thus most of your learning goes. For example, Lower testosterone increases attention to happy faces while higher to angry faces. 
3lemonhope1y
Hmm I think the damaging effect would occur over many years but mainly during puberty. It looks like there's only two studies they mention lasting over a year. One found a damaging effect and the other found no effect.
3Viliam1y
Also, accelerate education, to learn as much as possible before the testosterone fully hits. Or, if testosterone changes attention (as Gunnar wrote), learn as much as possible before the testosterone fully hits... and afterwards learn it again, because it could give you a new perspective.
2quetzal_rainbow1y
It's really weird hypothesis because DHT is used as nootropic. I think the most effect of high T, if it exists, is purely behavioral.
1metachirality1y
The hypothesis I would immediately come up with is that less traditionally masculine AMAB people are inclined towards less physical pursuits.
Moderation Log
More from lemonhope
View more
Curated and popular this week
127Comments