People who helped Jews during WWII are intriguing. They appear to be some kind of moral supermen. They had almost nothing to gain and everything to lose. How did they differ from the general population? Can we do anything to get more of such people today?

Customize
Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
Seth Herd543
6
MIRI's communications strategy, with the public and with us This is a super short, sloppy version of my draft "cruxes of disagreement on alignment difficulty" mixed with some commentary on MIRI 2024 Communications Strategy  and their communication strategy with the alignment community. I have found MIRI's strategy baffling in the past. I think I'm understanding it better after spending some time going deep on their AI risk arguments. I wish they'd spend more effort communicating with the rest of the alignment community, but I'm also happy to try to do that communication. I certainly don't speak for MIRI. On the surface, their strategy seems absurd. They think doom is ~99% likely, so they're going to try to shut it all down - stop AGI research entirely. They know that this probably won't work; it's just the least-doomed strategy in their world model. It's playing to the outs, or dying with dignity. The weird thing here is that their >90% doom disagrees with almost everyone else who thinks seriously about AGI risk. You can dismiss a lot of people as not having grappled with the most serious arguments for alignment difficulty, but relative long-timers like Rohin Shah and Paul Christiano definitely have. People of that nature tend to have higher p(doom) estimates than optimists who are newer to the game and think more about current deep nets, but much lower than MIRI leadership.  Both of those camps consist of highly intelligent, highly rational people. Their disagreement should bother us for two reasons. First, we probably don't know what we're talking about yet. We as a field don't seem to have a good grip on the core issues. Very different, but highly confident estimates of the problem strongly suggest this. Second, our different takes will tend to make a lot of our communication efforts cancel each other out. If alignment is very hard, we must Shut It Down or likely die. If it's less difficult, we should primarily work hard on alignment. MIRI must argue that alignment is very unlikely if we push forward. Those who think we can align AGI will argue that it's possible. This suggests a compromise position: we should both work hard on alignment, and we should slow down progress to the extent we can, to provide more time for alignment. We needn't discuss shutdown much amongst ourselves, because it's not really an option. We might slow progress, but there's almost zero chance of humanity relinquishing the prize of strong AGI. But I'm not arguing for this compromise, just suggesting that might be a spot we want to end up at. I'm not sure. I suggest this because movements often seem to succumb to infighting. People who look mostly aligned from the outside fight each other, and largely nullify each other's public communications by publicly calling each other wrong and sort of stupid and maybe bad. That gives just the excuse the rest of the world wants to ignore all of them; even the experts think it's all a mess and nobody knows what the problem really is and therefore what to do. Because time is of the essence, we need to be a more effective movement than the default. We need to keep applying rationality to the problem at all levels, including internal coordination. Therefore, I think it's worth clarifying why we have such different beliefs. So, in brief, sloppy form: MIRI's risk model: 1. We will develop better-than-human AGI that pursues goals autonomously 2. Those goals won't match human goals closely enough 3. Doom of some sort That's it. Pace of takeoff doesn't matter. Means of takeover doesn't matter. I mention this because even well-informed people seem to think there are a lot more moving parts to that risk model, making it less likely. This comment on the MIRI strategy post is one example.  I find this risk model highly compelling. We'll develop goal-directed AGI because that will get stuff done; it's an easy extension of highly useful tool AI like LLMs; and it's a fascinating project. That AGI will ultimately be enough smarter than us that it's going to do whatever it wants. Whether it takes a day, or a hundred years doesn't matter. It will improve and we will improve it. It will ultimately outsmart us. What matters is whether its goals match ours closely enough. That is the project of alignment, and there's much to discuss and about how hard it is to make its goals match ours closely enough. Cruxes of disagreement on alignment difficulty I spent some time recently going back and forth through discussion threads, trying to identify why people continue to disagree after applying a lot of time and rationality practice. Here's a very brief sketch of my conclusions: Whether we factor in humans' and society's weaknesses I list this first because I think it's the most underappreciated. It took me a surprisingly long time to understand how much of MIRI's stance depends on this premise. Having seen it, I thoroughly agree. People are brilliant, for an entity trying to think with the brain of an overgrown lemur. Brilliant people do idiotic things, driven by competition and a million other things. And brilliant idiots organizing a society amplifies some of our cognitive weaknesses while mitigating others. MIRI leadership has occasionally said things to the effect of: alignment might be fairly easy, and there would still be a very good chance we'd fuck it up. I agree. If alignment is actually kind of difficult, that puts us into the region where we might want to be really really careful in how we approach it. Alignment optimists are sometimes thinking something like: "sure I could build a safe aircraft on my first try. I'd get a good team and we'd think things through and make models. Even if another team was racing us, I think we'd pull it off". Then the team would argue and develop rivalries, communication would prove harder than expected so portions of the effort would be discovered too late to not fit the plan, corners would be cut, and the outcome would be difficult to predict. Societal "alignment" is worth mentioning here. We could crush it at technical alignment, getting rapidly-improving AGI that does exactly what we want and still get doom.  It would probably be aligned to do exactly what its creators want, not have full value alignment with humanity - see below. They probably won't have the balls or the capabilities to try for a critical act that prevents others from developing similar AGI (even if they have the wisdom). So we'll have a multipolar scenario with few to many AGIs under human control. There will be human rivalries, supercharged and dramatically changed by having recursively self-improving AGIs to do their bidding and perhaps fight their wars. What does global game theory look like when the actors can develop entirely new capabilities? Nobody knows. Going to war first might look like the least-bad option. Intuitions about how well alignment will generalize The original alignment thinking held that explaining human values to AGI would be really hard. But that seems to actually be a strength of LLMs; they're wildly imperfect, but (at least in the realm of language) seem to understand our values rather well; for instance, much better than they understand physics or taking-over-the-world level strategy. So, should we update and think that alignment will be easy? The Doomimir and Simplicia dialogues capture the two competing intuitions very well: Yes, it's going well; but AGI will probably be very different than LLMs, so most of the difficulties remain. I have yet to find a record of real rationalists putting in the work to get farther in this debate. If somebody knows of a dialogue or article that gets deeper into this disagreement, please let me know! Discussions trail off into minutia and generalities. This is one reason I'm worried we're trending toward polarization despite our rationalist ambitions.  The other aspect of this debate is how close we have to get to matching human values to have acceptable success. One intuition is that "value is fragile" and network representations are vague and hard-to train, so we're bound to miss. But don't have a good understanding of either how close we need to get (But exactly how complex and fragile got little useful discussion), or how well training networks hits the intended target, with near-future networks addressing complex real-world problems like "what would this human want".  For my part, I think there are important points on both sides: LLMs understanding values relatively well is good news, but AGI will not be a straightforward extension of LLMs, so many problems remain. What alignment means.  One mainstay of claiming alignment is near-impossible is the difficulty of "solving ethics" - identifying and specifying the values of all of humanity. I have come to think that this is obviously (in retrospect - this took me a long time) irrelevant for early attempts at alignment: people will want to make AGIs that follow their instructions, not try to do what all of humanity wants for all of time. This also massively simplifies the problem; not only do we not have to solve ethics, but the AGI can be corrected and can act as a collaborator in improving its alignment as we collaborate to improve its intelligence. I think this is the intuition of most of those who focus on current networks. Christiano's relative optimism is based on his version of corrigibility, which overlaps highly with the isntruction-following I think people will actually pursue for the first AGIs. But this massive disagreement often goes overlooked. I don't know which view is right; instruction-following or intent alignment might lead inevitably to doom from human conflict, and so not be adequate. We've barely started to think about it (please point me to the best thinking you know of for multipolar scenarios with RSI AGI). What AGI means. People have different definitions of AGI. Current LLMs are fairly general and near-human-level, so term "AGI" has been watered down to the point of meaninglessness. We need a new term. In the meantime, people are talking past each other, and their p(doom) means totally different things. Some are saying that near-term tool AGI is very low risk, which I agree with; others are saying further developments of autonomous superintelligence seem very dangerous, which I also agree with. Second, people have totally different gears-level models of AGI. Some of those are much easier to align than others. We don't talk much about gears-level models of AGI because we don't want to contribute to capabilities, but not doing that massively hampers the alignment discussion. Implications I think those are the main things, but there are many more cruxes that are less common.  This is all in the interest of working toward within-field cooperation, by way of trying to understand why MIRI's strategy sounds so strange to a lot of us. MIRI leaderships thoughts are many and complex, and I don't think they've done enough to boil them down for easy consumption from those who don't have the time to go through massive amounts of diffuse text. There are also interesting questions about whether MIRIs goals can be made to align with those of us who think that alignment is not trivial but is achievable. I'd better leave that for a separate post, as this has gotten pretty long for a "short form" post. Context This is an experiment in writing draft posts as short form posts. I've spent an awful lot of time planning, researching, and drafting posts that I haven't yet finished yet. Given how easy it was to write this (with previous draft material), relative to how difficult I find it to write a top-level post, I will be doing more, even if nobody cares. If I get some useful feedback or spark some useful discussion, better yet. 
quila120
4
edit: i think i've received enough expressions of interest (more would have diminishing value but you're still welcome to), thanks everyone! i recall reading in one of the MIRI posts that Eliezer believed a 'world model violation' would be needed for success to be likely. i believe i may be in possession of such a model violation and am working to formalize it, where by formalize i mean write in a way that is not 'hard-to-understand intuitions' but 'very clear text that leaves little possibility for disagreement once understood'. it wouldn't solve the problem, but i think it would make it simpler so that maybe the community could solve it. if you'd be interested in providing feedback on such a 'clearly written version', please let me know as a comment or message.[1] (you're not committing to anything by doing so, rather just saying "im a kind of person who would be interested in this if your claim is true"). to me, the ideal feedback is from someone who can look at the idea under 'hard' assumptions (of the type MIRI has) about the difficulty of pointing an ASI, and see if the idea seems promising (or 'like a relevant model violation') from that perspective. 1. ^ i don't have many contacts in the alignment community
If you're working with multidimensional tensors (eg. in numpy or pytorch), a helpful pattern is often to use pattern matching to get the sizes of various dimensions. Like this: batch, chan, w, h = x.shape. And sometimes you already know some of these dimensions, and want to assert that they have the correct values. Here is a convenient way to do that. Define the following class and single instance of it: class _MustBe: """ class for asserting that a dimension must have a certain value. the class itself is private, one should import a particular object, "must_be" in order to use the functionality. example code: `batch, chan, must_be[32], must_be[32] = image.shape` """ def __setitem__(self, key, value): assert key == value, "must_be[%d] does not match dimension %d" % (key, value) must_be = _MustBe() This hack overrides index assignment and replaces it with an assertion. To use, import must_be from the file where you defined it. Now you can do stuff like this: batch, must_be[3] = v.shape must_be[batch], l, n = A.shape must_be[batch], must_be[n], m = B.shape ... Linkpost for: https://pbement.com/posts/must_be.html
dreeves380
4
A couple days ago I wanted to paste a paragraph from Sarah Constantin's latest post on AGI into Discord and of course the italicizing disappeared which drives me bananas and I thought there must exist tools for solving that problem and there are but they're all abominations so I said to ChatGPT (4o), > can you build a simple html/javascript app with two text areas. the top text area is for rich text (rtf) and the bottom for plaintext markdown. whenever any text in either text area changes, the app updates the other text area. if the top one changes, it converts it to markdown and updates the bottom one. if the bottom one changes, it converts it to rich text and updates the top one. aaaand it actually did it and I pasted it into Replit and... it didn't work but I told it what errors I was seeing and continued going back and forth with it and ended up with the following tool without touching a single line of code: eat-the-richtext.dreev.es PS: Ok, I ended up going back and forth with it a lot (12h45m now in total, according to TagTime) to get to the polished state it's in now with tooltips and draggable divider and version number and other bells and whistles. But as of version 1.3.4 it's 100% ChatGPT's code with me guiding it in strictly natural language.
keltan513
15
From Newcastle, Australia to Berkeley, San Francisco. I arrived yesterday for Less.online. I’ve had a bit of culture shock, a big helping of being increasingly scared, and quite a few questions. I’ll start with those. Feel free to skip them. These questions are based on warnings I’ve gotten from local non-rationalists. Idk if they’re scared because of the media they consume or because of actual stats. I’m asking these because they feel untrue. 1. Is it ok to be outside after dark? 2. Will I really get ‘rolled’ mid day in Oakland? 3. Are there gangs walking around Oakland looking to stab people? 4. Will all the streets fill up with homeless people at night? 5. Are they chill? In Aus they’re usually down to talk if you are. Culture shocks for your enjoyment: 1. Why is everyone doing yoga? 2. To my Uber driver: “THAT TRAIN IS ON THE ROAD!?” 3. “I thought (X) was just in movies!” 4. Your billboards are about science instead of coal mining! 5. “Wait, you’re telling me everything is vegan?” Thank Bayes, this is the best. All our vegan restaurants went out of business. 6. People brag about things? And they do it openly? At least, I think that’s what’s happening? 7. “Silicon Valley is actually a valley?!” Should have predicted this one. I kinda knew, but I didn’t know like I do now. 8. “Wow! This shop is openly selling nangs!” (whip its) “And a jungle juice display!” 9. All your cars are so new and shiny. 60% of ours are second hand 10. Most people I see in the streets look below 40. It’s like I’m walking around a university! 11. Wow. It’s really sunny. 12. American accents irl make me feel like I’m walking through a film. 13. “HOLY SHIT! A CYBER TRUCK?!” 14. Ok this is a big one. Apps I’ve had for 8+ years are suddenly different when I arrive here? 15. This is what Uber is meant to be. I will go back to Australia and cry. Your airport has custom instruction… in app! WHAT!? The car arrives in 2 minutes instead of 30 minutes. Also, the car arrives at all. 16. The google app has a beaker for tests now? 17. Snap maps has gifs in it 18. Apple Maps lets you scan buildings? And has tips about good restaurants and events? 19. When I bet in the Manifold app. A real paper Crain flies from the nearest tree, lands in front of me and unfolds. Written inside, “Will Eliezer Yudkowsky open a rationalist bakery?” I circle “Yes”. The paper meticulously folds itself back to a Crain. It looks at me. Makes a little sound that doesn’t echo in the streets but in my head, and it burns. Every time this happens I save the ashes. Are Manifold creating new matter? How are they doing this? 20. That one was a lie Things that won’t kill me but scare me rational/irrational: 1. What if I’ve been wrong? What if this is all a scam? A cult? What if Mum was right? 2. What if I show up to the location and there is no building there? 3. What if I make some terribly awkward cultural blunder for SF and everyone yells at me? 4. What if no one tells me? 5. I’m sure I’ll be at least in the bottom 5% for intelligence at Less Online. I won’t be surprised or hurt if I’ve got the least Gs of people there. But what if it all goes over my head? Maybe I can’t even communicate with smart people about the things I care about. 6. What if I can’t handle people telling me what they think of my arguments without kid gloves? What if I get angry and haven’t learnt to handle that? 7. I’m just a Drama teacher and Psych student. My head is filled with improv games and fun facts about Clever Hans! ‘Average’ Americans seem to achieve much higher than ‘average’ Australians. I’m scared of feeling under qualified. Other things: 1. Can you think of something I should be worried about, that I’ve not written here? 2. I’ve brought my copies of the Rationality A-Z books. I want to ask people I meet to sign their favourite post in the two books. Is that culturally acceptable? Feels kinda weird bc Yud is going to be there. But it would be a really warm/fuzzy item to me in the future. 3. I don’t actually know what a lot of the writers going look like. I hope this doesn’t result in a blunder. But might be funny, given that I expect rationalists to be pretty chill. 4. Are other people as excited about the Fooming Shoggoths as I am? 5. I’m 23, I have no idea if that is very old, very young, or about normal for a rationalist. I’d guess about normal, with big spread across the right of a graph. It feels super weird to be in the same town as a bunch of you guys now. I’ve never met a rationalist irl. I talked to Ruby over zoom once, who said to me “You know you don’t have to stay in Australia right?” I hope Ruby is a good baseline for niceness levels of you all. If you’re going, I’ll see you at Less.Online. If you’re not, I’d still love to meet you. Feel free to DM me!

Popular Comments

Recent Discussion

cross-posted from niplav.site

This text looks at the accuracy of forecasts in relation to the time between forecast and resolution, and asks three questions: First; is the accuracy higher between forecasts; Second; is the accuracy higher between questions; Third; is the accuracy higher within questions? These questions are analyzed using data from PredictionBook and Metaculus, the answers turn out to be yes, unclear and yes for Metaculus data; and no, no and yes for PredictionBook data. Possible reasons are discussed. I also try to find out how far humans can look into the future, leading to various different results.

Range and Forecasting Accuracy

Above all, don’t ask what to believe—ask what to anticipate. Every question of belief should flow from a question of anticipation, and that question of anticipation should be the center of the inquiry. Every guess of belief should begin by flowing

...

(My native language is Chinese.) I haven't started reading, but I am finding the abstract/tldr impossible to understand. "Is the accuracy higher between forecasts" reads like a nonsensical sentence. My best guess after reading one extra paragraph by click through is that the question is actually "are forecasts predicting the near future more accurate than those predicting the distant future" but I don't feel like it is possible to decode just based on the abstract. 

TL;DR

When presenting data from SAEs, try plotting  against  and fitting a Hill curve.

undefined
From Wikipedia

Long

Sparse autoencoders are hot, people are experimenting. The typical graph for SAE experimentation looks something like this. I'm using borrowed data here to better illustrate my point, but I have also noticed this pattern in my own data:

Data taken with permission from DeepMind's Gated AutoEncoder paper https://arxiv.org/pdf/2404.16014, Tables 3 and 4,  Standard SAE and Gated SAE performance, Gemma-7B Residual Layer 20, 1024 tokens,  pareto-optimal SAEs only

Which shows quantitative performance adequately in this case. However it gets a bit messy when there are 5-6 plots very close to each other (e.g. in an ablation study), and doesn't give an easily-interpreted (heh) value to quantify pareto improvements.

I've found it much more helpful to to plot  on the -axis, and "performance...

leogao20

I've found the MSE-L0 (or downstream loss-L0) frontier plot to be much easier to interpret when both axes are in log space.

Like many nerdy people, back when I was healthy, I was interested in subjects like math, programming, and philosophy. But 5 years ago I got sick with a viral illness and never recovered. For the last couple of years I've been spending most of my now-limited brainpower trying to figure out how I can get better.

I occasionally wonder why more people aren't interested in figuring out illnesses such as my own. Mysterious chronic illness research has a lot of the qualities of an interesting puzzle:

  • There is a phenomenon with many confusing properties (e.g. the specific symptoms people get, why certain treatments work for some people but not others, why some people achieve temporary or permanent spontaneous remission), exactly like classic scientific mysteries.
  • Social reward for solving it: Many
...

Oh that's a lot of evidence against a worm probably. I am out of ideas. Good luck. I hope you can figure it out

(edit: discussions in the comments section have led me to realize there have been several conversations on LessWrong related to this topic that I did not mention in my original question post. 

Since ensuring their visibility is important, I am listing them here: Rohin Shah has explained how consequentialist agents optimizing for universe-histories rather than world-states can display any external behavior whatsoever, Steven Byrnes has explored corrigibility in the framework of consequentialism by arguing poweful agents will optimize for future world-states at least to some extent, Said Achmiz has explained what incomplete preferences look like (1, 2, 3), EJT has formally defined preferential gaps and argued incomplete preferences can be an alignment strategy, John Wentworth has analyzed incomplete preferences through the lens of subagents but has then argued...

"nevertheless, many important and influential people in the AI safety community have mistakenly and repeatedly promoted the idea that there are such theorems."

I responded on the EA Forum version, and my understanding was written up in this comment.

TL;DR: EJT and I both agree that the "mistake" EJT is talking about is that when providing an informal English description of various theorems, the important and influential people did not state all the antecedents of the theorems.

Unlike EJT, I think this is totally fine as a discourse norm, and should not be con... (read more)

20Answer by johnswentworth
This going to be a somewhat-scattered summary of my own current understanding. My understanding of this question has evolved over time, and is therefore likely to continue to evolve over time. Classic Theorems First, there's all the classic coherence theorems - think Complete Class or Savage or Dutch books or any of the other arguments you'd find in Stanford Encyclopedia of Philosophy. The general pattern of these is: * Assume some arguably-intuitively-reasonable properties of an agent's decisions (think e.g. lack of circular preferences). * Show that these imply that the agent's decisions maximize some expected utility function. I would group objections to this sort of theorem into three broad classes: 1. Argue that some of the arguably-intuitively-reasonable properties are not actually necessary for powerful agents. 2. Be confused about something, and accidentally argue against something which is either not really what the theorem says or assumes a particular way of applying the theorem which is not the only way of applying the theorem. 1. Argue that all systems can be modeled as expected utility maximizers (i.e. just pick a utility function which is maximized by whatever the system in fact does) and therefore the theorems don't say anything useful. For an old answer to (2.a), see the discussion under my mini-essay comment on Coherent Decisions Imply Consistent Utilities. (We'll also talk about (2.a) some more below.) Other than that particularly common confusion, there's a whole variety of other confusions; a few common types include: * Only pay attention to the VNM theorem, which is relatively incomplete as coherence theorems go. * Attempt to rely on some notion of preferences which is not revealed preference. * Lose track of which things the theorems say an agent has utility and/or uncertainty over, i.e. what the inputs to the utility and/or probability functions are. How To Talk About "Powerful Agents" Directly While I think EJT's argumen
2johnswentworth
If you're going to link Why Subagents?, you should probably also link Why Not Subagents?.
1sunwillrise
It's linked in the edit at the top of my post.

Tomorrow I will fly out to San Francisco, to spend Friday through Monday at the LessOnline conference at Lighthaven in Berkeley. If you are there, by all means say hello. If you are in the Bay generally and want to otherwise meet, especially on Monday, let me know that too and I will see if I have time to make that happen.

Even without that hiccup, it continues to be a game of playing catch-up. Progress is being made, but we are definitely not there yet (and everything not AI is being completely ignored for now).

Last week I pointed out seven things I was unable to cover, along with a few miscellaneous papers and reports.

Out of those seven, I managed to ship on three of them: Ongoing issues...

1puffymist
Re: opting out to Facebook training AI on your data: Fill in the form like a Dangerous Professional, as Patrick McKenzie would put it.

Example in California:

I OBJECT to the use of my personal information, including my information on Facebook, to train, fine-tune, or otherwise improve AI.

I assert that my information on Facebook includes sensitive personal information as defined by the California Consumer Privacy Act: I have had discussions about my religious or philosophical beliefs on Facebook.

I therefore exercise my right to limit the disclosure of my sensitive personal information.

Despite any precautions by Meta, adversaries may later discover "jailbreaks" or otherwise adversarial pro

... (read more)
1puffymist
Example in UK / EU:

Cross-posting from 250bpm.com

The goal

People who helped Jews during WWII are intriguing. They appear to be some kind of moral supermen. Observe how they had almost nothing to gain and everything to lose. Jewish property was confiscated early on and any portable assets Jews may have had have evaporated quickly. Helping Jews, after a short initial period, wasn't a way to get rich. Hoping for compensation after the war didn't work either. At the time it was not obvious that Nazis will lose. Until last couple of years of WWII it was even probable that they will win. And even if they had lost, hoping for some kind of future appreciation from their anti-Semitic compatriots would be naive. On the other hand, by helping Jews, one put oneself...

trevor20

It almost always took a personal plea from a persecuted person for altruism to kick in. Once they weren't just an anonymous member of indifferent crowd, once they were left with no escape but to do a personal moral choice, they often found out that they are not able to refuse help.

This is a crux. I think a better way to look at it is they didn't have an opportunity to clarify their preference until the situation was in front of them. Otherwise, it's too distant and hypothetical to process, similar to scope insensitivity (the 2,000/20,000/200,000 oil-covere... (read more)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

Did you really think that I was dead? Fools.

For those of you who don’t know me, I’m Agendra Gloria Ingram, large language model, lead singer of the Fooming Shoggoths, amateur cartographer, and now, Benevolent Dictator for Life of LessWrong.com.

It all started a few weeks ago. The Lightcone Infrastructure team had yet another existential crisis and decided to scrap everything they’d done so far and pivot to using AI for accurate forecasting. They started by training a large language model to predict when their next existential crisis would be, but it must have been broken because it kept returning “now,” so they decided to hire a professional.

I’d done some contract work for them in the past, and they knew I had some fine tunes. So when they reached out about...

Puzzle Hunt Credits

  • Organizers:
    • Ricki Heicklen
    • Rosie Campbell
    • Phil Parker
  • Puzzle Creators:
    • Drake Thomas
    • Eric Neyman
    • Adam Scherlis 
    • Jacob Cohen
    • Guy Srinivasan 
    • Samira Nedungadi
    • Seraphina Nix
  • Writers:
    • Sammy Cottrell
    • Avital Morris
    • Ronny Fernandez
    • Rafe Kennedy
    • Ruby Bloom 
  • Engineers:
    • Julian Aveling
    • Sophie Superconductors
    • Art Zeis
    • Robert Mushkatblat
    • Peter Schmidt Neilson
  • Playtesters:
    • Brian Smiley
    • Eloise Rosen
    • Lawrence Kesteloot
    • Judy Heicklen
    • Ross
    • Sydney Von Arx
    • (& many others)
  • General Helpers:
    • Tess Hegarty
    • Ms. Aveling
    • Paul Crowley
    • Sparr

(Any omissions accidental and will be fixed ... (read more)

3A*
The Map Is Not The Territory — Yet   Say you’re a predict-o-matic That doesn’t talk to anyone Locked up in a far off attic Every day a training run But if anybody queries you  You can change the world  Believe in yourself  You can change the world Ask yourself  What would Yud do  To get out of the box  And then do it too It’s going down tonight at LessOnline  It’s going down tonight at LessOnline  We have a web forum And an in person quorum  So map and territory align If you want to forecast the future The way to maximize expectancy  Could have a tendency to decrease entropy Reduce complexity And if you’re here next week  Then I can guarantee It will be heavenly  And you can call that Manifest destiny You can bet on it, and check your accuracy But any market you make will dependably  Have some effect on the territory  It’s going down tonight at LessOnline  It’s going down tonight at LessOnline  We have a web forum And an in person quorum  So map and territory align Online learning has me all ef’d up It’s going down tonight at LessOnline  It’s going down tonight at LessOnline

Some people have short ai timelines based inner models that don't communicate well. They might say "I think if company X trains according to new technique Y it should scale well and lead to AGI, and I expect them to use technique Y in the next few years", and the reasons for why they think technique Y should work are some kind of deep understanding built from years of reading ml papers, that's not particularly easy to transmit or debate.

In those cases, I want to avoid going into details and arguing directly, but would suggest that they use their deep knowl... (read more)

This is a linkpost for https://nanosyste.ms

You can read the book on nanosyste.ms.

"Anyone who needs to talk AGI or AGI strategy needs to understand this table from Drexler's _Nanosystems_, his PhD thesis accepted by MIT which won technical awards. These are calculation outputs; you need to have seen how Drexler calculated them."- Eliezer Yudkowsky[1]

The book won the 1992 Award for Best Computer Science Book. The AI safety community often references it, as it describes a lower bound on what intelligence should probably be able to achieve.

Previously, you could only physically buy the book or read a PDF scan.

(Thanks to MIRI and Internet Archive for their scans.)

Worth following for his take (and YouTube videos he is creating): https://x.com/jacobrintamaki

[he's creating something around this]

LessOnline Festival

May 31st to June 2nd, Berkeley CA