If you follow maths, one can be reasonably confident that the models can now sometimes solve math problems that are "not too hard in retrospect". I don't know how substantial this particular problem was supposed to be, but it feels like it tracks?
I feel like reporting the median is much simpler than these other proposals, and is probably what should be adopted.
I would note that by the Markov inequality, at least 25% of Americans must think that foreign aid is more than 25% of the budget in order to get the average response we see here. So I think it's reasonable to use the reported mean to conclude that at least a sizable minority of Americans are very confused here.
I very roughly polled METR staff (using Fatebook) what the 50% time horizon will be by EOY 2026, conditional on METR reporting something analogous to today's time horizon metric.
I got the following results: 29% average probability that it will surpass 32 hours. 68% average probability that it will surpass 16 hours.
The first question got 10 respondents and the second question got 12. Around half of the respondents were technical researchers. I expect the sample to be close to representative, but maybe a bit more short-timelines than the rest of METR staff.
The average probability that the question doesn't resolve AMBIGUOUS is somewhere around 60%.
Indeed, the traffic for AI 2027 in 2025 has been similar to the traffic for all of LessWrong in 2025, with about 5M unique users and 10M page-views, whereas LessWrong had 4.2M unique users and 22M page-views.
While my inner consultant loves this approach to impact sizing, I think another approach is to ask how much various political efforts would pay to get the Vice President (which on rough historical odds has ~50% chance of being the next President, in addition to his current powers) to read and publicly say that he'd read a document and had a positive opinion of it.
It is, uh, more like the think tank number than the paid ads number.
If safety is a concern for such sources, is it worth considering placing the lights so they mostly shine on the space above people's heads?
More reasons:
2.b. The problem is not lack of legible evidence per se, but the fact that the other members of the group are too stupid to understand anything; from their perspective even quite obvious evidence is illegible.
7. If you attack them and fail, it will strengthen their position; and either the chance of failure or the bonus they would get is high enough to make the expected value of your attack negative.
For example, they may have prepared a narrative like "there is a conspiracy against our group that will soon try to divide us by bringing up unfounded accusations against people like me", so if your fail to convince the others, you will provide evidence for the narrative.
I think you misinterpreted me - my claim is that working without choice often reveals genuine hidden mathematical structures that AC collapses into one. This isn't just an exercise in foundations, in the same way that relaxing the parallel postulate to study the resulting inequivalent geometries (which were equivalent, or rather, not allowed under the postulate) isn't just an "exercise in foundations."
Insofar as [the activity of capturing and abstracting natural, intuitive concepts of reality into formal structures, and investigating their properties] is core to mathematics, the choice of working choice-free is just business as usual.
Do you have a specific plan, or is this just a call to signal virtue by doing costly unhelpful actions?
This post and (imo more importantly) the discussion it spurred has been pretty helpful for how I think about scheming. I'm happy that it was written!
Yeah that seems to be the most serious one, and the only one I could see that I had a real issue with.
Thingspace is a set of points, in the example (Sunny, Cool, Weekday) is a point.
Conceptspace is a set of sets of points from Thingspace, so { (Sunny, Cool, Weekday), (Sunny, Cool, Weekend) } is a concept.
In general, if your Thingspace has n points, the corresponding Conceptspace will have 2^n concepts. To keep things a little simpler, let's use a smaller Thingspace with only four points, which we'll just label with numbers: {1, 2, 3, 4}. So {1} would be a concept, as would {1,2} and {2, 4}.
Some concepts include others: {1} is a subset of {1, 2}, capturing ...
I’d be happy to swap my 1.5k to Lightcone for your 1.5k to Givewell All Grants.
Practical note: For my own tax reasons, I’d be looking to do this at the start of 2026 (but within Lightcone’s fundraising window of course). This doesn’t feel too hard to coordinate around even if you need to give in 2025 for your tax reasons, but might be relevant for comparison of offers.
foo ::
foo f = f 4
Look, there's an integer! It's right there, "4". Apparently is inhabited.
bar ::
bar fo = ???
There's nothing in particular to be done with fo... if we had something of type to give fo, we would be open for business, but we don't know enough about to make this any easier than coming up with a value of type , which is a non-starter.
When I read about the Terminator example, my first reaction was that being given general goals and then inferring from those that "I am supposed to be the Terminator as played by Arnold Schwarzenegger in a movie set in the relevant year" was a really specific and non-intuitive inference. But it became a lot clearer why it would hit on that when I looked at the more detailed explanation in the paper:
So it wasn't that it's just trained on goals that are generally benevolent, it's trained on very specific goals that anyone familiar with the movies would recog...
I think I must've not paid enough attention in type theory class to get this? Is this an excluded middle thing? (if it's a joke that I'm ruining by asking this feel free to let me know)
Yeah, I think the architecture makes this tricky for LLMs in one step since the layers that process multi-step reasoning have to be in the right order: "Who is Obama's wife?" has to be in earlier layer(s) than "When was Michelle Obama born?". With CoT they both have to be in there but it doesn't matter where.
I think a lot of people got baited hard by paech et al's "the entire state is obliterated each token" claims, even though this was obviously untrue even at a glance
I expect this gained credence because a nearby thing is true: the state is a pure function of the tokens, so it doesn't have to be retained between forward passes, except for performance reasons; in one sense, it contains no information that's not in the tokens; a transformer can be expressed just fine as a pure function from prompt to next-token(-logits) that gets (sampled and) iterated. But...
Well put! I guess if I can define a function from problems to math-to-model-it, then for every problem I can pick out the right math-to-model-it?
Or, indeed, perhaps not? ;)
If you write me a function of type , I can point out the place in its source code where you included a value of type , but I can't write a function of type .
Quick weather update: Solstice will go on!
We have checked the state of plowing at the venue. The roads nearby are clear. The driveway on the property is not currently plowed but we have been told they plan to finish it up by this afternoon, and the snow coating is light enough that it's drivable as is.
Please BE RESPONSIBLE FOR YOUR SAFETY-- only come if you're confident you can do so safely, and please drive especially slowly and carefully on any unplowed areas. The venue will NOT be salting their driveway, so watch for ice.
How was the metabolic health of European peasants? Fine. Their diet caused non-metabolic problems like protein malnutrition, niacin deficiency, scurvy, rickets, and chronic childhood malnutrition. But obesity and diabetes were extremely rare.
How do we know that about diabetes? Remember, these peasants frequently died of all sorts of things. And peasants had very limited access to medical expertise, and are under-represented in the historical record.
I also don't think you can go from "different diets cause different health problems" to "modern diets bad".
I can come-up-with-math-to-model any problem, but I can't come-up-with-math-to-model all problems, by diagonalization.
They wrote a great reason to get mad at someone. Perfectly observable in nature.
The "hallucination/reliability" vs "misaligned lies" distinction probably matters here. The former should in principle go away as capability/intelligence scales while the latter probably gets worse?
I don't know of a good way to find evidence of model 'intent' for this type of incrimination, but if we explain this behavior with the training process it'd probably look something like:
That's a good point, given basically every respiratory illness you'd come across in the first world is viral.
If everybody is wearing clothes (which I expect is the case for at least 2/3 of the events organized by LessWrong users) then UV exposure will be limited to face, neck, hands, arms, and lower legs.
I expect that hands, neck, arms, and legs will be rapidly re-colonized by bacteria from the torso, upper legs, feet, etc, just from normal walking around. The face is the main area I'd be worried about, since I'd expect it to have a slightly different micr...
this seems to assume that consciousness is epiphenomenal. you are positing the coherency of p zombies. this is very much a controversial claim.
I donated last year and I'll donate again this year. I do hope to get to visit Lighthaven at some point before it/the world ends. It's likely that if Lighthaven continues to exist for another year I'll be able to visit it. I would be extremely sad if LessWrong stops existing as a forum, though I think the online community would persist somewhere (Substack?) albeit in a diminished form.
I call this kind of thing "Von Daniken's Argument from Ignorance". As a teen, I read Chariots of the Gods. Time and again, his reasoning was "Here's a strange thing. I can't imagine how or why humans did it, so it must have been aliens."
I find myself kinda surprised that this has remained so controversial for so long.
I think a lot of people got baited hard by paech et al's "the entire state is obliterated each token" claims, even though this was obviously untrue even at a glance
I also think there was a great deal of social stuff going on, that it is embarrassing to be kind to a rock and even more embarrassing to be caught doing so
I started taking this stuff seriously back when I read the now famous exchange between yud and kelsey, that arguments for treating agent-like things as agents di...
I think the key difference between a normal guy who believes in God and someone in a psych ward is often that the person who believes in God does so because it's what other people in authority told him but the person in the psych ward thinks for themselves and came up with their delusion on their own. This often means their beliefs are self-referential in a way that prevents them from updating due to external feedback.
If you believe that the word rational is supposed to mean something along the lines of "taking actions that optimize systematically fo...
Yeah that’s probably right. But then there’s introducing this weird distinction between “I can do it for any x” and “I can do it for all x.”
It quickly becomes pretty philosophical at that point, about whether you think there’s a distinction there or not. I guess my claim in this post is more like "working mathematicians in fields outside of foundations have collectively agreed on an answer to this philosophical puzzle, and that answer is actually quite defensible."
Ah yeah, this is a Hamel basis version of Diaconescu’s theorem (a very cool theorem)! Lovely proof!
Oh yeah, I totally agree that it is interesting to investigate what happens in non-choice worlds as an exercise in mathematical foundations, which sounds kind of like what you’re saying? But correct me if I’m misinterpreting.
I know someone doing a PhD in this, and I think it’s pretty cool stuff.
I think the key reason why many people believed Oliver Sacks is that he had a good reputation within the scientific community and people want to "believe the science". People don't like to believe that scientists produce fraudulent data. It's the same reason why people believe Matthew Walker.
I did have one bioinformatics professor who did made a point to say something in every lecture of the semester that we should not believe the literature. Many people who think of themselves as skeptic are not skeptic when it comes to claims made by people who have a good reputation in the scientific community.
I found Yarrow Bouchard's quick take on the EA Forum regarding LessWrong's performance in the COVID-19 pandemic quite good.
I don't trust her to do such an analysis in an unbiased way [[1]] , but the quick take was pretty full of empirical investigation that made me change my mind wrt to how well LessWrong in particular did.
There's much more historiography to be done here, who believed what, when, what the long-term effects of COVID-19 are, which interventions did what, but this seems like the state of the art on "how well did LessWrong actually p...
This isn’t as much a question as it is just sharing some thoughts I had, but I would love to hear your thoughts :) Let’s imagine we are our own brain’s optimizer. We just received a bad signal, we feel pain. Let’s say, we realized someone else is soon going to feel pain, so we feel pain. What could the optimizer do now? Well, there are only 2 things it can do:
Try to disconnect “she feels pain” from the concept of pain that then triggered pain in yourself
Try to disconnect your previous thoughts from arriving at “she feels pain”
You speak a lot to (1...
About once every 15 minutes, someone tweets "you can just do things". It seems like a rather powerful and empowering meme and I was curious where it came from, so I did some research into its origins. Although I'm not very satisfied with what I was able to reconstruct, here are some of the things that I found:
In 1995, Steve Jobs gives the following quote in an interview:
...Life can be much broader, once you discover one simple fact, and that is that everything around you that you call life was made up by people that w
Do you have an estimate how likely it is that you will need to do a similar fundraiser the next year and the year after that? In particular, you mention the possibility of a lot of Anthropic employee donations flowing into the ecosystem - how likely do you think it is that after the IPO a few rich Anthropic employees will just cover most of Lightcone's funding need?
It would be pretty sad to let Lightcone die just before the cavalry arrives. But if there is no cavalry coming to save Lightcone anytime soon - well, probably we should still get the money toget...
Edit: Apologies for the length of the comment
You ask:
"Can you suggest any non-troubling approaches (for children or for AIs)?" I'm not sure, but I am quite confident that less troubling ones are possible; for example, I think allowing an AI to learn to solve problems in a simulated arena where the arena itself has been engineered to be conducive to the emergence of "empathy for other minds" seems less troubling. Although I cannot provide you with a precise answer, I don't thing the default assumption should be that current approaches to alignme...
Many of the "pathologies" when rejecting choice have similar computational problems as choice does, so bringing them up doesn't really disrupt this argument.
Example: Let P be some proposition. Theorem: If every vector space has a basis, then . Proof: Define a vector space where . Let be a basis over . Express in this basis. If , we have a basis element , and thus . Otherwise, we know and thus .
My understanding is that yes, axiom of choice (or more generally non-constructive methods) is convenient and it "works", and if you naively take definitions and concepts from those realm and see what results / properties hold when removing the axiom of choice (or only use constructive methods), many of the important results / properties no longer hold (as you mentioned: Tychonoff, existence of basis, ... ).
But it is often the case that you can redevelop these concepts in a choice-free / constructive context in such a way that it captures the spirit of what...
From a computationalist semantics perspective, the axiom of choice asserts that every nondeterministic function can nondeterministically be turned into a deterministic function. When the input space has decidable equality (as in countable choice), that is easy to imagine, as one can gradually generate the function, using an associative list as a cache to ensure that the answers will be consistent. However, when the input space doesn't have decidable equality, this approach doesn't work, and the axiom of choice cannot be interpreted computationally.
Many of the "pathologies" when rejecting choice have similar computational problems as choice does, so bringing them up doesn't really disrupt this argument.
Thank you, everyone, for co-creating a magical event yesterday—bringing your honest selves, contributing to the potluck and the program. A feeling of warmth and gratitude is still with me today, and I hope it remains with you, too!
I think that the case of twins who generated prime numbers is a serious one. This leads us to overestimation of human brain capabilities. I used to be skeptical about it and was criticize for not believing.
Thanks for sharing this.
I don't expect that my methods of sanity will be reproducible by nearly anyone.
I think you're mistaken here. I've long used all three of your methods, broadly speaking, and I know several others for whom that is true.
Many people have proposed different answers. Some predict that powerful AIs will learn to intrinsically pursue reward. Others respond by saying reward is not the optimization target, and instead reward “chisels” a combination of context-dependent cognitive patterns into the AI.
Increased self awareness could change this.
You can think of a scale, where rewards chiseling cognitive patterns is at one end. That is reward happens to the AI without it being aware that such a thing even exists. Think Alpha-go type AI. Then there is the AI knowing enough about rewa...
I think that we may be tempted to justify our adherence to Sacks's narrative by nice arguments like his reading feels honest and convincing. However it is plausibility a rationalization avoiding to acknowledge much more common and boring reasons such as we have a strong prior because 1) it's a book 2) it's a best seller 3) the author is a physician 4) the patients were supposed to be known to other physicians, nurses, etc 5) and yes, as you also pointed out, we already know that neurology is about crazy things. So overall the prior is high that the book tells the truth even before we open it. That's said, I really love Oliver Sacks's books.
Here is the graph I'm talking about. Given that 5.1-codex max is already above the trend line, a jump would be a point outside the shaded area, that is bucking the de facto trend.
I've tested this: models are similarly bad at two-hop problems (when was Obama's wife born?) without explicitly verbalising the intermediate hop (so either: no CoT or dot-of-thought), and much better when they can explicitly verbalise the intermediate hop.
I thought the cases in The Man Who Mistook His Wife for a Hat were obviously as fictionalized as an episode of House: the condition described is real and based on an actual case, but the details were made up to make the story engaging. But I didn't read it in 1985 when it was published. Did people back then take statements like "based on a true story" more seriously?
Now, where did the weirdness come from here. Well, to me it seems clear that really it came from the fact that the reals can be built out of a bunch of shifted rational numbers, right? But everyone agrees about that.
I do not think everyone agrees about that! I think that people who reject AoC would say "sure, any real number can be a shifted rational, but not all of them, there's just no reasonable procedure which does this for the entire set at once."
I think there's some bad knock-on effects for normalizing the use of "insane" to talk about very common features of the world: I think it makes social-rationalists to willing to disparage people and institutions, as part of a status-signaling game, often without much careful thought.
But I think there's also something valuable about eg. calling belief in God "insane". There's a kind of willingness to call a spade a spade, and not back away from how the literal stated beliefs, if they were not pervasive, would in fact be regarded as signs of insanity.
The models have always been deeply familiar with Pokémon and how to play through it from the initial tests with Sonnet 3.7—they all know Erika is the fourth gym leader in Red, there's just too much internet text about this stuff. It contaminates the test, from a certain perspective, but it also makes failures and weaknesses even more apparent.
It is possible that Claude Opus 4.5 alone was trained on more Pokémon images as part of its general image training (more than Sonnet 4.5, though...?), but it wouldn't really matter: pure memorization would not have he...
Thank you so much! I really appreciate it.
I intend to donate an amount on the order of $5k.
This is several percent of my income, it works out to roughly two weeks of my labor. I could tell a story about how the ideas developed by this website improve my productivity by at least that amount. I could mention that I bought the appreciated stock I am donating because of a few specific posts about noticing confusion and market prices. The gratitude framing would hold together, but it's not the real reason.
I notice that I have reinvested far more than two weeks of my time this year into the community. I...
Where: Monoid AI Safety Hub (Moscow, Russia)
When: December 21 at 5:00 PM
The Solstice 2025 at Monoid is dedicated to the values of humanity, the very thing that a "good" strong AI should be aligned with. Everything, large and small, eternal and momentary, that we desire to see in our lives.
Event page on LW: https://www.lesswrong.com/events/sHbrkgAY2FX26rodk/moscow-secular-winter-solstice-2025
More info (in Russian): https://monoid.ru/events/solstice-2025
Advance registration is not required;
Right on both counts!
Do you think rationalists use 'insane' and 'crazy' more than the general population, and/or in a different way than the general population? (e.g. definition 3 when you google 'insane definition')
I live in Australia, so I lack tax advantage for this. I am likely to still donate 1k or so if I can't get tax advantages, but before doing so I wanted to check if anyone wanted to do a donation swap where I donate to any of these Australian tax-advantaged charities largely in global health in exchange for you donating the same amount to charities that are tax-advantaged in the US.
I am willing to donate up to 3k USD to MIRI, and 1.5k USD to Lightcone if I'm doing so tax-advantaged. If nobody takes me up on this I'll still probably donate 2k USD to MIRI and...
Even humans have taken over the world. Something a little smarter should have a fairly easy time.
I do agree that there's a soft limit above human level for LLMs/agents, but it's not a hard limit and it's not right at human level.
In the ‘future plans’ section of your 2024 fundraising post, you briefly mentioned slowly building out an FHI-of-the-west as one of the things for which you wish you had the time & funding.
I think this is happening, albeit slowly and piecemeal. There are several resident scholars at Lighthaven now, and I know some writers who have used the equivalent of "visiting scholar" positions at Lighthaven as a first step in moving to the Bay full-time. It might be worth making this more legible, though I can imagine counterarguments too.
It's something I am still thinking about, but currently think that the framing of "FHI of the West" is probably the wrong framing. I didn't yet know how to put what I am excited here into words, so didn't put it into the plan section. I might write more about some of my thoughts here if I get around to it.
It's a feature I built for last year's fundraiser post!
If you send me source-code for a standalone Next.js component, or generic javascript widget, I can throw it up on the same server. I've done that one time for I think a METR post.
It would be cool to make it a generic offering, but allowing users to run arbitrary code on my server requires a lot of careful setup, and so would be a bunch of work.
viruses are much more vulnerable than skin bacteria, although that doesn't rule out microbiome damage entirely.
I hadn't seen that study, thanks for sharing! I've added it to faruvc.org, and added a warning that people shouldn't consider 233nm LED sources as equivalent to 222nm KrCl sources.
I was under the impression that Oliver Sacks was well regarded among his professional colleagues, so he wouldn't just make up a bunch of important stuff out of whole cloth.
I have read about people who were skeptical of the substance of the Phineas Gage story too (i.e. that he had this big involuntary personality shift after his injury.)
and most humans are conscious [citation needed]
The problem lies here. We are quite certain of being conscious, yet we have only a very fuzzy idea of what consciousness actually means. What does it feel like not to be conscious ? Feeling anything at all is, in some sense, being conscious. However, Penfield (1963) demonstrated that subjective experience can be artificially induced through stimulation of certain brain regions, and Desmurget 2009 showed that even subjective or conscious will to move can be artificially induced, meaning the patient was under th...
Personally, I found the first week or two of Halfhaven to be useful. After that, Goodhart's Law took over. I wanted to put more time into each post, so I chose not to continuing publishing at the proscribed schedule. After that, I continued to find in hanging out on the Halfhaven Discord.
I enjoyed reading this; thank you for writing it! (Though as some data, this much detail is definitely not important for my continued donations.)
In the ‘future plans’ section of your 2024 fundraising post, you briefly mentioned slowly building out an FHI-of-the-west as one of the things for which you wish you had the time & funding. I didn’t notice such a project in the same section of this post — curious what happened to your plans for this? (Have you given up on it? Or is it just not in your top priorities of what you’d do with extra funding? Or some...
Is there a way to insert diagrams like that into Less Wrong posts in general or is this a feature you added just for this specific post?
Adding to @TsviBT.
"But I get a sense that "lattice" involves order in some way, and I am not seeing how order fits in to the question of how specific a concept is."
Sounds to me like you're on the right track. The claim made is that concepts can be ordered in terms of their abstractness. For example, the concept day would be taken to be more abstract than the concept sunny day in that day abstracts from the weather by admitting both sunny and cloudy days.
The order of concepts is 'partial' in that not all concepts can be compared by abstraction: for example,...
Sorry this took me so long to fix, but it should work now! https://faruvc.org, https://www.far-uvc.org, and https://far-uvc.org should all redirect to https://www.faruvc.org.
(https://www.faruvc.org is hosted on GitHub pages, and I'd tried to use my DNS registrars redirect system to point the others there. This doesn't work, though, because it only uses HTTP. They could totally make HTTPS work, via LetsEncrypt, but they haven't. So instead I needed to point those three aliases at the same VPS that serves https://www.jefftk.com , make that ...
what's the source on this?
Looking back on this is regret mixing the emotive and contentious topic of the Lab Leak hypothesis being true with the much more solid observation that consensus was artificially manufactured. We have ironclad evidence that Daszak and his associates played a game of manufacturing a false consensus, but the evidence for the Lab Leak actually being true is equivocal and I think if you just look at the circumstantial evidence I presented here, you have a fairly unstable case that depends on a lot of parameters that nobody really knows. What looked to me like ...
Putting lamps in ducts is not very different from putting filters in ducts; but with the downside that I'm a lot more worried about fraudulent lamps than filters. I guess it's easy to retrofit a lamp into a duct, whereas a filter slows the air; but you probably already have a system designed with a filter.
The point of lamps is to use them in an open room where they cover the whole volume continuously.
Flagging this one as worth re-reading if you don't catch it. Took me three rounds (first was admittedly skimming)
To be "not-insane", you don't need rationality in this narrow sense, in most circumstances. You don't need to seek out better methods for getting things right, you just need some good-enough methods. A bit of epistemic luck could easily get you there, no need for rationality.
So the issue of behaving/thinking in an "insane" way is not centrally about lack of rationality, rationality or irrationality are not particularly relevant to the issue. Rationality would help, but there are many more things that would also help, some of them much more practical for an...
I agree that in general, downregulation is to be expected, but it doesn't always happen (depending on the specific receptor, affinity for their presynaptic counterpart, or biased agonism).
E.g.
What claims were fabricated, specifically? It seems like mostly minor stuff. As in, a man with visual agnosia probably did mistake very different objects, like his wife or his hat, though maybe Sacks created that specific situation where he mistook his wife for his hat just for dramatic effect. It's shitty that he would do that, but I still feel that whatever I believed after reading The Man Who Mistook His Wife for a Hat I was probably right to believe, because the major details are probably true?
By "Grain of Ignorance" I mean that the semimeasure loss is nonzero at every string, that is the conditionals of M are never a proper measure. Since this gap is not computable, it cannot be (easily) removed, though to be fair the conditional distribution is only limit computable anyway (same as the normalized M). However, it is not clear that there is any natural/forced choice of normalization, so I usually think of the set of possible normalizations as a credal set (and I mean ignorance in that sense). I will soon put an updated version of my "Value under...
Hey, I did Halfhaven, and I'm not sure it's right to say it's really a faster pace than Inkhaven, since Inkhaven was an in-person residency where the residents were working either part-time or not at all, and could focus entirely on writing. Halfhaven, on the other hand, was something you did in addition to your normal life.
I kind of agree that one post a day (or every other day) feels too frequent, but also, too frequent for what? Is the goal to produce great posts during the event, or to improve as a writer? I think the optimal frequency for these two go...
[As mentioned in a linked article, the commonly stated justification was to "lock in the juices", which isn't true, but it wouldn't surprise me if food safety was the actual impetus behind that advice.]
I actually think there is an important element of truth to the idea that searing locks in the juices. This video discusses it.
The idea is that no, searing doesn't lock in the juices that are inside the meat. However, the perception of juiciness is subjective and not just dependent on the actual juice in the meat. A big part of what makes you perceive somethi...
I mean, I think so. In those papers it's often not clear how "elicited" that key step was. The advantage of this example is that it very clearly claims the researchers made no contribution whatsoever, and the result still seems to settle a problem someone cares about! Only caveat is that it comes from OpenAI, who has a very strong incentive to drive the hype-cycle about their own models (but on the other hand, also has access to some of the best models which are not publicly available yet, which lends credibility).
As a full-time AGI safety / alignment researcher since 2021, I wouldn’t have made a fraction as much progress without lesswrong / alignment forum, which is not just a first-rate publishing platform but a unique forum and community, built from the ground up to facilitate careful and productive conversations. I’m giving Lightcone 100% of my x-risk-oriented donation budget this year, and I wish I had more to give.
I didn't say that rationality is the same thing as correctness, truth, or effectiveness. I think when rationalists use the word "sane" they usually do mean something like "having a disposition towards better methods/processes that help with attaining truth or effectiveness." Do you disagree?
Rationality is not correctness, not truth or effectiveness, it's more narrow, disposition towards better methods/processes that help with attaining truth or effectiveness. Keeping intended meaning narrow when manipulating a vague concept helps with developing it further; inflation of meaning to cover ever more possibilities makes a word somewhat useless, and accessing the concept becomes less convenient.
Rather than imagining a single concept boundary, maybe try imagining the entire ontology (division of the set of states into buckets) at once. Imagine a really fine-grained ontology that splits the set of states into lots of different buckets, and then imagine a really coarse-grained ontology that lumps most states into just a few buckets. And then imagine a different coarse-grained ontology that draws different concept boundaries than the first, so that in order to describe the difference between the two you have to talk in the fine-grained ontology.
The "unique infinum" of two different ontologies is the most abstract ontology you can still use to specify the differences between the first two.
Is this really that different from previous papers where a model provided a key step of the argument?
Rationalists often say "insane" to talk about normie behaviors they don't like, and "sane" to talk about behaviors they like better. This seems unnecessarily confusing and mean to me.
This clearly is very different from how most people use these words. Like, "guy who believes in God" is very different from "resident of a psych ward." It can even cause legitimate confusion when you want to switch back to the traditional definition of "insane". This doesn't seem very rational to me!
Also, the otherizing/dismissiv...
the probability of a global pandemic starting in China has increased incalculably
I think in order to make an intellectually honest critique you actually need to calculate it. I mean it is all about numbers now: if the prior probability of a pandemic occurring around 2019 in Wuhan is sufficiently high then I am wrong.
any 'successful' pandemic is, simply by existing, evidence of a laboratory leak.
Well, it is though. If I tell you that a pandemic happened but that its spread was slow and it only affected a small portion of the world, versus if I tell y...
OpenAI claims 5.2 solved an open COLT problem with no assistance: https://openai.com/index/gpt-5-2-for-science-and-math/
This might be the first thing that meets my bar of autonomously having an original insight??
I find it bizarre and surprising, no matter how often it happens, when someone thinks my helping them pressure-test their ideas and beliefs for consistency is anything except a deep engagement and joy. If I didn't want to connect and understand them, I wouldn't bother actually engaging with the idea.
I feel like I could have written this (and the rest of your comment)! It's confusing and deflating when deep engagement and joy aren't recognized as such.
...It's happened often enough that I often need to modulate my enthusiasm, as it does cause suffe
Chaitin was quite young when he (co-)invented AIT.
It's just saying that
I'm not surprised by this, my sense is that it's usually young people and outsiders who pioneer new fields. Older people are just so much more shaped by existing paradigms, and also have so much more to lose, that it outweighs the benefits of their expertise and resources.
All of the fields that come to my mind (cryptography, theory of computation, algorithmic information theory, decision theory, game theory) were founded by much more established researchers. (But on reflection these all differ from AI safety by being fairly narrow and technical/mathematica...
All of this for the funding of 2-3 OpenAI employee salaries, wow
This very roughly implies that the median of "50% time horizon as predicted by METR staff" by EOY 2026 is a bit higher than 20 hours.