Reputation is lazily evaluated
When evaluating the reputation of your organization, community, or project, many people flock to surveys in which you ask randomly selected people what they think of your thing, or what their attitudes towards your organization, community or project are.
If you do this, you will very reliably get back data that looks like people are indifferent to you and your projects, and your results will probably be dominated by extremely shallow things like "do the words in your name invoke positive or negative associations".
People l...
practically all metrics of the EA community's health and growth have sharply declined, and the extremely large and negative reputational effects have become clear.
I want more evidence on your claim that FTX had a major effect on EA reputation. Or: why do you believe it?
Don't Induce psychosis intentionally. Don't take psychedelics while someone probes your beliefs. Don't let anyone associated with Michael Vasser anywhere near you during an altered state.
Edit: here is a different report from three years ago with the same person administering the methods:
Mike Vasser followers practice intentionally inducing psychosis via psychedelic drugs. Inducing psychosis is a verbatim self report of what they are doing. I would say they practice drug induced brain washing. TBC they would dispute the term brain washing and probably...
Thanks for answering; good to hear that you don't think you've had any severe or long-lasting consequences (though it sounds like one time LSD was a contributor to your episode of bad mental health).
I guess here's other question that seems natural: it's been said that some people take LSD on either the personal advice of Michael Vassar, or otherwise as a result of reading/discussing his ideas. Are either of those true for you?
I've been thinking of writing up a piece on the implications of very short timelines, in light of various people recently suggesting them (eg Dario Amodei, "2026 or 2027...there could be a mild delay")
Here's a thought experiment: suppose that this week it turns out that OAI has found a modified sampling technique for o1 that puts it at the level of the median OAI capabilities researcher, in a fairly across-the-board way (ie it's just straightforwardly able to do the job of a researcher). Suppose further that it's not a significant additional compute expens...
If we could trust OpenAI to handle this scenario responsibly, our odds would definitely seem better to me.
it's quite plausible (40% if I had to make up a number, but I stress this is completely made up) that someday there will be an AI winter or other slowdown, and the general vibe will snap from "AGI in 3 years" to "AGI in 50 years". when this happens it will become deeply unfashionable to continue believing that AGI is probably happening soonish (10-15 years), in the same way that suggesting that there might be a winter/slowdown is unfashionable today. however, I believe in these timelines roughly because I expect the road to AGI to involve both fast periods and slow bumpy periods. so unless there is some super surprising new evidence, I will probably only update moderately on timelines if/when this winter happens
What do you think would be the cause(s) of the slowdown?
For anyone curious about what the sPoOkY and mYsTeRiOuS Michael Vassar actually thinks about various shit, many of his friends have blogs and write about what they chat about, and he's also been on several long form podcasts.
https://naturalhazard.xyz/ben_jess_sarah_starter_pack
https://open.spotify.com/episode/1lJY2HJNttkwwmwIn3kyIA?si=em0lqkPaRzeZ-ctQx_hfmA
https://open.spotify.com/episode/01z3WDSIHPDAOuVp1ZYUoN?si=VOtoDpw9T_CahF31WEhZXQ
https://open.spotify.com/episode/2RzlQDSwxGbjloRKqCh1xg?si=XuFZB1CtSt-FbCweHtTnUA
I want to say I have to an extent (for all three), though I guess there's been second-hand in person interactions which maybe counts. I dunno if there's any sort of central thesis I could summarize, but if you pointed me at like any more specific topics I could take a shot at translating. (Though I'd maybe prefer to avoid the topic for a little while.)
In general, I think an actual analysis of the ideas involved and their merits / drawbacks existing would've been a lot more helpful for me than just... people having a spooky reputation was.
My other explanation probably has to do with the fact that it's way easier to work with an already almost-executed object than a specification, because we are constrained to only think about a subset of possibilities for a reasonable time.
In other words, constraints are useful given that you are already severely constrained, to limit the space of possibilities.
Risk is a great study into why selfish egoism fails.
I took an ethics class at university, and mostly came to the opinion that morality was utilitarianism with an added deontological rule to not impose negative externalities on others. I.e. "Help others, but if you don't, at least don't hurt them." Both of these are tricky, because anytime you try to "sum over everyone" or have any sort of "universal rule" logic breaks down (due to Descartes' evil demon and Russell's vicious circle). Really, selfish egoism seemed to make more logical sense, but it doesn't h...
Yeah my view is that as long as our features/intermediate variables form human understandable circuits, it doesn't matter how "atomic" they are.
Basically every time a new model is released by a major lab, I hear from at least one person (not always the same person) that it's a big step forward in programming capability/usefulness. And then David gives it a try, and it works qualitatively the same as everything else: great as a substitute for stack overflow, can do some transpilation if you don't mind generating kinda crap code and needing to do a bunch of bug fixes, and somewhere between useless and actively harmful on anything even remotely complicated.
It would be nice if there were someone who t...
I find them quite useful despite being buggy. I spend about 40% of my time debugging model code, 50% writing my own code, and 10% prompting. Having a planning discussion first with s3.6, and asking it to write code only after 5 or more exchanges works a lot better.
Also helpful is asking for lots of unit tests along the way yo confirm things are working as you expect.
they lacked the conviction to push any other distinctive brand (and in some cases the situation made alternatives infeasible).
I guess it is difficult to promote the brand of Tough No-Nonsense Prosecutor in the age of Defund The Police.
Which is really unfortunate, because it seems like "defund the police" was actually what woke white people wanted. Black people were probably horrified by the idea of giving up and letting the crime grow exponentially at the places they live. Unfortunately, the woke do not care about the actual opinions of the people they spe...
If Biden pardons people like Fauci for crimes like perjury, that would set a bad precedent.
There's a reason why perjury is forbidden and if you just give pardons to any government official who committed crimes at the end of an administration that's a very bad precedent.
One way out of that would be to find a different way to punish government criminals when they are pardoned. One aspect of a pardon is that they remove the Fifth Amendment defense.
You can subpoena pardoned people in front of Congress and ask them under oath to speak about all the crimes...
Misgivings about Category Theory
[No category theory is required to read and understand this screed]
A week does not go by without somebody asking me what the best way to learn category theory is. Despite it being set to mark its 80th annivesary, Category Theory has the evergreen reputation for being the Hot New Thing, a way to radically expand the braincase of the user through an injection of abstract mathematics. Its promise is alluring, intoxicating for any young person desperate to prove they are the smartest kid on the block.
Recently, there has been sig...
I was at an ARIA meeting with a bunch of category theorists working on safeguarded AI and many of them didn't know what the work had to do with AI.
epistemic status: short version of post because I never got around to doing the proper effort post I wanted to make.
What is the current popular (or ideally wise) wisdom wrt publishing demos of scary/spooky AI capabilities? I've heard the argument that moderately scary demos drive capability development into secrecy. Maybe it's just all in the details of who you show what when and what you say. But has someone written a good post about this question?
The way it is now, when one lab has an insight, the insight will probably spread quickly to all the other labs. If we could somehow "drive capability development into secrecy," that would drastically slow down capability development.
I kinda feel like I literally have more subjective experience after experiencing ego death/rebirth. I suspect that humans vary quite a lot in how often they are conscious, and to what degree. And if you believe, as I do, that consciousness is ultimately algorithmic in nature (like, in the "surfing uncertainty" predictive processing view, that it is a human-modeling thing which models itself to transmit prediction-actions) it would not be crazy for it to be a kind of mental motion which sometimes we do more or less of, and which some people lack entirely.
I ...
interesting. what if she has her memories and some abstract theory of what she is, and that theory is about as accurate as anyone else's theory, but her experiences are not very vivid at all. she's just going through the motions running on autopilot all the time - like when people get in a kind of trance while driving.
Putting down a prediction I have had for quite some time.
The current LLM/Transformer architecture will stagnate before AGI/TAI (That is the ability to do any cognitive task as effectively and cheaper than a human)
From what I have seen, Tesla autopilot learns >10,000 slower than a human datawise.
We will get AGI by copying nature, at the scale of a simple mammal brain, then scaling up, like this kind of project:
https://x.com/Andrew_C_Payne/status/1863957226010144791
https://e11.bio/news/roadmap
I expect AGI to be 0-2 years after a mammal brain is mapped. In...
OK fair point. If we are going to use analogies, then my point #2 about a specific neural code shows our different positions I think.
Lets say we are trying to get a simple aircraft of the ground and we have detailed instructions for a large passenger jet. Our problem is that the metal is too weak and cannot be used to make wings, engines etc. In that case detailed plans for aircraft are no use, a single minded focus on getting better metal is what its all about. To me the neural code is like the metal and all the neuroscience is like the plane schematics. ...
Lighthaven clearly needs to get an actual Gerver's sofa particularly if the proof that it's optimal comes through.
It does look uncomfortable I'll admit, maybe it should go next to the sand table.
I was just thinking of adding some kind of donation tier where if you donate $20k to us we will custom-build a Gerver sofa, and dedicate it to you.
We still don't know if this will be guaranteed to happen, but it seems that OpenAI is considering removing its "regain full control of Microsoft shares once AGI is reached" clause. It seems they want to be able to keep their partnership with Microsoft (and just go full for-profit (?)).
Here's the Financial Times article:
...OpenAI seeks to unlock investment by ditching ‘AGI’ clause with Microsoft
OpenAI is in discussions to ditch a provision that shuts Microsoft out of its most advanced models when the start-up achieves “artificial general intelligence”, as
need any help on post drafts? whatever we can do to reduce those trivial inconveniences
New AWS Trainium 2 cluster offers compute equivalent to 250K H100s[1], and under this assumption Anthropic implied[2] their previous compute was 50K H100s (possibly what was used to train Claude 3.5 Opus).
So their current or imminent models are probably 1e26-2e26 FLOPs (2-4 months on 50K H100s at 40% compute utilization in BF16)[3], and the upcoming models in mid to late 2025 will be 5e26-1e27 FLOPs, ahead of what 100K H100s clusters of other players (possibly except Google) can deliver by that time.
SemiAnalysis gives an estimate of 24-27 kilowatts per
This might require bandwidth of about 300 Tbps for 500K B200s systems (connecting their geographically distributed parts), based on the below estimate. It gets worse with scale.
The "cluster" label applied in this context might be a bit of a stretch, for example the Llama 3 24K H100s cluster is organized in pods of 3072 GPUs, and the pods themselves are unambiguously clusters, but at the top level they are connected with 1:7 oversubscription (Section 3.3.1).
Only averaged gradients need to be exchanged at the top level, once at each optimizer step (minibatch...