Imaging pausing did not change p(doom) at all and merely delays inevitable extinction by 10 years. To me that would still be a no brainer - rather have 10 more years. To you, does that really only boil down to 600 million extra deaths and nothing positive, like, say, 80 billion extra years of life gained?
Doesn't your way of calculating things suggest that, if you had the chance to decide between two outcomes:
You'd choose the former because you'd end up at a lower number of people dying?
Great initiative, looking forward to what you eventually report!
I had a vaguely similar thought at first, but upon some reflection found the framing insightful. I hadn't really thought much about the "AI models might just get selected for the capability of resisting shutdown, whether they're deliberate about this or not" hypothesis, and while it's useful to distinguish the two scenarios, I'd personally rather see this as a special case of "resisting shutdown" than something entirely separate.
One more addition: Based on @leogao's comment, I went a bit beyond the "visualize loss landscape based on gradient" approach, and did the following: I trained 3 models of identical architecture (all using [20, 30, 20] hidden neurons with ReLU) for 100 epochs and then had a look at the loss landscape in the "interpolation space" between these three models (such that model1 would be at (0,0), model2 at (1,0), model3 at (0,1), and the rest just linearly interpolating between their weights). I visualized the log of the loss at each point. My expectation was to get clear minima at (0,0), (1,0) and (0,1), where the trained models are placed, and something elevated between them. And indeed:
Otherwise the landscape does look pretty smooth and boring again.
In fact, even after only 10 epochs and a test loss of >1.2, model 4 already produces something that clearly resembles Mandelbrot, which model 3 failed to achieve even after 100s of epochs:
a factor of 3 loss difference is huge! if you want to claim that smooth actfn is better beyond what's explained by the loss, you need to compare two models with the same loss but different actfn.
I made a fairer comparison now. Training model 4 (same architecture as model 3, but SiLU instead of ReLU) for only 30 epochs, it achieves a test loss of 0.3435, slightly above the 0.3403 of model 3. Taking these two models to render Mandelbrot, I get these results:
Model 3 (as we've seen in the post):
Model 4 with slightly worse test loss than Model 3:
And zooming in a bit to see the fractal details of Model 4:
So I'd say the observation holds - the huge difference between rendered Mandelbrot quality for some reason does not depend so much on the loss here. Even for (roughly) identical loss, they are worlds apart. So my take is that the smoothness of the SiLU activation function somehow enables the fractal iteration to work out much better (but not really sure how else to test this vague assumption).
You mean in ARENA or with this complex number multiplication project? In both cases I was just using Google Colab (i.e. cloud compute) anyway. It probably would have worked in the free tier, but I did buy $10 worth of credits to speed things up a bit, as in the free tier I was occasionally downgraded to a CPU runtime after running the notebook for too long throughout a day. So I never tried this on my own hardware.
For this project, I'm pretty sure it would have worked completely fine locally. For ARENA, I'm not entirely sure, but would expect so too (and I think many people do work through it locally on their device with their own hardware). I think the longest training run I've encountered took something like 30 minutes on a T4 GPU in Colab, IIRC. According to Claude, consumer GPUs should be able to run that in a similar order of magnitude. Whereas if you only have some mid-range laptop without a proper graphics card, Claude expects a 10-50x slowdown, so that might become rather impractical for some of the ARENA exercises, I suppose.
One addition: I've been privately informed that another interesting thing to look at would be a visualization of C² (rather than only multiplication of a constant complex number with other complex numbers, see Visualizing Learned Functions section).
So I did that. For instance, here's the square visualization of model2 (the one with [10, 10] hidden neurons):
Again, we see some clear parallel between reality and the model, i.e. colors end up in roughly the right places, but it's clearly quite a bit off anyway. We also still see a lot of "linearity", i.e. straight lines in the model predictions as well as the diff heatmap, but this linearity is now seemingly only occurring in "radial" form, towards the center.
Model 0 and 1 look similar / worse. Model 3 ([20, 30, 20] hidden neurons) gets much closer despite still using ReLU:
And model 4 (same but with SiLU), expectedly, does even better:
But ultimately, we see the same pattern of "the larger the model, the more accurate, and SiLU works better than ReLU" again, without any obvious qualitative difference between SiLU and ReLU - so I don't think these renderings give any direct hint of SiLU performing that much better for actual fractal renderings than ReLU.
In the post though, you wrote:
So if you're still biting the bullet under these conditions, then I don't really get why - unless you're a full-on negative utilitarian, but then the post could just have said "I think I'm e/acc because that's the fastest way of ending this whole mess". :P
I mean, that's fine and all, but if your values truly imply you prefer ending the world now rather than later, when these are the two options in front of you, then that does some pretty heavy lifting. Because without this view, I don't think your other premises would lead to the same conclusion.
If we assume roughly constant population size (or even moderate ongoing growth) and your assumption holds that a pause reduces p(doom) from 10 to 5%, then far fewer people will die in a fiery apocalypse. So however we turn it, I find it hard to see how your conclusion follows from your napkin math, unless I'm missing something. (edit: I notice I jumped back from my hypothetical scenario to the AGI pause scenario; bit premature here, but eventually I'd still like to make this transition, because again, your fiery apocalypse claim above would suggest you should rather be in favor of a pause, and not against it)
(I'd also argue that even if the math checks out somehow, the numbers you end up with are pretty close while all the input values (like the 40 year timeline) surely have large error bars, where even small deviations might lead to the opposite outcome. But I notice this was discussed already in another comment thread)