Jonas Hallgren

Wiki Contributions

Comments

Man, this was a really fun and exciting post! Thank you for writing it. 

 

Maybe there's a connection to the FEP here? I remember Karl Friston saying something about how we can see morality as downstream from the FEP, which is (kinda?) used here, maybe? 

I actually read less books than I used to, the 3x thing was that I listen to audiobooks at 3x the speed so I read less non-fiction but at a faster pace.

Also weird but useful in my head is for example looking into population dynamics to understand alignment failures. When does ecology predict that mode collapse will happen inside of large language models? Understanding these areas and writing about them is weird but it could also a useful bet for at least someone to take.

However, this also depends on how much doing the normal stuff is saturated. I would recommend trying to understand the problems and current approaches really well and then come up with ways of tackling them. To get the bits of information on how to tackle them you might want to check out weirder fields since those bits aren't already in the common pool of "alignment information" if that makes sense?

Generally a well-argued post; I enjoyed it even though I didn't agree with all of it. 

I do want to point out the bitter lesson when it comes to capabilities increase. On current priors, it seems like that intelligence should be something that can solve a lot of tasks at the same time. This would point towards higher capabilities in individual AIs, especially once you add online learning to the mix. The AGI will not have a computational storage limit for the amount of knowledge it can have. The division of agents you propose will most likely be able to made into the same agent, it's more about storage retrieval time here and storing an activation module for "play chess" is something that will not be computationally intractable for an AGI to do. 

This means that the most probable current path forward is into highly capable general AI that generalise across tasks. 

Since you asked so nicely, I can give you two other models. 

1. Meditation is like slow recursive self-improvement and reprogramming of your mind. It gives focus & mental health benefits that are hard to get from other places. If you want to accelerate your growth, I think it's really good. A mechanistic model of meditation & then doing the stages in the book The Mind Illuminated will give you this. (at least, this is how it has been for me)
2. Try to be weird and useful. If you have a weird background, you will catch ideas that might fall through the cracks for other people. Yet to make those ideas worth something you have to be able to actually take action on them, meaning you need to know how to, for example, communicate. So try to find the Pareto optimal between weird and useful by following & doing stuff you find interesting, but also valuable and challenging.

(Read a fuckton of non-fiction books as well if that isn't obvious. Just look up 30 different top 10 lists and you will have a good range to choose from.)

I just wanted to say that you have my vote of confidence on this. It makes the intuitions behind the idea more salient as well.

Sorry for not responding earlier; working on a post that goes through related things in more detail. 

Well I'm just saying that the red blob goes outside the striped circle. The red blob is our viscera, which has now flowed outside our boundary. 

I imagine boundaries being a way of depicting the world so the viscera is the "object" that is in the territory whilst the boundary is our map of that object, meaning that the viscera can change without the boundary changing, which in turn leads us to a mismatch and both exfiltration and infiltration in this case.

I wanted to bring up a mode of cognition related to the undirected time inspired by Kaj Sotala's multiagent models of mind, which I think of as directed undirected time. It's basically defining a vague area, such as how to bridge natural abstractions to interpretability, where I allow my thoughts to roam free. If I get a thought such as, "Ah man, I should have been nicer in that situation yesterday", then I don't engage with it.  It's related to "problem-solving meditations" and something very beneficial to me, especially in the context of a walk. 
For you who like visualisations, you can imagine it as defining an area where thoughts are allowed to bounce around.


(A caveat is that I have meditated quite a bit, so I have a pretty good introspective awareness which might be required for directed undirected exploration, I'm pretty sure it helps at least.)

 

I want to point out the two main pillars I think your model has to assume for it to be the best model for prediction. (I think they're good assumptions)

1. There is a steep difficulty increase in training NNs that act over larger time spans. 
2. This is the best metric to use as it outcompetes all other metrics when it comes to making useful predictions about the efficacy of NNs. 

I like the model and I think it's better than not having one. I do think it misses out on some of the things Steven Byrnes responds with. There's a danger of it being too much of a Procrustes bed or overfitted as specific subtasks and cognition that humans have evolved might be harder to replicate than others. The main bottlenecks might then not lay in the temporal planning distance but in something else.

My prior on the t-AGI not being overfitting is probably something like 60-80% due to the bitter lesson, which to some extent tells us that cognition can be replicated quite well with Deep Learning. So I tend to agree but I would have liked to see a bit more epistemic humility to this point I guess.

I feel like this is trying to say something important but my brain isn't parsing it.

First and foremost, what categorisation are we talking about? Secondly, in what way are the categories framed in terms of social perception? Thirdly, what do you mean by direction and how does Buck confuse the direction?

(Sorry if this is obvious)

Load More