Linda Linsefors

Hi, I am a Physicist, an Effective Altruist and AI Safety student/researcher.

Wiki Contributions

Comments

Recording though in progress...

I notice that I don't expect FOOM like RSI, because I don't expect we'll get an mesa optimizer with coherent goals. It's not hard to give the outer optimiser (e.g. gradient decent) a coherent goal. For the outer optimiser to have a coherent goal is the default. But I don't expect that to translate to the inner optimiser. The inner optimiser will just have a bunch of heuristics and proxi-goals, and not be very coherent, just like humans. 

The outer optimiser can't FOOM, since it don't do planing, and don't have strategic self awareness. It's can only do some combination of hill climbing and random trial and error. If something is FOOMing it will be the inner optimiser, but I expect that one to be a mess.

I notice that this argument don't quite hold. More coherence is useful for RSI, but complete coherence is not necessary. 

I also notice that I expect AIs to make fragile plans, but on reflection, I expect them to gett better and better with this. By fragile I mean that the longer the plan is, the more likely it is to break. This is true for human too though. But we are self aware enough about this fact to mostly compensate, i.e. make plans that don't have too many complicated steps, even if the plan spans a long time.

Like, as a crappy toy model, if every alignment-visionary's vision would ultimately succeed, but only after 30 years of study along their particular path, then no amount of new visionaries added will decrease the amount of time required from “30y since the first visionary started out”.

 

I think that a closer to true model is that most current research directions will lead approximately no-where but we don't know until someone goes and check. Under this model adding more researchers increases the probability that at least someone is working on fruitful research direction. And I don't think you (So8res) disagree, at least not completely?

I don't think we're doing something particularly wrong here. Rather, I'd say: the space to explore is extremely broad; humans are sparsely distributed in the space of intuitions they're able to draw upon; people who have an intuition they can follow towards plausible alignment-solutions are themselves pretty rare; most humans don't have the ability to make research progress without an intuition to guide them. Each time we find a new person with an intuition to guide them towards alignment solutions, it's likely to guide them in a whole new direction, because the space is so large. Hopefully at least one is onto something.

I do think that researchers stack, because there are lots of different directions that can and should be explored in parallel. So maybe the crux is to what fraction of people can do this? Most people I talk to do have research intrusions. I think it takes time and skill to cultivate one's intuition into an agenda that one can communicate to others, but just having enough intuition to guide one self is a much lower bar. However most people I talk to think they have to fit into someone else's idea of what AIS research look like in order to get paid. Unfortunately I think this is a correct belief for everyone without exceptional communication skills and/or connections. But I'm honestly uncertain about this, since I don't have a good understanding of the current funding landscape.

A side from money there are also imposter-syndrom type effects going on. A lot of people I talk to don't feel like they are allowed to have their own research direction, for vague social reasons. Some things that I have noticed sometimes helps:

  • Telling them "Go for it!", and similar things. Repletion helps.
  • Talking about how young AIS is as a field, and the implications of this, including the fact that their intrusions about the importance of expertise is probably wrong when applied to AIS.
  • Handing over a post-it note with the text "Hero Licence".

I believe you that in some parts of Europe this is happening, witch is good. 

I "feel shocked that everyone's dropping the ball".

 

Maybe not everyone
The Productivity Fund (nonlinear.org)
Although this project has been "Coming soon!" for several months now. If you want to help with the non-dropping of this ball, you could check in with them to see if they could use some help.

Funding is not truly abundant. 

  • There are people who have above zero chance of helping that don't get upskilling grants or research grants. 
  • There are several AI Safety orgs that are for profit in order to get investment money, and/or to be self sufficient, because given their particular network, it was easier to get money that way (I don't know the details of their reasoning).
  • I would be more efficient if I had some more money and did not need to worry about budgeting in my personal life. 

I don't know to what extent this is due to the money not existing, or it's due to grant evaluation is hard, and there are some reason to not give out money to easily. 

Is this... not what's happening?


No by default.

I did not have this mindset right away. When I was new to AI Safety I though it would require much more experience before I was qualified to question the consensus, because that is the normal situation, in all the old sciences. I knew AI Safety was young, but I did not understand the implications at first. I needed someone to prompt me to get started. 

Because I've run various events and co-founded AI Safety Support, I've talked to loooots of AI Safety newbies. Most people are too causes when it comes to believing themselves and too ready to follow authorities. It's usually only takes a short conversation pointing out how incredibly young AI Safety is, and what that means, but many people do need this one push.

Yes, that makes sense. Having a bucked is defiantly helpful for finding advise. 

I can't answer for Duncan, but I have had similar enough experiences that I will answer for my self. When I notice that someone is chronically typical minding (not just typical minding as a prior, but shows signs that they are unable to even to consider that others might be different in unexpected ways), then I leave as fast as I can, because such people are dangerous. Such people will violate my boundaries until I have a full melt down. They will do so in the full belief that they are helpful, and override anything I tell them with their own prior convictions. 

I tired to get over the feeling of discomfort when I felt misunderstood, and it did not work. Because it's not just a reminder that the wold isn't perfect (something I can update on and get over), but an active warning signal.

Learning to interpret this warning signal, and knowing when to walk away, has helped a lot.

Different people and communities are more or less compatible with my style of weird. Keeping track of this is very useful.  

I think this comment is pointing in the right direction. But I disagree with

E.g. today we have buckets like "ADHD" and "autistic" with some draft APIs attached

There are buckets, but I don't know what the draft APIs would be. Unless you count "finding your own tribe and stay away from the neurotypicals" as an API.

If you know something I don't let me know!

Yes, that is a thing you can do with decision transforms too. I was referring to variant of the decision transformer (see link in original short form) where the AI samples the reward it's aiming for. 

Load More